Evolution , Robustness and Generality of a Team of Simple Agents with Asymmetric Morphology in Predator-Prey Pursuit Problem †

One of the most desired features of autonomous robotic systems is their ability to accomplish complex tasks with a minimum amount of sensory information. Often, however, the limited amount of information (simplicity of sensors) should be compensated by more precise and complex control. An optimal tradeoff between the simplicity of sensors and control would result in robots featuring better robustness, higher throughput of production and lower production costs, reduced energy consumption, and the potential to be implemented at very small scales. In our work we focus on a society of very simple robots (modeled as agents in a multi-agent system) that feature an “extreme simplicity” of both sensors and control. The agents have a single line-of-sight sensor, two wheels in a differential drive configuration as effectors, and a controller that does not involve any computing, but rather—a direct mapping of the currently perceived environmental state into a pair of velocities of the two wheels. Also, we applied genetic algorithms to evolve a mapping that results in effective behavior of the team of predator agents, towards the goal of capturing the prey in the predator-prey pursuit problem (PPPP), and demonstrated that the simple agents featuring the canonical (straightforward) sensory morphology could hardly solve the PPPP. To enhance the performance of the evolved system of predator agents, we propose an asymmetric morphology featuring an angular offset of the sensor, relative to the longitudinal axis. The experimental results show that this change brings a considerable improvement of both the efficiency of evolution and the effectiveness of the evolved capturing behavior of agents. Finally, we verified that some of the best-evolved behaviors of predators with sensor offset of 20◦ are both (i) general in that they successfully resolve most of the additionally introduced, unforeseen initial situations, and (ii) robust to perception noise in that they show a limited degradation of the number of successfully solved initial situations.


Introduction
One of the most desired features of autonomous robotic systems is their ability to accomplish complex tasks with a minimum amount of sensory information [1].Often, however, the limited amount of sensory information (simplicity of sensors) should be compensated by more precise and more complex control [2].An optimal tradeoff between the amount of available sensory information and the complexity of control would result in robots featuring less complicated design, better robustness, higher throughput of production and lower production costs, reduced energy consumption, and the potential to be implemented at very small (nano-and micro-) scales.Multi-robot systems are highly nonlinear, and difficult to formalize.Therefore, their desired sensory morphology and/or behavior are usually developed via heuristic nature-inspired approaches (such as, for example, evolutionary computation).The simplicity of such robots often implies a reduced size of the search space, and therefore a more efficient heuristics [3,4].Motivated by these advantages of simple robots, we consider a society of very simple robots modeled as agents in a multi-agent system (MAS) that feature an "extreme simplicity" [5] of both sensors and control.
Bhattacharya et al. [1] presented one of the first works on sensory constrains for robots featuring two wheels in a differential drive configuration (the simplest possible effectors), that are required to solve complex tasks such as navigation.The notion of sensory constraints was later developed into the concept of the minimum amount of sensory information that should be adequate for robots with two wheels as effectors to accomplishing a task of a given complexity.Yu et al. [2] proposed the simple "windshield" (field of view) sensors.The proposed sensor was further minimized to a single line-of-sight sensor that could be viewed as a special case of the "windshield" featuring a nearly zero angle of the visual field [5].The important feature of such a simple sensor is that it could be implemented by a single (or, at most a few) receptor(s)-e.g., a single (or few) pixel(s) of a camera.Gauci et al. [5,6] previously modelled these "extremely simple" agents-featuring a line-of-sight sensor with an unlimited range, two wheels in a differential drive as effectors, and a simple reactive controller that does not compute-and proved that they are able to self-organize in order to solve the (simple) robot aggregation problem.The same framework was also successfully applied for the more complex object-clustering problem [7] in which the agents need to interact with an immobile object.The possibility of a team of such agents to conduct an elaborate social (surrounding) behavior in an environment featuring dynamic objects was recently demonstrated by Ozdemir et al. [8] in solving the shepherding problem, where a team of simple agents (shepherds) need to guide multiple dynamic agents (sheep) toward an a priori defined goal.
In our current work, we adopted similar simple agents with a single line-of-sight sensor, two wheels in a differential drive configuration as effectors, and a controller that does not require a memory nor involves any computing [5][6][7][8].Rather, the controller defines the action of the agent as a direct, purely reactive mapping of the currently perceived environmental state to the velocities of two wheels in a differential drive configuration.However, in order to further reduce the complexity of the sensor (and, at the same time, to improve its realism) we limited the range of its visibility [3,4].Moreover, different from previous studies [5][6][7][8][9], we challenged these simple agents to solve a special case of the well-studied but difficult predator-prey pursuit problem (PPPP) [10].This task is more complex in that it requires the agents (predators) to exhibit more diverse behaviors, including exploring the environment, surrounding, and capturing the prey [11][12][13].The emergence of such behavior in the proposed PPPP is even more challenging due to the additionally reduced complexity (limited range of visibility) of the sensors, the simple moving abilities, and the direct reactive control of the predator agents [3,4].Their speed is also limited to the same speed as that of the prey, since otherwise (if the predators are faster) the task of capturing the prey would be trivial.Also, to make the model more realistic (and the task of capturing more challenging), contrary to the previous studies involving similar simple agents [7,8] the initial position of the (predator) agents is such that the prey is not surrounded by them.The PPPP is widely used as a benchmark for the effectiveness of emergent complex, coordinated behavior of agents in MAS.It could serve as a model of various potential real-world applications of both macro- [11,14,15] and micro-robots [16][17][18][19].
Our objective is to verify whether such a team of simple predator agents could successfully solve the PPPP, and to investigate what changes to their morphologies could be proposed (without compromising their simplicity) in order to improve the effectiveness of the behavior of these agents.We are also interested in whether genetic algorithms (GA) could be employed to evolve a direct mapping of the perceived environmental states to the agents' wheel velocities that yields a successful capture of the prey by the team of these simple predator agents.Finally, we would like to investigate the generality of the evolved behaviors of the team of simple predator agents to unforeseen initial environmental situations and the robustness of these behaviors to various levels of perceptual noise.
The remainder of this article is organized as follows.Section 2 describes the entities in the PPPP.In Section 3 we elaborate on the GA adopted for the evolution of predator behaviors.In Section 4 we present the experimental results and introduce the proposed asymmetric sensory morphology of predators.In the same section, we discuss the robustness of the evolved behavior.Section 5 reviews some common questions that might arise.We draw conclusions in the sixth section.

The Predators
Our team of predators consists of eight identical simple cylindrical robots: featuring a sensor with restricted range and two wheels, controlled by two motors in a differential drive configuration.Table 1 shows the main features of the team of predator agents.The sensor, aligned with the longitudinal axis of the agent, could comprise two photodetectors, sensitive to non-overlapping wave-lengths of (ultraviolet, visible, or infrared) light emitted by the predators and prey, respectively.The sensor reading provides information in binary format for any type of entity (either a predator or a prey) in its range: 1, if the corresponding entity is detected and 0 otherwise.Such sensors allow the predators to perceive only four discrete environmental states, as shown in Figure 1.The state <11> is the most challenging one, and it could be sensed under the following assumption: when two entities are in the line-of-sight, one does not obscure the other.The perceived environmental states do not provide the predators with any insight about the distance to the perceived entities, nor their total number.successful capture of the prey by the team of these simple predator agents.Finally, we would like to investigate the generality of the evolved behaviors of the team of simple predator agents to unforeseen initial environmental situations and the robustness of these behaviors to various levels of perceptual noise.The remainder of this article is organized as follows.Section 2 describes the entities in the PPPP.In Section 3 we elaborate on the GA adopted for the evolution of predator behaviors.In Section 4 we present the experimental results and introduce the proposed asymmetric sensory morphology of predators.In the same section, we discuss the robustness of the evolved behavior.Section 5 reviews some common questions that might arise.We draw conclusions in the sixth section.

The Predators
Our team of predators consists of eight identical simple cylindrical robots: featuring a sensor with restricted range and two wheels, controlled by two motors in a differential drive configuration.Table 1 shows the main features of the team of predator agents.The sensor, aligned with the longitudinal axis of the agent, could comprise two photodetectors, sensitive to non-overlapping wave-lengths of (ultraviolet, visible, or infrared) light emitted by the predators and prey, respectively.The sensor reading provides information in binary format for any type of entity (either a predator or a prey) in its range: 1, if the corresponding entity is detected and 0 otherwise.Such sensors allow the predators to perceive only four discrete environmental states, as shown in Figure 1.The state <11> is the most challenging one, and it could be sensed under the following assumption: when two entities are in the line-of-sight, one does not obscure the other.The perceived environmental states do not provide the predators with any insight about the distance to the perceived entities, nor their total number.
. The agents feature a rather simple, purely reactive architecture that defines the action (i.e., the instant velocities of the two wheels) of the agent as a direct mapping of the currently perceived environmental state.For simplicity, hereafter we will assume a mapping into the linear velocities of the wheels, expressed as a percentage-within the range [−100%…+100%]-of their respective maximum linear velocities.The negative values of velocities would result in a backward rotation of The agents feature a rather simple, purely reactive architecture that defines the action (i.e., the instant velocities of the two wheels) of the agent as a direct mapping of the currently perceived environmental state.For simplicity, hereafter we will assume a mapping into the linear velocities of the wheels, expressed as a percentage-within the range [−100% . . .+100%]-of their respective maximum linear velocities.The negative values of velocities would result in a backward rotation of the corresponding wheel(s).The decision-making of the predator agents could be formally expressed by the following octet D: where , and V 11R are the linear velocities (as a percentage of the maximum linear velocity) of the left and right wheels of the predators for the perceived environmental states <00>, <01>, <10>, and <11>, respectively.Our objective of evolving (via GA) the optimal direct mapping of the four perceived environmental states into their respective velocities of wheels could be re-phrased as evolving such values of the velocities, shown in the octet in Equation (1), resulting in an efficient capturing behavior of the team of predator agents.

The Prey
The prey is equipped with an omnidirectional sensor, with limited range of visibility.To balance the advantage that the omnidirectional sensor gives to the prey, compared to the single line-of-sight sensor of the predators, the viewing distance of the prey is only 50 units, compared to the 200 units of the predators.The maximum speed of the prey, however, is identical to that of the predators.These conditions would encourage the predator agents to evolve cooperative behaviors as they will be unable to capture the prey alone.From another perspective, an eventual successful solution to the PPPP, defined in such a way, could demonstrate the virtue of the MAS as it could solve a problem that a single (predator) agent could not.
In contrast to the evolved behaviors of the predator agents, we implemented a handcrafted behavior for the prey.The prey attempts to escape from the closest predator (if any) by running at its maximum speed in the direction that is exactly opposite to the direction of the predator.The prey remains still if it does not detect any predator.Table 1 shows the main features of the prey agent.

The World
We modelled the world as a two-dimensional infinite plane with a visualized section of 1600 × 1600 units.We update the perceptions, decision-making, and the resulting new state (e.g., location, orientation, and speed) of agents with a sampling interval of 100 ms.The duration of trials is 120 s, modelled in 1200 time steps.We approximate the new state of predators through the following two steps: First, from the current orientation, the yaw rate, and the duration of the sampling interval we calculate the new orientation (as an azimuth) of the agents.The yaw rate is obtained from the difference between the linear velocities of the left and right wheels, and the length of the axis between the wheels.Second, we calculate the new position (i.e., the two-dimensional Cartesian coordinates) as a projection (in time, equal to the duration of the sampling interval) of the vector of the linear velocity of the predator.The vector is aligned with the newly calculated orientation, and its magnitude is equal to the mean of the linear velocities of the two wheels.

Evolving the Behavior of Predator Agents
MAS, as complex systems, feature a significant semantic gap between the hierarchically lower-level properties of the agents, and the (emergent) higher-level properties of the system as a whole.Thus, we could not analytically infer the optimal velocity values of the wheels of the agents from the desired behavior of the team of predator agents.Therefore, we applied the GA-a nature-inspired heuristic approach to gradually evolve good values of the parameters, similar to the evolution of species in nature.GA have proven to be efficient in finding the optimal solution(s) to combinatorial optimization problems featuring large search spaces [20][21][22].Thus, consonant with the concept of evolutionary robotics [23], we adopted the GA to evolve good values of the eight velocities of the wheels of the predators that resulted in an efficient behavior-presumably involving exploring the environment, surrounding, and capturing the prey-of the team of predators.The main attributes of GA are elaborated below.

Evolutionary Setup
We decided to apply a heuristic, evolutionary approach to the "tuning" of the velocities of both wheels for each of the perceived four environmental situations because we are a priori unaware of the values of these velocities that would result in a successful behavior of the team of predator agents.As we briefly mentioned in Section 1, a MAS, as a complex system, features a significant semantic gap between the simple, hierarchically lower-level properties of the agents, and the more elaborate, higher-level behavior of the whole system.Consequently, we would be unable to formally infer the values of the octet of velocities of the wheels of agents from the desired behavior of the team of such agents.Similarly, as we will elaborate later, we shall consider an asymmetric morphology of the predator agents with a sensor that is misaligned angularly relatively to the longitudinal axis of the agents.Naturally, we are unaware of the optimal value of the angular offset of the sensor that would result in a successful capturing behavior of the predator agents, and this optimal value could be, in principle evolved via GA too.Moreover, the values of velocities of both wheels and the value of the angular offset of the sensor would, most likely, be dependent on each other; consequently, a co-evolution of the morphology (the sensor offset) and the behavior (velocities of wheels) might be feasible too.
Alternatively, we could have adopted another-deterministic-approach, such as, for example, a complete enumeration of the possible combinations of the eight velocities of wheels and the sensor offset.If each of these 8 velocities is discretized into, for example, 40 possible integer values, ranging from −100% to +100%, then the size of the resulting search space would be equal to 40 8 or about 6.53 × 10 12 .This would render the eventual "brute force" approach, based on complete enumeration of possible combinations of values of velocities computationally intractable.
As an alternative to the brute force search, we could apply reinforced learning (RL) in order to define the good mapping of the four perceived environmental states into the four pairs of velocities of wheels.However, MAS are complex, non-linear systems, and there is a significant gap between the properties of the entities and the (emergent) properties of the system as a whole.RL would obtain a "reward" from the system as a whole (i.e., the efficiency of the team of predators in capturing the prey), and will try to modify the properties (the four pairs of velocities of wheels) of the entities.Due to the complexity and non-linearity of MAS, defining which pairs of velocities should be modified, and how, are not straightforward tasks.This challenge is also related to the intra-agent credit (or blame) assignment problem, as we could not tell which part(s) of the agents is responsible (and therefore should be modified) for the unsatisfactory overall behavior of the team of predators.Conversely, the GA (and evolutionary computing, in general) solves this challenge in an elegant way by obtaining the fitness value as a measure of the performance of the system as a whole (i.e., the efficiency of the team of predators in capturing the prey) and modifying the properties of entities (pairs of velocities of wheels of predators) of the selected best-performing teams of predators via genetic operations, crossover and mutations.
Yet another challenge in RL is the delayed reward problem.The success (if any) of the system (team of predators) would occur several hundred time steps into the trial.However, this success might be determined by the earlier behavior phases of the team of predators, such as the dispersing (exploration of the environment) at the very beginning of the trial.Regarding the delayed reward problem, evolution, as a holistic approach, does not pay specific attention to the timing of the particular (emergent) behaviors of agents, but rather the overall (final) outcome of the trial.
In our work we apply GA, a nature-inspired heuristic approach that gradually evolves the values of a set of parameters (the values of the eight velocities of the wheels) in a way similar to the evolution of species in nature.The main algorithmic steps of our GA are shown in Algorithm 1, and its main attributes-genetic representation, genetic operations, and fitness function-are elaborated below.

Algorithm 1. Main steps of genetic algorithms (GA).
Step 1: Creating the initial population of random chromosomes; Step 2: Evaluating the population; Step 3: WHILE not (Termination Criteria) DO Steps 4-7: Step 4: Selecting the mating pool of the next generation; Step 5: Crossing over random pairs of chromosomes of the mating pool; Step 6: Mutating the newly created offspring; Step 7: Evaluating the population.

Genetic Representation
We genetically represent both (i) the decision-making behavior of the predator agents and (ii) their sensory morphology in a single "chromosome".The latter consist of an array of eight integer values of the evolved velocities of wheels of the agents and an additional allele encoding the angular offset of their sensor.The values for the velocities are constrained within the range [−100% . . .+100%], and are divided into 40 possible discreet values, with an interval of 5% between them.This number of discrete values (and, the interval between these values, respectively) provides a good trade-off between the precision of "tuning" (i.e., expressiveness of the genetic representation) and the size of the search space of the GA.The population size is 400 chromosomes.The breeding strategy is homogeneous in that the performance of a single chromosome cloned to all predators is evaluated.

Genetic Operations
Binary tournament is used as a selection strategy in the evolutionary framework.It is computationally efficient, and has proven to provide a good trade-off between the diversity of the population and the rate of convergence of the fitness [21].In addition to tournament selection, we also adopted elitism in that the four best-performing chromosomes survive unconditionally and are inserted into the mating pool of the next generation.In addition, we implemented with equal probability both a one and two-point crossover.The two-point crossover results in an exchange of the values of both velocities (of the left and right wheels) associated with a given environmental state.This reflects our assumption that the velocities of both wheels determine the moving behavior of the agents (for a given environmental state), and therefore they should be treated as a whole as an evolutionary building block.Two-point crossovers would have no destructive effect on such building blocks.The one-point crossover is applied to develop such building blocks (exploration of the search space), while the two-point crossover is intended to preserve them (exploitation).

Fitness Evaluation
Our aim is to evolve the behaviors of the team of predators that are general to multiple initial situations, rather than a behavior that is specialized to one particular situation.To facilitate such an evolution, we evaluated each of the evolving chromosomes in 10 different initial situations.In each of these situations, the prey is located in the center of the world.The predators are scattered in a small cloud situated south of the prey.Figure 2 shows a sample initial situation.The overall fitness is the sum of the fitness values scored in each of the 10 initial situations.For a successful situation (i.e., the predators manage to capture the prey during the 120 s trial), the fitness is equal to the time needed to capture the prey.If the initial situation is unsuccessful, the fitness is calculated as a sum of (i) the closest distance-registered during the entire trial-between the prey and any predator and (ii) a penalty of 10,000.The former component is intended to provide the evolution with a cue about the comparative quality of the different unsuccessful behaviors.We verified empirically that this heuristic quantifies the "near-misses" well, and correlates with the chances of the predators-pending small evolutionary tweaks in their genome-to successfully capture the prey in the future.The second component is introduced with the intention to heavily penalize the lack of success of predators in any given initial situation.
Our PPPP is an instance of a minimization problem, as a lower OF value corresponds to a better performing team of predator agents.The evolution terminates on OF values lower than 600, which implies a successful capture of the prey in all 10 initial situations in an average time shorter than 60 s (i.e., half of the trial duration).The main parameters of the GA are elaborated in Table 2.

Genotype
Eight integer values of the velocities of wheels  Initially, the predators need to learn how to move and locate the prey, therefore in the first of the ten initial situations the prey is placed very close to the group of predator agents.In each of the nine successive initial situations, the distance between the prey and the group of agents increases.The distance is calculated by the formula: current situation × 2 + random of 50 units.Thus, the first initial situation would be 2 + random of 50 units, while the tenth initial situation would be 20 + random of 50 units.
Our PPPP is an instance of a minimization problem, as a lower OF value corresponds to a better performing team of predator agents.The evolution terminates on OF values lower than 600, which implies a successful capture of the prey in all 10 initial situations in an average time shorter than 60 s (i.e., half of the trial duration).The main parameters of the GA are elaborated in Table 2. Initially, the predators need to learn how to move and locate the prey, therefore in the first of the ten initial situations the prey is placed very close to the group of predator agents.In each of the nine successive initial situations, the distance between the prey and the group of agents increases.The distance is calculated by the formula: current situation × 2 + random of 50 units.Thus, the first initial situation would be 2 + random of 50 units, while the tenth initial situation would be 20 + random of 50 units.

Canonical Predators
As we elaborated in our previous work [3], we implemented 32 independent runs of the GA in an attempt to evolve a suitable mapping of the perceived environmental states into corresponding velocities of wheels of predators with canonical morphology.The sensor of these agents is aligned with their longitude axis.As shown in Figure 3, the evolution is unable to find a solution to the PPPP in all of the 10 tested initial situations.
Information 2018, 9, x FOR PEER REVIEW 8 of 17

Canonical Predators
As we elaborated in our previous work [3], we implemented 32 independent runs of the GA in an attempt to evolve a suitable mapping of the perceived environmental states into corresponding velocities of wheels of predators with canonical morphology.The sensor of these agents is aligned with their longitude axis.As shown in Figure 3, the evolution is unable to find a solution to the PPPP in all of the 10 tested initial situations.Moreover, the most successful team of predators manage to solve just 6 (of 10) initial situations.The average of about 4 solved initial situations, through the whole set of 32 runs, is an indication that the problem was hard to tackle for the current (canonical) morphology of the predator agents.

Enhancing the Morphology of Predators
To improve the generality of the evolved predator behaviors, we focus on modifying their morphological features.The last of the features listed in Table 1-the orientation of the sensors-implies a straightforward implementation of the agents.This, indeed, is the common configuration of the previously studied simple agents [5][6][7][8][9].We are interested in whether an a priori fixed asymmetry-an angular offset-would facilitate the evolution of more general behaviors of the team of predators.We speculate that a sensory offset would allow the predators to realize an anticipatory (equiangular, proportional) pursuit of the prey, aiming at the anticipated point of contact with the moving prey, rather than the currently perceived position of the prey.Notice that the proposed asymmetric morphology does not compromise the intended simplicity of the predator agents.
In our experimental setup, we fixed the offset of all predators to 10°, 20°, 30°, and 40° counterclockwise and conducted 32 evolutionary runs of the GA for each of these 4 configurations.The results are shown in Figure 4a-d, respectively, and summarized in Table 3.As Figure 4a and Table 3 illustrate, offsetting the sensors by only 10° significantly improves the generality of the evolved predator behaviors.They can resolve all 10 situations in 30 (93.75%) of the 32 evolutionary runs.The probability of success-the statistical estimation of the efficiency of evolution, defined for the PPPP as the probability to resolve all 10 initial situations, reaches 90% by generation #60 (Table 3).The terminal value of the OF in the worst evolutionary run is 10,987, corresponding to only one unresolved initial situation.Figure 4 shows convergence of the values of best objective function (top) and the number of successful situations (bottom) of 32 runs of the GA evolving predators with Moreover, the most successful team of predators manage to solve just 6 (of 10) initial situations.The average of about 4 solved initial situations, through the whole set of 32 runs, is an indication that the problem was hard to tackle for the current (canonical) morphology of the predator agents.

Enhancing the Morphology of Predators
To improve the generality of the evolved predator behaviors, we focus on modifying their morphological features.The last of the features listed in Table 1-the orientation of the sensors-implies a straightforward implementation of the agents.This, indeed, is the common configuration of the previously studied simple agents [5][6][7][8][9].We are interested in whether an a priori fixed asymmetry-an angular offset-would facilitate the evolution of more general behaviors of the team of predators.We speculate that a sensory offset would allow the predators to realize an anticipatory (equiangular, proportional) pursuit of the prey, aiming at the anticipated point of contact with the moving prey, rather than the currently perceived position of the prey.Notice that the proposed asymmetric morphology does not compromise the intended simplicity of the predator agents.
In our experimental setup, we fixed the offset of all predators to 10 • , 20 • , 30 • , and 40 • counterclockwise and conducted 32 evolutionary runs of the GA for each of these 4 configurations.The results are shown in Figure 4a-d, respectively, and summarized in Table 3.As Figure 4a and Table 3 illustrate, offsetting the sensors by only 10 • significantly improves the generality of the evolved predator behaviors.They can resolve all 10 situations in 30 (93.75%) of the 32 evolutionary runs.The probability of success-the statistical estimation of the efficiency of evolution, defined for the PPPP as the probability to resolve all 10 initial situations, reaches 90% by generation #60 (Table 3).The terminal value of the OF in the worst evolutionary run is 10,987, corresponding to only one unresolved initial situation.Figure 4 shows convergence of the values of best objective function (top) and the number of successful situations (bottom) of 32 runs of the GA evolving predators with sensor offset of (a) 10 • , (b) 20 • , (c) 30 • , and (d) 40 • , respectively.The bold curves correspond to the mean, while the envelope illustrates the minimum and maximum values in each generation.
More efficient evolution, and behaviors that are more general, were obtained for the sensory offsets of 20 • and 30 • .As Figure 4b and Table 3 depict for 20 • , the predators successfully resolved all 10 initial situations in all 32 evolutionary runs.The probability of success reaches 90% relatively quickly-by generation #9 (Table 3).Both the efficiency of evolution and the generality of the predator behaviors are similar for agents with sensory offsets of 20 • and 30 • , while these two characteristics deteriorate with the further increase of the offset to 40 • (Figure 4c,d, and Table 3).More efficient evolution, and behaviors that are more general, were obtained for the sensory offsets of 20° and 30°.As Figure 4b and Table 3 depict for 20°, the predators successfully resolved all 10 initial situations in all 32 evolutionary runs.The probability of success reaches 90% relatively quickly-by generation #9 (Table 3).Both the efficiency of evolution and the generality of the predator behaviors are similar for agents with sensory offsets of 20° and 30°, while these two characteristics deteriorate with the further increase of the offset to 40° (Figure 4c,d, and Table 3).

Generality of the Evolved Behavior
After seeing the success which the sensory offset brought to solving the initial 10 test situations, we investigated the generality of the best-evolved teams of predators on an extended set of 100 initial situations, including the same 10 initial situations used in the evolution, and 90 additional situations unforeseen during the evolution.For the 10 initial situations (from situation #1 to situation #10) used during the evolution, the predators are dispersed south of the prey (as illustrated in Figure 2) such that the average distance between the prey and the group of predators slightly increases with each situation.For the additional situations, (from situation #11 to situation #100) the average distance between the predators and the prey is kept the same as that of the

Generality of the Evolved Behavior
After seeing the success which the sensory offset brought to solving the initial 10 test situations, we investigated the generality of the best-evolved teams of predators on an extended set of 100 initial situations, including the same 10 initial situations used in the evolution, and 90 additional situations unforeseen during the evolution.For the 10 initial situations (from situation #1 to situation #10) used during the evolution, the predators are dispersed south of the prey (as illustrated in Figure 2) such that the average distance between the prey and the group of predators slightly increases with each situation.For the additional situations, (from situation #11 to situation #100) the average distance between the predators and the prey is kept the same as that of the situation with the most distant predators used during the evolution-situation #10.In each of the additional situations #11 . . .#100 the average distance of the predators to the prey is the same as that of situation #10; however, both the position and the orientation of the predators are random.We would like to note that the alternative approach of exploring the generality of the agents by evolving them directly on 100 initial situations would feature a greater computational overhead as the total number of fitness trials would be almost 10-fold higher.
We conducted the experiments on the set of these 100 initial situations with the 32 evolved best-of-run teams of predators featuring a sensor offset of 20 • .Our choice of this particular morphological configuration is based on its superiority in terms of both (i) the quality (fitness value) of the evolved best-of-run teams of agents and (ii) the consistency (probability of success) in evolving these teams, as illustrated in Table 3. Figure 5 shows the experimental results of the number of successfully solved initial situation by each of the 32 evolved best-of-run teams of predators.The results shown in Figure 5 demonstrate that the sensory offset of 20 • results in behaviors of predator agents that are quite general (rather than over-fitted to particular initial situations).Indeed, only three solutions resolved less than 60 of all 100 initial situations (obtained from evolutionary runs #11, #22, and #25), while, the best solution (obtained from evolutionary run #17) resolves 97 initial situations.The evolved mapping of the velocities of wheels of the most general solution #17 is shown in Table 4.
situation with the most distant predators used during the evolution-situation #10.In each of the additional situations #11…#100 the average distance of the predators to the prey is the same as that of situation #10; however, both the position and the orientation of the predators are random.We would like to note that the alternative approach of exploring the generality of the agents by evolving them directly on 100 initial situations would feature a greater computational overhead as the total number of fitness trials would be almost 10-fold higher.
We conducted the experiments on the set of these 100 initial situations with the 32 evolved best-of-run teams of predators featuring a sensor offset of 20°.Our choice of this particular morphological configuration is based on its superiority in terms of both (i) the quality (fitness value) of the evolved best-of-run teams of agents and (ii) the consistency (probability of success) in evolving these teams, as illustrated in Table 3. Figure 6 shows the experimental results of the number of successfully solved initial situation by each of the 32 evolved best-of-run teams of predators.The results shown in Figure 5 demonstrate that the sensory offset of 20° results in behaviors of predator agents that are quite general (rather than over-fitted to particular initial situations).Indeed, only three solutions resolved less than 60 of all 100 initial situations (obtained from evolutionary runs #11, #22, and #25), while, the best solution (obtained from evolutionary run #17) resolves 97 initial situations.The evolved mapping of the velocities of wheels of the most general solution #17 is shown in Table 4.

Robustness to Sensory Noise
After verifying the generality of the 32 evolved best-of-run behaviors of the team of predator agents, we decided to investigate how each of these behaviors degrades when a random noise is introduced into the perception readings.Revisiting the same set of 100 initial situations (including the same 10 initial situations used in the evolution, and 90 additional unforeseen situations), we introduced two types of noise-false positive (FP) and false negative (FN), respectively.Under the influence of the FP noise, the value of either (randomly chosen with probability of 50%, individually for every predator agent, on each time step) of the two bits of sensory information is read as one, regardless of the actual reading of the sensor.On the contrary, in the presence of FN noise, the reading is registered as zero, even if the corresponding entity (a predator or the prey) is in the line-of-sight.We conducted experiments for levels of either FP or FN noise, starting from 2% and increasing by multiplier of 2, up to 16% (i.e., 2%, 4%, 8% and 16%).
The experimental results obtained, as shown in Figure 6, are somehow unexpected.We anticipated that that the number of solved initial situations would decrease in a monotonic way with the increase of the level of FP noise.However, most of the 32 best-of-run behaviors are quite robust to FP noise in that the number of solved initial situations varies slightly compared to the

Robustness to Sensory Noise
After verifying the generality of the 32 evolved best-of-run behaviors of the team of predator agents, we decided to investigate how each of these behaviors degrades when a random noise is introduced into the perception readings.Revisiting the same set of 100 initial situations (including the same 10 initial situations used in the evolution, and 90 additional unforeseen situations), we introduced two types of noise-false positive (FP) and false negative (FN), respectively.Under the influence of the FP noise, the value of either (randomly chosen with probability of 50%, individually for every predator agent, on each time step) of the two bits of sensory information is read as one, regardless of the actual reading of the sensor.On the contrary, in the presence of FN noise, the reading is registered as zero, even if the corresponding entity (a predator or the prey) is in the line-of-sight.We conducted experiments for levels of either FP or FN noise, starting from 2% and increasing by multiplier of 2, up to 16% (i.e., 2%, 4%, 8% and 16%).
The experimental results obtained, as shown in Figure 6, are somehow unexpected.We anticipated that that the number of solved initial situations would decrease in a monotonic way with the increase of the level of FP noise.However, most of the 32 best-of-run behaviors are quite robust to FP noise in that the number of solved initial situations varies slightly compared to the number of solved situations in a noiseless environment.Moreover, often the number of solved situations anomalously increases with the increase of noise levels.Notable behaviors that exhibit a slight increase in the number of solved situations with the increase of noise are, for example, those obtained from evolutionary runs #7 and #15.For behavior #15, the number of successful situations (91) for 4% noise is significantly higher than that obtained in noiseless environments (91 vs 83, respectively, or a 10% increase).number of solved situations in a noiseless environment.Moreover, often the number of solved situations anomalously increases with the increase of noise levels.Notable behaviors that exhibit a slight increase in the number of solved situations with the increase of noise are, for example, those obtained from evolutionary runs #7 and #15.For behavior #15, the number of successful situations (91) for 4% noise is significantly higher than that obtained in noiseless environments (91 vs 83, respectively, or a 10% increase).As illustrated in Figure 7, the detrimental effect of FN noise is as we expected -for most of the 32 best-of-run behaviors the number of solved initial situations steadily decreases with the increase of noise levels.Even so, there are some behaviors that exhibit an anomalous increase of the number of solved situations with the increase of noise levels.The notable behaviors are #7, #16, #19 and #28.Each of these behaviors result in an increase in the number of solved initial situations for 8%, 2%, 16% and 4% noise, respectively, as seen from Figure 6.These results demonstrate that both the type and the magnitude of perception noise have an influence on the robustness of the evolved behaviors of predator agents.The effect of FP noise on the behavior of the team of predator agents is sometimes detrimental, sometimes favorable, or often insignificant.On the other hand, the FN noise with few exceptions (e.g., behaviors #7, #17, #19, #23 and #28) is detrimental for performance of the agents.
It is interesting to note that the evolved behavior #17 (shown in Table 4) is rather versatile in that it (i) exhibits good performance on the 10 initial situations used during the evolution, (ii) is general to most unforeseen initial situations (solving 97 of 100 situations), and (iii) is robust to both FP and FN noise.Furthermore, behaviors like #7, #23 and #28, exhibit robustness to both types of noises and even increased performance for certain levels of noise.As illustrated in Figure 7, the detrimental effect of FN noise is as we expected -for most of the 32 best-of-run behaviors the number of solved initial situations steadily decreases with the increase of noise levels.Even so, there are some behaviors that exhibit an anomalous increase of the number of solved situations with the increase of noise levels.The notable behaviors are #7, #16, #19 and #28.Each of these behaviors result in an increase in the number of solved initial situations for 8%, 2%, 16% and 4% noise, respectively, as seen from Figure 6.number of solved situations in a noiseless environment.Moreover, often the number of solved situations anomalously increases with the increase of noise levels.Notable behaviors that exhibit a slight increase in the number of solved situations with the increase of noise are, for example, those obtained from evolutionary runs #7 and #15.For behavior #15, the number of successful situations (91) for 4% noise is significantly higher than that obtained in noiseless environments (91 vs 83, respectively, or a 10% increase).As illustrated in Figure 7, the detrimental effect of FN noise is as we expected -for most of the 32 best-of-run behaviors the number of solved initial situations steadily decreases with the increase of noise levels.Even so, there are some behaviors that exhibit an anomalous increase of the number of solved situations with the increase of noise levels.The notable behaviors are #7, #16, #19 and #28.Each of these behaviors result in an increase in the number of solved initial situations for 8%, 2%, 16% and 4% noise, respectively, as seen from Figure 6.These results demonstrate that both the type and the magnitude of perception noise have an influence on the robustness of the evolved behaviors of predator agents.The effect of FP noise on the behavior of the team of predator agents is sometimes detrimental, sometimes favorable, or often insignificant.On the other hand, the FN noise with few exceptions (e.g., behaviors #7, #17, #19, #23 and #28) is detrimental for performance of the agents.

Discussion
It is interesting to note that the evolved behavior #17 (shown in Table 4) is rather versatile in that it (i) exhibits good performance on the 10 initial situations used during the evolution, (ii) is general to most unforeseen initial situations (solving 97 of 100 situations), and (iii) is robust to both FP and FN noise.Furthermore, behaviors like #7, #23 and #28, exhibit robustness to both types of noises and even increased performance for certain levels of noise.These results demonstrate that both the type and the magnitude of perception noise have an influence on the robustness of the evolved behaviors of predator agents.The effect of FP noise on the behavior of the team of predator agents is sometimes detrimental, sometimes favorable, or often insignificant.On the other hand, the FN noise with few exceptions (e.g., behaviors #7, #17, #19, #23 and #28) is detrimental for performance of the agents.

Discussion
It is interesting to note that the evolved behavior #17 (shown in Table 4) is rather versatile in that it (i) exhibits good performance on the 10 initial situations used during the evolution, (ii) is general to most unforeseen initial situations (solving 97 of 100 situations), and (iii) is robust to both FP and FN noise.Furthermore, behaviors like #7, #23 and #28, exhibit robustness to both types of noises and even increased performance for certain levels of noise.

Advantages of the Proposed Asymmetric Morphology
The experimental results indicate that the introduction of the angular offset of the sensor of the agents improves the effectiveness of the team by enabling the emergence of a general and robust (to environmental noise) capturing behavior.It also helps increase the efficiency of the evolution of successful capturing behavior of agents.Our previous work [3], as well as the result provided above, suggest that the behavior evolved with a sensor offset of 20 • is the most robust to noise and shows the best generality when subjected to additional unforeseen (during the evolution) initial situations in a noiseless environment.
The beneficial effect, stemming from the addition of a sensory offset, is the ability to accomplish a reliable tracking of the prey by predicting its position in case it disappears from sight.Due to the counterclockwise sensory offset, and the parallax induced by the movement of the predator, most of the time, a prey that has disappeared would be located to the left of the predator.The GA manages to discover this knowledge and to take advantage of it by setting the velocities of the wheels in such a way that allows the predator to turn to the left (V 00 -Table 4)-i.e., towards the most probable position of the disappeared prey.
Therefore, one of the virtues of the sensor offset is that it results in a more deterministic direction of the disappearance of the prey-almost certainly to the left-which, in turn facilitates a faster rediscovery, and consequently a more reliable tracking of the prey by the predator.Moreover, as shown in Figure 8, the chase by the predator featuring an asymmetric morphology would result in a characteristic circular trajectory of both the predator and the prey.This behavior is more efficient, compared to that of the predators with canonical straightforward morphology.In the latter case the predator could not make any reliable prediction about the most likely direction of the disappearance of the chased prey.With the challenging but realistic assumption that the prey is not initially surrounded by the predators (as illustrated in Figure 2) such emergent circular trajectories would facilitate the surrounding; as the prey would be shepherded (driven) by a single predator (driver) towards the pack of the remaining predators.Significantly reducing the sensor offset from 20 • would stretch the chasing trajectory of the predator.Moreover, in such case, as the chased prey becomes closer to the longitudinal axis of the predator, in order not to compromise the certainty that the prey has disappeared to the left, it would need to turn slightly to the right (instead of going straight, as shown in Figure 8, for environmental state <01>)-by reducing the speed of the right wheel-during the chase.This, in turn, would reduce the overall chasing speed of the predator.These two factors-stretching the chasing trajectory and reducing the chasing speed of the predator-would result in increasing the time needed to drive the prey towards the dispersed predators.Conversely, increasing the sensor offset would result in a more compact chasing trajectory that might not stretch enough to reach back to the pack of remaining predators.
Information 2018, 9, x FOR PEER REVIEW 12 of 17

Advantages of the Proposed Asymmetric Morphology
The experimental results indicate that the introduction of the angular offset of the sensor of the agents improves the effectiveness of the team by enabling the emergence of a general and robust (to environmental noise) capturing behavior.It also helps increase the efficiency of the evolution of successful capturing behavior of agents.Our previous work [3], as well as the result provided above, suggest that the behavior evolved with a sensor offset of 20° is the most robust to noise and shows the best generality when subjected to additional unforeseen (during the evolution) initial situations in a noiseless environment.
The beneficial effect, stemming from the addition of a sensory offset, is the ability to accomplish a reliable tracking of the prey by predicting its position in case it disappears from sight.Due to the counterclockwise sensory offset, and the parallax induced by the movement of the predator, most of the time, a prey that has disappeared would be located to the left of the predator.The GA manages to discover this knowledge and to take advantage of it by setting the velocities of the wheels in such a way that allows the predator to turn to the left (V00-Table 4)-i.e., towards the most probable position of the disappeared prey.
Therefore, one of the virtues of the sensor offset is that it results in a more deterministic direction of the disappearance of the prey-almost certainly to the left-which, in turn facilitates a faster rediscovery, and consequently a more reliable tracking of the prey by the predator.Moreover, as shown in Figure 8, the chase by the predator featuring an asymmetric morphology would result in a characteristic circular trajectory of both the predator and the prey.This behavior is more efficient, compared to that of the predators with canonical straightforward morphology.In the latter case the predator could not make any reliable prediction about the most likely direction of the disappearance of the chased prey.With the challenging but realistic assumption that the prey is not initially surrounded by the predators (as illustrated in Figure 2) such emergent circular trajectories would facilitate the surrounding; as the prey would be shepherded (driven) by a single predator (driver) towards the pack of the remaining predators.Significantly reducing the sensor offset from 20° would stretch the chasing trajectory of the predator.Moreover, in such case, as the chased prey becomes closer to the longitudinal axis of the predator, in order not to compromise the certainty that the prey has disappeared to the left, it would need to turn slightly to the right (instead of going straight, as shown in Figure 8, for environmental state <01>)-by reducing the speed of the right wheel-during the chase.This, in turn, would reduce the overall chasing speed of the predator.These two factors-stretching the chasing trajectory and reducing the chasing speed of the predator-would result in increasing the time needed to drive the prey towards the dispersed predators.Conversely, increasing the sensor offset would result in a more compact chasing trajectory that might not stretch enough to reach back to the pack of remaining predators.

Emergent Behavioral Strategies
The current research, as well as our previous work on agents with asymmetric morphology [3], suggest that the solution most robust to noise and with greatest success rate in the generality tests is the behavior obtained from the evolutionary run # 17.In this section, we will use that specific

Emergent Behavioral Strategies
The current research, as well as our previous work on agents with asymmetric morphology [3], suggest that the solution most robust to noise and with greatest success rate in the generality tests is the behavior obtained from the evolutionary run # 17.In this section, we will use that specific behavior to review the behavioral strategies of the team of predator agents emerging from evolution of the velocity mappings by the GP framework.The values of the evolved velocities of motors and the sensor offset of behavior #17 are shown in Table 4.The team of predator agents exhibits the following four behavioral strategies, executed in four consecutive phases of the trial: (i) circling around until they find a peer or the prey (controlled by velocities V 00 ), (ii) exploring the environment by distancing themselves from each other (controlled by velocities V 10 ), (iii) surrounding by shepherding (driving) the prey (by some of the predators-drivers) in a circular trajectory (V 01 ), and (iv) capturing the prey (V 11 ).The team of predator agents exhibits the following three behavioral strategies, executed in three consecutive phases of the trial: (i) circling around until they find a peer or the prey (controlled by velocities V 00 ), and then exploring the environment by distancing themselves from each other (controlled by velocities V 10 ), (ii) surrounding by shepherding (driving) the prey (by some of the predators-drivers) in an circular trajectory (V 01 ), and (iii) capturing the prey (V 11 ). Figure 9 illustrates the different phases the agents go through in the process of catching the prey.A video of how the team of predators deals with all 10 initial situations can be found at http://isd-si.doshisha.ac.jp/m.georgiev/2018-12-08-SA20deg.mp4.
and then exploring the environment by distancing themselves from each other (controlled by velocities V10), (ii) surrounding by shepherding (driving) the prey (by some of the predators-drivers) in an circular trajectory (V01), and (iii) capturing the prey (V11).Figure 9 illustrates the different phases the agents go through in the process of catching the prey.A video of how the team of predators deals with all 10 initial situations can be found at http://isd-si.doshisha.ac.jp/m.georgiev/2018-12-08-SA20deg.mp4 As shown in Figure 9a, in the beginning all agents have no vision of either the prey or any of the peers.Following the mapping of V00L = 25% and V00R = 100% (as shown in Table 4), they start circling around-scanning the environment in an attempt to find another entity.Detecting a peer activates the set of velocities V10L = −25% and V10R = −20%, which forces the predators to enter the second stage: to move away from the perceived agent, which facilitates a better dispersion and a coverage of a wider area.This enhances the ability of the predators to explore the environment and to discover the prey.The third stage-surrounding-begins when any of the predators discovers the prey.The mapping V01L = 100%, and V01R = 100% results in moving forward at the highest speed, which helps in keeping the prey almost always in the same relative position to the agent-i.e., on the left side, as shown in Figure 8 and Figure 9b-e.Once the prey disappears from view-as shown in the center panel of Figure 8-the predator exhibits an embodied cognition that the disappearance is a result, in part, of its own forward motion; therefore, the new location of the prey is-due to the counter clockwise offset of the sensor-most likely on the left of its own orientation.Then the evolved V00L = 25% and V00R = 100% are activated (Figure 8 right) resulting in a circular motion to the left, until the agent rediscovers the disappeared prey.As shown in Figure 9a, in the beginning all agents have no vision of either the prey or any of the peers.Following the mapping of V 00L = 25% and V 00R = 100% (as shown in Table 4), they start circling around-scanning the environment in an attempt to find another entity.Detecting a peer activates the set of velocities V 10L = −25% and V 10R = −20%, which forces the predators to enter the second stage: to move away from the perceived agent, which facilitates a better dispersion and a coverage of a wider area.This enhances the ability of the predators to explore the environment and to discover the prey.The third stage-surrounding-begins when any of the predators discovers the prey.The mapping V 01L = 100%, and V 01R = 100% results in moving forward at the highest speed, which helps in keeping the prey almost always in the same relative position to the agent-i.e., on the left side, as shown in Figures 8 and 9b-e.Once the prey disappears from view-as shown in the center panel of Figure 8-the predator exhibits an embodied cognition that the disappearance is a result, in part, of its own forward motion; therefore, the new location of the prey is-due to the counter clockwise offset of the sensor-most likely on the left of its own orientation.Then the evolved V 00L = 25% and V 00R = 100% are activated (Figure 8 right) resulting in a circular motion to the left, until the agent rediscovers the disappeared prey.
Moreover, as Figure 9b-d show, a single predator-driver-due to its sensor offset, shepherds (drives) the prey in a circular, counter clockwise trajectory into the (already dispersed) other predators.The fourth (final) behavioral phase concludes the chase by capturing the prey that is surrounded from all sides by both the driver(s) and the newly encountered predators, as illustrated in Figure 9e,f.When approaching from opposite sides, the predators are able to see both the prey and a peer, which activates the mapping V 11L = 100% and V 11R = 100%.Since they have a slight angular offset, it is possible for only three predators to catch the prey, as illustrated in Figure 9e,f.One of the predators chases the prey from behind and guides it to its front left side, while the other approaches it from the opposite direction.
At the same time, as shown in Figure 9d-f, two of the agents keep distancing themselves from the group of other predators.The agents seem to exhibit an emergent knowledge [24] that not all eight agents are needed to capture the prey.While sacrificing their own chances capture the prey in this particular initial situation, in general, such behavior helps the team (as a whole) by expanding the initially explored field and therefore it discovers the prey faster, especially when it is further away from the predators.
The manifestation of shepherding behavior in the third behavioral phase is probably the most significant difference between the evolved behavior of the canonical straightforward predator agents and that of the agents with asymmetric morphology.This behavior, being a vital part of the successful capturing, could not be observed in the behavior of the canonical predator agents.A video showing how the canonical agents struggle to find and capture the prey is available at http://isd-si.doshisha.ac.jp/m.georgiev/2018-12-03-SA-Straightforward.mp4.

Alternative Approaches
A different approach to finding a solution would be to implement a multi-agent system with several different types of predator agents, in which, each of them has a specifically assigned role that contributes towards capturing the prey.In our previous work that compares the performance of heterogeneous and homogeneous MAS [25,26], we have analyzed in-depth the different problems that heterogeneity introduces.The main reason we did not implement a heterogeneous MAS is that the efficiency of evolution of the heterogeneous system might be hindered by the inflated search space.Additionally, due to the a priori defined specialization of the agents in heterogeneous systems, we cannot ensure that generality of the heterogeneous team; e.g., consisting of two types of agents-drivers and capturers-would not be compromised, because we could not guarantee that the a priori defined drivers would be in the most favorable position relative to the prey in each of the initial situations.Indeed, in real-world situations, placing a particular predator (driver) in a particular position is challenging and not always possible.Alternatively, we decided to implement a priori unspecialized, versatile agents and to give them the ability to execute any role (depending on their perceived environment) that is needed to capture the prey.In our case, the agent that is closest to the prey would assume the role of a "driver".This behavioral heterogeneity emerges dynamically from the interaction between the homogeneous genotype (all of the agents share the same velocity mappings of the rotation velocities of wheels) and the environment.

Conclusions
We considered a society of very simple robots (modeled as agents in MAS) that feature an "extreme simplicity" of both sensors and control.The adopted agents have a single line-of-sight sensor, two wheels in a differential drive configuration as effectors, and a controller that does not require a memory and does not involve any computing, but rather a direct mapping of the currently perceived environmental state into a pair of velocities of the two wheels.Also, we applied genetic algorithms to evolve a mapping that will result in effective behavior of the team of predator agents towards the goal of capturing the prey in the predator-prey pursuit problem (PPPP).The preliminary experimental results indicated that the simple agents featuring the canonical straightforward sensory morphology could hardly evolve the ability to solve the PPPP.
To enhance the performance of the evolved system of predator agents, we propose an asymmetric morphology featuring an angular offset of the sensor, relative to the longitudinal axis of the agents.The experimental results demonstrated that this modification brings-without compromising the simplicity of agents-a considerable improvement of both (i) the efficiency of evolution and (ii) the effectiveness of the evolved capturing behavior of predator agents.Also, we verified that some of the evolved best-of-run behaviors of predators featuring a sensor offset of 20 • are both (i) general in that they are able to successfully solve most of the additionally introduced, unforeseen initial situations, and (ii) robust to perception noise in that they show a limited degradation of the number of successfully resolved initial situations.
The results described in our work could be seen as a step towards the verification that complex behavior needed for solving challenging tasks could emerge from the coordination of very simple robots featuring an asymmetric sensory morphology.The advantages of such robots, in addition to the simple design, include better robustness, higher throughput of production and lower production costs, reduced energy consumption, and the potential to be implemented at very small (nano-and micro-) scales.
In our future work, we are planning to investigate the anomalous increase of the number of successful situations with the increase of false positive noise.While similar phenomena are well known in engineering (e.g., stochastic resonance, dithering) there are no documented results on the beneficial effects of noise on the performance of MAS.Also, we are planning to develop an even more realistic, three-dimensional model of the environment of PPPP.

Figure 1 .
Figure 1.The four possible environmental states perceived by (any) predator agent Ai.

Figure 1 .
Figure 1.The four possible environmental states perceived by (any) predator agent Ai.
Fitness value Sum of fitness values of each situation: (a) Successful situation: time needed to capture the prey (b) Unsuccessful situation: 10,000 + the shortest distance between the prey and any predator during the trial Termination criterion (# Generations = 200) or (Stagnation of fitness for 32 consecutive generations) or (Fitness < 600)

Figure 3 .
Figure 3. Convergence of the values of best objective function (OF, top) and the number of successful situations (bottom) of 32 independent runs of the GA evolving the behavior of a team of predator agents with canonical morphology.The bold curves correspond to the mean, while the envelope shows the minimum and maximum values in each generation.

Figure 3 .
Figure 3. Convergence of the values of best objective function (OF, top) and the number of successful situations (bottom) of 32 independent runs of the GA evolving the behavior of a team of predator agents with canonical morphology.The bold curves correspond to the mean, while the envelope shows the minimum and maximum values in each generation.

Information 2018, 9 ,
x FOR PEER REVIEW 9 of 17 sensor offset of (a) 10°, (b) 20°, (c) 30°, and (d) 40°, respectively.The bold curves correspond to the mean, while the envelope illustrates the minimum and maximum values in each generation.

Figure 5 .
Figure 5. Generality of the 32 evolved best-of-run behaviors of the team of predator agents.

Figure 5 .
Figure 5. Generality of the 32 evolved best-of-run behaviors of the team of predator agents.

Figure 6 .
Figure 6.Number of successfully solved initial situations for various levels of false positive (FP) perception noise.

Figure 7 .
Figure 7. Number of successfully solved initial situations for various levels of FN perception noise.

Figure 6 .
Figure 6.Number of successfully solved initial situations for various levels of false positive (FP) perception noise.

Figure 6 .
Figure 6.Number of successfully solved initial situations for various levels of false positive (FP) perception noise.

Figure 7 .
Figure 7. Number of successfully solved initial situations for various levels of FN perception noise.

Figure 7 .
Figure 7. Number of successfully solved initial situations for various levels of FN perception noise.

Figure 8 .
Figure 8. Reliable tracking of the prey by a predator Ai.

Table 1 .
Features of the predator and prey agents.

Table 1 .
Features of the predator and prey agents.

Table 2 .
Parameters of GA.

Table 2 .
Parameters of GA.

Table 3 .
Efficiency of evolution of the team of predator agents.

Table 3 .
Efficiency of evolution of the team of predator agents.

Table 4 .
Velocity mappings of the most prominent behavior #17.The sensor offset is 20°.

Table 4 .
Velocity mappings of the most prominent behavior #17.The sensor offset is 20 • .
Reliable tracking of the prey by a predator Ai.