Next Article in Journal
AI and the Singularity: A Fallacy or a Great Opportunity?
Next Article in Special Issue
Machine Reading Comprehension for Answer Re-Ranking in Customer Support Chatbots
Previous Article in Journal
Novel Parameterized Utility Function on Dual Hesitant Fuzzy Rough Sets and Its Application in Pattern Recognition
Previous Article in Special Issue
Automatic Acquisition of Annotated Training Corpora for Test-Code Generation

Information 2019, 10(2), 72; https://doi.org/10.3390/info10020072

Article
Evolution, Robustness and Generality of a Team of Simple Agents with Asymmetric Morphology in Predator-Prey Pursuit Problem
1
Department of Information System Design, Doshisha University, Kyotanabe, Kyoto 610-0321, Japan
2
Department of Biology, University of Oklahoma, Norman, Oklahoma, OK 73019, USA
*
Correspondence: [email protected]; Tel.: +81-077-465-6951
This paper is extended version of our paper presented in 18th International Conference AIMSA 2018, Varna, Bulgaria, 12–14 September 2018.
Received: 21 January 2019 / Accepted: 17 February 2019 / Published: 20 February 2019

Abstract

:
One of the most desired features of autonomous robotic systems is their ability to accomplish complex tasks with a minimum amount of sensory information. Often, however, the limited amount of information (simplicity of sensors) should be compensated by more precise and complex control. An optimal tradeoff between the simplicity of sensors and control would result in robots featuring better robustness, higher throughput of production and lower production costs, reduced energy consumption, and the potential to be implemented at very small scales. In our work we focus on a society of very simple robots (modeled as agents in a multi-agent system) that feature an “extreme simplicity” of both sensors and control. The agents have a single line-of-sight sensor, two wheels in a differential drive configuration as effectors, and a controller that does not involve any computing, but rather—a direct mapping of the currently perceived environmental state into a pair of velocities of the two wheels. Also, we applied genetic algorithms to evolve a mapping that results in effective behavior of the team of predator agents, towards the goal of capturing the prey in the predator-prey pursuit problem (PPPP), and demonstrated that the simple agents featuring the canonical (straightforward) sensory morphology could hardly solve the PPPP. To enhance the performance of the evolved system of predator agents, we propose an asymmetric morphology featuring an angular offset of the sensor, relative to the longitudinal axis. The experimental results show that this change brings a considerable improvement of both the efficiency of evolution and the effectiveness of the evolved capturing behavior of agents. Finally, we verified that some of the best-evolved behaviors of predators with sensor offset of 20° are both (i) general in that they successfully resolve most of the additionally introduced, unforeseen initial situations, and (ii) robust to perception noise in that they show a limited degradation of the number of successfully solved initial situations.
Keywords:
simple agents; micro-robots; asymmetric morphology; predator-prey problem; genetic algorithms

1. Introduction

One of the most desired features of autonomous robotic systems is their ability to accomplish complex tasks with a minimum amount of sensory information [1]. Often, however, the limited amount of sensory information (simplicity of sensors) should be compensated by more precise and more complex control [2]. An optimal tradeoff between the amount of available sensory information and the complexity of control would result in robots featuring less complicated design, better robustness, higher throughput of production and lower production costs, reduced energy consumption, and the potential to be implemented at very small (nano- and micro-) scales. Multi-robot systems are highly nonlinear, and difficult to formalize. Therefore, their desired sensory morphology and/or behavior are usually developed via heuristic nature-inspired approaches (such as, for example, evolutionary computation). The simplicity of such robots often implies a reduced size of the search space, and therefore a more efficient heuristics [3,4]. Motivated by these advantages of simple robots, we consider a society of very simple robots modeled as agents in a multi-agent system (MAS) that feature an “extreme simplicity” [5] of both sensors and control.
Bhattacharya et al. [1] presented one of the first works on sensory constrains for robots featuring two wheels in a differential drive configuration (the simplest possible effectors), that are required to solve complex tasks such as navigation. The notion of sensory constraints was later developed into the concept of the minimum amount of sensory information that should be adequate for robots with two wheels as effectors to accomplishing a task of a given complexity. Yu et al. [2] proposed the simple “windshield” (field of view) sensors. The proposed sensor was further minimized to a single line-of-sight sensor that could be viewed as a special case of the “windshield” featuring a nearly zero angle of the visual field [5]. The important feature of such a simple sensor is that it could be implemented by a single (or, at most a few) receptor(s)—e.g., a single (or few) pixel(s) of a camera. Gauci et al. [5,6] previously modelled these “extremely simple” agents—featuring a line-of-sight sensor with an unlimited range, two wheels in a differential drive as effectors, and a simple reactive controller that does not compute—and proved that they are able to self-organize in order to solve the (simple) robot aggregation problem. The same framework was also successfully applied for the more complex object-clustering problem [7] in which the agents need to interact with an immobile object. The possibility of a team of such agents to conduct an elaborate social (surrounding) behavior in an environment featuring dynamic objects was recently demonstrated by Ozdemir et al. [8] in solving the shepherding problem, where a team of simple agents (shepherds) need to guide multiple dynamic agents (sheep) toward an a priori defined goal.
In our current work, we adopted similar simple agents with a single line-of-sight sensor, two wheels in a differential drive configuration as effectors, and a controller that does not require a memory nor involves any computing [5,6,7,8]. Rather, the controller defines the action of the agent as a direct, purely reactive mapping of the currently perceived environmental state to the velocities of two wheels in a differential drive configuration. However, in order to further reduce the complexity of the sensor (and, at the same time, to improve its realism) we limited the range of its visibility [3,4]. Moreover, different from previous studies [5,6,7,8,9], we challenged these simple agents to solve a special case of the well-studied but difficult predator-prey pursuit problem (PPPP) [10]. This task is more complex in that it requires the agents (predators) to exhibit more diverse behaviors, including exploring the environment, surrounding, and capturing the prey [11,12,13]. The emergence of such behavior in the proposed PPPP is even more challenging due to the additionally reduced complexity (limited range of visibility) of the sensors, the simple moving abilities, and the direct reactive control of the predator agents [3,4]. Their speed is also limited to the same speed as that of the prey, since otherwise (if the predators are faster) the task of capturing the prey would be trivial. Also, to make the model more realistic (and the task of capturing more challenging), contrary to the previous studies involving similar simple agents [7,8] the initial position of the (predator) agents is such that the prey is not surrounded by them. The PPPP is widely used as a benchmark for the effectiveness of emergent complex, coordinated behavior of agents in MAS. It could serve as a model of various potential real-world applications of both macro- [11,14,15] and micro-robots [16,17,18,19].
Our objective is to verify whether such a team of simple predator agents could successfully solve the PPPP, and to investigate what changes to their morphologies could be proposed (without compromising their simplicity) in order to improve the effectiveness of the behavior of these agents. We are also interested in whether genetic algorithms (GA) could be employed to evolve a direct mapping of the perceived environmental states to the agents’ wheel velocities that yields a successful capture of the prey by the team of these simple predator agents. Finally, we would like to investigate the generality of the evolved behaviors of the team of simple predator agents to unforeseen initial environmental situations and the robustness of these behaviors to various levels of perceptual noise.
The remainder of this article is organized as follows. Section 2 describes the entities in the PPPP. In Section 3 we elaborate on the GA adopted for the evolution of predator behaviors. In Section 4 we present the experimental results and introduce the proposed asymmetric sensory morphology of predators. In the same section, we discuss the robustness of the evolved behavior. Section 5 reviews some common questions that might arise. We draw conclusions in the sixth section.

2. The Entities

2.1. The Predators

Our team of predators consists of eight identical simple cylindrical robots: featuring a sensor with restricted range and two wheels, controlled by two motors in a differential drive configuration. Table 1 shows the main features of the team of predator agents.
The sensor, aligned with the longitudinal axis of the agent, could comprise two photodetectors, sensitive to non-overlapping wave-lengths of (ultraviolet, visible, or infrared) light emitted by the predators and prey, respectively. The sensor reading provides information in binary format for any type of entity (either a predator or a prey) in its range: 1, if the corresponding entity is detected and 0 otherwise. Such sensors allow the predators to perceive only four discrete environmental states, as shown in Figure 1. The state <11> is the most challenging one, and it could be sensed under the following assumption: when two entities are in the line-of-sight, one does not obscure the other. The perceived environmental states do not provide the predators with any insight about the distance to the perceived entities, nor their total number.
The agents feature a rather simple, purely reactive architecture that defines the action (i.e., the instant velocities of the two wheels) of the agent as a direct mapping of the currently perceived environmental state. For simplicity, hereafter we will assume a mapping into the linear velocities of the wheels, expressed as a percentage—within the range [−100%…+100%]—of their respective maximum linear velocities. The negative values of velocities would result in a backward rotation of the corresponding wheel(s). The decision-making of the predator agents could be formally expressed by the following octet D:
D = {V00L, V00R, V01L, V01R, V10L, V10R, V11L, V11R}
where V00L, V00R, V01L, V01R, V10L, V10R, V11L, and V11R are the linear velocities (as a percentage of the maximum linear velocity) of the left and right wheels of the predators for the perceived environmental states <00>, <01>, <10>, and <11>, respectively.
Our objective of evolving (via GA) the optimal direct mapping of the four perceived environmental states into their respective velocities of wheels could be re-phrased as evolving such values of the velocities, shown in the octet in Equation (1), resulting in an efficient capturing behavior of the team of predator agents.

2.2. The Prey

The prey is equipped with an omnidirectional sensor, with limited range of visibility. To balance the advantage that the omnidirectional sensor gives to the prey, compared to the single line-of-sight sensor of the predators, the viewing distance of the prey is only 50 units, compared to the 200 units of the predators. The maximum speed of the prey, however, is identical to that of the predators. These conditions would encourage the predator agents to evolve cooperative behaviors as they will be unable to capture the prey alone. From another perspective, an eventual successful solution to the PPPP, defined in such a way, could demonstrate the virtue of the MAS as it could solve a problem that a single (predator) agent could not.
In contrast to the evolved behaviors of the predator agents, we implemented a handcrafted behavior for the prey. The prey attempts to escape from the closest predator (if any) by running at its maximum speed in the direction that is exactly opposite to the direction of the predator. The prey remains still if it does not detect any predator. Table 1 shows the main features of the prey agent.

2.3. The World

We modelled the world as a two-dimensional infinite plane with a visualized section of 1600 × 1600 units. We update the perceptions, decision-making, and the resulting new state (e.g., location, orientation, and speed) of agents with a sampling interval of 100 ms. The duration of trials is 120 s, modelled in 1200 time steps. We approximate the new state of predators through the following two steps: First, from the current orientation, the yaw rate, and the duration of the sampling interval we calculate the new orientation (as an azimuth) of the agents. The yaw rate is obtained from the difference between the linear velocities of the left and right wheels, and the length of the axis between the wheels. Second, we calculate the new position (i.e., the two-dimensional Cartesian coordinates) as a projection (in time, equal to the duration of the sampling interval) of the vector of the linear velocity of the predator. The vector is aligned with the newly calculated orientation, and its magnitude is equal to the mean of the linear velocities of the two wheels.

3. Evolving the Behavior of Predator Agents

MAS, as complex systems, feature a significant semantic gap between the hierarchically lower-level properties of the agents, and the (emergent) higher-level properties of the system as a whole. Thus, we could not analytically infer the optimal velocity values of the wheels of the agents from the desired behavior of the team of predator agents. Therefore, we applied the GA—a nature-inspired heuristic approach to gradually evolve good values of the parameters, similar to the evolution of species in nature. GA have proven to be efficient in finding the optimal solution(s) to combinatorial optimization problems featuring large search spaces [20,21,22]. Thus, consonant with the concept of evolutionary robotics [23], we adopted the GA to evolve good values of the eight velocities of the wheels of the predators that resulted in an efficient behavior—presumably involving exploring the environment, surrounding, and capturing the prey—of the team of predators. The main attributes of GA are elaborated below.

3.1. Evolutionary Setup

We decided to apply a heuristic, evolutionary approach to the “tuning” of the velocities of both wheels for each of the perceived four environmental situations because we are a priori unaware of the values of these velocities that would result in a successful behavior of the team of predator agents. As we briefly mentioned in Section 1, a MAS, as a complex system, features a significant semantic gap between the simple, hierarchically lower-level properties of the agents, and the more elaborate, higher-level behavior of the whole system. Consequently, we would be unable to formally infer the values of the octet of velocities of the wheels of agents from the desired behavior of the team of such agents. Similarly, as we will elaborate later, we shall consider an asymmetric morphology of the predator agents with a sensor that is misaligned angularly relatively to the longitudinal axis of the agents. Naturally, we are unaware of the optimal value of the angular offset of the sensor that would result in a successful capturing behavior of the predator agents, and this optimal value could be, in principle evolved via GA too. Moreover, the values of velocities of both wheels and the value of the angular offset of the sensor would, most likely, be dependent on each other; consequently, a co-evolution of the morphology (the sensor offset) and the behavior (velocities of wheels) might be feasible too.
Alternatively, we could have adopted another—deterministic—approach, such as, for example, a complete enumeration of the possible combinations of the eight velocities of wheels and the sensor offset. If each of these 8 velocities is discretized into, for example, 40 possible integer values, ranging from −100% to +100%, then the size of the resulting search space would be equal to 408 or about 6.53 × 1012. This would render the eventual “brute force” approach, based on complete enumeration of possible combinations of values of velocities computationally intractable.
As an alternative to the brute force search, we could apply reinforced learning (RL) in order to define the good mapping of the four perceived environmental states into the four pairs of velocities of wheels. However, MAS are complex, non-linear systems, and there is a significant gap between the properties of the entities and the (emergent) properties of the system as a whole. RL would obtain a “reward” from the system as a whole (i.e., the efficiency of the team of predators in capturing the prey), and will try to modify the properties (the four pairs of velocities of wheels) of the entities. Due to the complexity and non-linearity of MAS, defining which pairs of velocities should be modified, and how, are not straightforward tasks. This challenge is also related to the intra-agent credit (or blame) assignment problem, as we could not tell which part(s) of the agents is responsible (and therefore should be modified) for the unsatisfactory overall behavior of the team of predators. Conversely, the GA (and evolutionary computing, in general) solves this challenge in an elegant way by obtaining the fitness value as a measure of the performance of the system as a whole (i.e., the efficiency of the team of predators in capturing the prey) and modifying the properties of entities (pairs of velocities of wheels of predators) of the selected best-performing teams of predators via genetic operations, crossover and mutations.
Yet another challenge in RL is the delayed reward problem. The success (if any) of the system (team of predators) would occur several hundred time steps into the trial. However, this success might be determined by the earlier behavior phases of the team of predators, such as the dispersing (exploration of the environment) at the very beginning of the trial. Regarding the delayed reward problem, evolution, as a holistic approach, does not pay specific attention to the timing of the particular (emergent) behaviors of agents, but rather the overall (final) outcome of the trial.
In our work we apply GA, a nature-inspired heuristic approach that gradually evolves the values of a set of parameters (the values of the eight velocities of the wheels) in a way similar to the evolution of species in nature. The main algorithmic steps of our GA are shown in Algorithm 1, and its main attributes—genetic representation, genetic operations, and fitness function—are elaborated below.
Algorithm 1. Main steps of genetic algorithms (GA).
 Step 1: Creating the initial population of random chromosomes;
 Step 2: Evaluating the population;
 Step 3: WHILE not (Termination Criteria) DO Steps 4–7:
    Step 4: Selecting the mating pool of the next generation;
    Step 5: Crossing over random pairs of chromosomes of the mating pool;
    Step 6: Mutating the newly created offspring;
    Step 7: Evaluating the population.

3.2. Genetic Representation

We genetically represent both (i) the decision-making behavior of the predator agents and (ii) their sensory morphology in a single “chromosome”. The latter consist of an array of eight integer values of the evolved velocities of wheels of the agents and an additional allele encoding the angular offset of their sensor. The values for the velocities are constrained within the range [−100%…+100%], and are divided into 40 possible discreet values, with an interval of 5% between them. This number of discrete values (and, the interval between these values, respectively) provides a good trade-off between the precision of “tuning” (i.e., expressiveness of the genetic representation) and the size of the search space of the GA. The population size is 400 chromosomes. The breeding strategy is homogeneous in that the performance of a single chromosome cloned to all predators is evaluated.

3.3. Genetic Operations

Binary tournament is used as a selection strategy in the evolutionary framework. It is computationally efficient, and has proven to provide a good trade-off between the diversity of the population and the rate of convergence of the fitness [21]. In addition to tournament selection, we also adopted elitism in that the four best-performing chromosomes survive unconditionally and are inserted into the mating pool of the next generation. In addition, we implemented with equal probability both a one and two-point crossover. The two-point crossover results in an exchange of the values of both velocities (of the left and right wheels) associated with a given environmental state. This reflects our assumption that the velocities of both wheels determine the moving behavior of the agents (for a given environmental state), and therefore they should be treated as a whole as an evolutionary building block. Two-point crossovers would have no destructive effect on such building blocks. The one-point crossover is applied to develop such building blocks (exploration of the search space), while the two-point crossover is intended to preserve them (exploitation).

3.4. Fitness Evaluation

Our aim is to evolve the behaviors of the team of predators that are general to multiple initial situations, rather than a behavior that is specialized to one particular situation. To facilitate such an evolution, we evaluated each of the evolving chromosomes in 10 different initial situations. In each of these situations, the prey is located in the center of the world. The predators are scattered in a small cloud situated south of the prey. Figure 2 shows a sample initial situation. The overall fitness is the sum of the fitness values scored in each of the 10 initial situations. For a successful situation (i.e., the predators manage to capture the prey during the 120 s trial), the fitness is equal to the time needed to capture the prey. If the initial situation is unsuccessful, the fitness is calculated as a sum of (i) the closest distance—registered during the entire trial—between the prey and any predator and (ii) a penalty of 10,000. The former component is intended to provide the evolution with a cue about the comparative quality of the different unsuccessful behaviors. We verified empirically that this heuristic quantifies the “near-misses” well, and correlates with the chances of the predators—pending small evolutionary tweaks in their genome—to successfully capture the prey in the future. The second component is introduced with the intention to heavily penalize the lack of success of predators in any given initial situation.
Our PPPP is an instance of a minimization problem, as a lower OF value corresponds to a better performing team of predator agents. The evolution terminates on OF values lower than 600, which implies a successful capture of the prey in all 10 initial situations in an average time shorter than 60 s (i.e., half of the trial duration). The main parameters of the GA are elaborated in Table 2.
Initially, the predators need to learn how to move and locate the prey, therefore in the first of the ten initial situations the prey is placed very close to the group of predator agents. In each of the nine successive initial situations, the distance between the prey and the group of agents increases. The distance is calculated by the formula: current situation × 2 + random of 50 units. Thus, the first initial situation would be 2 + random of 50 units, while the tenth initial situation would be 20 + random of 50 units.

4. Experimental Results

4.1. Canonical Predators

As we elaborated in our previous work [3], we implemented 32 independent runs of the GA in an attempt to evolve a suitable mapping of the perceived environmental states into corresponding velocities of wheels of predators with canonical morphology. The sensor of these agents is aligned with their longitude axis. As shown in Figure 3, the evolution is unable to find a solution to the PPPP in all of the 10 tested initial situations.
Moreover, the most successful team of predators manage to solve just 6 (of 10) initial situations. The average of about 4 solved initial situations, through the whole set of 32 runs, is an indication that the problem was hard to tackle for the current (canonical) morphology of the predator agents.

4.2. Enhancing the Morphology of Predators

To improve the generality of the evolved predator behaviors, we focus on modifying their morphological features. The last of the features listed in Table 1—the orientation of the sensors—implies a straightforward implementation of the agents. This, indeed, is the common configuration of the previously studied simple agents [5,6,7,8,9]. We are interested in whether an a priori fixed asymmetry—an angular offset—would facilitate the evolution of more general behaviors of the team of predators. We speculate that a sensory offset would allow the predators to realize an anticipatory (equiangular, proportional) pursuit of the prey, aiming at the anticipated point of contact with the moving prey, rather than the currently perceived position of the prey. Notice that the proposed asymmetric morphology does not compromise the intended simplicity of the predator agents.
In our experimental setup, we fixed the offset of all predators to 10°, 20°, 30°, and 40° counterclockwise and conducted 32 evolutionary runs of the GA for each of these 4 configurations. The results are shown in Figure 4a–d, respectively, and summarized in Table 3. As Figure 4a and Table 3 illustrate, offsetting the sensors by only 10° significantly improves the generality of the evolved predator behaviors. They can resolve all 10 situations in 30 (93.75%) of the 32 evolutionary runs. The probability of success—the statistical estimation of the efficiency of evolution, defined for the PPPP as the probability to resolve all 10 initial situations, reaches 90% by generation #60 (Table 3). The terminal value of the OF in the worst evolutionary run is 10,987, corresponding to only one unresolved initial situation. Figure 4 shows convergence of the values of best objective function (top) and the number of successful situations (bottom) of 32 runs of the GA evolving predators with sensor offset of (a) 10°, (b) 20°, (c) 30°, and (d) 40°, respectively. The bold curves correspond to the mean, while the envelope illustrates the minimum and maximum values in each generation.
More efficient evolution, and behaviors that are more general, were obtained for the sensory offsets of 20° and 30°. As Figure 4b and Table 3 depict for 20°, the predators successfully resolved all 10 initial situations in all 32 evolutionary runs. The probability of success reaches 90% relatively quickly—by generation #9 (Table 3). Both the efficiency of evolution and the generality of the predator behaviors are similar for agents with sensory offsets of 20° and 30°, while these two characteristics deteriorate with the further increase of the offset to 40° (Figure 4c,d, and Table 3).

4.3. Generality of the Evolved Behavior

After seeing the success which the sensory offset brought to solving the initial 10 test situations, we investigated the generality of the best-evolved teams of predators on an extended set of 100 initial situations, including the same 10 initial situations used in the evolution, and 90 additional situations unforeseen during the evolution. For the 10 initial situations (from situation #1 to situation #10) used during the evolution, the predators are dispersed south of the prey (as illustrated in Figure 2) such that the average distance between the prey and the group of predators slightly increases with each situation. For the additional situations, (from situation #11 to situation #100) the average distance between the predators and the prey is kept the same as that of the situation with the most distant predators used during the evolution—situation #10. In each of the additional situations #11…#100 the average distance of the predators to the prey is the same as that of situation #10; however, both the position and the orientation of the predators are random. We would like to note that the alternative approach of exploring the generality of the agents by evolving them directly on 100 initial situations would feature a greater computational overhead as the total number of fitness trials would be almost 10-fold higher.
We conducted the experiments on the set of these 100 initial situations with the 32 evolved best-of-run teams of predators featuring a sensor offset of 20°. Our choice of this particular morphological configuration is based on its superiority in terms of both (i) the quality (fitness value) of the evolved best-of-run teams of agents and (ii) the consistency (probability of success) in evolving these teams, as illustrated in Table 3. Figure 5 shows the experimental results of the number of successfully solved initial situation by each of the 32 evolved best-of-run teams of predators. The results shown in Figure 5 demonstrate that the sensory offset of 20° results in behaviors of predator agents that are quite general (rather than over-fitted to particular initial situations). Indeed, only three solutions resolved less than 60 of all 100 initial situations (obtained from evolutionary runs #11, #22, and #25), while, the best solution (obtained from evolutionary run #17) resolves 97 initial situations. The evolved mapping of the velocities of wheels of the most general solution #17 is shown in Table 4.

4.4. Robustness to Sensory Noise

After verifying the generality of the 32 evolved best-of-run behaviors of the team of predator agents, we decided to investigate how each of these behaviors degrades when a random noise is introduced into the perception readings. Revisiting the same set of 100 initial situations (including the same 10 initial situations used in the evolution, and 90 additional unforeseen situations), we introduced two types of noise—false positive (FP) and false negative (FN), respectively. Under the influence of the FP noise, the value of either (randomly chosen with probability of 50%, individually for every predator agent, on each time step) of the two bits of sensory information is read as one, regardless of the actual reading of the sensor. On the contrary, in the presence of FN noise, the reading is registered as zero, even if the corresponding entity (a predator or the prey) is in the line-of-sight. We conducted experiments for levels of either FP or FN noise, starting from 2% and increasing by multiplier of 2, up to 16% (i.e., 2%, 4%, 8% and 16%).
The experimental results obtained, as shown in Figure 6, are somehow unexpected. We anticipated that that the number of solved initial situations would decrease in a monotonic way with the increase of the level of FP noise. However, most of the 32 best-of-run behaviors are quite robust to FP noise in that the number of solved initial situations varies slightly compared to the number of solved situations in a noiseless environment. Moreover, often the number of solved situations anomalously increases with the increase of noise levels. Notable behaviors that exhibit a slight increase in the number of solved situations with the increase of noise are, for example, those obtained from evolutionary runs #7 and #15. For behavior #15, the number of successful situations (91) for 4% noise is significantly higher than that obtained in noiseless environments (91 vs 83, respectively, or a 10% increase).
As illustrated in Figure 7, the detrimental effect of FN noise is as we expected – for most of the 32 best-of-run behaviors the number of solved initial situations steadily decreases with the increase of noise levels. Even so, there are some behaviors that exhibit an anomalous increase of the number of solved situations with the increase of noise levels. The notable behaviors are #7, #16, #19 and #28. Each of these behaviors result in an increase in the number of solved initial situations for 8%, 2%, 16% and 4% noise, respectively, as seen from Figure 6.
These results demonstrate that both the type and the magnitude of perception noise have an influence on the robustness of the evolved behaviors of predator agents. The effect of FP noise on the behavior of the team of predator agents is sometimes detrimental, sometimes favorable, or often insignificant. On the other hand, the FN noise with few exceptions (e.g., behaviors #7, #17, #19, #23 and #28) is detrimental for performance of the agents.
It is interesting to note that the evolved behavior #17 (shown in Table 4) is rather versatile in that it (i) exhibits good performance on the 10 initial situations used during the evolution, (ii) is general to most unforeseen initial situations (solving 97 of 100 situations), and (iii) is robust to both FP and FN noise. Furthermore, behaviors like #7, #23 and #28, exhibit robustness to both types of noises and even increased performance for certain levels of noise.

5. Discussion

5.1. Advantages of the Proposed Asymmetric Morphology

The experimental results indicate that the introduction of the angular offset of the sensor of the agents improves the effectiveness of the team by enabling the emergence of a general and robust (to environmental noise) capturing behavior. It also helps increase the efficiency of the evolution of successful capturing behavior of agents. Our previous work [3], as well as the result provided above, suggest that the behavior evolved with a sensor offset of 20° is the most robust to noise and shows the best generality when subjected to additional unforeseen (during the evolution) initial situations in a noiseless environment.
The beneficial effect, stemming from the addition of a sensory offset, is the ability to accomplish a reliable tracking of the prey by predicting its position in case it disappears from sight. Due to the counterclockwise sensory offset, and the parallax induced by the movement of the predator, most of the time, a prey that has disappeared would be located to the left of the predator. The GA manages to discover this knowledge and to take advantage of it by setting the velocities of the wheels in such a way that allows the predator to turn to the left (V00Table 4)—i.e., towards the most probable position of the disappeared prey.
Therefore, one of the virtues of the sensor offset is that it results in a more deterministic direction of the disappearance of the prey—almost certainly to the left—which, in turn facilitates a faster rediscovery, and consequently a more reliable tracking of the prey by the predator. Moreover, as shown in Figure 8, the chase by the predator featuring an asymmetric morphology would result in a characteristic circular trajectory of both the predator and the prey. This behavior is more efficient, compared to that of the predators with canonical straightforward morphology. In the latter case the predator could not make any reliable prediction about the most likely direction of the disappearance of the chased prey. With the challenging but realistic assumption that the prey is not initially surrounded by the predators (as illustrated in Figure 2) such emergent circular trajectories would facilitate the surrounding; as the prey would be shepherded (driven) by a single predator (driver) towards the pack of the remaining predators. Significantly reducing the sensor offset from 20° would stretch the chasing trajectory of the predator. Moreover, in such case, as the chased prey becomes closer to the longitudinal axis of the predator, in order not to compromise the certainty that the prey has disappeared to the left, it would need to turn slightly to the right (instead of going straight, as shown in Figure 8, for environmental state <01>)—by reducing the speed of the right wheel—during the chase. This, in turn, would reduce the overall chasing speed of the predator. These two factors—stretching the chasing trajectory and reducing the chasing speed of the predator—would result in increasing the time needed to drive the prey towards the dispersed predators. Conversely, increasing the sensor offset would result in a more compact chasing trajectory that might not stretch enough to reach back to the pack of remaining predators.

5.2. Emergent Behavioral Strategies

The current research, as well as our previous work on agents with asymmetric morphology [3], suggest that the solution most robust to noise and with greatest success rate in the generality tests is the behavior obtained from the evolutionary run # 17. In this section, we will use that specific behavior to review the behavioral strategies of the team of predator agents emerging from evolution of the velocity mappings by the GP framework.
The values of the evolved velocities of motors and the sensor offset of behavior #17 are shown in Table 4. The team of predator agents exhibits the following four behavioral strategies, executed in four consecutive phases of the trial: (i) circling around until they find a peer or the prey (controlled by velocities V00), (ii) exploring the environment by distancing themselves from each other (controlled by velocities V10), (iii) surrounding by shepherding (driving) the prey (by some of the predators—drivers) in a circular trajectory (V01), and (iv) capturing the prey (V11).The team of predator agents exhibits the following three behavioral strategies, executed in three consecutive phases of the trial: (i) circling around until they find a peer or the prey (controlled by velocities V00), and then exploring the environment by distancing themselves from each other (controlled by velocities V10), (ii) surrounding by shepherding (driving) the prey (by some of the predators—drivers) in an circular trajectory (V01), and (iii) capturing the prey (V11). Figure 9 illustrates the different phases the agents go through in the process of catching the prey. A video of how the team of predators deals with all 10 initial situations can be found at http://isd-si.doshisha.ac.jp/m.georgiev/2018-12-08-SA20deg.mp4.
As shown in Figure 9a, in the beginning all agents have no vision of either the prey or any of the peers. Following the mapping of V00L = 25% and V00R = 100% (as shown in Table 4), they start circling around—scanning the environment in an attempt to find another entity. Detecting a peer activates the set of velocities V10L = −25% and V10R = −20%, which forces the predators to enter the second stage: to move away from the perceived agent, which facilitates a better dispersion and a coverage of a wider area. This enhances the ability of the predators to explore the environment and to discover the prey. The third stage—surrounding—begins when any of the predators discovers the prey. The mapping V01L = 100%, and V01R = 100% results in moving forward at the highest speed, which helps in keeping the prey almost always in the same relative position to the agent—i.e., on the left side, as shown in Figure 8 and Figure 9b–e. Once the prey disappears from view—as shown in the center panel of Figure 8—the predator exhibits an embodied cognition that the disappearance is a result, in part, of its own forward motion; therefore, the new location of the prey is—due to the counter clockwise offset of the sensor—most likely on the left of its own orientation. Then the evolved V00L = 25% and V00R = 100% are activated (Figure 8 right) resulting in a circular motion to the left, until the agent rediscovers the disappeared prey.
Moreover, as Figure 9b–d show, a single predator—driver—due to its sensor offset, shepherds (drives) the prey in a circular, counter clockwise trajectory into the (already dispersed) other predators. The fourth (final) behavioral phase concludes the chase by capturing the prey that is surrounded from all sides by both the driver(s) and the newly encountered predators, as illustrated in Figure 9e,f. When approaching from opposite sides, the predators are able to see both the prey and a peer, which activates the mapping V11L = 100% and V11R = 100%. Since they have a slight angular offset, it is possible for only three predators to catch the prey, as illustrated in Figure 9e,f. One of the predators chases the prey from behind and guides it to its front left side, while the other approaches it from the opposite direction.
At the same time, as shown in Figure 9d–f, two of the agents keep distancing themselves from the group of other predators. The agents seem to exhibit an emergent knowledge [24] that not all eight agents are needed to capture the prey. While sacrificing their own chances to capture the prey in this particular initial situation, in general, such behavior helps the team (as a whole) by expanding the initially explored field and therefore it discovers the prey faster, especially when it is further away from the predators.
The manifestation of shepherding behavior in the third behavioral phase is probably the most significant difference between the evolved behavior of the canonical straightforward predator agents and that of the agents with asymmetric morphology. This behavior, being a vital part of the successful capturing, could not be observed in the behavior of the canonical predator agents. A video showing how the canonical agents struggle to find and capture the prey is available at http://isd-si.doshisha.ac.jp/m.georgiev/2018-12-03-SA-Straightforward.mp4.

5.3. Alternative Approaches

A different approach to finding a solution would be to implement a multi-agent system with several different types of predator agents, in which, each of them has a specifically assigned role that contributes towards capturing the prey. In our previous work that compares the performance of heterogeneous and homogeneous MAS [25,26], we have analyzed in-depth the different problems that heterogeneity introduces. The main reason we did not implement a heterogeneous MAS is that the efficiency of evolution of the heterogeneous system might be hindered by the inflated search space. Additionally, due to the a priori defined specialization of the agents in heterogeneous systems, we cannot ensure that generality of the heterogeneous team; e.g., consisting of two types of agents—drivers and capturers—would not be compromised, because we could not guarantee that the a priori defined drivers would be in the most favorable position relative to the prey in each of the initial situations. Indeed, in real-world situations, placing a particular predator (driver) in a particular position is challenging and not always possible. Alternatively, we decided to implement a priori unspecialized, versatile agents and to give them the ability to execute any role (depending on their perceived environment) that is needed to capture the prey. In our case, the agent that is closest to the prey would assume the role of a “driver”. This behavioral heterogeneity emerges dynamically from the interaction between the homogeneous genotype (all of the agents share the same velocity mappings of the rotation velocities of wheels) and the environment.

6. Conclusions

We considered a society of very simple robots (modeled as agents in MAS) that feature an “extreme simplicity” of both sensors and control. The adopted agents have a single line-of-sight sensor, two wheels in a differential drive configuration as effectors, and a controller that does not require a memory and does not involve any computing, but rather a direct mapping of the currently perceived environmental state into a pair of velocities of the two wheels. Also, we applied genetic algorithms to evolve a mapping that will result in effective behavior of the team of predator agents towards the goal of capturing the prey in the predator-prey pursuit problem (PPPP). The preliminary experimental results indicated that the simple agents featuring the canonical straightforward sensory morphology could hardly evolve the ability to solve the PPPP.
To enhance the performance of the evolved system of predator agents, we propose an asymmetric morphology featuring an angular offset of the sensor, relative to the longitudinal axis of the agents. The experimental results demonstrated that this modification brings—without compromising the simplicity of agents—a considerable improvement of both (i) the efficiency of evolution and (ii) the effectiveness of the evolved capturing behavior of predator agents. Also, we verified that some of the evolved best-of-run behaviors of predators featuring a sensor offset of 20° are both (i) general in that they are able to successfully solve most of the additionally introduced, unforeseen initial situations, and (ii) robust to perception noise in that they show a limited degradation of the number of successfully resolved initial situations.
The results described in our work could be seen as a step towards the verification that complex behavior needed for solving challenging tasks could emerge from the coordination of very simple robots featuring an asymmetric sensory morphology. The advantages of such robots, in addition to the simple design, include better robustness, higher throughput of production and lower production costs, reduced energy consumption, and the potential to be implemented at very small (nano- and micro-) scales.
In our future work, we are planning to investigate the anomalous increase of the number of successful situations with the increase of false positive noise. While similar phenomena are well known in engineering (e.g., stochastic resonance, dithering) there are no documented results on the beneficial effects of noise on the performance of MAS. Also, we are planning to develop an even more realistic, three-dimensional model of the environment of PPPP.

Author Contributions

Conceptualization, I.T. and T.R.; methodology, I.T. and T.R; software, I.T.; investigation, M.G. and I.T.; validation, M.G., I.T., K.S. and T.R.; interpretation of results, M.G., I.T., K.S. and T.R.; resources, I.T.; data curation, M.G. and I.T.; visualization, M.G., I.T., and T.R., writing—original draft preparation, M.G. and I.T.; writing—review and editing, K.S. and T.R.; project administration, I.T.

Funding

This research received no external funding.

Acknowledgments

This research is an extension to our previous work on “Evolving a Team of Asymmetric Predator Agents That Do Not Compute in Predator-Prey Pursuit Problem” presented at AIMSA-2018. We have included here the results of additional experiments on generality and robustness of the evolved behaviors of the simple predator agents with asymmetric morphology. Also, we have provided additional evidence about the benefits of the proposed asymmetric morphology of agents. Finally, we have addressed the questions, comments, and suggestions that we received from the attendees of AIMSA-2018. Our work was assisted by Doshisha University, which provided the facilities and equipment needed for testing and analyses. The corresponding author—Milen Georgiev, was partly supported by the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT).

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

  1. Bhattacharya, S.; Murrieta-Cid, R.; Hutchinson, S. Optimal paths for landmark-based navigation by differential-drive vehicles with field-of-view constraints. IEEE Trans. Robot. 2007, 23, 47–59. [Google Scholar] [CrossRef]
  2. Yu, J.; LaValle, S.M.; Liberzon, D. Rendezvous without coordinates. IEEE Trans. Autom. Control. 2012, 57, 421–434. [Google Scholar]
  3. Tanev, I.; Georgiev, M.; Shimohara, K.; Ray, T. Evolving a Team of Asymmetric Predator Agents That Do Not Compute in Predator-Prey Pursuit Problem. In Proceedings of the 18th International Conference on Artificial Intelligence: Methodology, Systems, Applications, Varna, Bulgaria, 12–14 September 2018. [Google Scholar]
  4. Georgiev, M.; Tanev, I.; Shimohara, K. Coevolving behavior and morphology of simple agents that model small-scale robots. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, Kyoto, Japan, 15–19 July 2018. [Google Scholar]
  5. Gauci, M. Swarm Robotic Systems with Minimal Information Processing. Ph.D. Thesis, University of Sheffield, Sheffield, UK, September 2014. [Google Scholar]
  6. Gauci, M.; Chen, J.; Li, W.; Dodd, T.J.; Groß, R. Self-organized aggregation without computation. Int. J. Robot. Res. 2014, 33, 1145–1161. [Google Scholar] [CrossRef]
  7. Gauci, M.; Chen, J.; Li, W.; Dodd, T.J.; Groß, R. Clustering objects with robots that do not compute. In Proceedings of the 2014 International Conference on Autonomous Agents and Multi-agent Systems, Paris, France, 5–9 May 2014. [Google Scholar]
  8. Özdemir, A.; Gauci, M.; Groß, R. Shepherding with Robots That Do Not Compute. In Proceedings of the 14th European Conference on Artificial Life, Lyon, France, 4–8 September 2017; pp. 332–339. [Google Scholar]
  9. Brown, D.S.; Turner, R.; Hennigh, O.; Loscalzo, S. Discovery and exploration of novel swarm behaviors given limited robot capabilities. In Proceedings of the 13th International Symposium on Distributed Autonomous Robotic Systems, London, UK, 7–9 November 2016; pp. 447–460. [Google Scholar]
  10. Benda, M.; Jagannathan, V.; Dodhiawala, R. An Optimal Cooperation of Knowledge Sources; Technical Report BCS-G2010-28; Boeing AI Center, Boeing Computer Services: Bellevue, WA, USA, 1986. [Google Scholar]
  11. Haynes, T.; Sen, S. Evolving behavioral strategies in predators and prey. In Proceedings of the 1995 International Joint Conference on AI, Montreal, QC, Canada, 20–25 August 1995. [Google Scholar]
  12. Luke, S.; Spector, L. Evolving Teamwork and Coordination with Genetic Programming. In Proceedings of the First Annual Conference on Genetic Programming, Stanford, CA, USA, 28–31 July 1996; pp. 150–156. [Google Scholar]
  13. Tanev, I.; Brzozowski, M.; Shimohara, K. Evolution, generality and robustness of emerged surrounding behavior in continuous predators-prey pursuit problem. Genet. Progr. Evolvable Mach. 2005, 6, 301–318. [Google Scholar] [CrossRef]
  14. Rubenstein, M.; Cabrera, A.; Werfel, J.; Habibi, G.; McLurkin, J.; Nagpal, R. Collective transport of complex objects by simple robots: Theory and experiments. In Proceedings of the 2013 International Conference on Autonomous Agents and Multiagent Systems, St. Paul, MN, USA, 6–10 May 2013; pp. 47–54. [Google Scholar]
  15. Lien, J.M.; Rodriguez, S.; Malric, J.P.; Amato, N.M. Shepherding Behaviors with Multiple Shepherds. In Proceedings of the 2005 IEEE International Conference on Robotics and Automation, Barcelona, Spain, 18–22 April 2005. [Google Scholar]
  16. Requicha, A.A. Nanorobots, NEMS, and Nanoassembly. Proc. IEEE 2013, 91, 1922–1933. [Google Scholar] [CrossRef]
  17. Niu, R.; Botin, D.; Weber, J.; Reinmüller, A.; Palberg, T. Assembly and Speed in Ion-Exchange-Based Modular Phoretic Microswimmers. Langmuir 2017, 33, 3450–3457. [Google Scholar] [CrossRef] [PubMed]
  18. Ibele, M.; Mallouk, T.E.; Sen, A. Schooling Behavior of Light-Powered Autonomous Micromotors in Water. Angewandte Chemie Int. Ed. 2009, 48, 3308–3312. [Google Scholar] [CrossRef] [PubMed]
  19. Martinez-Pedrero, F.; Massana-Cid, H.; Tierno, P. Assembly and Transport of Microscopic Cargos via Reconfigurable Photoactivated Magnetic Microdockers. Small 2017, 13, 1603449. [Google Scholar] [CrossRef] [PubMed]
  20. Holland, J. Adaptation in Natural and Artificial Systems; The University of Michigan Press: Ann Arbor, MI, USA, 1975. [Google Scholar]
  21. Goldberg, E. Genetic Algorithms in Search, Optimization and Machine Learning; Addison-Wesley Longman Publishing Co., Inc.: Boston, MA, USA, 1989. [Google Scholar]
  22. Mitchell, M. An Introduction to Genetic Algorithms; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
  23. Nolfi, S.; Floreano, D. Evolutionary Robotics: The Biology, Intelligence, and Technology of Selforganizing Machines; MIT Press: Cambridge, MA, USA, 2000. [Google Scholar]
  24. Angeline, P.J.E.; Kinnear, K.E., Jr. Genetic Programming and Emergent Intelligence. In Advances in Genetic Programming; MIT Press: Cambridge, MA, USA, 1994. [Google Scholar]
  25. Georgiev, M.; Tanev, I.; Shimohara, K. Performance Analysis and Comparison on Heterogeneous and Homogeneous Multi-Agent Societies in Correlation to Their Average Capabilities. In Proceedings of the 2018 57th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE), Nara, Japan, 11–14 September 2018. [Google Scholar]
  26. Georgiev, M.; Tanev, I.; Shimohara, K. Exploration of the effect of uncertainty in homogeneous and heterogeneous multi-agent societies with regard to their average characteristics. In Proceedings of the Genetic and Evolutionary Computation Conference Companion GECCO (Companion), Kyoto, Japan, 15–19 July 2018. [Google Scholar]
Figure 1. The four possible environmental states perceived by (any) predator agent Ai.
Figure 1. The four possible environmental states perceived by (any) predator agent Ai.
Information 10 00072 g001
Figure 2. Sample initial situation.
Figure 2. Sample initial situation.
Information 10 00072 g002
Figure 3. Convergence of the values of best objective function (OF, top) and the number of successful situations (bottom) of 32 independent runs of the GA evolving the behavior of a team of predator agents with canonical morphology. The bold curves correspond to the mean, while the envelope shows the minimum and maximum values in each generation.
Figure 3. Convergence of the values of best objective function (OF, top) and the number of successful situations (bottom) of 32 independent runs of the GA evolving the behavior of a team of predator agents with canonical morphology. The bold curves correspond to the mean, while the envelope shows the minimum and maximum values in each generation.
Information 10 00072 g003
Figure 4. Convergence of the fitness for 32 runs of the GA evolving predators with sensor offset of (a) 10°, (b) 20°, (c) 30°, and (d) 40°, respectively.
Figure 4. Convergence of the fitness for 32 runs of the GA evolving predators with sensor offset of (a) 10°, (b) 20°, (c) 30°, and (d) 40°, respectively.
Information 10 00072 g004
Figure 5. Generality of the 32 evolved best-of-run behaviors of the team of predator agents.
Figure 5. Generality of the 32 evolved best-of-run behaviors of the team of predator agents.
Information 10 00072 g005
Figure 6. Number of successfully solved initial situations for various levels of false positive (FP) perception noise.
Figure 6. Number of successfully solved initial situations for various levels of false positive (FP) perception noise.
Information 10 00072 g006
Figure 7. Number of successfully solved initial situations for various levels of FN perception noise.
Figure 7. Number of successfully solved initial situations for various levels of FN perception noise.
Information 10 00072 g007
Figure 8. Reliable tracking of the prey by a predator Ai.
Figure 8. Reliable tracking of the prey by a predator Ai.
Information 10 00072 g008
Figure 9. Emergent behavioral strategies of a sample evolved team of predator agents with sensor offset of 20°. Environmental state perceived by predator: grey = <00>, red = <10>, blue = <01>, purple = <11>.
Figure 9. Emergent behavioral strategies of a sample evolved team of predator agents with sensor offset of 20°. Environmental state perceived by predator: grey = <00>, red = <10>, blue = <01>, purple = <11>.
Information 10 00072 g009
Table 1. Features of the predator and prey agents.
Table 1. Features of the predator and prey agents.
FeatureValue of the Feature
PredatorsPrey
Number of agents81
Diameter (and wheel axle track), units88
Max linear velocity of wheels, units/s1010
Max speed of agents, units/s1010
Type of sensorSingle line-of-sightOmni-directional
Range of visibility of the sensor, units20050
Orientation of sensorParallel to longitudinal axisNA
Table 2. Parameters of GA.
Table 2. Parameters of GA.
ParameterValue
GenotypeEight integer values of the velocities of wheels (V00L, V00R, V01L, V01R, V10L, V10R, V11L, and V11R)
Population size400 chromosomes
Breeding strategyHomogeneous: the chromosome is cloned to all predator agents before the trial
SelectionBinary tournament
Selection ratio10%
ElitismBest 1% (4 chromosomes)
CrossoverBoth single- and two-point
MutationSingle-point
Mutation ratio5%
Fitness cases10 initial situations
Duration of the fitness trial120 s per initial situation
Fitness valueSum of fitness values of each situation:
(a) Successful situation: time needed to capture the prey
(b) Unsuccessful situation: 10,000 + the shortest distance between the prey and any predator during the trial
Termination criterion(# Generations = 200) or (Stagnation of fitness for 32 consecutive generations)
or (Fitness < 600)
Table 3. Efficiency of evolution of the team of predator agents.
Table 3. Efficiency of evolution of the team of predator agents.
Sensor OffsetTerminal Value of Objective FunctionSuccessful Runs # Generations
Needed to Reach 90% Probability of Success
BestWorstMeanStandard DeviationNumber% (of 32 Runs)
40,92870,72961,064851600NA
10°50410,987131025313093.7560
20°46881858857.2321009
30°49571357438.53210012
40°47540,903184071283196.87515
Table 4. Velocity mappings of the most prominent behavior #17. The sensor offset is 20°.
Table 4. Velocity mappings of the most prominent behavior #17. The sensor offset is 20°.
V00LV00RV01LV01RV10LV10RV11LV11R
25%100%100%100%−25%−20%100%100%

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Back to TopTop