Ant Robotic Swarm for Visualizing Invisible Hazardous Substances

: Inspired by the simplicity of how nature solves its problems, this paper presents a novel approach that would enable a swarm of ant robotic agents (robots with limited sensing, communication, computational and memory resources) form a visual representation of distributed hazardous substances within an environment dominated by diffusion processes using a decentralized approach. Such a visual representation could be very useful in enabling a quicker evacuation of a city’s population affected by such hazardous substances. This is especially true if the ratio of emergency workers to the population number is very small.


Introduction
Monitoring of hazardous substances in our environments is fast becoming a high priority for governments around the world due to global warming [1][2][3]. One of the projects that have been setup to tackle pollution is the European Union Shoal project, in which a shoal of robotic fishes would be deployed to find the source of pollution in sea ports and the culprit responsible charged accordingly. This work focuses on the spread of pollution by diffusion in the hydrodynamics section of the

OPEN ACCESS
European Union Shoal project. Monitoring of hazardous substances in environments other than the sea is also a necessity. This is especially true if the hazardous substance is not visible to humans. Such substances could include nerve gas, Carbon dioxide, which can be fatal if in large quantities, Sarin gas or even nuclear radiation. Taking a scenario in which nerve gas as been released due to a terrorist attack in a city; the population of that city needs to be evacuated. However, as they cannot see the pollution, they do not know whether they are running towards it or away from it. The emergency services have been called in but are overwhelmed by the ratio of number of citizens to crew men. If however a swarm of flying micro robots were able to visualize the pollutant, it is possible for the population to keep away from affected areas and in the process evacuate themselves from dangerous areas.
This can be done by ensuring that the motion of each individual agent in the swarm of robots is controlled so that eventually, the distribution of the swarm converges in a way that areas containing a high amount of pollutants are more densely covered by swarm individuals than areas that have little or no pollutant. This approach leads to efficient use of mobile sensors especially when their numbers are limited.
Furthermore, the robots could be used as measurement stations to collect information about the pollutant and then a map of the pollution could be developed and built at a ground station. This information could then be used to predict the path and behaviour of the invisible pollutant. By observing the behaviour of the pollutant from the map, it is possible to say what type of pollutant it is. In order for the agents to provide such visual information, the agents need to be equipped with a coverage control algorithm that would enable them to perform optimal coverage.
There has been a lot of work done in the area of coverage using multiple agents, such as the use of Voronoi partitions [4][5][6][7][8]. The Voronoi partition method requires a polygon derived environment and a large computational time due to computation of the Voronoi partition in addition to the computation of the global cost function required to drive the mobile sensors to their required cost minimal positions. Nevertheless, Schwager et al. was able to use the Voronoi partition method and a Radial basis function machine learning technique on a group of robots to achieve a visual representation in [6,9]. Shucker et al. [10,11] used virtual springs to form the distribution of a pollutant by tracking points in it. Their work requires long distance communication and this might not be possible when considering hardware limitations on simple agents.
The use of deterministic annealing as in [12] also requires a large communication burden. Artificial Physics and Computation Fluid Dynamics (CFD) were used by Zarzhitsky et al. in [13] to control a group of robots to find the source of a pollutant in the environment. Artificial Physics was used to make the robots form a lattice while CFD was used to navigate up pressure fluxes in the environment towards the source. This approach works on the assumption that the pollutant would most likely to be in the area where there was a negative flux or sink.
Furthermore, work has also been carried out regarding coverage using ant robots. Ant robotics has been inspired by simple minded creatures like ants whereby individuals following simple rules are able to build complex structures and perform complex feats of engineering. Robots inspired by these creatures often have limited memory, computing, sensing and communication resources [14,15]. These properties make them very cheap and hence disposable in hazardous situations whereby they could be thrown away after being used for monitoring harmful substances. Additionally, most of the work in ant robotics tend to focus on simple algorithms to perform tasks such as lawn mowing, mine sweeping, vacuum cleaning, and so on, and make use of the advantage of multiple agents to introduce graceful degradation of the system [15].
In [14], terrain coverage by ant robots was investigated. In their work, they took inspiration from natural ants by using markings left by individual robots to communicate indirectly with other individual robots. Additionally, the markings were also used to determine the next action of the ant due to its ability to sense what is around it. Their ant robots did not need a map of the terrain, localization, path planning nor did they require memory. They were able to achieve terrain coverage by comparing a variety of algorithms such as Learning Real-time A Star (LRTA*), Node counting, Thrun's Value-Update Rule and Wagner's Value-Update Rule.
In the famous "cooperative cleaners" task in [16], a cleaning protocol algorithm called CLEAN was developed to clean up contaminated tiles in the environment. In [17], a dynamic cleaning algorithm called SWEEP was developed to clean up a dynamically spreading target such as fire. However, this work and that in [18] relies on the fact that critical points are not cleaned so that the task can be completed. Furthermore, in [18], a distributed search for evasive targets using ant robots was presented while in [19] analyses of how many pursuers and how they should move was conducted. Similarly, [18] analysed the amount of time it would take for a swarm of robots to clean both a static and expanding region of pollutants while in [15], the authors showed that even if robots having limited capabilities were moved, it is guaranteed that a swarm of them would still cover an area of interest. In most of the work above, algorithms were either developed or applied after understanding the nature of the problem.
This work uses mathematical models of the bacterium, flocking behaviour as seen in starlings and a velocity function to solve the problem of providing a visual representation of an invisible pollutant.
The bacterium behaviour was used to explore the environment, find pollutants and subsequently navigate up the gradient of the pollutant; the flocking behaviour was used to keep agents from colliding from each other and for collective foraging; while the velocity function was used to simulate a dwelling behaviour in which agents tend to dwell in an area that has a rich source of food compared to other areas that lack food.
This work is divided into two parts. The first part uses a swarm of robots to provide a visual representation of a static pollutant and follows in the path of ant robots. Using the criteria in [15], the robots have a finite bounded memory k = 4 and communication radius of r = 50 in a grid of 2000 by 2000. Additionally, the agents can only sense the pollutant in their immediate locality (x, y) but cannot sense what is beyond this at for example (x ± ∆x, y ± ∆y). Using this information, each individual agent moves according to a set of equations that model the bacterium's behaviour when searching for food. The bacterium only needs current local information related to the area in which it presently is; in addition to a history of past local information k = 4 in order to determine its next move. This is unlike previous work in ant robotics where most of the work relied on assuming that the ant knows about the conditions in tiles surrounding it. The conditions in those tiles then determine its next action.
Also, in this present work, cleaning is not our major objective. The effect of cleaning on the proposed algorithm in this work would be studied in future work. Nevertheless, this is the first time that ant robots have been used in the task of visualizing pollutant.
The second part investigates the possibility of enhancing the first part by using a Genetic Algorithm machine learning technique. In this case, the ant robots are equipped with knowledge of the pollutant profile to be mapped. We do this in accordance to [20] in which Arkin discussed how simple reactive models of robot control could be made more effective through the introduction of knowledge about the environment. This was done in this work by updating the parameters of the proposed ant coverage controller through the use of a modified computationally efficient Gaussian model based Genetic Algorithm. The Genetic Algorithm was used to estimate the distribution of the pollutant in the environment and the information used as a cost function to distribute the agents optimally in the environment. However, it could be argued that introducing this additional capability makes them less of ant robots due to the computational expensive nature of Genetic Algorithms.
Furthermore, it should be noted that the method presented in this work does not require the geometry of the environment to be known a priori as in the Voronoi partition method [21]. Also, the approach uses computationally efficient ant algorithms to achieve the same level of coverage as was investigated in [22].
The rest of this paper is arranged as follows. In Section 2, the technical approach is described, including the problem description and coverage controller. Section 3 discusses, briefly, about how a Genetic Algorithm was used to estimate the profile of a pollutant in the environment while Section 4 discusses about simulation results. Finally, a brief conclusion and future work are presented in Section 5.

Technical Approach
The problem of distributing sensors into an environment under investigation can be seen as a coverage problem in which a pollution profile F(S(x)) presented in an area S(x) is to be covered by n number of robotic agents so that the distribution of the robotic agents A T (x) is close to that of the pollution profile F(S(x)) where x is the set of coordinates (x, y) in S(x). The robotic agents used in this work are assumed to have omnidirectional kinematics.
However, the proposed coverage controller can also be used on other types of robotic platforms with different dynamics by adding a low level controller to convert the control outputs into ones that suitable for the platform being used.
As mentioned previously, the coverage controller is made up of a bacteria controller that gives the agents the ability to navigate up a pollution gradient or a pressure flux gradient if required. This controller gives a source directional force to the flock of agents. A flocking controller is used to make the agents move as a group towards the source of the pollutant and to pull each other out of local minimum situations if they occur. Additionally, the flocking controller enables the flock of agents to localise at the source faster than if they were moving individually. The velocity controller is used to tune the distribution of the agents in the environment as will be discussed in Section 2.3. Each controller will now be discussed separately.

Bacteria Controller
There have been many investigations into the development of pollution source seeking controllers. For example Baronov and Baillieul [23], used a radial function as a potential function to develop an ascending controller. Their algorithm is not fully dependent on gradient but the function to be mapped or transverse must be known a priori for tuning the controller. Mayhew et al. [24] used a hybrid controller that combines a line minimization-based algorithm and a vehicle path planning algorithm to find the extremum of a function. Their approach does not need GPS point measurements and does not need any knowledge of the function to be mapped a priori. Dhariwal [25] used the Keller and Segel bacteria model as a basis for developing their controller whilst Marques et al. [26] investigated the use of rule based approach to using bacteria chemotaxis behaviour on a robot. In their experiments, they also investigated the use of the silkworm moth algorithm and a direct gradient following method. Lilienthal et al. [27] used a Braitenberg vehicle approach to find an ethanol source in their own experiments.
However this work uses a controller based on the Berg and Brown model in [28]. Models such as the one in [29], and other modern complex models, were not considered because a random biased walk model is sufficient for environmental exploration and source declaration in an environment. A bacterium motion is composed of a combination of tumble and run phases. The frequency of these phases depends on the measured concentration gradient in the surrounding environment. The run phase is generally a straight line while the tumble phase is a random change in direction with a mean of about 68 degrees in the E. Coli bacterium. If the bacterium is moving up a favourable gradient, it tumbles less thereby increasing the length of the run phase and vice versa if going down an unfavourable gradient. This behaviour was modelled by Berg and Brown by fitting the results of their experimental observations in [30] with a best fit equation in [28]. This model is shown below: where is the mean run time and is the mean run time in the absence of concentration gradients, is a constant of the system based on the chemotaxis sensitivity factor of the bacteria, is the fraction of the receptor bound at concentration C. In our work, C was the present reading taken by our Robotic agent.
is the dissociation constant of the bacterial chemoreceptor. is the rate of change of . is the weighted rate of change of , while is the time constant of the bacterial system.
The above equations determine the time between tumbles and hence the length of runs between tumbles. During the tumble phase, the robot agent can randomly choose a range of angles in the set {0….. 360} by randomly choosing angles. This made it possible for the robot agents to backtrack if there is a favourable gradient behind it. The ability of the implemented bacteria chemotaxis controller to work over long distances has been presented in [31,32] and works fine even in the presence of small disturbances.
The above discussion about the bacterium behaviour can be depicted graphically, as shown in Figure 1.
A counter that increments every time tick is implemented. Once the counter is greater than a tumble phase is initiated otherwise the bacterium stays in the run phase. Once a tumble phase is completed through the assigning of a new , the run phase is resumed. A velocity controller which will be discussed in Section 2.3 was embedded into the bacterium controller.
It could be argued that the bacterium behaviour is a gradient based algorithm and, as such, an ordinary gradient based algorithm could be used in place of it. However, the Equations (1) to (3) have more to them than meets the eye. These equations take noise in the bacterium sensors, dynamics or environment into consideration. Equation (2) is an exponentially weighted low pass filter that is capable of filtering out noise. As a result of the filter, the Berg and Brown derived bacterium controller is not easily affected by noise and hence would not be readily caught in local maximums when they are encountered [33]. This is further aided by the random nature of the algorithm as a result of the tumble phase of the controller. It could be argued as well that the random nature of the algorithm is not beneficial to bacteria. However, as was proven in [34], it is this very random nature that enables them to form a distribution corresponding to the profile of the pollutant in the environment. This is further supported by the works of Mach and Schweitzer, Hamann and Heinz [35,36] in which they discussed how a swarm of stochastic agents having a random component (bacterium tumble phase in this case) and a direct component (bacterium run phase in this case) can be represented by a Fokker Planck equation. The Fokker Planck equation would always converge to its stationary distribution that corresponds to the spatial function of interest.
It is the above properties that make the Berg and Brown derived bacterium controller different from an ordinary gradient ascent algorithm, and make it suited for use on noisy functions as used in the simulations in this work. In this work, the filter as represented by Equation (2) was implemented in discrete form using an ant limited memory resource length k = 4 to remember a history of past readings.

Flocking Controller
There has been a lot of work done on flocking controllers. The first flocking controller was developed by C.W. Reynolds in [37]. In order to achieve flocking, he defined that three rules need to followed. These rules were: Keep as close as possible to your neighbour (Aggregation), do not collide with your neighbour (Collision Avoidance), and move in a similar general heading as your neighbours.
Following this discovery, many researchers such as [38][39][40] have worked on these three basic rules in investigating various approaches of achieving flocking. Most of the approaches agree that Equation (4) should be satisfied in order to achieve flocking [38].

Tumble Run
where r is the distance between agents, represents the repulsion function and is the attractive function. Following this, we use exponential functions, as in Equation (5), to achieve flocking. This results in the Morse potential as in Equation (5) [41].
where i is the index of the agents. Gains of 1 for the repulsion term G R and 0.99 for the attractant term G A were used. It was discovered that it is possible to control how closely the agents get to each other whilst not colliding by reducing the G G gain. r in Equation (5) is the distance between the two closest agents. This value was obtained as follows: Each agent had a communication radius l = 50 and can only buffer up to N = 5 readings obtained from agents within this range. From these readings, the closest agent r is obtained and this value is used in Equation (5) to avoid collisions with other agents. Despite using a limited number N of neighbours, the agents were still able to avoid collisions whilst maintaining cohesion as a flock. Using a buffer of five readings was used only for flocking. This buffer is not necessary if proximity sensors are used instead, making sure that the ant like feature is maintained. The final output from the flocking controller and bacteria controller is calculated as shown in Equation (6) where G F and G B are gains applied to both the flocking controller and bacteria controller output respectively.

Velocity Controller
The agent velocity updates are such that by controlling them, it is possible for the agents form a distribution of the pollutant in the environment and in works of [32,42] it was obtained by an adaptable velocity that was determined by Equation (7): (7) where is a dynamic velocity that depends on the present reading of the environmental quantity, is the standard velocity without any reading, v k is a constant for tuning the dynamic velocity and C is the pollutant reading obtained by the robot's sensor. The separate controllers are combined by using the architecture shown in Figure 2.
The relationship between the bacterium controller and the velocity controller is shown in Equation (8), where is the distance covered by the agent during a run phase and is the time duration from the last tumble phase. (8) By using the velocity Equation (7), it was possible to enable agents achieve various visual distributions of pollution profiles as can be seen in Figures 2 to 5. However, it was also discovered that the constant v k , can be used to control the spread of the agents in the pollution profile with high v k values resulting in agents behaving more like gas molecules, covering a smaller area and forming less detailed features of the pollution profile whilst lower v k values result in agents behaving more like liquid-like molecules, covering a higher area and showing more detailed features of the pollutant profile as can be seen in Figures 6 and 7.       However, as choosing the right v k value is presently guess work, a way of using machine learning to do this can be investigated. The approach used in this paper relies on using a Genetic Algorithm based on Gaussian models. In order to do this, some modifications are made to Equation (7) as shown in Equation (9): (9) where T can be viewed as a system temperature constant and can be used for tuning the system using an adaptable scheme.
By adjusting this constant, the spread of the agents in the environment could be controlled to achieve different effects.
is a dynamic velocity of agent i that depends on the present reading C i (t) of the environmental quantity, s the standard velocity without any reading. This velocity function was embedded in the bacteria controller. The present reading C i (t) of the agent adapts the velocity of the agent so that in an area of higher concentration, it moves slowly covering a smaller area and vice versa. How this achieves coverage can be explained using (10) where A i is the area covered by agent i, is the mean run time or mean run length of the bacteria controller and A = π 2 is the area covered by the agent when the bacteria controller is tuned so that the agents motion is circular. As the agents get closer to the source, the area covered by them decreases due to a lower velocity and higher concentration reading while they cover a larger area if they have a higher velocity due to a lower concentration. The area covered by the pollutant is covered if the individual areas are covered by the agents as follows: (11) This could be used to develop a control law as in Equations (12) and (13) if an estimate of the pollutant profile F(S(x)) in the environment S(x) were known and used to tune the system temperature T so that the spread of the agents in the environment is controlled. As a result, Equation (9) becomes Equation (14). (12) (13) (14) where is an integral gain and is a proportional gain. A Gaussian model based Genetic algorithm explained in Section 3 was used to estimate parameters of the pollutant profile F(S(x)) locally. The standard deviation parameters of the locally estimated Gaussian are then used in Equation (12) so that it becomes (15) where A T (x) std is the standard deviation of the flock positions. From machine learning, combination of Gaussian curves could be used to form various functions [43] so that the agents using the present approach are able to form various complex pollutant shapes. By using machine learning, it is possible to correctly get the shape of the pollutant and increase the rate of convergence. The disadvantage however is that the kernel function used to estimate the pollutant profile must be capable of forming the distribution of the pollutant profile effectively. Investigating the use of machine learning in improving the proposed control system opens up the possibility of studying how various real world conditions could affect the agents' distribution. This knowledge can then be incorporated into the control system so that the flock of agents are better prepared to deal with the environmental conditions as they arise.

Pollution Distribution Estimation Using Genetic Algorithm
Genetic Algorithm was used to approximate the distribution F(S(x)) of the pollution in the environment. As the proposed approach is aimed at use in an environment that is dominated by diffusion, we assume that a pollutant can be described using a Gaussian function. This can be explained by observing the behaviour of a drop of ink put into water. It starts out concentrated at the source point and as time progresses, the concentration at the source point will reduce while the surrounding areas become contaminated. This is as a result of diffusion.
This behaviour can be viewed as a Gaussian function of which amplitude at the source reduces from an initial value A to a stationary value as time t goes to infinity (A  as t  ) while the standard deviation  becomes wider and goes to a stationary value as time t progresses to infinity ( as t  ). This can be modelled as in Equations (16) and (17).
Where k 1 is the rate of decay of the amplitude at time t and k 2 is a constant that can be used to tune the conversation of the change in amplitude to the spread of the pollutant.

Estimating Simple Gaussian Functions
In this work, a very slow (near negligible) diffusion was assumed. The Genetic Algorithm was used to estimate the parameters (amplitude, source location, and standard deviation) of a Gaussian function that was used to simulate the pollutant. The estimated standard deviation F(S(x)) std was fed back to the robotic flock according to Equations (15), (13) and (14) in order to adjust the velocity of the agents to achieve optimal coverage of the environment.
The Genetic Algorithm was designed with 35 genes on each chromosome, with 7 genes representing each of the Gaussian's amplitude, x and y standard deviations, and x and y means. Each gene could take a value of either 0 or 1. As a result, the estimation range for each of the parameter using the Genetic Algorithm was between 0 to 128. A gene pool size of 400 chromosomes, tournament size of 10, crossover, mutation, reproduction ratios of 10%, 75% and 15% were used respectively with 10 generations per run.
During simulation runs, agents explored the environment using both the bacteria and flocking algorithm whilst collecting spatial pollutant data. The spatial pollutant data was stored in a running memory having n = 20 elements. This data was then passed to the Genetic Algorithm in order to estimate the pollutant parameters.

Estimating Multi-Modal Gaussian Functions
A pollutant is known to break up into pieces due to the turbulent nature of the environment. Hence, the pollutant's profile becomes one made up of various local maximums and one global maximum. This could also be viewed as noise. Nevertheless, the aim of the method proposed in this paper is to represent a pollutant's profile as closely as possible. This would be a problem for the method discussed in Section 3.1 as it would attempt to estimate a single global estimate from the data collected by the agents.
In order to solve this problem, it is assumed that each agent i transmits its individual estimated Gaussian parameter values to its local neighbours {1…N} present within an l = 5 units radius together with its position . A reduced communication radius value ensures that only local agents to a local maximum contribute to each other's estimation.
Based upon this information, each agent uses an averaging scheme depicted by Equation (18) to reach a consensus as to the parameter values of the local maximum.
The consensus scheme works by using a weighted averaging mechanism so that values closer to what an individual agent i is estimating are given more weight than the values farther from its estimation: (18) where k ave i is the k-average of the parameters estimate by agent i, G(||.||) is a Gaussian curve with agent i's estimate as the centre value, N is the number of agents in the neighbourhood of agent i. N ≥ 5 as this is used to simulate the communication limitation of the agent i in that it can only buffer up to five readings.
By using the parallelism offered by swarm robotics and the consensus scheme as discussed above, it is possible to obtain an estimation of the pollutant's profile in addition to recovering the function representing it.
The overall structure of the algorithm can be presented as in Algorithm 1. Lines four to six are necessary for a moving or changing pollutant profile. This has been tested in simulations with satisfactory results. The use of the model based genetic algorithm with the previous architecture is shown in Figure 9.

Simple Gaussian Functions
In order to test the approach, different pollutant profiles were generated with 50 agents deployed. A Gaussian having the following parameters of mean (x, y) = (400, 200), spread (x, y) = (38, 38) and amplitude of 100 was used. As can be seen in Figure 10, the agents were able to form the distribution even though their motion from Figure 10(b) seems erratic. This behaviour is as a result of the bacteria controller searching for a favourable path towards the optimal of the Gaussian distribution. Figure 10(c) shows how the error between the agents spread and the pollution spread reduces with time. The error was obtained by using Equation (14).  (40,80) and amplitudes = 100, 45 and 80 respectively were used as in Figure 11. It was observed that the approach enabled the agents to distribute according to the simulated pollutant's profile.

Multi-Modal Gaussian Function
Using the approach discussed in Section 3.2, tests on multimodal Gaussian functions generated by C = 160 + (x 2 -150cos(x ) + y 2 -150cos(2y )) were carried out with = = 0.065. During these experiments, 100 agents were deployed with sensor readings for the agents normalized between the ranges of 0 to 100 as that is the range that the Genetic Algorithm was designed to handle. A higher number of agents were deployed because of the nature of the pollutant profile.
As can be observed in Figure 12, 60 of the agents were able of form the shape of the pollutant to a limited extent with the dips in the multimodal function seen in the agents distribution at (x, y) = (2.5, 16) and (x, y) = (15,16). The rest of the agents were exploring the environment.

Conclusion
As discussed in the introduction of this paper, this work consisted of two parts: the first part uses a swarm of robots to provide a visual representation of a noisy static pollutant. For this, the method in this work relied on using the bacterium behaviour, flocking behaviour, and a velocity function. However, this part relied on choosing an appropriate value for in the velocity function. It was shown that the proposed method could be classified into the ant robotics category due to its limited communication capability, memory, and computational requirement. As a result, according to present knowledge, this is one of the first pieces of work that use ant robotics to create a visual representation of an invisible pollutant.
The second part of this work is the development of a feedback system that used the error between the standard deviation of the swarm member positions and the pollutant profile to choose an appropriate value for . The standard deviation of the pollutant profile was estimated by using a computationally efficient model based Genetic Algorithm that relied on the parallelism offered by the swarm to reach a consensus among swarm members when the pollutant profile became more challenging.
However, the implementation of Genetic Algorithm takes the agents out of the scope of ant robotics because of the amount of memory elements that would be required to store the gene pool alone. In addition, noise would adversely affect the estimation of the Genetic Algorithm which raises the question of whether an alternative scheme that is less affected by noise can be developed. A cooling scheme borrowed from the theory of Simulated Annealing might offer promising results and this would be the task of future investigations.
Other areas to be investigated include: Estimating swarm convergence time to a pollutant profile and studying the effect of dynamic cleaning of the pollutant on the proposed approach upon encounter.