1. Introduction
In recent years, frequent sudden pollution accidents have seriously threatened the ecological environment of water. When pollutants are discharged into water, a dynamic spatial and temporal pollution field is formed. When monitoring the water quality, identifying the source of the pollution in a timely and effective fashion is a key problem. Traditional monitoring methods have difficulty tracking and monitoring such dynamic pollution fields. The great advantages of multiple intelligent monitoring USVs in autonomous detection are providing new solutions for water quality monitoring. However, there are still many issues to be studied, especially in water environments such as lakes. Because lake currents are not as directional as rivers, and do not have clear tidal characteristics like oceans, it is difficult to estimate the location of pollution sources quickly and accurately in cases of emergency monitoring with limited individual knowledge. The slow flow velocity and large turbulence, wind field and environmental noise also cause the pollution fields to present discrete local extrema within a local range, meaning that the USV can easily produce incorrect assessments, affecting the detection efficiency.
With the aim of improving the efficiency of pollution source tracing, this study takes multi-USV cooperative monitoring methods as the means, making use of its good spatial expansion characteristics and information fault tolerance, and the monitoring of sudden lake water pollution as the application scenario in which the research is carried out. An innovative N-PSO information trend method is proposed. Probability distribution is used to represent the distribution of pollution sources in space. Information uncertainty under limited prior knowledge is reduced by sharing information among multiple USVs. By introducing the confidence factor, the cognitive differences between USVs are coordinated. The PSO algorithm is combined with an ‘Infotaxis’ algorithm in order to plan the exploration path. This method effectively avoids premature convergence and improves the exploration efficiency of USVs.
2. Related Work
Recently, intelligent mobile devices such as mobile robots have been used more and more in daily life, in engineering, and in the military. Compared with traditional approaches, mobile robots can reach places in harsh environments [
1,
2,
3]. Mobile robots which simulate biological behavior are used to locate chemical sources. These are referred to as olfactory robots [
4]. Robots equipped with chemical sensors can track the chemical plume and then locate the chemical source. Olfactory robots have broad prospects in many applications, such as searching for life signs after earthquakes, locating chemical sources of toxic gas leakage, locating ocean hydrothermal vents, locating fire ignition points, and military detection. Cooperative multi-robot systems, such as unmanned ground vehicles, unmanned air vehicles (UAVs), unmanned underwater vehicles (UUVs) or unmanned surface vehicles (USVs), have shown their superiority over sensor networks in the environmental monitoring domain. In this way, perceptive agents are able not only to adapt their measurement capabilities to the changing environmental conditions, but also to cooperate in gathering information and making intelligent decisions. In this study, we focused on a study of cooperative search approach for water pollution source localization.
Chemical concentration trend is a commonly used search method for mobile robots. It relies on local concentration gradient to guide the robot to move toward the chemical source based on simulating the crawling behavior of animals in tracking odorants. In [
5], a chemical source search method was proposed based on solid formal principles from the field of fluid mechanics. A mobile sensor network composed of multiple robots is used to sense the ambient fluid velocity and chemical concentration, and calculate derivatives. Fluid dynamics and flux tropism are used to guide the robot to move to the chemical source. In [
6], two three-dimensional moth-inspired odor tracking algorithms, Counter-turner and Modified counter-turner, were tested on a robotic platform. Flight tracks show promise in mimicking the flight tracks observed in biological experiments with moths. In [
7], the results from a 3-D computer simulation of an autonomous unmanned aerial vehicle (UAV) were presented for tracking a chemical plume to its source. The simulation study includes a simulated dynamic chemical plume, six degrees of freedom, a nonlinear aircraft model, and a bio-inspired navigation algorithm.
For the chemical trend search method, the detected chemical concentration should be sufficiently high to ensure that the average concentration difference measured by the robot at two adjacent locations is greater than the fluctuation error. The ratio of signal to noise depends on the time course and increases with waiting time. Nevertheless, the average concentration may decay rapidly (sometimes exponentially) with increasing distance from the chemical source. A signal-to-noise ratio that is weaker in this way will make the waiting time for concentration detection too long. In addition, during the waiting time, the concentration information may change. Meanwhile, different types of pollution sources emit different pheromones. Some of them are diffused with the flow, and show a clear concentration gradient feature, such as liquid chemical pollutants. Some of them are rapidly diffused at the beginning but form a uniform oil film which will drift along with the water after reaching a certain degree of diffusion, such as the oil pollutants. Some of them emit sporadic pheromones, such as when solid waste is dumped. The environmental factors also have a great impact on morphology. Slowly flowing water and wind fields cause the pheromones to present better continuity. However, turbulence may break up pheromones and make them discrete and irregular. These factors cause the chemical concentration tend search method to be unreliable in complex environments. Robots may lose their tracking targets due to disordered local information, or may obtain inaccurate concentration information due to a high signal-to-noise ratio. Sometimes, they may fall into extreme values of local concentration and make incorrect decisions.
In the last decade, an ‘Infotaxis’ algorithm was proposed, which has been well developed in the past few years. In [
8], a strategy of information trend for olfactory tracing was presented. Information entropy plays a similar role as the concentration gradient in the method of chemical concentration trend. The strategy of the ‘Infotaxis’ algorithm is to maximize the expected information gain. By comparing the predicted information gains, the searcher always moves towards the location with maximum information gain. The uncertainties of the probability are reduced continuously with the exploration of robots, until the source location is located. This ‘Infotaxis’ algorithm makes the exploration independent of the concentration gradient, and it can be applied in turbulent environments with unstable concentration cues or in weak sensing environments which is far from the chemical pollution sources.
In lake water environments, because of the wide range of the working space, there are some shortcomings in using a single USV for target detection.
(1) Although the ‘Infotaxis’ algorithm does not depend on the continuous concentration gradient distribution, when a single USV explores within a discrete clue distribution field, it may still terminate the exploration because of losing clues.
(2) Due to the lack of global information, the exploration process is vulnerable to being influenced by sensor information that sets off false alarms, changing environmental information, and other factors, thus resulting in incorrect decision-making. This would lead the efficiency of the exploration to become very low, extending the exploration time.
(3) Due to the limited environmental information obtained, a single USV may fall into local extreme values, resulting in misjudgment.
(4) Once the single USV fails, the task cannot be completed.
With the development of intelligent robots, more and more attention has been paid to multi-agent theory and technology [
9,
10,
11]. Compared with the single robot exploration method, multiple robots can achieve decreased information entropy more quickly and locate the chemical source more effectively. In [
12,
13,
14,
15,
16], the ‘Infotaxis’ algorithm for multiple cooperative robots was proposed and applied. However, the cooperation strategy of multi-USV still needs to be improved, especially when used in wide spaces such as lakes or oceans. The main issues include:
- (1)
Simply overlaying the exploration information of multiple USVs cannot maximize the advantages of the multi-USV system.
- (2)
Multi-USV systems that lack cooperation make it easy for USVs to search the same area repeatedly. This will lead to the aggregation of multiple USVs in the same area, thus reducing the efficiency of exploration.
- (3)
How can a reasonable cooperation strategy be designed to minimize the impact of environmental uncertainty?
Based on the existing ‘Infotaxis’ algorithm, this study proposes an improved shared probability updating method. To further optimize the exploration strategy, the PSO algorithm is introduced into the ‘Infotaxis’ algorithm to plan the USVs’ exploration path.
5. ‘PSO-Infotaxis’ Algorithm-Based Exploration of Cooperative USVs
According to the above study, it can be seen that there are still some shortcomings in the application of the multi-USV information trend search method.
(1) Multi-USV exploration can only share information about probabilistic maps, but there are no cooperative measures. When exploring, only the information obtained by the individual USV is considered, which lowers the exploring efficiency, and the USV can easily fall into local extreme values and make incorrect judgments.
(2) The exploration method in consideration of cognitive differences can avoid falling into local optimal solutions. Multi-USV cooperation can achieve better fault tolerance, making the detection results more robust. Nevertheless, the method is still based on a simple cooperation method without considering coordination between individual cognition and the population’s experience, which lowers the search efficiency.
(3) The next step of the standard information trend method is to locate the target location adjacent to the explorer. In small spaces, this method is more effective. However, in large areas of space, the exploration step is too small. The speed of convergence is significantly affected by the size of the space.
If the multi-cooperative USV exploration system is regarded as a social population, the behavior of each individual in it will be affected not only by its past experiences and cognition, but also by overall social behaviors. The manner in which this cooperative role can be better achieved, and in which the search strategy can be adjusted in accordance with own-historical experience and group behavior in order to improve the efficiency of method, is the next problem to be solved. A meta-heuristic algorithm, which inspired our study, is a combination of a stochastic and a local search. In [
17], a novel adaptation of the multi-group quasi-affine transformation evolutionary algorithm for global optimization was proposed. In [
18], a compact pigeon-inspired optimization algorithm was proposed to solve complex scientific and industrial problems with many data packets, including the use of classical optimization problems and the ability to find optimal solutions in many solution spaces with limited hardware resources. Those studies provide a feasible solution to the problem under acceptable computational time and space, and the solution cannot be predicted in advance [
19].
In this study, the PSO algorithm is introduced into the information trend search algorithm to plan and adjust the USVs’ exploration path. This method is called the ‘PSO-Infotaxis’ algorithm.
5.1. Basic Idea of Standard PSO
PSO was initially proposed by Eberhart and Kennedy [
20,
21]. Its basic concept originates from the study of the foraging behavior of birds. The basic PSO algorithm is expressed in the following. Assume that the search space of
n dimensions comprises populations with
n particles, where the position of each particle can be expressed as a vector of
T. According to the objective function, the fitness value corresponding to each particle’s position
can be calculated. The velocity of the
i-th particle is expressed as
Vi = (
V1,
V2 ⋯
Vn)
T, and its individual extreme values denote the optimum historical position of the particle, which is expressed as
. The extreme value of the population is the optimum historical position of particle populations, which are expressed as
. In the
t-th iteration, the updating formula of particle velocity and position is as follows:
where
is the inertial weight, which represents the degree of inertial motion of a particle in accordance with its own velocity. It is linearly reduced with the number of iterations.
and
are learning factors which represent the experience learned from the particle and the particle group, respectively. The values of
and
are usually 2.
and
are random numbers between 0 and 1 [
19].
5.2. ‘Infotaxis’ Algorithm of Multi-USV Exploration Based on Improved PSO
The algorithm proposed in this study is inspired by PSO. The multiple USVs are regarded as particles, and form the particle population . When dynamic particles sample in exploratory space, their own knowledge about their previous experience and the shared knowledge with other particles are used as guidance to make local exploratory behavior more efficient. The PSO identifies the knowledge shared by the group as much as possible. Meanwhile, it retains the consideration of the experiences of the particle itself. This makes the cooperation among the multiple USVs more effective.
The speed and position of particles are decided according to PSO. The optimal values of the probability of the source location detected by the USVs are taken as the fitness function. For the particle i, in the t-th iteration, the extreme value is the historically optimal position that possesses the best fitness value on its trajectory. This is the in Equation (15). is the best location of the fitness function value of the whole particle population. The fitness function of the whole particle population is the shared probability calculated according to Equation (14).
and
can be expressed as follows:
Here, is the probability of source position estimated in USV i’s t-th iteration, and is the sharing probability of source position estimated in the multiple USVs’ t-th iteration.
The next exploration position of USV i is calculated from Equations (15) and (16). can be understood as the step length of the USV’s next movement.
To overcome the shortcomings of less numbers of particles, and avoid premature convergence, it is necessary to enrich the diversity of particle selection. This study further improves the standard PSO. The and in Equation (15) respectively take different random values to generate population , where is the size of the population that is generated. To avoid excessive computation, the value of is limited to less than 8.
For n particles , take it into Equation (16) and obtain the corresponding position population of each particle: . According to Equation (10), we can calculate the best combination of positions with maximum entropy reduction as the exploration target of USVs at moment . The method is iterated until the source of chemical pollution is found or the limit of iterations is reached.
5.3. The Overall Process of the Method
Step 1: The local map is rasterized, where the direction of the X or Y axis on the map corresponds to the direction of the water flow. Initialize the speed and position of particles in the population. The initial probability distribution in each grid cell of the map is , where N is the number of cells.
Step 2: According to the clues detected by the particles in their respective positions, the posterior probability distribution on the map is calculated according to Equation (4). The information entropy value based on the historical clues obtained by the particles at time t is calculated through Equation (5). For multiple USVs, the sharing probability is calculated according to Equation (14).
Step 3: The position of the optimal posterior probability of the particle, that is, in Equation (15), is updated. The position , which is the optimal fitness value of the whole particle population, is updated.
Step 4: The speed and location set of each particle are calculated according to Equations (15) and (16).
Step 5: According to Equation (10), the best moving position combination of particles with the greatest entropy drop in each particle population is calculated.
Step 6: The USVs move to their next target points and record the clues obtained.
Step 7: Repeat Step 3 for iteration.
Step 8: If the global fitness value reaches a certain limit (in this study set to 0.9), and its position does not change during the set time interval , the chemical pollution source confirmation procedure is started. If the chemical pollution source is confirmed, the task is ended. Otherwise, the local map is expanded and Step 1 is repeated to continue execution.