Multi-Agent Cooperative Target Search

Hu, Jinwen; Xie, Lihua; Xu, Jun; Xu, Zhao

doi:10.3390/s140609408

Open AccessArticle

Multi-Agent Cooperative Target Search

by

Jinwen Hu

¹,

Lihua Xie

²,

Jun Xu

³ and

Zhao Xu

^4,*

¹

Singapore Institute of Manufacturing Technology, Singapore 638075, Singapore

²

The School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore

³

Western Digital Corporation, Singapore 118261, Singapore

⁴

Institute of High Performance Computing, Singapore 138632, Singapore

^*

Author to whom correspondence should be addressed.

Sensors 2014, 14(6), 9408-9428; https://doi.org/10.3390/s140609408

Submission received: 25 February 2014 / Revised: 16 May 2014 / Accepted: 20 May 2014 / Published: 26 May 2014

(This article belongs to the Section Physical Sensors)

Download

Browse Figures

Versions Notes

Abstract

: This paper addresses a vision-based cooperative search for multiple mobile ground targets by a group of unmanned aerial vehicles (UAVs) with limited sensing and communication capabilities. The airborne camera on each UAV has a limited field of view and its target discriminability varies as a function of altitude. First, by dividing the whole surveillance region into cells, a probability map can be formed for each UAV indicating the probability of target existence within each cell. Then, we propose a distributed probability map updating model which includes the fusion of measurement information, information sharing among neighboring agents, information decay and transmission due to environmental changes such as the target movement. Furthermore, we formulate the target search problem as a multi-agent cooperative coverage control problem by optimizing the collective coverage area and the detection performance. The proposed map updating model and the cooperative control scheme are distributed, i.e., assuming that each agent only communicates with its neighbors within its communication range. Finally, the effectiveness of the proposed algorithms is illustrated by simulation.

Keywords:

UAV; multi-agent network; target search; cooperative control

1. Introduction

With the fast development of high resolution imaging devices and processing technologies, unmanned aerial vehicles (UAVs) with air-borne cameras are increasingly employed in civil and military applications such as environmental monitoring, battlefield surveillance and map building, where ground-target search is one of the major applications [1,2]. Target tracking and search have been one of the most popular utilizations of UAVs [3,4]. The conventional method for target search by UAVs in a closed region divides the whole surveillance region into cells, and associates each cell with a probability or confidence of target existence in the cell which constitutes a probability map for the whole region [5,6].

In [7], an online planning and control method is proposed for cooperative search by a group of UAVs, where each agent keeps an individual probability map for the whole region updated according to the Dempster-Shafer theory. A path planning algorithm is designed by using the obtained measurement information, which requires each agent to directly communicate with all other agents. In [8], target detection is considered as part of an integrated mission including coverage control and data collection as parallel tasks for multi-agent networks. The coverage control method aims to maximize the joint detection probability of random events and the probability of target existence is updated by measurements based on the Bayesian rule. However, only the measurement information of direct neighbors is exchanged, which makes it difficult to obtain the target information of the whole surveillance region. In [9], a decentralized gradient-based control strategy is proposed for multiple autonomous mobile sensor agents searching for targets of interest by minimizing the joint team probability of no detection within action horizon based on range detection sensing model. However, each agent is required to collect detection information from all other agents. In [10], a decentralized search algorithm is developed which includes a two-step updating procedure for the probability maps. Each agent first obtains observations over the cells within its sensing region and updates its individual probability map by the Bayesian rule. Then, each agent transmits its individual probability map to its neighbors for map fusion. This algorithm is distributed and full network connectivity is not required. However, the lack of information correlation makes the map fusion difficult and only a heuristic fusion method is given in [10], the performance of which has not been analyzed. In our recent work [11], a distributed iterative map updating model is proposed to fuse the information from measurements and the maps of neighbors based on a logarithmic transformation of the Bayesian rule. Through this, the nonlinear Bayesian update is replaced by a linear one which simplifies the computation. The convergence speed of individual probability map of an agent is also analyzed under fixed detection and false alarm probabilities for the search of static targets.

The cooperative control is an important task for efficient target search by a group of UAVs. Compared with the centralized control algorithms, distributed control algorithms are more robust to accidental failures of UAVs and breaks of communication links [12]. In [13], a distributed multi-agent coverage control method is proposed based on a given sensing performance function related to the distance to robots and gradient descent algorithms are designed for a class of utility functions to optimize the coverage and sensing performance. In [14], a distributed, adaptive control law is developed to achieve an optimal sensing configuration for a network of mobile robots which obtain sensory information of a static environment and exchange their estimates of the environment with neighbors. In [15], a three dimensional distributed control strategy is proposed to deploy hovering robots with downward facing cameras to collectively monitor an environment. A new optimization criterion is defined as the information obtained by each pixel of a camera. In [16], a dynamic awareness model is proposed to control a multi-vehicle sensor network with intermittent communications. The state of awareness of each individual vehicle is updated by its own sensing model and sharing information with its neighbors. However, none of the coverage control schemes mentioned above has considered the detection results of target existence which may affect UAVs' movement decisions in target search. Moreover, there are very few works addressing the issue of distributed vision-based cooperative search for multiple mobile targets with probabilistic detections.

In this paper, we investigate the vision-based cooperative search for multiple ground mobile targets by a group of UAVs with limited sensing and communication capabilities. The main contribution of this paper is that a distributed strategy of information fusion and cooperative control is proposed for searching multiple mobile targets using multi-agent networks based on probabilistic detections. In addition, the time-varying detection and false alarm probabilities are considered which are due to the varying altitudes of the agents with 3-dimensional dynamics. Each agent under our search strategy shares local target information and controls its own behavior in a distributed manner. Based on the probability map updating model proposed in [11], we generalize the model by considering the information decay and transmission between cells due to environmental changes such as the target movement. The influence of the time-varying detection probability on the update of probability maps due to the three-dimensional UAV dynamics is also analyzed. Then, a coverage optimization problem is formulated to balance the coverage area and detection performance. The proposed map updating model and cooperative control scheme are distributed, i.e., each agent only communicates with the agents within its communication range.

This paper is organized as follows: Section 2 describes the basic notations and assumptions used in this paper. Section 3 presents the probability map updates by measurements and information sharing with time-varying detection probabilities. In Section 4, a three-dimensional coverage control method is presented for target search. Simulation results are shown in Section 5, and the conclusions are drawn in Section 6.

2. Basic Definitions and Assumptions

The surveillance region Sensors 14 09408i1 ∈ ℝ² is assumed to be on a plane ground and has been uniformly divided into a set of cells of the same size. We assume that all UAVs (or agents) use the same global Cartesian coordinate system and the position of each agent is denoted as $μ_{i, k} = {[c_{i, k}^{T}, h_{i, k}]}^{T}$ ∈ ℝ³ for agent i (i = 1, 2, ⋯, N) at time k (as shown in Figure 1a), where c _i,k ∈ ℝ² is the planar coordinate of its projection on Sensors 14 09408i1 , h_i,k ∈ ℝ is the altitude of the agent above , N is the total number of agents and “T” denotes the transpose operation. Each agent is assumed to have access to its own position at any time. Each cell in the surveillance region is associated with a probability or confidence of target existence within the cell which is modeled using the Bernoulli distribution, i.e., θ_g,_k = 1 (a target is present) with probability P_i (θ_g,k = 1) and θ_g,k = 0 (no target is present) with probability 1 − P_i (θ_g = 1) for agent i and cell g at time k, where g ∈ ℝ² is the location of the cell center in Sensors 14 09408i1 . If more than one target are present within a cell, they are treated as one single target.

In this paper, we mainly discuss about the vision-based detections where each agent carries an airborne camera facing downward to surveillance region (as shown in Figure 1a). Each agent independently takes measurements Z_i,g,k over the cells within its sensing region ℂ_i,k at time k, where

ℂ_{i, k} ≜ {g \in O : ‖ g - c_{i, k} ‖ ⩽ h_{i, k} tan φ}

and ‖•‖ denotes the 2-norm for vectors. Each agent is assumed to have the same angle of field of view, half of which is denoted by φ. We also assume that the size of each cell is sufficiently small comparing with the size of ℂ_i,k so that we can ignore the boundary effect and roughly consider a cell as wholly within ℂ_i,k if its center is within ℂ_i,k. Only two observation results are defined for each cell, Z_i,g,k = 0 or Z_i,g,k = 1. For all cells, P (Z_i,g,k = 1|θ_g,k = 1) = p_i,k and P (Z_i,g,k= 1|θ_g,k = 0) = q_i,k are assumed to be known by agent i as the detection probability and false alarm probability respectively.

The topology of the network of all agents at time k is modeled by an undirected graph Sensors 14 09408i2 _k = (ε_k, Sensors 14 09408i3 ). = {1,2,…,N} is the vertex set and ε_k ={{i,j} : i,j ∈ ; ‖μ_i,k− μ_j,k‖ ⩽ R_c} is the edge set, where each edge {i, j} is an unordered pair of distinct agents and R_c is the communication range of each agent. The graph or the network is connected if for any two vertices i and j there exists a sequence of edges (a path) {i, ν₁}, {ν₁, ν₂},…, {ν_n₋₁, ν_n}, {ν_n,j} in ε_k. Let Sensors 14 09408i4 _i,k = {j ∈ Sensors 14 09408i3 | {i,j} ∈ ε_k} U {i} denote the set of neighbors of agent i at time k where an agent is assumed to be a neighbor of itself. The degree (number of neighbors) of agent i at time k is denoted as d_i,k = | _i,k|.

3. Probability Map Update

3.1. Bayesian Update and Consensus-Based Map Fusion

In [11], we proposed a cooperative control scheme for target search in multi-agent systems. In a group of UAVs, each agent i keeps an individual probability map Sensors 14 09408i5 _i,_g,k of the whole region, where _i,g,k ≜ P_i (θ_g,k = 1) and is updated by the Bayesain rule:

\begin{array}{l} P_{i, g, k} & = \frac{P (Z_{i, g, k} | θ_{g, k} = 1) P_{i, g, k - 1}}{P (Z_{i, g, k} | θ_{g, k} = 1) P_{i, g, k - 1} + P (Z_{i, g, k} | θ_{g, k} = 0) (1 - P_{i, g, k - 1})} \\ = {\begin{array}{l} \frac{p_{i, k} P_{i, g, k - 1}}{p_{i, k} P_{i, g, k - 1} + q_{i, k} (1 - P_{i, g, k - 1})} & if Z_{i, g, k} = 1 \\ \frac{(1 - p_{i, k}) P_{i, g, k - 1}}{(1 - p_{i, k}) P_{i, g, k - 1} + (1 - q_{i, k}) (1 - P_{i, g, k - 1})} & if Z_{i, g, k} = 0 \\ P_{i, g, k - 1} & ohterwise \end{array} \end{array}

(1)

where 0 <

_i,g,₀ < 1 and 1 > p_i,k,q_i,k > 0. For the cases with p_i,k = 0 or 1 or q_i,k = 0 or 1, simplified conclusions can be obtained as shown in [11] and will not be considered in this paper. By letting

Q_{i, g, k} = ln (\frac{1}{P_{i, g, k}} - 1)

(2)

we get the following transformation of Equation (1):

Q_{i, g, k} = Q_{i, g, k - 1} + υ_{i, g, k}

(3)

where

υ_{i, g, k} ≜ {\begin{array}{l} ln \frac{q_{i, k}}{p_{i, k}} & if Z_{i, g, k} = 1 \\ ln \frac{1 - q_{i, k}}{1 - p_{i, k}} & if Z_{i, g, k} = 0 \\ 0 & otherwise \end{array}

(4)

Keeping Q_i,g,k as the updated term instead of Sensors 14 09408i5 _i,g,k simplifies the nonlinear update in Equation (1) into the linear one in Equation (3). For a group of UAVs, we let each agent i at time k first take measurements and transmit the measurements to its neighbors. After receiving the measurements from all its neighbors, Q_i,g,k is updated as follows:

H_{i, g, k} = Q_{i, g, k - 1} + ∑_{j \in N_{i, k}} υ_{j, g, k}

(5)

Then, each agent i transmits the updated Q_i,g,k of the whole region to its neighbors for map fusion, which is given by:

Q_{i, g, k} = ∑_{j \in N_{i, k}} w_{i, j, k} H_{j, g, k}

(6)

where

w_{i, i, k} = 1 - \frac{d_{i, k} - 1}{N}

,

w_{i, j, k} = \frac{1}{N}

for j ∈

_i,k (j ≠ i) and w_i,j,k= 0 for j ∉ Sensors 14 09408i4

_i,k. Then, a matrix composed of w_i,j,k can be defined as:

W_{k} ≜ {[w_{i, j, k}]}_{N \times N} (i, j = 1, \dots, N)

(7)

which is a doubly stochastic matrix [17]. The communications of neighboring agents are assumed to be synchronized within a short time interval. Time synchronization in distributed networks is not the focus of this paper and has been addressed by many works [18–21].

3.2. Time-Varying Detection Probability

In [11], we only considered a 2-dimensional control scheme assuming that all agents move on a fixed plane parallel to the ground plane. However, in the real world, UAVs such as helicopters can change their altitudes according to their task requirements so as to enlarge their sensing area (here we only consider cameras with a fixed zooming level). Therefore, in this paper, we will consider the influence of 3-dimensional dynamics of UAVs on the detection performance.

For vision-based detection, the detection probability relies on the picture resolutions. Figure 1b shows the basic imaging scheme by an airborne camera similar to the one given in [15,22]. In general, a desirable property for good target recognition is a “right” ratio between the size of the image and the size of the target, where “right” depends on the target type and the detection algorithm that is employed. To be more simplified, it can be assumed that the larger the image of a target in the picture (in terms of the number of occupied pixels) obtained by the UAV, the easier for the UAV to discriminate the target no matter what recognition method is used. Hence, we can model the target discriminability of a UAV as a function ρ proportional to the ratio between the size of a target image taken by the camera denoted by S_TI and the size of one pixel denoted by S_P, i.e., $ρ \propto \frac{S_{TI}}{S_{P}}$ . Here, we assume that all targets are of the same visual properties such as color, shape and size that are influential on target discriminability. It is also assumed that each camera has a fixed focal length so that we can only consider the change of ρ due to the variation of agent altitude. Then, by denoting the size of the projection of a target on the ground plane as S_T, we can derive that:

ρ \propto \frac{S_{TI}}{S_{T}} \frac{S_{T}}{S_{P}} = \frac{b^{2} S_{T}}{h^{2} S_{P}}

(8)

where h is the altitude of the UAV and b is the fixed distance between the image and the lens (as shown in Figure 1b). In a multi-agent system, for the i-th agent at time k, we have

ρ_{i, k} \propto \frac{b^{2} S_{T}}{h_{i, k}^{2} S_{P}}

. From Equation (8), we may get ρ_i,k → ∞ as h_i,k → 0. However, in reality, ρ_i,k cannot be infinitely large and there should be an upper limit when h_i,k is smaller than a threshold h̲. That is to say, the target discrimination ability will not be improved any more if a UAV is descending very close to the ground.

The target discriminability determines the detection probability when a UAV is detecting the existence of targets within each cell under surveillance. It is natural to conceive that the detection probability p_i,k increases and the false alarm probability q_i,kdecreases as ρ_i,k increases. When the altitude of the UAV becomes larger than a threshold h̄, it runs out of its ability to discriminate any target from the background environment, which means that the detection result dose not rely on the true existence of the target any more. That is, if h_i,k ⩾ h̄, we have P (Z_i,g,_k = 0|θ_g = 1) = P (Z_i,g,k = 1|θ_g = 1) = P (Z_i,g,k = 1|θ_g = 0) = P (Z_i,g,k = 1|θ_g = 0), i.e., p_i,k = q_i,k = 0.5. Generally, when h̲_i,k ∈ [h, h̄] (h̲<h̄), p_i,k should be a monotonically increasing function of ρ_i,k, or more explicitly, a monotonically decreasing function of h_i,k. Therefore, we assume the following detection probability model:

p_{i, k} = {\begin{array}{l} 0.5 & if h_{i, k} ⩾ \bar{h} \\ f_{1} (h_{i, k}) & if \underline{h} < h_{i, k} < \bar{h} \\ 0 & if 0 < h_{i, k} ⩽ \underline{h} \end{array}

(9)

where

f_{1}^{'} (h_{i, k}) < 0

for h_i,k ∈ (h̲, h̄) and 1 > f₁ (h̲) = p̆ > f₁ (h̄) = 0.5. Similarly, we can assume the false alarm probability model as a monotonically increasing function of h_i,k:

q_{i, k} = {\begin{array}{l} 0.5 & if h_{i, k} ⩾ \bar{h} \\ f_{2} (h_{i, k}) & if \underline{h} < h_{i, k} < \bar{h} \\ 0 & if 0 < h_{i, k} ⩽ \underline{h} \end{array}

(10)

where

f_{2}^{'} (h_{i, k}) > 0

for h_i,k ∈ (h̲, h̄), 0 < f₂ (h̲) = qˆ < f₂ (h̄) = 0.5. In this paper, the altitude h_i,kof an agent is allowed to vary from 0 to ∞ for theoretic analysis, though it may not happen in the real world due to system limitations.

Remark 1

Model (9) is motivated by the natural understanding of the interaction between the altitude of an agent and its detection and false alarm probabilities. It only reflects the general relation between those parameters, and is not restricted to a specific parametric representation of f₁ and f₂. Hence, our method is applicable for any detection probability function that fits for the model. An experimental detection probability model of CCD camera has been given in [22].

Remark 2

p_i,k and q_i,k can also be cell-dependent, i.e., they may vary from place to place due to environment conditions. For example, the target is often easier to be discriminated on an open ground than on a land with trees. In complex environments, agents must know the detection probability and false alarm probability models of different type of regions. For the ease of expression, we assume the models to be constant across the whole surveillance region.

Denoting by m_i,g,k the number of observations taken over cell g up to time k by agent i and defining $m_{g, k = ∑_{i = 1}^{N} m_{i, g, k}}$ , we can get the following conclusions for the update of Q_i,g,k.

Theorem 1

Given the initial prior probability map 0 < Sensors 14 09408i5 _i,g,₀ < 1 ∀_i ∈ Sensors 14 09408i3 , if there exists a constant ε > 0 such that p_i,k ⩾ 0.5 + ε and q_i,k ⩽ 0.5 − ε∀i ∈ , and the network topology Sensors 14 09408i2 _k is connected at all times, the following conclusions hold by implementing the map updating rule (5) and (6).

(1): If a target is present within cell g, $Q_{i, g, k} \overset{a . s .}{\to} - \infty$ (i.e., $P_{i, g, k} \overset{a . s .}{\to} 1$ ) ∀i ∈ as m_g,k → +∞.
(2): If no target is present within cell g, $Q_{i, g, k} \overset{a . s .}{\to} + \infty$ (i.e., $P_{i, g, k} \overset{a . s .}{\to} 0$ ) ∀i ∈ as m_g,k → +∞.

Proof

See Appendix A.

3.3. Environment-Based Probability Map

In the map updates (5) and (6), the effect of the environmental changes such as the information decay and transmission between cells has not been considered. For example, if targets may randomly appear or disappear during search, the historical information about the target existence cannot reflect the true current situation and revisits of certain frequency to the detected cells are needed for information update. This problem can be formulated as the information decay for each cell. If a target may move from one cell to another, then part of the information for the former cell should be removed and counted as the new information for the latter cell. This problem can be formulated as the information transmission between each two cells. Therefore, we need to generalize the aforementioned map updating model to be applicable to the case with such environmental changes. Similar to the assumption made in [16], we assume that Q_i,_g,k decays exponentially for each cell if there is no prior knowledge and/or no measurement information. The information transmission between cells due to target movement is modeled based on the transition of probabilities. In addition, the prior knowledge about the environmental change is taken as the system input. All these lead to the following generalized updating model for Q_i,g,k:

\begin{array}{l} H_{i, g, k} & = e^{- α T} ∑_{r \in O} a_{i, g, r, k} b_{i, g, r, k} Q_{i, r, k - 1} + ∑_{j \in N_{i, k}} υ_{j, g, k} + ξ_{i, g, k} \\ Q_{i, g, k} & = ∑_{j \in N_{i, k}} w_{i, j, k} H_{j, g, k} \end{array}

(11)

where α ⩾ 0 is the information decaying factor, T is the sampling period of all UAVs, a_i,g,r,kand b_i,g,r,k are the information transmission factors which are nonnegative, and ξ_i,g,k is the input information vector given by the prior knowledge about the target existence within cell g. Specifically, b_i,g,r,k satisfies b_i,g,g,k = 1 and b_i,g,r,k = 0 (g ≠ r) for Q_i,r,k₋₁ > 0, and b_i,g,r,k = P(θ_g,k = 1|θ_r,k₋₁ = 1) for Q_i,r,k₋₁ ⩽ 0. a_i,g,r,k is determined by a_i,_g,_rˆi,k = 1 and a_i,g,r,k = 0(r ≠ rˆ_i), where

\begin{array}{l} {\hat{r}}_{i} & = \underset{r \in B_{i, g, k}}{argmin} b_{i, g, r, k} Q_{i, r, k - 1} \\ B_{i, g, k} & = {r \in O : b_{i, g, r, k} > 0} \end{array}

(12)

Remark 3

a_i,g,r,k and b_i,g,r,k are defined based on the physical meaning of information transmission due to the target movement in the real world. Since the combination of Q_i,r,k (r ∈ Sensors 14 09408i1 ) into a cell g involves the combination of historical measurement information of all cells r ∈ , the correlation of which may not be known, we need to be careful in dealing with the fusion of such information. If Q_i,g,k > 0, we are more confident that no target exists within cell g. Otherwise, we are more confident that a target exists within cell g. Since an information transmission out of a cell at time k is expected to occur only when a target exists within the cell at time k−1, we let b_i,g,r,k = 0 if Q_i,r,k₋₁ > 0, which means there is no transmission of information (or target movement) from cell r to cell g. If Q_i,r,k₋₁ ⩽ 0, an information transmission occurs from cell r to cell g due to the possible target movement from r to g and the amount of information transmitted should be proportional to P (θ_g,k = 1|θ_r,k₋₁ = 1), i.e., equal to b_i,g,r,kQ_i,r,k₋₁. The smaller b_i,g,r,k is, the less amount of information is retained for cell g. Furthermore, by assuming that within one cell there can only exist up to one target at a time, i.e., at most one target can move into a cell at a time, we select the information stream with the largest transmitted amount as the newly stored information for cell g when there are incoming information streams from multiple cells r ∈ Sensors 14 09408i6 _g,k. That is, to take the smallest b_i,g,r,kQ_i,r,k₋₁ subject to Q_i,r,k₋₁ ⩽ 0 as the newly stored information after the transmission, which corresponds to the most probable target movement to g in all possible movements to g from different cells. The information decaying factor α is set to be positive in the case that the prior knowledge of b_i,g,r,k is not accurate or targets may appear and disappear unpredictably during the search. In this case, the information decay makes the agents revisit the detected regions at a certain frequency. As for the input information vector ξ, it only denotes the effect brought by the prior knowledge and there is no need to calculate it out in real implementations, because any prior knowledge on the target existence can be directly used to update the probabilities of target existence and thus update directly following its definition in Equation (2).

Here we give a simple example to illustrate how the parameters are designed if the true target dynamic model is give by x _k₊₁ = Ψx_k where x_k is a vector including the target location. In this case, one can calculate the transition probability P(θ_g,k₊₁ = 1|θ_r,k= 1) for any two cells r and g where θ_r,k = 1 represents that the target locates within cell r at time k. Following this, given the current accumulated information on target existence Q_i,r,k of agent ifor cell r at time k, one can calculate b_i,g,r,k following its definition. Further, with the results of b_i,g,r,k for any two cells r and g, one can calculate a_i,g,r,k according to its definition in Equation (12). If the target will not suddenly disappear/appear, the decaying factor α can be set as zero.

Define the following augmented variables:

\begin{array}{l} Q_{i, k} & = {[Q_{i, g_{1}, k, \dots, Q_{i, g_{M}, k}}]}^{T}, Q_{k} ≜ {[Q_{1, k}^{T}, \dots, Q_{N, K}^{T}]}^{T} \\ V_{i, k} & = {[∑_{j \in N_{i, k}} υ_{j, g_{1}, k}, \dots, ∑_{j \in N_{i, k}} υ_{j, g_{M}, k}]}^{T} \\ V_{k} & = {[V_{1, k}^{T}, \dots, V_{N, k}^{T}]}^{T} \\ ξ_{i, k} & \begin{matrix} = {[ξ_{i, g_{1}, k}, \dots, ξ_{i, g_{1}, k}]}^{T}, & ξ_{k} = {[ξ_{1, k}^{T}, \dots, ξ_{N, k}^{T}]}^{T} \end{matrix} \\ A_{i, k} & = {[a_{i, g_{τ}, g_{s}, k} b_{i, g_{τ}, g_{s}, k}]}_{M \times M} (τ, s = 1, \dots, M) \\ A_{k} & = diag [A_{1, k}, \dots, A_{N, k}] \end{array}

where τ and s are respectively the row and column indices of an appropriate cell in A_i,k, and M is the total number of cells, we get the following generalized updating model:

Q_{k} = e^{- α T} (W_{k} \otimes I) A_{k} Q_{k - 1} + (W_{k} \otimes I) (V_{k} + ξ_{k})

(13)

where ⊗ denotes the Kronecker product.

According to Theorem 1, ‖Q_k‖ can be seen as the gathered information for decision making on the target existence and the larger the ‖Q_k‖, the less the uncertainty about the target existence or nonexistence. Hence, our aim of controlling the UAVs is to maximize ‖Q_k‖ in some sense, which will be discussed in the following section.

4. Cooperative Coverage Control

In the previous section, a distributed map updating scheme was proposed for fusion of the knowledge of multiple agents. In this section, we will design a cooperative control strategy that optimizes the trajectories of agents for target search based on their real-time updated knowledge about the target information. Within each time interval, an agent first updates its probability map by the the map updating scheme designed in Section 3.3 and then makes a control decision on which place it should move to for the next observation by collective optimization which will be addressed in this section. The two steps make the whole network form a closed-loop sensing and feedback control system.

Here we consider the waypoint motion model for each agent:

μ_{i, k} = μ_{i, k - 1} + u_{i, k}

(14)

where u_i,k ∈ ℝ³ is the control input (or the waypoint displacement) of the i-th agent at time k. Note that the above motion model only deals with the waypoints of agents at discrete-time steps. The true dynamics of agents is not discussed in this paper since we do not want to limit our results on the dynamic model of any specific type of UAV. How to make the agents achieve the desired waypoints by their inner-loop flight controllers is a technical issue which will not be addressed in this paper but left to be solved in our real system experiments. Our job is to optimize the selection of the next waypoint (i.e., u_i,k) for each agent given its current waypoint (i.e., u_i,k₋₁).

Following Equation (13), we can get

Q_{k} = G_{k} + (W_{k} \otimes I) V_{k}

where G_k ≜ e⁻^αT (W_k ⊗ I) A_kQ_k₋₁+ (W_k ⊗I) ξ_k. At time k − 1, G_k can be seen as the prior information, and V_k the information gathered from measurements. Since V_k and E [V_k] are both related to the true target existence which is unknown, we cannot predict the values of Q_k or E [Q_k] before taking measurements. What we can do at time k − 1 is to find the optimal next time sampling position μ_i,k so as to maximize the information to be gathered at time k. More precisely, the problem can be formulated as the optimization problem:

max_{μ_{k}} E [{‖ Q_{k} - G_{k} ‖}^{2} | Q_{k - 1,} ξ_{k}]

(15)

where

μ_{k} ≜ {[μ_{1, k}^{T}, \dots, μ_{N, k}^{T}]}^{T}

. Considering that W_k includes the global topological information which is often hard to obtain for each individual agent in a distributed system, and ‖(W_k ≜ I) V_k‖ ⩽ ‖V_k‖, we replace Equation (15) with the following suboptimal optimization:

max_{μ_{k}} E [{‖ V_{k} ‖}^{2}] = ∑_{g \in O} ∑_{i = 1}^{N} ∑_{j \in N_{i, k}} E [υ_{j, g, k}^{2}] 1_{{g \in ℂ_{j, k}}}

(16)

Notice that Equation (16) is not an approximation of Equation (15), but a new cost function we intend to optimize which is an upper bound of Equation (16). Such way of defining the cost function is very common in statistics and estimation theory such as the Cramér–Rao lower bound, which is often selected as the cost function if the true variance of estimation error is time-varying and unknown.

Following Equation (16), we should try to maximize $E [υ_{j, g, k}^{2}]$ and the collective sensing area of all agents. From Equation (4), we get for g ∈ ℂ_j,k:

E [υ_{j, g, k}^{2}] = {\begin{array}{l} {‖ p_{j, k} ln \frac{q_{j, k}}{p_{j, k}} + (1 - q_{j, k}) ln \frac{1 - q_{j, k}}{1 - p_{j, k}} ‖}^{2} & if θ_{g, k} = 1 \\ {‖ q_{j, k} ln \frac{q_{j, k}}{p_{j, k}} + (1 - p_{j, k}) ln \frac{1 - q_{j, k}}{1 - p_{j, k}} ‖}^{2} & otherwise \end{array}

It is straightforward to find that $E [υ_{j, g, k}^{2}]$ is monotonically increasing with respect to p_j,k and monotonically decreasing with respect to q_j,k no matter θ_g,k = 1 or not. Thus, Equation (16) is further replaced with the following optimization problem:

max_{μ_{k}} H (μ_{k}) = ∑_{i = 1}^{N} \int_{M_{i, k}} ϕ_{k} (r) (p_{i, k} - q_{i, k}) 1_{{r \in ℂ_{i, k}}} d r

(17)

where ϕ_k is a given nonnegative weighting function of r ∈ Sensors 14 09408i1

at time k, and its influence on the control law will be shown later. { Sensors 14 09408i7

₁,_k_,…

_N,k} is a partition of Sensors 14 09408i1

at time k subject to μ_i,k ∈ Sensors 14 09408i7

_i,k, such as the Voronoi partition. The introduction of the partition is for avoidance of collision between UAVs and ease of dealing with the overlapped sensing regions between neighboring agents, which will be discussed later. Since p_i,k ⩾ q_i,k, Sensors 14 09408i8

(μ_k) is always nonnegative and Sensors 14 09408i8

(μ_k) = 0 if h_i,k ⩾ h̄. Denoting by ∂(•) the boundary of the corresponding region and n_∂(•) (r) the outward pointing normal vector of the boundary ∂ (•) at point r, we can compute the gradient of Sensors 14 09408i8

(μ_k) as follows.

Theorem 2

The gradient of the cost function Sensors 14 09408i8 (μ_k) with respect to μ_i,k (h_i,k < h̄) is given by

\begin{array}{l} \frac{\partial H (μ_{k})}{\partial c_{i, k}} & = (p_{i, k} - q_{i, k}) \int_{S_{i}^{2}} ϕ_{k} (r) n_{S_{i}^{2}} (r) d r \\ \frac{\partial H (μ_{k})}{\partial h_{i, k}} & = (f_{1}^{'} (h_{i, k}) - f_{2}^{'} (h_{i, k})) \int_{S_{i}^{1}} ϕ_{k} (r) d r + (p_{i, k} - q_{i, k}) tan φ \int_{S_{i}^{2}} ϕ_{k} (r) d r \end{array}

(18)

for h_i,k ∈ (h̲, h̄), and

\begin{array}{l} \frac{\partial H (μ_{k})}{\partial c_{i, k}} & = (\overset{⌣}{p} - \hat{q}) \int_{S_{i}^{2}} ϕ_{k} (r) n_{S_{i}^{2}} (r) d r \\ \frac{\partial H (μ_{k})}{\partial h_{i, k}} & = (\overset{⌣}{p} - \hat{q}) tan φ \int_{S_{i}^{2}} ϕ_{k} (r) d r \end{array}

(19)

for h_i,k ∈ (0, h̲), where c_i,0 ∈ (0, h̄),

S_{i}^{1} = M_{i, k} \cap C_{i, k}

,

S_{i}^{2} = M_{i, k} \cap \partial (C_{i, k})

.

Proof

See Appendix B.

Following Theorem 2, a gradient-based control law is given by

u_{i, k} = {K_{u} \frac{\partial H (μ_{k})}{\partial μ_{i, k}} |}_{μ_{k} = μ_{k - 1}}

(20)

where K_u is a positive gain parameter. A larger K_u may lead to faster convergence of to the sub-optimal configuration, but may also cause larger convergence error or oscillation around the settle points due to the discrete-time control. In real system implementations, users should choose the parameter by trading off the two performance indices.

Remark 4

Note that the control input is always upper bounded in real systems, i.e., ‖u_i,k‖ ⩽ u_max for some positive number u_max and the height of each agent is also bounded by h_i,k < h̄ to have meaningful detections. Moreover, to avoid collision when neighboring agents are at the same altitude, the motion of each agent should be constrained by c_i,k₊₁ ∈ Sensors 14 09408i7 _i,k. Therefore, the control law (20) is modified as follows to be adapted to the constraints:

u_{i, k} = {λ_{i, k} K_{u} \frac{\partial H (μ_{k})}{\partial μ_{i, k}} |}_{μ_{k} = μ_{k - 1}}

(21)

where λ_i,k is a scaling factor defined by

\begin{array}{l} λ_{i, k} = \underset{0 ⩽ λ ⩽ 1}{argmax} ‖ λ K_{u} \frac{\partial H (μ_{k})}{\partial μ_{i, k}} ‖ \\ s.t. ‖ λ K_{u} \frac{\partial H (μ_{k})}{\partial μ_{i, k}} ‖ ⩽ u_{max} \\ c_{i, k - 1} + λ K_{u} \frac{\partial H (μ_{k})}{\partial c_{i, k}} \in M_{i, k - 1} \ Λ_{i, k - 1} \\ h_{i, k - 1} + λ K_{u} \frac{\partial H (μ_{k})}{\partial h_{i, k}} ⩽ \bar{h} - ϵ_{2} \end{array}

(22)

Λ_i,k is a buffer region enclosing the border of Sensors 14 09408i7 _i,k defined as follows:

Λ_{i, k} = {r \in M_{i, k} : min (ϵ_{1}, min_{s \in \partial M_{i, k}} ‖ c_{i, k} - s ‖) < min_{s \in \partial M_{i, k}} ‖ r - s ‖}

(23)

where ∊₁, ∊₂ > 0 are given parameters for limiting the width of the buffer region and the height of each agent respectively.

Generally, it is favored that UAVs stay longer in the region with less gathered information to take more measurements. Thus, we define the weighting function ϕ_i,k (g) as a function of the gathered information ‖Q_i,g,k₋₁‖ for each cell, i.e.:

ϕ_{i, k} (g) = e^{- K_{ϕ} ‖ Q_{i, g, k - 1} ‖}

(24)

where K_ϕ is a positive gain parameter. By this model, cells with less gathered information are given higher weights for detection. There is no specific rule for choosing the optimal K_ϕ since it only denotes the user's preference on the search priority for different cells. In general, K_ϕ is only required not to be too large or too small in order to properly scale the weights of different cells. For example, we find that K_ϕ = 2 is one of the many suitable settings in our simulation.

Remark 5

The partition { Sensors 14 09408i7 ₁_,k, … _N,k} can be static or time-varying. Partition is commonly used where each UAV only takes charge of one part of the whole surveillance region so that the whole searching task is shared by multiple agents. Users can predefine the task regions for each UAV or let the UAVs dynamically compute the partition following some rules. An example of the dynamic partition is the Voronoi partition which has been widely used in the distributed control [13].

5. Simulation

5.1. Simulation Environment

We deploy multiple UAVs to search for four ground targets. The whole surveillance region is a square region of [0, 50] × [0, 50] m² as shown in Figure 2a, within which lie two crossing roads denoted by Sensors 14 09408i1 _R ⊂ . The four targets stay or move only on the roads and no target appears outside the roads in the surveillance region. At time k, each target z (z = 1,2,3,4) randomly moves to one of the cells in the set {g ∈ _R : ‖g − Tar_z,k₋₁‖ ⩽ V_Tar} where Tar_z,k₋₁ is the cell it stays in at time k − 1 and V_Tar is the largest possible speed of target movement. Hence, P (θ_g,k = 1|θ_r,k−1 = 1) = 1/∑_g_∈_R Sensors 14 09408i10 _{_g_∈_r_} for r ∈ _R, where Sensors 14 09408i9 _r = {g ∈ _r : ‖g − r‖ ⩽ V_Tar}. Initially, we set Q_i,g_,0 = 0 for all i and g within roads (i.e., Sensors 14 09408i5 _i,g,k = 0.5 for g ∈ r), and Q_i,g_,0 to a fixed large value for g outside the roads (i.e., _i,_g,k ≈ 0 for g∉ Sensors 14 09408i1 _R). The detection probability function and the false alarm probability function are assumed to be f₁ (h_i,k) = K₁e⁻^K²⁽^hi,k⁻^h⁾², and f₂ (h_i,k) = K₃e^K⁴⁽^hi^,k−^h̲⁾² respectively where K₁, K₂, K₃,K₄ are positive parameters satisfying the conditions in Equations (9) and (10). We test the proposed target search method in two scenarios. In Scenario I, all targets appear at k = 0 and keep stationary during the whole searching process, i.e., V_Tar = 0 m/s. In Scenario II, we set V_Tar = 1 m/s to test the influence of target mobility on the convergence of probability maps. The four targets also appear at k = 0 but do not disappear during the search. In these two scenarios, we verify the effectiveness of the proposed target search method by deploying different number of UAVs and using different information decaying factors. The initial positions of UAVs are randomly selected within region [0,5]³ m³. The partition { Sensors 14 09408i7 ₁,_k,… _N,k} is generated by Voronoi partition. The communication range is set as R_c = 20 m and the communication control protocol in [11] is applied for connectivity maintenance. A distributed K-connectivity maintenance algorithm has also been developed by the authors in [23] which can be applied in cooperative target search. Readers may refer to the references for more details on the communication protocols or maintenance algorithms. The cell size is fixed as 1 × 1 m² Other key parameters are respectively set as K_u = 0.3, K_ϕ = 2, q = 0.1, p̆ = 0.99, qˆ = 0.01, h̄ = 10 m, h̲ = 5 m, α = 0, u_max = 2 m/s and T = 1 s.

Since the convergence of the individual probability map Sensors 14 09408i5 _i,_g,k of agent i implies that the weight ϕ_i,k (g) defined by Equation (24) approaches 0 for each cell, we define the following average weight to evaluate the convergence performance of the whole network:

ϕ_{k} = \frac{1}{N M_{R}} ∑_{i = 1}^{N} ∑_{g \in O_{R}} ϕ_{i, k} (g)

where M_R denotes the total number of cells within the roads. It is easy to find that the initial value of ϕ_kis

ϕ_{0} = \frac{1}{N M_{R}} ∑_{i = 1}^{N} ∑_{g \in O_{R}} e^{- K_{ϕ} ‖ Q_{i, g, 0} ‖} = 1

. In the simulations, we compare the results of ϕ_k with different system parameters. The results are averaged from 200 Monte Carlo simulations.

5.2. Simulation Results

Figure 2 shows an example of the convergence process of individual probability maps in Scenario I with stationary targets, where the probabilities converge to 1 for the cells within which targets truly exist and 0 for the cells within which no target exists. The snapshots of UAVs in Scenario I are shown in Figure 3. Additionally, ϕ_k finally converges to 0 and the more agents are deployed, the faster it converges as shown in Figure 4a.

The convergence process of an individual probability map in Scenario II with mobile targets is shown in Figure 5, where the probabilities for the cells around targets may not converge to 0 as in Scenario I due to the random mobility of targets. However, we still can infer that there are four targets on the roads and have a rough estimation of their positions based on the envelopes of the final probability maps of UAVs. The snapshots of UAVs in Scenario II at according times are shown in Figure 6. In this case, ϕ_k does not converge to 0 as shown in Figure 4b. However, a smaller ϕ_k can be obtained with more agents deployed since the collective sensing area becomes larger. Compared with the results in Scenario I, the number of deployed agents has a greater impact on the convergence performance of probability maps in Scenario II with random target mobility. Hence, the algorithm is more robust with more UAVs deployed.

In addition, we also test the impact of the information decaying factor on the convergence results. According to the simulation results (as shown in Figure 7), a larger decaying factor will lead to larger average uncertainty about the target existence in the whole region, because the accumulated information for each cell decays faster. In fact, the design purpose of the decaying factor is to let agents revisit each cell at certain frequency to update the latest information about target existence in the cell. Therefore, the tradeoff lies in that, a larger decaying factor leads to larger uncertainty, but makes the agents pay more attention to the cells with fewer observations. However, there is no quantitative means of choosing the decaying factor and users may find a proper one via simulation method.

6. Conclusions

In this paper, we studied the three-dimensional vision-based cooperative control and information fusion in target search by a group of UAVs with limited sensing and communication capabilities. First, heuristic detection probability and false alarm probability models were built which are related to the target discriminability of a camera and varies as a function of altitude. Then, we formulated the target search problem as a coverage optimization problem by balancing the coverage area and the detection performance. A generalized probability map updating model was proposed, by considering the information decay and transmission due to environmental changes such as the target movement. The simulation results showed that the proposed algorithms can make the individual probability maps of all agents converge to the same one which reflects the true environment when the targets are stationary. The influence of target mobility and the number of deployed UAVs on the convergence of probability maps has also been illustrated by simulation. Following this work, there is still a big potential area for future development and generalization of the proposed method. For example, the extension for detection by heterogenous sensors is an interesting topic since more types of information can be combined to improve the detection performance. More realistic environmental and system conditions that can affect the search results need to be considered such as the light intensity, block on the line of sight, camera with adjustable focus, asynchronous communication, etc.

Author Contributions

Jinwen Hu, Lihua Xie and Jun Xu conceived and designed the study. Jinwen Hu and Zhao Xu designed and implemented the simulation and method validation. Jinwen Hu wrote the paper. Lihua Xie and Jun Xu reviewed and edited the manuscript. All authors read and approved the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Theorem 1

First, we consider Case 1, where a target is present within cell g. Define the following augmented notations for the entire system:

\begin{array}{l} ϒ_{g, k} & ≜ {[Q_{1, g, k}, Q_{2, g, k}, \dots, Q_{N, g, k,}]}^{T} \\ Φ_{g, k} & ≜ {[υ_{1, g, k}, υ_{2, g, k}, \dots, υ_{N, g, k,}]}^{T} \end{array}

Then, the updating rule (5) and (6) can be replaced by the following equation:

ϒ_{g, k} = \prod_{t = 1}^{k} W_{t} ϒ_{g, 0} + ∑_{l = 1}^{k} \prod_{t = l}^{k} W_{t} Φ_{g, l}

(25)

Hence,

E [∑_{i = 1}^{N} Q_{i, g, k}] = ∑_{i = 1}^{N} E [Q_{i, g, 0}] + ∑_{t = 1}^{k} ∑_{t = 1}^{N} ∑_{j \in N_{i, t}} E [υ_{j, g, t}]

where

E [υ_{j, g, t}] = [p_{j, t} ln \frac{q_{j, t}}{p_{j, t}} + (1 - p_{j, t}) ln \frac{1 - q_{j, t}}{1 - p_{j, t}}] 1_{{g \in ℂ_{j, t}}}

and

_{_g_{∈ ℂ}_j,_t_} is the indicator function defined as:

1_{{g \in ℂ_{j, t}}} = {\begin{array}{l} 1 & if g \in ℂ_{j, t} \\ 0 & otherwise \end{array}

Since Q_i,g,₀, υ_j₁_,g,_η₁ and υ_j₂_,g,_η₂ are independent for j₁ ≠ j₂ or η₁ ≠ η₂, we can get the variance

D [∑_{i = 1}^{N} Q_{i, g, k}] = ∑_{i = 1}^{N} D [Q_{i, g, 0}] + ∑_{t = 1}^{k} ∑_{i = 1}^{N} ∑_{j \in N_{i, t}} D [υ_{i, g, l}]

where

D [υ_{j, g, t}] = p_{j, t} (1 - p_{j, t}) {(ln \frac{q_{j, t}}{p_{j, t}} - ln \frac{1 - q_{j t}}{1 - p_{j t}})}^{2} 1_{{g \in ℂ_{j, t}}}

Considering that D [υ_j,g,t] is a continuous function of p_j,t and 1 > p̌ ⩾ p_j,t ⩾ 0.5 + ε > 0.5, 0 < qˆ ⩽ q_j,t ⩽ 0.5 − ε < 0.5, there exists a constant real number $σ = 2 \sqrt{\overset{⌣}{p} (0.5 - ε)} ln \frac{0.5 - ε}{0.5 + ε}$ such that D [υ_j,g,_t] ⩽ σ² for g ∈ ℂ_j,t, which implies

lim_{m_{g, k} \to + \infty} ∑_{t = 1}^{k} ∑_{i = 1}^{N} ∑_{j \in N_{i, t}} \frac{D [υ_{j, g, t}]}{m_{g, t}^{2}} ⩽ lim_{m_{g, k} \to + \infty} ∑_{l = 1}^{m_{g, k}} \frac{N^{2} σ^{2}}{l^{2}} < \infty

According to the Kolmogorov Strong Law of Large Numbers, [24] we get

\begin{matrix} \begin{matrix} \frac{∑_{t = 1}^{k} ∑_{i = 1}^{N} ∑_{j \in N_{i, t}} E [υ_{j, g, t}]}{m_{g, t}} \overset{a . s .}{\to} 0, & a s \end{matrix} & m_{g, t} \to + \infty \end{matrix}

Hence,

\begin{array}{l} \frac{∑_{i = 1}^{N} Q_{i, g, k}}{m_{g, k}} & = \frac{∑_{i = 1}^{N} Q_{i, g, 0}}{m_{g, k}} + \frac{∑_{t = 1}^{k} ∑_{i = 1}^{N} ∑_{j \in N_{i, t}} E [υ_{j, g, t}]}{m_{g, t}} \\ + \frac{∑_{t = 1}^{k} ∑_{i = 1}^{N} ∑_{j \in N_{i, t}} (υ_{j, g, t} - E [υ_{j, g, t}])}{m_{g, t}} \\ \overset{a . s .}{\to} \frac{∑_{t = 1}^{k} ∑_{i = 1}^{N} ∑_{j \in N_{i, t}} E [υ_{j, g, t}]}{m_{g, t}}, as m_{g, t} \to + \infty \end{array}

On the other hand, since p_j,t > q_j,t, it is straightforward to get

\begin{array}{l} \frac{\partial E [υ_{j, g, t}]}{\partial p_{j, t}} & = ln \frac{p_{j, t}}{q_{j, t}} - ln \frac{1 - p_{j, t}}{1 - q_{j, t}} < 0 \\ \frac{\partial E [υ_{j, g, t}]}{\partial q_{j, t}} & = ln \frac{p_{j, t}}{q_{j, t}} - ln \frac{1 - p_{j, t}}{1 - q_{j, t}} > 0 \end{array}

which implies

E [υ_{j}, g, t] ⩽ 2 ε ln \frac{0.5 - ε}{0.5 + ε} < 0

Hence,

\underset{m_{g, k} \to + \infty}{lim sup} \frac{∑_{i = 1}^{N} Q_{i, g, k}}{m_{g, k}} ⩽ \underset{m_{g, k} \to + \infty}{lim sup} \frac{∑_{i = 1}^{k} ∑_{i = 1}^{N} E [υ_{i, g, t}]}{m_{g, k}} = \hat{p} ln \frac{q}{\hat{p}} + (1 - \hat{p}) ln \frac{1 - q}{1 - \hat{p}} < 0

which implies

∑_{i = 1}^{N} Q_{i, g, k} \underset{\to}{a . s .} - \infty

. Since the network is connected all the time, Q_i,g,k for each agent i will almost surely converge to −∞ by implementing the average consensus protocol (6) (as shown in [11,25]).

Following the same procedure of the proof above, we can prove the conclusion of Case 2.

Appendix B. Proof of Theorem 2

We first consider h_i,k ∈ (h̲, h̄). From Equation (17), we get

H (μ_{k}) = ∑_{i = 1}^{N} \int_{M_{i, k}} ϕ_{k} (r) (p_{i, k} - q_{i, k}) 1_{{r \in ℂ_{i, k}}} d r = ∑_{i = 1}^{N} \int_{M_{i, k} \cap ℂ_{i, k}} ϕ_{k} (r) (p_{i, k} = q_{i, k}) d r

Defining a set of agents for agent i, i.e., Sensors 14 09408i11 _i,k = {j : ∂ ( Sensors 14 09408i7 _j,k) ∩ ℂ_i,k ≠ Ø}, it follows that

\begin{array}{l} \frac{\partial H (μ_{k})}{\partial μ_{i, k}} & = \int_{S_{i}^{1}} ϕ_{k} (r) (\frac{\partial p_{i, k}}{\partial μ_{i, k}} - \frac{\partial q_{i, k}}{\partial μ_{i, k}}) d r + \int_{S_{i}^{2}} ϕ_{k} (r) (p_{i, k} - q_{i, k}) {\frac{\partial r}{\partial μ_{i, k}}}^{T} n S_{i}^{2} (r) d r \\ + ∑_{j \in {\overset{⌢}{N}}_{i, k}} \int_{S_{i}^{3}} ϕ_{k} (r) (p_{i, k} - q_{i, k}) {\frac{\partial r}{\partial μ_{i, k}}}^{T} n S_{i}^{3} (r) d r \\ + ∑_{j \in {\overset{⌢}{N}}_{i, k}} \int_{S_{i}^{4}} ϕ_{k} (r) (p_{i, k} - q_{i, k}) {\frac{\partial r}{\partial μ_{i, k}}}^{T} n S_{i}^{4} (r) d r \end{array}

where

S_{i}^{1} ≜ M_{i, k} \cap ℂ_{i, k}

,

S_{i}^{2} ≜ M_{i, k} \cap \partial (ℂ_{i, k})

,

S_{i, j}^{1} ≜ (\partial (M_{i, k}) \ \partial (O)) \cap ℂ_{i, k}

,

S_{i, j}^{2} ≜ (\partial (M_{j, k}) \cap \partial (O)) \cap ℂ_{i, k}

. For r ∈

S_{i, j_{1}}^{1} \cap S_{i, j_{2}}^{1}

(j₁, j₂ ∈

and j₁ ≠ j₂ we have

n_{S_{j_{1}}^{3}} (r) = - n_{S_{j_{2}}^{3}} (r)

. For r ∈

S_{i, j}^{2} (j \in {\overset{⌢}{N}}_{i, k})

, we have

{\frac{\partial r}{\partial μ_{i, k}}}^{T} = 0

. Hence,

\begin{array}{l} \frac{\partial H (μ_{k})}{\partial μ_{i, k}} & = \int_{S_{i}^{1}} ϕ_{k} (r) (\frac{\partial p_{i, k}}{\partial μ_{i, k}} - \frac{\partial q_{i, k}}{\partial μ_{i, k}}) d r + \int_{S_{i}^{2}} ϕ_{k} (r) (p_{i, k} - q_{i, k}) {\frac{\partial r}{\partial μ_{i, k}}}^{T} n S_{i}^{2} (r) d r \end{array}

From Equation (9) and according to the results of [15], we get

\begin{array}{l} \frac{\partial p_{i, k}}{\partial c_{i, k}} = 0, & \frac{\partial p_{i, k}}{\partial h_{i, k}} = f_{1}^{'} (h_{i}, k), & \frac{\partial q_{i, k}}{\partial h_{i, k}} = f_{2}^{'} (h_{i, k}) \end{array}

and

\begin{array}{l} {\frac{\partial r}{\partial c_{i, k}}}^{T} n S_{i}^{2} (r) & = {\frac{\partial r}{\partial c_{i, k}}}^{T} n S_{i}^{2} (r) \\ {\frac{\partial r}{\partial h_{i, k}}}^{T} n_{S_{i}^{2}} (r) & = tan φ \end{array}

where φ is half of the angle width of the field of view of each agent (as shown in Figure 1b). Therefore, Equation (18) holds.

According to Equation (9) p_i,k = p̌ for h_i,k ∈ (0, h̲). It is straightforward to get Equation (19) following the same procedure of proof as above.

References

Garzón, M.; Valente, J.; Zapata, D.; Barrientos, A. An Aerial-Ground Robotic System for Navigation and Obstacle Mapping in Large Outdoor Areas. Sensors 2013, 13, 1247–1267. [Google Scholar]
Heredia, G.; Caballero, F.; Maza, I.; Merino, L.; Viguria, A.; Ollero, A. Multi-Unmanned Aerial Vehicle (UAV) Cooperative Fault Detection Employing Differential Global Positioning (DGPS), Inertial and Vision Sensors. Sensors 2009, 9, 7566–7579. [Google Scholar]
Kocur, D.; Švecová, M.; Rovňáková, J. Through-the-Wall Localization of a Moving Target by Two Independent Ultra Wideband (UWB) Radar Systems. Sensors 2013, 13, 11969–11997. [Google Scholar]
Xu, Y.; Xu, H.; An, W.; Xu, D. FISST Based Method for Multi-Target Tracking in the Image Plane of Optical Sensors. Sensors 2012, 12, 2920–2934. [Google Scholar]
Bertuccelli, L.; How, J. Robust UAV search for environments with imprecise probability maps. Proceedings of the 44th IEEE Conference on Decision and Control, Seville, Spain, 12–15 December 2005; pp. 5680–5685.
Yang, Y.; Minai, A.; Polycarpou, M. Decentralized cooperative search by networked UAVs in an uncertain environment. Proceedings of the American Control Conference, Boston, MA, USA, 30 June–2 July 2004; Volume 6, pp. 5558–5563.
Yang, Y.; Polycarpou, M.; Minai, A. Multi-UAV cooperative search using an opportunistic learning method. J. Dyn. Syst. Meas. Control 2007, 129, 716–728. [Google Scholar]
Zhong, M.; Cassandras, C. Distributed coverage control and data collection with mobile sensor networks. Proceedings of the IEEE Conference on Decision and Control, Atlanta, GA, USA, 15–17 December 2010; pp. 5604–5609.
Gan, S.K.; Sukkarieh, S. Multi-UAV target search using explicit decentralized gradient-based negotiation. Proceedings of the IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 751–756.
Millet, P.; Casbeer, D.; Mercker, T.; Bishop, J. Multi-Agent Decentralized Search of a Probability Map with Communication Constraints. Proceedings of the AIAA Guidance, Navigation, and Control Conference, Toronto, ON, Canada, 2–5 August 2010.
Hu, J.; Xie, L.; Lum, K.Y.; Xu, J. Multiagent Information Fusion and Cooperative Control in Target Search. IEEE Trans. Control Syst. Technol. 2013, 21, 1223–1235. [Google Scholar]
Arai, T.; Pagello, E.; Parker, L.E. Editorial: Advances in multi-robot systems. IEEE Trans. Robot. Autom. 2002, 18, 655–661. [Google Scholar]
Cortes, J.; Martinez, S.; Karatas, T.; Bullo, F. Coverage control for mobile sensing networks. IEEE Trans. Robot. Autom. 2004, 20, 243–255. [Google Scholar]
Schwager, M.; Rus, D.; Slotine, J. Decentralized Adaptive Coverage Control for Networked Robots. Int. J. Robot. Res. 2009, 28, 357–375. [Google Scholar]
Schwager, M.; Julian, B.; Angermann, M.; Rus, D. Eyes in the Sky: Decentralized Control for the Deployment of Robotic Camera Networks. Proc. IEEE 2011, 99, 1541–1561. [Google Scholar]
Wang, Y.; Hussein, I. Awareness coverage control over large-scale domains with intermittent communications. IEEE Trans. Autom. Control 2010, 55, 1850–1859. [Google Scholar]
Horn, R.; Johnson, C. Matrix Analysis; Cambridge University Press: Cambridge, UK, 1990. [Google Scholar]
Lu, J.; Chen, G. A time-varying complex dynamical network model and its controlled synchronization criteria. IEEE Trans. Autom. Control 2005, 50, 841–846. [Google Scholar]
Solis, R.; Borkar, V.S.; Kumar, P. A new distributed time synchronization protocol for multihop wireless networks. Proceedings of the 45th IEEE Conference on Decision and Control, San Diego, CA, USA, 13–15 December 2006; pp. 2734–2739.
Kopetz, H.; Ochsenreiter, W. Clock synchronization in distributed real-time systems. IEEE Trans. Comput. 1987, 100, 933–940. [Google Scholar]
Elson, J.E. Time Synchronization in Wireless Sensor Networks. Ph.D. Thesis, University of California Los Angeles, Los Angeles, CA, USA, 2003. [Google Scholar]
Hansen, S.; McLain, T.; Goodrich, M. Probabilistic searching using a small unmanned aerial vehicle. Proceedings of AIAA Infotech@Aerospace, Rohnert Park, CA, USA, 7–10 May 2007; pp. 7–10.
Hu, J.; Xu, Z. Distributed cooperative control for deployment and task allocation of unmanned aerial vehicle networks. IET Control Theory Appl. 2013, 7, 1574–1582. [Google Scholar]
Ash, R.; Doleans-Dade, C. Probability and Measure Theory; Academic Press: San Diego, USA, 2000. [Google Scholar]
Xiao, L.; Boyd, S.; Lall, S. A scheme for robust distributed sensor fusion based on average consensus. Proceedings of the 4th International SymposiumInformation Processing in Sensor Networks, Los Angeles, CA, USA, 25–27 April 2005; pp. 63–70.

Figure 1. Target search by multiple UAVs. (a) A network of UAVs; (b) Target image taken by an airborne camera.

Figure 2. The convergence of the probability map of an agent in Scenario I. (a) k = 0 s; (b) k = 10 s; (c) k = 30 s; (d) k = 50 s; (e) k = 70 s; (f) k = 90 s.

Figure 3. Snapshots of UAVs in Scenario I. (a) k = 0 s; (b) k = 10 s; (c) k = 30 s; (d) k = 50 s; (e) k = 70 s; (f) k = 90 s.

Figure 4. Weight average ϕ_k by different number of agents. (a) Scenario I; (b) Scenario II.

Figure 5. The convergence of the probability map of an agent in Scenario II. (a) k = 0 s; (b) k = 10 s; (c) k = 30 s; (d) k = 50 s; (e) k = 70 s; (f) k = 90 s.

Figure 6. Snapshots of UAVs in Scenario II. (a) k = 0 s; (b) k = 10 s; (c) k = 30 s; (d) k = 50 s; (e) k = 70 s; (f) k = 90 s.

Figure 7. Weight average ϕ_k by different information decaying factor. (a) Scenario I; (b) Scenario II.

© 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license ( http://creativecommons.org/licenses/by/3.0/).

Share and Cite

MDPI and ACS Style

Hu, J.; Xie, L.; Xu, J.; Xu, Z. Multi-Agent Cooperative Target Search. Sensors 2014, 14, 9408-9428. https://doi.org/10.3390/s140609408

AMA Style

Hu J, Xie L, Xu J, Xu Z. Multi-Agent Cooperative Target Search. Sensors. 2014; 14(6):9408-9428. https://doi.org/10.3390/s140609408

Chicago/Turabian Style

Hu, Jinwen, Lihua Xie, Jun Xu, and Zhao Xu. 2014. "Multi-Agent Cooperative Target Search" Sensors 14, no. 6: 9408-9428. https://doi.org/10.3390/s140609408

Article Menu

Multi-Agent Cooperative Target Search

Abstract

1. Introduction

2. Basic Definitions and Assumptions

3. Probability Map Update

3.1. Bayesian Update and Consensus-Based Map Fusion

3.2. Time-Varying Detection Probability

Remark 1

Remark 2

Theorem 1

Proof

3.3. Environment-Based Probability Map

Remark 3

4. Cooperative Coverage Control

Theorem 2

Proof

Remark 4

Remark 5

5. Simulation

5.1. Simulation Environment

5.2. Simulation Results

6. Conclusions

Author Contributions

Conflicts of Interest

Appendix A. Proof of Theorem 1

Appendix B. Proof of Theorem 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI