^{1}

^{2}

^{★}

^{1}

^{1}

^{1}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (

This article considers a sensor management problem where a number of road bounded vehicles are monitored by an unmanned aerial vehicle (UAV) with a gimballed vision sensor. The problem is to keep track of all discovered targets and simultaneously search for new targets by controlling the pointing direction of the vision sensor and the motion of the UAV. A planner based on a state-machine is proposed with three different modes; target tracking, known target search, and new target search. A high-level decision maker chooses among these sub-tasks to obtain an overall situational awareness. A utility measure for evaluating the combined search and target tracking performance is also proposed. By using this measure it is possible to evaluate and compare the rewards of updating known targets versus searching for new targets in the same framework. The targets are assumed to be road bounded and the road network information is used both to improve the tracking and sensor management performance. The tracking and search are based on flexible target density representations provided by particle mixtures and deterministic grids.

Limited sensor resources are a bottleneck for most surveillance systems. It is rarely possible to fulfill the requirements of large area coverage and high resolution sensor data at the same time. This article considers a surveillance scenario where an unmanned aerial vehicle (UAV) with a gimballed infrared/vision sensor monitors a certain area. The field-of-view of the camera is very narrow so just a small part of the scene can be surveyed at a given moment. The problem is to keep track of all discovered targets, and simultaneously search for new targets, by controlling the pointing direction of the camera and the motion of the UAV. The motion of the targets (e.g., cars) are constrained by a road network which is assumed to be prior information. The tracking and sensor management modules presented in this article are essential parts of (semi-) autonomous surveillance systems corresponding to the UAV framework presented in [

There are many possible approaches to the sensor management problem, but in this work a state-machine planner is proposed with three major modes:

search new target

follow target

locate known target.

The work is based on Bayesian estimation and search methods,

An alternative and interesting setup is when a second, possibly fixed, wide-angle camera provides points of interest for the tele-angle camera to investigate. Such setup will probably lead to better performance, but the planning problem will still be similar to the one we are considering in this article and therefore we think that our problem is still very relevant.

The main contributions of this study are:

Search theory and multi-target tracking are large, but separate, research areas. The problem where search and target tracking are combined has received much less attention, especially in vision-based applications. One of the objectives of this study is to fill this gap.

There are few studies that examines both high-level and low-level aspects of tracking, planning and searching. In this paper, the state-machine framework adopted uses sub-blocks that achieve low-level tracking, planning and searching tasks while a high-level decision maker chooses among these sub-tasks to obtain an overall situational awareness.

A useful utility measure for the combined search and target tracking performance is proposed.

The use of road networks has been used widely to improve the tracking performance of road-bound targets, but in this study we are also utilizing the road network for improved search performance.

We investigate the effect of the multi-target assumption on the search and show that it results in the same planning scheme as in the single target case.

There are few studies where both PFs and PMFs are examined. We show the utilization of each algorithm and discuss the merits and the disadvantages with the appropriate applications.

A common assumption, utilized also in this work, is that the system can be modeled by a first-order Markov process,

One of the first algorithms for an exact solution to POMDP was given by Sondik [

In receding horizon control (RHC) only a finite planning horizon is considered. The extreme case is myopic planning where the next action is based only on the immediate consequence of that action. A related approach is roll-out where an optimal solution scheme is used for a limited time horizon and a base policy is applied beyond that time point. The base policy is suboptimal, but should be easy to compute. He and Chong [

One suboptimal approach is to approximate the original problem with a new problem where some theoretical results in planning and control can be applied that makes the problem simpler to solve. One example in the sensor management context is when multi-target tracking planning problems are treated as a multi-armed bandit (MAB) problem (see [

The elements of a basic search problem are: (a) a prior probability distribution of the search object location; (b) a detection function relating the search effort and the probability of detecting the object given that the object is in the scanned area; (c) a constrained amount of search effort; and finally (d) an optimization criterion representing the probability of success. There are two common criteria that are used in search strategy optimization. One criterion is the probability of finding the target in a given time interval and the other criterion is the expected time to find the target.

The classical search theory, as developed by Koopman

Classical

The sensor management problem considered in this work is presented in Section 2. A utility measure for evaluating the combined search and target tracking performance is described and a basic example to illustrate its advantages is given. A planner framework based on a state-machine is proposed to handle the complex planning problem and an overview of this planner is also given in Section 2. In Section 3 some fundamental concepts of estimation and search are given and these results are used in later sections. Sections 4 and 5 describe the particle filter (PF) and the point mass filter (PMF), respectively, and how they can be used in search applications. Section 6 returns to the state-machine planner and more detailed descriptions of the planning modes are given together with a presentation of the high-level planner. Section 6 ends with a simulation example where the whole planning framework is applied. Finally in Section 7 some conclusions are drawn. The Appendices collect detailed descriptions of the system model.

The planning problem in this study is to search for road targets and keep track of discovered targets by controlling the pointing direction of a gimballed vision sensor and the UAV trajectory. The vision sensor has a limited field-of-view (FOV) which makes it unlikely that more than one target can be observed at the same time. One tracking filter is used for each detected target since the targets are assumed to be uncorrelated. The association problem is also ignored, since perfect discrimination is assumed. The surveillance area used in the simulations in this article is shown in

In this section, first the overall objective function is presented in order to evaluate different approaches to the planning problem, and then the general planning problem is described. The section ends with an overview of the planner that is proposed to solve the problem.

The objective function outputs a scalar such that different plans can be compared,

Consider the uncertainty measure tr ^{j}^{j}_{j}^{j}_{j}^{j}^{−1} does not have that problem. On the other hand, the information value decreases quickly when the target is not visible and this rewards very conservative behavior where the search for new targets is not encouraged.

To summarize, we want a reward function with the following properties:

It increases with each observation.

It does not decrease unlimited for unobserved or lost targets,

It should favor strategies that find many targets.

It should give a fair compromise between splitting the observation resource among the different targets.

We propose the following reward function:
_{t}^{−α tr Pj} is a monotonically decreasing function of the uncertainty tr ^{i}

_{t}_{t+1} = _{t}_{t}, w_{t}_{t+1} _{t}_{t}_{t}, e_{t}

The performance is in this work computed as
_{xy}_{t}^{t}

Assume, without loss of generality, that the planning is performed at time _{t}^{⊤} that are controlling azimuth (

The goal of the planning is to find a control law _{t}_{t}_{t}

The planning problem (

A state-machine approach is instead proposed to simplify the planning problem. The state-machine contains three major states, representing three different planning modes, see

The tracking Mode 1 is quite straightforward to develop, and the task is to make sure the target is centered in the sensor frame. Modes 2 and 3 and the high-level planner is the main topic of this article. First some basics of estimation and search will be presented in Section 3, then the particle filter used for target density estimation is presented in Section 4. The approach used in Mode 1 is described in Section 5. The whole state-machine is again considered in Section 6 where also the high-level planner is described in detail. Some simulation examples are presented along the way to the final results in Section 6.6.

In this section, the search objective function, which will be the basis of all optimization objective functions used throughout this work, is derived. First we present the Bayesian recursions which will serve as a starting point for the calculation of the objective function. Then the derivation of the search criterion is presented.

The aim of this section is to introduce the recursive state estimation theory [_{t}_{t}_{t+1}|_{t}_{t}_{t}_{1:t} = {_{1}_{2}, ..., _{t}_{t}

The above equations represent the so-called Bayesian filter and there are only few cases when it is possible to derive the analytical solutions for them. One case is the linear Gaussian case, leading to the well known Kalman filter (KF). In the general case, numerical approximations are necessary and one popular technique is to approximate the target density _{t}_{1:t}) with a particle mixture as in the particle filter (PF), or with a point grid as in the point mass filter (PMF).

As in the estimation theory introduction, assume that _{t}_{t}_{t}_{t}_{t}_{t}_{t}_{t}_{t}_{0:t−1} is found by the marginalization
_{t}|y_{0:t−1}) is the prediction density from the Bayesian filter. Define

In this work the cumulative probability of no detection Λ_{0:t} will be used as the cost function. Note that _{t}_{0:t}, _{0:t} is non-decreasing.

Now consider the case when _{i}

A common choice to represent the target probability density in search applications is to use a grid representation of the world where the target might be [

In a particle filter Arulampalam _{t}_{1:t}) is approximated by a particle mixture, containing _{t+1}|_{t}). The filter recursion (

The search objective function based on the cumulative probability of non-detection (

Let

Simulate the evolution of the particles according to the target model

Compute the probability of detection, _{D}_{t}

Set

Set the cost function value as the cumulative probability of non-detection

When running this algorithm several times, e.g., in an optimization routine, it is possible to pre-compute the particle trajectories of the target to save computational time since the target is independent of the actions of the searcher.

Three snapshots of the simulation are shown in

In this section, we are going to investigate the search problem under the assumption that multiple targets exist in the search region. First, we are going to generalize the single target search case based on the cumulative probability of no detection to the case of multiple targets, where it will be shown that the assumption of multiple targets will result into the same search plan as the single target case. Based on this result, we will propose a model for representing the target probability density of undiscovered targets. The goal of this filter is not to track the state of a discovered target; instead this filter will be used to model which areas have been searched. Since the density will be rather flat on the whole road network, the point-mass filter (PMF) will be used, i.e., a grid will be defined on the road network and the corresponding weights of the grid points represent the probability of at least one target in that location. This section ends with a simulation showing a target search example.

We here consider the generalization of the methodology in the previous sections to the case that multiple targets exist in the search region. We assume the availability of a discrete distribution over the number of targets (called cardinality distribution [_{T}_{t}_{t}_{t}_{0:t−1} =

_{T}

In probabilistic terms, the function _{T}_{T}_{T}_{T}

By Theorem 1, the search problem for the multiple target case becomes obtaining the search plan that minimizes
_{T}

In a point-mass filter (PMF), a.k.a. grid method, the target probability density is approximated with a number of grid points. This grid representation in a PMF is actually similar to the particle representation in the PF, but the grid points in the PMF are stationary and defined deterministically while the particles in the PF are stochastic and non-stationary. In the PMF [_{t}_{1:t}) is discretized. The general filter recursion (^{(j)} is computed such that

The advantage of the PF approach compared to this grid filter approach is that the resolution adapts automatically so that high probability regions have higher resolution. Consequently, the advantage of a grid approach is that it has better support in low probability areas compared to a particle mixture that needs a very high number of particles for flat target densities. The particle filter is more flexible in terms of target motion models and even though the PF has problems with high-dimensional state-spaces, it is in general a better choice than the grid approaches for problems with a state-space dimension of 3 and higher. However, for 2D problems the PMF is a nice filter that is straightforward to implement and that can handle multi-modal probability densities. There exist methods for adapting the grid resolution and its spatial spread as a approach to solve the problem of larger state-spaces [

Consider a random walk model and an observation model with additive noise
_{t}^{(i)}_{t}_{1:t}), ^{(i)} is proportional to the weight
_{t}^{(i)}|_{1:t}). In the non-detection case the filter update step is

^{°} × 7.5^{°}.

In Sections 3–5 some tools are described that will be used here to fill the proposed state-machine planner in Section 2.3 with suitable contents. Now recall the modes described in Section 2.3:

A high-level planner decides which mode will be active (Section 6.4). The outcome of all modes is an aiming reference point and this point is also used the UAV path planner (Section 6.5).

In this search mode the roads are explored to discover new unknown targets and the grid approach presented in Section 5 is here used. The search will stop once a target is detected and a mode transition to the

When the

When a previously discovered target has not been observed for some time, its position uncertainty has grown. To keep the uncertainty at an acceptable level, the target must be rediscovered on a regular basis. However, the larger the uncertainty is, the longer is the expected search time to find the target again. If the uncertainty is above a certain threshold, the target is considered lost. When the high-level planner decides to re-discover a known target again, a position on the border of the particle cloud is selected and then the particle cloud is swept over. If the particle cloud is spread over several roads, the most probable road is selected,

The high-level planner decides which mode is active. Basically, the task is to decide if any known target needs to be updated and re-discovered, or if there is time to conduct search for new targets. There is an obvious conflict in this problem. When one target is tracked, the uncertainty of the other targets will grow, and with increasing uncertainty the expected search time for re-discovery is also increasing. At some point the uncertainty is so large that the target can be considered lost. At every time step (except when in the

The proposed high-level planner relies on a stochastic scheduling result in [

^{j}, i.e., P^{j}^{−μj t}^{j}^{j}^{j}R^{j}, j

Proof:

Our problem can also be considered a stochastic scheduling problem where the jobs are to re-discover targets or search for new. However, the expected time to detection is not exponential, but it is a reasonable approximation. Actually, Koopman in the early days of search theory addressed a search problem with a patrol aircraft searching for a ship in open sea and he showed that if an area

A more significant approximation is that the scheduling result above assumes a static setup, but our problem is dynamic since target uncertainties change (grow in general) as time goes by. Thus, a decision must be made on the time step the scheduling will be based on, the most obvious choice is the current time

It is important to be aware of these approximations when we apply Theorem 2 to our planning problem. First consider a discovered target ^{2} is 4

The reward of rediscovering target _{p}_{p}^{j}R^{j}_{xy}_{p}

The ^{0} = 1 and the expected (inverted) time for discovering a new target is

The high-level planner is summarized in Algorithm 1.

_{t}_{xy}_{p}

The planner is suboptimal in the sense that it evaluates the current state and assumes open loop control over the planning horizon. Moreover, it assumes that all targets will be visited just once. The problem dynamics and the utilization of new measurements are handled by replanning on a regular basis in a RHC manner. One way to improve the high-level planner is to develop a roll-out approach where the proposed planner serves as the base policy.

The assumptions of the proposed method has some similarities with the assumptions made in a multi-armed bandit (MAB) problem. Both approaches assume independent targets and that only one single target is processed at the same time. Moreover, both approaches assume that the targets are not affected by the action of the sensor platform and both simplifies the target dynamics. As mentioned in Section 1.1, the main advantage of a MAB formulation is that the optimal policy is of index type. Algorithm 1 also has this property, since values are computed independently for all alternatives and then the best alternative is given by the largest value. This results in a planner that is cheap to compute yet very efficient, as will be seen in the simulation in Section 6.6. The approximations that are made in the current version of the planner could certainly be improved, but this is left for future work. In particular, alternatives where the target uncertainties are predicted should be examined instead of using the uncertainty of the current time.

The UAV path planner has already been introduced in Example 4. The UAV model is the nearly constant speed model in

Three simulations are here presented to illustrate the qualitative behavior of the proposed planning method in a multi-target tracking scenario with five moving road targets in an urban area. All targets are initially unknown to the sensor platform. A snapshot from one simulation is shown in _{p}_{p}^{−6}, (b) _{p}_{p}

In

In case (b), _{p}_{p}^{−6}, is an extreme case where the planner just focuses on finding new targets. All five targets are actually discovered in the scenario, but since no explicit attempts to rediscover the known targets are made, the position uncertainty of the discovered targets grows quickly. Case (c), _{p}

The simulation results are evaluated by the reward function
_{p}_{p}_{p}^{−6}, the reward function is basically the total number of discovered targets. In the middle column, _{p}_{p}

The occlusion problem is handled surprisingly well most of the time, but future work should investigate ways of including the occlusion model explicitly in the UAV path planner and the camera pointing planners.

The conclusion is that the proposed planner solves the general problem (_{p}

The framework has also been evaluated in experiments with an Axis 233D pan-tilt-zoom network camera and a number of tape following mobile robots,

We got many valuable experiences from the experiments. The well-known saying

It is also very important to have suitable system models. The model choice is often a compromise: the model can not be too complex for computational reasons and it can not be too simplified since the results would then be irrelevant. In particular, the detection probability model is a main component in this study. It is tempting to ignore effects like motion blur and atmospheric disturbances in the simulations, but for a successful real-world application it is important to be aware of such phenomena. In general, it is extremely difficult to obtain a model of the absolute detection performance for vision sensors (see [

This article considers a sensor management problem where a number of road bounded vehicles are monitored by an unmanned aerial vehicle (UAV) with a gimballed vision sensor. The problem is to keep track of all discovered targets, and simultaneously search for new targets, by controlling the pointing direction of the camera and the motion of the UAV. The motion of the targets (e.g., cars) are constrained by a road network which is assumed to be prior information.

One key to a successful planning result is an appropriate objective function such that different scenarios can be evaluated in a unified framework. Problems with some common uncertainty and information measures are discussed. A suitable utility measure is proposed that makes it possible to compare the tasks of updating a known target and searching for new targets. A general infinite-horizon planning problem is formulated, but this is too complex to solve by standard dynamic programming techniques. A sub-optimal state-machine framework is instead proposed to solve an approximation of the problem. The planner contains three different modes that perform low-level tracking and search tasks while a higher lever decision maker chooses among these modes.

Usually, tracking and search are treated as separate problems. It is not straightforward to develop a method that can compare and evaluate search and tracking subtasks within the same framework. The problem comprises the classical planning aspect of exploration versus exploitation. The proposed high-level planner in this work is able to compare different tracking and searching subtasks within the same framework thanks to the proposed utility measure and a theorem from the stochastic scheduling literature. Some approximations are needed to fit the conditions of the theorem, e.g., the detection time is assumed to be exponentially distributed, the dynamics of the sensor platform and targets are simplified and the targets are assumed to be independent. The trade-off between updating known targets and searching for new can easily be tuned by a scalar design parameter.

Road networks have been used both to improve the tracking performance of the road-bound targets and to improve the sensor management performance. A particle filter is used to estimate the state of the known targets and a grid based filter is used to compute the probability density of the unknown targets. The effect of the multi-target assumption during searching is investigated and it is shown that the same planning scheme can be used as in the single target case. The reason to use a grid representation for unknown targets is that the support is better in low probability areas compared to the particle representation. The drawback is that a simplified motion model must be used to decrease the state dimensionality. In the search modes the cumulative probability of detection is used as the objective function, both for known and unknown targets. The detection likelihood model is based on the pointing direction (limited field-of-view), occlusion due to buildings, and a qualitative range dependent model of the sensor performance and the atmospheric disturbances.

Although the proposed planning framework can handle complex aerial surveillance missions in an efficient way, there is room for future improvements. For instance, to fit the scheduling theorem used in the high-level planner, the subtasks are assumed to be static and independent, but in reality they are not. Nevertheless, the current planning method is very useful, but these approximations should be investigated more to achieve even better performance. Visibility aspects, like occlusion due to buildings, are included to some extent, but this should also be improved.

This work has been supported by the Swedish Research Council under the Linnaeus Center CADICS and the frame project grant Extended Target Tracking (621-2010-4301). The work is part of the graduate school Forum Securitatis in Security Link. The experimental setup is part of the FOI project “Signalbehandling för styrbara sensorsystem” founded by FM. The authors are thankful to Erik Eriksson Ianke and Erik Hagfalk for their work with the experimental setup [

In this appendix the details of the system models are presented.

A simple but useful dynamic model of a fix-wing UAV is the nearly constant speed model with the state vector ^{T}, where v is the speed in the xy-plane,

The sensor gimbal is a mechanical device, typically with two actuated axes for panning and tilting the sensor. The azimuth (pan) and elevation (tilt) angles are denoted as ^{ϕ}^{θ}_{max}_{max}

The road network information _{RN} is composed of a number of roads. Each road is a 3D continuous curve whose both ends are connected to other roads in intersection. The target is assumed to be on one of the roads all the time. Which road a target currently travels on is described by a mode parameter ^{r}^{r}^{r}^{r}^{⊤}. x^{r}^{r}^{r}_{m}_{m}^{r}^{r}_{t}_{t}_{i}^{r}^{r}

See Skoglar

Let ^{s}^{s}^{s}^{s}^{⊤} and ^{g}^{g}^{g}^{⊤} be the position of the sensor and the target, respectively, relative to a global Cartesian reference system. An observation at time _{t}_{t}

There are several factors that affect the performance of an electro-optical/infrared (EO/IR) sensor system [

In this work a simplified probability of detection model that captures some important aspects is proposed. Below, the likelihood functions
^{s}

^{s}

^{g}^{g}^{g}^{⊤} ^{3}, expressed in Cartesian coordinates relative the camera fixed reference system, is projected on a virtual image plane onto the image point (^{⊤} ^{2} according to the ideal perspective projection formula
^{⊤}_{u} ≤ u ≤ _{u}, − _{v} ≤ v ≤ _{v}} and define an indicator function as

_{occ}^{s}, ℒ_{occ}^{s}_{{θ|θ>θi∀i}}(^{s}_{i}

In order to prove Theorem 1, we define the

We can write _{ND}

An example of a gimballed sensor system with infrared and video sensors. The gimbal has two actuated axis pan (azimuth) and tilt (elevation) that decides the pointing direction of the sensors. The sensor system is assumed to be mounted on a UAV platform.

Urban surveillance area.

Comparison of different reward functions. _{j}_{j}^{−1} decreases quickly when the target is not observed.

A state-machine with three different modes is proposed to solve the combined search and tracking problem. A high-level planner is deciding which mode that is active.

Single target search where the target density is represented as a particle mixture. Three snapshots are shown at time steps 8, 16, and 24. The particles can be seen as small dots and one particle can be considered as an hypothesis of where the target is. The target density vanishes where the sensor is pointing and after the resampling step the number of particles decreases in those areas, but increases anywhere else. The flight path is a solid black/gray line and the sensor footprints on the ground are polygons with four corners.

Search on a gridded road network with a stationary UAV, Example 3. Same area as in

Search on a gridded road network with a moving UAV and with occlusion due to the buildings, Example 4. (A movie is available [

Illustration of the reward function as a function of the target uncertainty.

A snapshot from the multiple target tracking and search simulation, 2D view in

Evaluation of three simulation examples where different values of _{p}_{p}^{−6}. This planner focuses on search for new unknown targets. No explicit attempts of rediscovering known targets are made. However, targets are rediscovered anyway as a result of the new target search mode. All five targets in the scenario are discovered. _{p}_{p}

Evaluation of three simulation examples where different values of _{p}^{−6} _{p}_{p}^{−6}. This planner focuses on search for new unknown targets and all five targets are found quite quickly (see left plot). The price is that the information about the discovered targets are quite low (middle and right plot) since no attempts for rediscovering the targets again are made. However, in the current simulation targets are rediscovered anyway as a result of the new target search mode. _{p}_{p}

The mobile robot used in the experiments. The visual feature on top is used to facilitate the target identification process. The view shows the zoom level used in the experiment.