A Possibilistic Formulation of Autonomous Search for Targets

Autonomous search is an ongoing cycle of sensing, statistical estimation, and motion control with the objective to find and localise targets in a designated search area. Traditionally, the theoretical framework for autonomous search combines sequential Bayesian estimation with information theoretic motion control. This paper formulates autonomous search in the framework of possibility theory. Although the possibilistic formulation is slightly more involved than the traditional method, it provides a means for quantitative modelling and reasoning in the presence of epistemic uncertainty. This feature is demonstrated in the paper in the context of partially known probability of detection, expressed as an interval value. The paper presents an elegant Bayes-like solution to sequential estimation, with the reward function for motion control defined to take into account the epistemic uncertainty. The advantages of the proposed search algorithm are demonstrated by numerical simulations.


Introduction
Search is a repetitive cycle of sensing, estimation (localisation), and motion control, with the objective to find and localise one, or as many as possible targets, inside the search volume, in the shortest possible time.The searching platform (agent) is assumed to be mobile and capable of sensing, while the detection process is typically imperfect [1], in the sense that the probability of detection is less than one, with a small (but non-negligible) probability of false alarms.Autonomous search refers to the search by an intelligent agent without human intervention.
Search techniques have been used in many situations.Examples include rescue and recovery operations, security operations (e.g., search for toxic or radioactive emissions), understanding animal behaviour, and military operations (e.g., anti-submarine warfare) [2].Search techniques have also become increasingly important in field robotics [3][4][5][6] for the purpose of carrying out dirty and dangerous missions.A formal search theory has roots in the works by Koopman [7], and has since been expanded and extended to different problems and applications.It can be categorised into a static versus a moving target search, a reactive versus a non-reactive target search, a single versus multiple target search, a cooperative versus a non-cooperative target search, etc. [2].
In this paper, we focus on an area search for an unknown number of static targets using a realistic sensor on a single searching platform, conceptually similar to the problems discussed in [8][9][10].The searching agent can be a drone equipped with a sensor capable of detecting targets on the ground with a certain probability of detection (as a function of range) as well with some false alarm probability.
The dominant theoretical framework for the formulation of search is probability theory, where Bayesian inference is used to sequentially update the posterior probability distribution of target locations as new measurements are collected over time [9,[11][12][13][14].Sensor motion control is typically formulated as a partially observed Markov decision process (POMDP) [15].The information state in the POMDP formulation is represented by the posterior probability distribution of targets.The set of sensor motion controls (actions), which determine where the searching agent should move next, can be made a single step or multiple steps ahead.The reward function in POMDP maps the set of admissible actions to a set of positive real numbers (rewards) and is typically formulated as a measure of information gain (e.g., the reduction in entropy, Fisher information gain) [16].
Statistical inference is based on mathematical models.In the target search context, we need a model of sensing which incorporates the uncertainty with regards to the probability of true and false detection as well as the statistics of target (positional) measurement errors.This uncertainty in the Bayesian framework is expressed by probability functions-in particular, the probability of detection, the probability of false alarm, and the probability density function (PDF) of a positional measurement, given the true target location.The key limitation of the Bayesian approach, however, is that these probabilistic models must be known precisely.Unfortunately, in many practical situations it is difficult or even impossible to postulate precise probabilistic models.Consider, for example, the probability of detection.It typically depends on the (unknown) size and reflective characteristics of the target, and hence, at best can be specified as a confidence interval (rather than a precise probability value), for a given distance to the target.Thus, we need to deal with epistemic uncertainty, which incorporates both randomness and partial ignorance.
In order to deal with epistemic uncertainty, an alternative mathematical framework for inference is required.Such theories involve non-additive probabilities [17] for the representation and processing of uncertain information.They include, for example, possibility theory [18], Dempster-Shafer theory [19], and imprecise probability theory [20].Because the last two theories are fairly complicated, and at present applicable only to discrete state spaces, we focus on possibility theory [21,22].Recent research in nonlinear filtering and target tracking [23][24][25][26][27] has demonstrated that possibility theory provides an effective tool for uncertain knowledge representation and reasoning.
The main contributions of this paper include a theoretical formulation of autonomous search in the framework of possibility theory and a demonstration of its robustness in the presence of epistemic detection uncertainty.The paper presents an elegant Bayes-like solution to sequential estimation, with a definition of the reward function for motion control which takes into account the epistemic uncertainty.Evaluation of the proposed search algorithm considers scenarios with a large number of targets and for two cases for the probability of detection as a function of range: (i) the case when it is known precisely; (ii) the case when it is known only as an interval value.
The paper is organised as follows.Section 2 introduces the autonomous search problem.Section 3 reviews the standard probabilistic formulation of autonomous search and presents the theoretical framework for estimation using possibility functions.Section 4 formulates the new possibilistic solution to autonomous search.Numerical results with a comparison are presented in Section 5, while the conclusions are drawn in Section 6.

Problem Formulation
Consider a search area S. A surveillance drone, flying at a fixed altitude, has a mission to autonomously search and localise the ground-based static targets in S, as in [10].The number and locations of the targets are unknown.Following [9,10], the search area S is discretized into n c ≫ 1 cells of equal size.The presence or absence of a target in the nth cell at a discrete time k = 0, 1, 2, • • • can be modelled by a Bernoulli random variable (r.v.) X k,n ∈ {0, 1}, where by convention X k,n = 1 denotes that a target is present, (i.e., 0 denotes target absence) and n = 1, . . ., n c is the cell index.
Suppose the search agent is equipped with a sensor (e.g., a radar) which illuminates a region L k ⊂ S at time k and collects a set of detections Z k within L k .Each detection reports the Cartesian coordinates of a possible target.However, the sensing process is uncertain in two ways: (1) the reported target coordinates are affected by measurement noise; (2) the measurement set Z k may include false detections and also may miss some of the true target detections.The probability of true target detection is a (monotonically decreasing) function of range and is specified as an interval value for a given range.
The objective is to detect and localise as many targets as possible in the shortest possible time.

Probabilistic Search
Autonomous search in the Bayesian probabilistic framework is typically information driven.The information state at time k is represented by the posterior probability of target presence in each cell of the discretised search area.This posterior probability at time k is denoted by , and therefore, is unnecessary to compute.
The target or threat map is defined as the array P k = [P k,n ].Initially, at time k = 0, the map is specified as P 0,n = 1  2 , for all n = 1, . . ., n c , thus expressing the initial ignorance.As time progresses and the search agent collects measurements, the threat map is sequentially updated using Bayes's rule.Consequently, the information content of the threat map P k increases with time.The information content of the threat map is measured by its entropy, defined as Note that at k = 0, H 0 = 1 and that entropy decreases with time.
In order to explain how the threat map is updated using Bayes's rule, let us introduce another Bernoulli r.v.Y n,k ∈ {0, 1}, where Y n,k = 1 represents the event that a detection from the set Z k has fallen inside the nth cell (Y n,k = 0 represents the opposite event).Bayes's rule is given by where the subscript (n, k) is temporarily removed from X n,k and Y n,k in order to simplify the notation, and i, j ∈ {0, 1}.
Note that Pr{Y = 1|X = 1} = D and Pr{Y = 1|X = 0} = F represent the probability of detection and the probability of false alarm, respectively.Then, Given P k−1 , if none of the detections in Z k falls into the nth cell (i.e., Y n,k = 0), the probability P n,k is updated according to (2), as where D k,n is the probability of detection and F k,n is the probability of false alarm in the nth cell of search area S at time k.
If Z k contains a detection in the nth cell (i.e., Y n,k = 1), then the update equation according to (2) is After collecting the measurement set Z k−1 , the searching agent must decide on its subsequent action, that is, where to move (and sense) next.Suppose the set of possible actions (for movement) is A k .This set can be formed by considering one or more motion steps ahead (in the future).The reward function associated with every action α ∈ A k is typically defined as the reduction in entropy of the threat map [10], that is, Note the expectation operator E with respect to the (future) detection set Z k (α).In practical implementation, in order to simplify computation, we typically adopt an approximation that circumvents E in (5).This approximation involves the assumption that a single realisation for Z k (α) is sufficient: the one which results in hypothetical detection(s) at those cells which are characterised by a high probability of target presence, i.e., such that where ζ is a threshold close to 1.The searching agent chooses the action which maximises the reward, i.e.,

The Possibilistic Estimation Framework
Possibility theory is developed for quantitative modelling of epistemic uncertainty.The concept of the uncertain variable in possibility theory, plays the same role as the random variable in probability theory.The main difference is that the quantity of interest is not random, but simply unknown, and our aim is to infer its true value out of a set of possible values.The theoretical basis of this approach can be found in [28][29][30].Briefly, the uncertain variable is a function X : Ω → X , where Ω is the sample space and X is the state space (the space where the quantity of interest lives).Our current knowledge about X can be encoded in a function π X : X → [0, 1], such that π X (x) is the possibility (credibility) for the event X = x.Function π X is not a density function, it is referred to as a possibility function, being the primitive object of possibility theory [22].It can be viewed as a membership function determining the fuzzy restrictions of minimal specificity (in the sense that any hypothesis not known to be impossible cannot be ruled out) about x [18].
In the formulation of the search problem, we will deal with two binary uncertain variables, corresponding to r.v.s X k,n and Y k,n .Hence, let us focus on a discrete uncertain variable X and its state space X = {x 1 , . . ., x N }.The possibility measure of an event A ⊆ X is a mapping Π X : 2 X → [0, 1], where 2 X is the set of all subsets of X .Mapping Π X satisfies three axioms: (1) Π X (∅) = 0; (2) Π X (X ) = 1; and (3) the possibility of a union of disjoint events A 1 and A 2 is given by Π Possibility measure Π X is related to the possibility function π X as follows: for every A ⊆ X .There is also a notion of the necessity of an event N X (A), which is dual to where A c is the complement of A in X .One can interpret the necessity-possibility interval [N X (A), Π X (A)] as the belief interval, specified by the lower and upper probabilities in the sense of Willey [20].Note that for a binary variable X ∈ {0, 1}, this interval can be expressed for event where, due to normalisation, the following condition must be satisfied: max{Π Bayes-like updating in possibility theory is described next.Suppose π(x) is the prior possibility function over the state space X = {x 1 , . . ., x N }.Let γ(z|x) be the likelihood of receiving measurement z ∈ Z if x ∈ X is true.Then, the posterior possibility of x ∈ X is given by [28,31,32]

Information State
The information state at time k in the framework of possibility theory will be represented by two posteriors: 1.
The posterior possibility of target presence The posterior probability of target absence Π 0 k,n = Π X k,n ({0}|Z 1:k ).We need both of them, because Π 0 k,n cannot be worked out from Π 1 k,n .Consequently, during the search two posterior possibility maps need to be updated sequentially over time, where n = 1, . . ., n c .Suppose now that the probability of detection is specified by an interval value, that is, where D k,n and D k,n represent the lower and upper probability of this interval, respectively.Because a detection event is a binary variable, due to the reachability constraint for probability intervals [33], (9) implies that the probability of non-detection is in interval Then, via normalisation we can express the possibility of detection and the possibility of non-detection D 0 k,n (in cell n at time k) as ] represents the necessity-possibility interval for the probability of detection.Note that specification of a possibility function from a probability mass function expressed by probability intervals is not unique; for example, another more involved method for this task is via the maximum specificity criterion [34].
In general, the probability of detection D k,n by a sensor, as well as the two possibilities D 0 k,n and D 1 k,n , are typically dependent on the distance d n,k between the nth grid cell and the searching agent's position at time k.
In a similar manner, we can also assume that the probability of false alarm is specified by an interval value, that is, , where F 0 k,n and F 1 k,n represent the possibility of no false alarm and the possibility of false alarm (in cell n at time k), respectively.
Next, we explain how to sequentially update, during the search, the two posterior possibility maps Π 1 k (for target presence) and Π 0 k (for target absence).The proposed update equations follow from (3) and (4), when we apply the Bayes-like update rule (8).
Given Π 1 k−1 and detection set Z k , if none of the detections in Z k falls into the nth cell, the possibility of target presence in the nth cell is updated as follows: for n = 1, . . ., n c .Similarly, in this case Π 0 k,n is updated according to If a detection from Z k falls into the nth cell, then the update equation for Π 1 k,n can be expressed as And finally, in this case the update equation for Π 0 k,n is given by Note that the probability of target presence in each cell of the search area, using the described possibilistic approach, is expressed by a necessity-possibility interval, i.e., for n = 1, . . ., n c , where max{Π 0 k,n , Π 1 k,n } = 1.Initially, at time k = 0 (before any sensing action), the posterior possibility maps are set to meaning that P 0,n ∈ [0, 1], for n = 1, . . ., n c .This is an expression of initial ignorance about the probability of target presence in the nth cell.

Epistemic Reward
Let us first quantify the amount of uncertainty contained in the information state, represented by two posterior possibility maps: Π 1  k and Π 0 k .Various uncertainty (and information) measures in the context of non-additive probabilistic frameworks have been proposed in the past [35][36][37].We adopt the principle that epistemic uncertainty corresponds to the volume under the possibility function [25,37].For a possibility function π over a discrete finite state space X = {x 1 , . . ., x N }, epistemic uncertainty equals the sum ∑ N i=1 π(x i ).The possibilistic entropy G k , contained in the information state, represented by Π 1  k and Π 0 k , is then defined as Equation ( 18) can be interpreted as the average volume of possibility functions of all binary variables X n,k , for n = 1, . . ., n c .Subtraction by 1 on the right-hand side of (18) ensures that G k ∈ [0, 1].Thus, at k = 0, when Π 0 0,n = Π 1 0,n = 1, we have G 0 = 1.This means that initially (at the start of the search), the amount of information contained in the information state is zero (representing total ignorance).As the searching agent moves and collects measurements it gains knowledge, and as a result either Π 0 k,n or Π 1 k,n will reduce its value in some cells (keeping in mind that max{Π 0 k,n , Π 1 k,n } = 1), thus reducing the possibilistic entropy G k .Finally, G k = 0 if either Π 0 k,n = 0 (and due to normalisation k,n = 0 (and Π 0 k,n = 1) for all cells n = 1, . . ., n c .Note that (18) can also be expressed as which gives another interpretation of possibilistic entropy G k : it represents the average necessity-possibility interval over all cells in the search area.This interpretation does not mean that G k is a measure of uncertainty only due to imprecision, because (18) and ( 19) are equivalent.Similar to (5), we define the reward function as the reduction in possibilistic entropy of the information state, expressed by maps Π 1  k and Π 0 k .Mathematically, this is expressed as where, as before, α ∈ A k is an action from the set of admissible actions at time k and E is the expectation with respect to the (random) measurement set Z k (α).Again, in order to simplify the computation, we make the same assumption described in relation to ( 5): a single realisation for Z k (α) consisting of hypothetical detection(s) at those cells which are characterised by Finally, the searching agent chooses the action which maximises the reward, as in (6).
The search mission is terminated when the reduction in possibilistic entropy falls below a specified threshold, i.e., when G k−1 − G k < ξ.

Simulation Setup and a Single Run
We use a simulation setup similar to [10].The search area S is a rectangle of size 100 km × 90 km, discretised into n c = 100 × 90 resolution cells of size 1 km 2 .A total of 80 targets are placed at (a) uniformly random locations across the search area; (b) two squares in diagonal corners of the search area.A typical scenario with a uniform distribution of targets is shown in Figure 1, where cyan coloured asterisks indicate where the targets are placed.The probability of detection D is modelled as a function of the distance between the nth grid cell and the searching agent's position at time k.The following mathematical model is adopted for this purpose: where d ≥ 0 is the distance, while µ > 0 and σ > 0 are modelling parameters.Figure 2 illustrates this model; it displays the imprecise model of the probability of detection D as a function of distance d, using (21) with two sets of parameters µ and σ (the orange-coloured area).The search algorithm described in Section 4 is using this imprecise model for its search mission.The model provides the upper and lower probabilities D k,n and D k,n for a given range, from which we can work out D 1 k,n and D 0 k,n , via (10) and (11), respectively.The true value of the probability of detection, which is used in the generation of simulated measurements (but which is unknown to the search algorithm), is plotted with the solid blue line in Figure 2. The truth is also based on model (21), using one particular pair of µ and σ values (The actual values used for the orange-coloured area in Figure 2 are µ 1 = 8000, σ 1 = 2200, µ 2 = 18,000, and σ 2 = 2200.The true probability of detection (blue line in Figure 2) is obtained using µ = 9000 and σ = 2200).With this specification, the probability of detecting a target located more than a certain distance ρ max from the searching agent is practically zero.Assuming 360 • coverage, the sensing area L k is a circular area of radius ρ max .The spatial distribution of false alarms is assumed to be uniform over L k , with probability F k,n = 0.005 (per cell of L k ).For simplicity, we will assume that this parameter is known as the precise value to the search algorithm of Section 4. The threshold parameter ζ is set to 0.8.
Sensor measurements are affected by additive Gaussian noise with the standard deviation in range and azimuth of 100 m and 1 • , respectively.An additional assumption is that there is at most one target per cell and one detection per cell.
The results of a single run of the possibilistic search at time k = 140 for a uniform placement of targets is shown in Figures 1, 3, and 4. Figure 1 displays the search path (blue dotted line).The searching agent enters the search area S in the bottom left corner, and follows an inward-spiral path, in accordance with the probabilistic search [10].Figure 3 shows the two posterior possibilistic maps: (a) target presence Π 1 k ; and (b) target absence Π 0 k .The colour coding is as follows: white cells of the maps indicate zero possibility, while black cells denote the possibility is equal to 1. Figure 3a indicates that the area around the travelled path in Π 1  k is mainly white, with occasional black cells where targets are possibly located.In those cells of the search area S where Π 1  k is high (black colour) and Π 0 k is low (white colour), there is a high chance that a target is placed.Therefore, the presence of a target in each cell of the search area is declared if the difference Π 1 k,n − Π 0 k,n > 0.8.The output of the search algorithm at k = 140 is shown in Figure 4, which represents a map of estimated target positions: each red asterisk indicates a cell where the search algorithm declared a target.We can visually compare Figure 4 (estimated target positions at k = 140) with Figure 1 (true target positions).
If the search were to be continued beyond k = 140, the full spiral path would be completed at about k = 200 (for an average run).After that, the rate of reduction in possibilistic entropy would significantly drop and the search algorithm would automatically stop (according to the termination criterion).

Monte Carlo Runs
Next, we compare the average search performance of the possibilistic search versus the probabilistic search.The adopted metric for search performance is the optimal sub-pattern assignment (OSPA) error, because it expresses in a mathematically rigorous manner the error both in the target position estimate and in the target number (cardinality error) [39].The parameters of OSPA error used are cut-off c = 50 km and order p = 1.The mean OSPA error is estimated by averaging over 100 Monte Carlo runs, with a random placement of targets on every run.Because the search duration is random, for the sake of averaging the OSPA error, we fixed the duration to k = 201 time steps.
In order to apply the probabilistic search for the problem specified in Section 5.1, we must adopt a precise (rather than an interval-valued) probability of detection.For comparison's sake, we will consider two cases: (a) when the true probability of detection versus range (i.e., the blue line in Figure 2) is known; (b) given the interval-valued probability of detection (orange area in Figure 2), we choose the mid-point of the interval at a given range as the true value.Case (a) is ideal and is expected to result in the best performance, whereas case (b), because it uses an incorrect value of the probability of detection, is expected to perform worse.
The resulting three mean OSPA errors are presented in Figure 5 for two different target placements: (i) uniformly random target locations across the search area; (ii) random placement in two squares positioned in diagonal corners of the search area.The mean OSPA line colours in Figure 5 are as follows: black for possibilistic search; blue for probabilistic using true D (i.e., ideal case (a) above); red for probabilistic using wrong D (i.e., case (b)).All three mean OSPA error curves follow the same trend: they reduce steadily from the initial value of c as the searching agent traverses the area along the spiral path and discovers the targets.Of the three compared methods, as expected, the best performance (i.e., the smallest OSPA error) is achieved using the probabilistic with true D (ideal case).The possibilistic solution, which operates using the available interval-valued probability of detection, is fairly close to the ideal case.Finally, the probabilistic using the wrong value of D is the worst.The difference in performance is particularly dramatic when the placement of targets is non-uniform. (a)

Conclusions
This paper formulated a solution to autonomous search for targets in the framework of possibility theory.The main rationale for the possibilistic formulation is its ability to deal with epistemic uncertainty, expressed by partially known probabilistic models.In this paper, we focused on the interval-valued probability of detection (as a function of range).The paper presented Bayes-like update equations for the information state in the possibilistic framework, as well as an epistemic reward function for motion control.The numerical results demonstrated that the proposed possibilistic formulation of search can deal effectively with epistemic uncertainty in the form of interval-valued probability of detection.As expected, the (conventional) probabilistic solution performs (sightly) better when the correct precise model of the probability of detection is known (the ideal model-match case).However, the probabilistic solution can result in dramatically worse performance if an incorrect precise model is adopted.

Figure 1 .
Figure 1.Simulation setup: the cyan stars indicate the true targets; the blue dotted line is the trajectory of the searching agent up to k = 140 steps; the red dots indicate detections at k = 140.

Figure 2 .
Figure 2. The imprecise model of the probability of detection D used in simulations.The true D is plotted with the blue solid line.

Figure 4 .
Figure 4. Output of the search algorithm: Estimated target locations (indicated by red asterisks).