Defense against Adversarial Swarms with Parameter Uncertainty

This paper addresses the problem of optimal defense of a high-value unit (HVU) against a large-scale swarm attack. We discuss multiple models for intra-swarm cooperation strategies and provide a framework for combining these cooperative models with HVU tracking and adversarial interaction forces. We show that the problem of defending against a swarm attack can be cast in the framework of uncertain parameter optimal control. We discuss numerical solution methods, then derive a consistency result for the dual problem of this framework, providing a tool for verifying computational results. We also show that the dual conditions can be computed numerically, providing further computational utility. Finally, we apply these numerical results to derive optimal defender strategies against a 100-agent swarm attack.


Introduction
Swarms are characterized by large numbers of agents which act individually, yet produce collective, herd-like behaviors. Implementing cooperating swarm strategies for a large-scale swarm is a technical challenge which can be considered to be from the "insider's perspective". It assumes inside control over the swarm's operating algorithms. However, as large-scale 'swarm' systems of autonomous systems become achievablesuch as those proposed by autonomous driving, UAV package delivery, and military applications-interactions with swarms outside our direct control become another challenge. This generates its own "outsider's perspective" issues.
In this paper, we look at the specific challenge of protecting an asset against an adversarial swarm. Autonomous defensive agents are tasked with protected a high-value unit (HVU) from an incoming swarm attack. The defenders do not fully know the cooperating strategy employed by the adversarial swarm. Nevertheless, the task of the defenders is to maximize the probability of survival of the HVU against an attack by such a swarm. This challenge raises many issues-for instance, how to search for the swarm [1], how to observe and infer swarm operating algorithms [2], and how to best defend against the swarm given algorithm unknowns, and only limited, indirect control through external means. In this paper, we restrict ourselves to the last issue. However, these problems share multiple technical challenges. The preliminary approach we apply in this paper demonstrates some basic methods which we hope will stimulate the development of more sophisticated tools.
For objectives achieved via external control of the swarm, several features of swarm behavior must be characterized: capturing the dynamic nature of the swarm, tracking the collective risk profile created by a swarm, and engaging with a swarm via dynamic inputs, such as autonomous defenders. The many modeling layers create a challenge for generating an effective response to the swarm, as model uncertainty and model error are almost certain. In this paper, we look at several dynamic systems where the network structure is determined by parameters. These parameters set neighborhood relations and interaction rules. Additional parameters establish defender input and swarm risk.
We consider the generation of optimal defense strategies given uncertainty in parameter values. We demonstrate that small deviances in parameter values can have catastrophic effects on defense trajectories optimized without taking error into account. We then demonstrate the contrasting robustness of applying an uncertain parameter optimal control framework instead of optimizing with nominal values. The robustness against these parameter values suggests that refined parameter knowledge may not be necessary given appropriate computational tools. These computational tools-and the modeling of the high-dimensional swarm itself-are expensive. To assist with this issue, we provide dual conditions for this problem in the form of a Pontryagin minimum principle and prove the consistency of these conditions for the numerical algorithm. These dual conditions can, thus, be computed from the numerical solution of the computational method and provide a tool for solution verification and parameter sensitivity analysis.
Although in this paper, optimal strategies against swarms motivate the framework of uncertain parameter optimal control, and the subsequent development of the dual conditions, both the framework and the dual conditions have many applications beyond swarm defense. Optimal control with parameter uncertainty is relevant to robotics-where parts, such as wheels, may have small size and calibration uncertainties; aerospace-where both components and exogeneous factors, such as wind, may be modeled using parameter uncertainty; and search and rescue-where the location of a target object can be considered a parameter uncertainty [3,4]. It is also an instance of mean-field optimal control (which includes this framework, but also more general probability distributions), which is finding application in the training of neural networks [5]. The dual conditions provided in this paper provide both a tool for verification of numerical solutions, as well as another potential route for generating numerical solutions.
The structure of this paper is as follows. Section 2 provides examples of dynamic swarming models and extensions for defensive interactions. Section 3 discusses optimization challenges and describes a general uncertain parameter optimal control framework that this problem could be addressed with. Section 4 provides a proof of the consistency of the dual problem for this control framework, which expands on the results initially presented in the conference paper [6]. Section 5 gives an example of numerical implementation that demonstrates optimal defense against a large-scale swarm of 100 agents. Section 6 discusses the results and future work.

Cooperative Swarm Models
The literature on the design of swarm strategies which produce coherent, stable collective behavior has become vast. A quick review of the literature points to two main trends/categories in swarm behavior design. The first relies on dynamic modeling of the agents and potential functions to control their behavior (see [7,8] and references therein). The second trend relates to the use of rules to describe agents' motion and local rule-based algorithms to control them [9,10].
We present two examples of dynamic swarming strategies from the literature. These examples are illustrative of the forces considered in many swarming models: • collision avoidance between swarm members; • alignment forces between neighboring swarm members; • stabilizing forces.
These intra-swarm goals are aggregated to provide a swarm control law, which we will refer to as F S , to each swarm agent. Both example models in this paper share the same double integrator form with respect to this control law. For n swarm agents, the dynamics are defined byẍ In this model [11,12], swarm agents track to a virtual body (or bodies) guiding their course, while also reacting to intra-swarm forces of collision avoidance and group cohesion. The input u i is the sum of intra-swarm forces, virtual body tracking, and a velocity dampening term. In addition, in this adversarial scenario, swarm agents are influenced to avoid intruding defense agents. The intra-swarm force between two swarm agents has magnitude f I and is a gradient of an artificial potential V I . Let The artificial potential V I depends on the distance ||x ij || between swarm agents i and j. The artificial potential V I is defined as: where α is a scalar control gain, and d 0 and d 1 are scalar constants for distance ranges. Then the magnitude of interaction force is given by The swarm body is guided by 'virtual leaders', non-corporeal reference trajectories which lead the swarm. We assign a potential V h on a given swarm agent i associated with the k-th virtual leader, defined with the distance ||h ik || between the swarm agent i and leader k. Mirroring the parameters α, d 0 , and d 1 defining V I , we assign V h the parameters α h , h 0 , and h 1 . An additional dissipative force f v i is included for stability. The control law u i for the vehicle i associated with m defenders is given by

Example Model 2: Reynolds Boid Model
In this model [8,13], for radius r, j = 1, . . . , N, define the neighbors of agent i at position x i ∈ R n by the set Swarm control is designated by three forces. Alignment of velocity vectors: Cohesion of swarm: Separation between agents: for positive constant parameters w al , w coh , w sep .

Adversarial Swarm Models
The previous subsection provides several examples of inner swarm cooperative forces, F S . In order to enable adversarial behavior and defense, these inner swarm cooperative forces need to be supplemented by additional forces of exogenous input into the collective. As written, the above cooperative swarming models neither respond to outside agents nor 'attack' (swarm towards) a specific target. We, thus, supplement the control laws above with two additional forces. The first, we refer to as F HVU ; the goal of the swarm, in this paper, is limited to tracking an HVU. An example of F HVU is provided in the example of Section 5, in Equation (28).
We also supplement by an adversarial force, which we refer to as F D . The review [7] discusses several approaches to adversarial control. Examples include containment strategies modeled after dolphins [14], sheep-dogs [15,16], and birds of prey [17]. In [18], the authors studied the interaction between two swarms, one of which could be considered adversarial. In these examples of adversarial swarm control, the mechanism of interaction and defense is provided through the swarm's own pursuit and evasion responses. This indirectly uses the swarm's own response strategy against it-an approach which can be termed 'herding'.
In addition to herding reactions, one can consider more direct additional forces of disruption, to model neutralizing swarm agents and/or physically remove them from the swarm. One form this can take, for example, is the removal of agents from the communications network, as considered in [19]. Another approach is taken in [20], which uses survival probabilities based on damage attrition. Defenders and the attacking swarm engage in mutual damage attrition while the swarm also damages the HVU when in proximity to it. Probable damage between agents is tracked as damage rates over time, where the rate of damage is based on features such as distance between agents and angle of attack. The damage rate at time t provides the probability of a successful 'hit' in time period [t, t + ∆t]. The probability of agent survival can be modeled based on the aggregate number of hits it takes to incapacitate the agent. The authors of [20] provide derivations for multiple possibilities, such as single-shot destruction and N-shot destruction. These probabilities take the form of ODE equations. Tracking survival probabilities thus adds an additional state to the dynamics of each agent-a survival probability state.
We, thus, summarize a control scheme with HVU target-tracking and herding driven by the reactive forces of collision avoidance with the defenders as the following, for HVU states y 0 and defender states y k , k = 1, . . . , K:

Example Attrition Model: Single-Shot Destruction
From [20]: let P 0 (t) be the probability the HVU has survived up to time t, P k (t), k = 1, . . . , K, the probability defender k has survived, and Q j (t), j = 1, . . . , N the probability swarm attacker j has survived. Let d j,k y (x j (t), y k (t)) be the damage the defender y k inflicts on swarm attacker x j and let d k,j x (y k (t), x j ) be the damage the swarm attacker x j inflicts on the defender y k , with the HVU represented by k = 0.
Then the survival probabilities for attackers and defenders from single-shot destruction are given by the coupled ODEs:

Problem Formulation
The above models depend on a large number of parameters. The dynamic swarming model coupled with attrition functions results in over a dozen key parameters, and many more would result from a non-homogeneous swarm. A concern would be that this adds too much model specificity, making optimal defense strategies lack robustness due to sensitivity to the specific set of model parameters. This concern turns out to be justified. When defense strategies are optimized for fixed, nominal parameter values, they display catastrophic failure for small perturbations of certain parameters, as can be seen in Figure 1. In fact, the plots included in Figure 1 clearly demonstrate that the sensitivity of the cost with respect to the uncertain parameters is highly non-linear. Thus, generating robust defense strategies requires a more sophisticated formalism introduced in the next Section 3.1.

Uncertain Parameter Optimal Control
The class of problems addressed by the computational algorithm is defined as follows: subject to the dynamics initial condition x(0, θ) = x 0 (θ), and the control constraint the Sobolev space of all essentially bounded functions with essentially bounded distributional derivatives, and F : Additional conditions imposed on the state and control space and component functions are specified in Appendix A. In Problem P, the set Θ is the domain of a parameter θ ∈ R n θ . The format of the cost functional is that of the integral over Θ of a Mayer-Bolza type cost with parameter θ. This parameter can represent a range of values for a feature of the system, such as in ensemble control [21], or a stochastic parameter with a known probability density function.
For computation of numerical solutions, we introduce an approximation of Problem P, referred to as Problem P M . Problem P M is created by approximating the parameter space, Θ, with a numerical integration scheme. This numerical integration scheme is defined in given certain function smoothness assumptions. See Appendix A Assumption A1 for formal assumptions. Throughout the paper, M is used to denote the number of nodes used in this approximation of parameter space. For a given set of nodes . . , M, be defined as the solution to the ODE created by the state dynamics of Problem P evaluated at θ M i : LetX The system of ODEs definingX M has dimension n x × M, where n x is the dimension of the original state space and M is the number of nodes. The numerical integration scheme for parameter space creates an approximate objective functional, defined by: In [4], the consistency of P M is proved. This is the property that, if optimal solutions to Problem P M converge as the number of nodes M → ∞, they converge to feasible, optimal solutions of Problem P. See [4] for detailed proof and assumptions.

Computational Efficiency
The computation time of the numerical solution to the discretized problem defined in Equations (16) and (17) will depend on the value of M. Ideally, it should be sufficiently small so as to allow for a fast solution. On the other hand, a value of M that is too small will result in a solution that is not particularly useful, i.e., too far from the optimal. Naturally, the question arises: how far is a particular solution from the optimal? One tool for assessing this lies in computing the Hamiltonian and is addressed in Section 4.

Consistency of Dual Variables
The dual variables provide a method to determine the solution of an optimal control problem or a tool to validate a numerically computed solution. For numerical schemes based on direct discretization of the control problem, analyzing the properties of the dual variables and their resultant Hamiltonian may also lead to insight into the validity of an approximation scheme [22,23]. This could be especially helpful in highdimensional problems, such as swarming, where parsimonious discretization is crucial to computational tractability.
Previous work shows the consistency of the primal variables in approximate Problem P M to the original parameter uncertainty framework of Problem P. Here, we build on that and prove the consistency of the dual problem of Problem P as well. This theoretical contribution is diagrammed in Figure 2. The consistency of the dual problem in parameter space enables approximate computation of the Hamiltonian from numerical solutions. In [24], necessary conditions for Problem P were established. These conditions are as follows: Problem P λ [( [24], pp. 80-82)]. If (x * , u * ) is an optimal solution to Problem P, then there exists an absolutely continuous costate vector λ * (t, θ), such that for θ ∈ Θ: where H is defined as: Furthermore, the optimal control u * satisfies where H is given by Because Problem P M is a standard non-linear optimal control problem, it admits a dual problem as well. Problem P Mλ , provided by the Pontryagin minimum principle (a survey of minimum principle conditions is given by [25]). Applied to P M this generates: that satisfies the following conditions: whereH M is defined as: An alternate direction from which to approach solving Problem P overall is to approximate the necessary conditions of Problem P , i.e., Problem P λ , directly rather than to approximate Problem P. This creates the system of equations: for i = 1, . . . , M, where H is defined as: This system of equations can be re-written in terms of the quadrature approximation of the stationary Hamiltonian defined in Equation (20). Definẽ ] denote the semi-discretized states from Equation (16). Equation (23) can then be written as: for i = 1, . . . , M. Thus, we reach the following discretized dual problem: whereH M is defined as: Lemma 1. The mapping: whereH M is the Hamiltonian of Problem P λM as defined by Equation (26) and H is the Hamiltonian of Problem P as defined by Equation (20). The proof of this theorem can be found in the Appendix B.
The convergence of the Hamiltonians of the approximate, standard control problems to the Hamiltonian of the general problem, H(x ∞ , λ ∞ , u ∞ , t), means that many of the useful features of the Hamiltonians of standard optimal control problems are preserved. For instance, it is straightforward to show that the satisfaction of Pontryagin's minimum principle by the approximate Hamiltonians implies minimization of H(x ∞ , λ ∞ , u ∞ , t) as well. That is, that for all feasible u. Furthermore, when applicable, the stationarity properties of the standard control Hamiltonian, such as a constant-valued Hamiltonian in time-invariant problems, or stationarity with respect to u(t) in problems with open control regions, are also preserved.

Numerical Example
In a slight refashioning of the notation in the Section 2.2, Equation (12), let the parameter vector θ be defined by all the unknown parameters defining the interaction functions. Assuming prior distribution φ(θ) over these unknowns and parameter bounds Θ, we construct the following optimal control problem for robustness against the unknown parameters.
Problem SD (Swarm Defense). For K defenders and N attackers, determine the defender controls u k (t) that minimize: subject to: for swarm attackers j = 1, . . . , N and controlled defenders k = 1 . . . , K.
We implement Problem SD for both swarm models in Section 2.1, for a swarm of N = 100 attackers and K = 10 defenders.

Example Model 1: Virtual Body Artificial Potential
The cooperative swarm forces F S are defined with the Virtual Body Artificial Potential of Section 2.1 with parameters α, d 0 and d 1 . In lieu of a potential for the virtual leaders, we assign the HVU tracking function: where y 0 ∈ R 3 is the position of the HVU. The dissipative force f v i = −K 2ẋi is employed to guarantee stability of the swarm system. K 1 and K 2 are positive constants. The swarm's collision avoidance response to the defenders is defined by Equation (4) with parameters α h , h 0 and h 1 . Since there is only a repulsive force between swarm members and defenders, not an attractive force, we set h 1 = h 0 . For attrition, we use the the damage function defined in Equation (21) of [20]: where Φ is the cumulative normal distribution and · 2 is the vector 2-norm. This function smoothly penalizes proximity, with the impact decreasing with distance. The parameters λ, F, a, and σ shape the steepness of this function and the decline of damage over distance. For the damage rate of defenders inflicted on attackers, we calibrate by the parameters λ D , σ D . For the damage rate of attackers inflicted on defenders, we calibrate by the parameters λ A , σ A . In both cases, the parameters F and a in [20] are set to F = 0, a = 1. Table 1 provides the parameter values that remain fixed in each simulation, and and Table 2 provides the parameters we consider as uncertain.  We first use the nominal parameter values provided in Tables 1 and 2 to find a nominal solution defender trajectories that result in the minimum probability of HVU destruction. With the results of these simulations as a reference point, we consider as uncertain each of the parameters that define attacker swarm model and weapon capabilities. In this simulation, these parameters are considered individually. The number of discretization nodes for parameter space was chosen by examination of the Hamiltonian. To illustrate this method and the results obtained in Section 4 we compute Hamiltonians for the Problem SD and Model 1 with θ = d 0 , d 0 ∈ [0.5, 1.5] and M = [5,8,11]. As M increases the sequence of Hamiltonians should converge to the optimal Hamiltonian for the Problem SD. For Problem SD that should result in a constant, zero-valued Hamiltonian. Figure 3 shows the respective Hamiltonians for M = [5,8,11]. The value M = 11 was chosen for simulations, based on the approximately zero-valued Hamiltonian it generates. We compare the performance of the solution generated using uncertain parameter optimal control Problem SD versus a solution obtained with the nominal values. Figure 4 shows the nominal solution trajectories. The comparitive results of the nominal solutions vs the uncertain parameter control solutions are shown in Figure 5, where the performance of each is shown for different parameters values.  As seen in Figure 5 the trajectories generated by optimization using the nominal values perform poorly over a range of α, d 0 , σ A , α k and h 0 . In the case of h 0 , for example, this is because the attackers are less repelled by the defenders when h 0 is decreased, and they are more able to destroy the HVU from a longer distance as σ A is increased. The parameter uncertainty solution, however, demonstrates that using the uncertain parameter optimal control framework a solution can be provided which is robust over a range of parameter values. We contrast these results with the case of uncertain parameters d 1 and λ A , also shown in Figure 5. It can be seen that robustness improvements are modest to non-existent for these parameters. This suggests an insensitivity of the problem d 1 and λ A parameters. This kind of analysis can be used to guide inference and observability priorities.

Example Model 2: Reynolds Boid Model
To demonstrate flexibility of the proposed framework to include diverse swarm models we have applied the same analysis as was done in Section 5.1 to the Reynolds Boid Model introduced in Section 2.1. We apply the same HVU tracking function as Equation (28). The herding force F D of the defenders repelling attackers is applied as a separation force in the form of Equation (10). The fixed parameter values are the same as those in Table 1; the uncertain parameters and ranges are given in Table 3. The results are shown in Figure 6. Again, we see that the tools developed in this paper can be used to gain an insight into the robustness properties of the nominal versus uncertain parameter solutions. For example, we can see that the uncertain parameter solutions perform much better than the nominal ones for the cases where λ, σ and w I are uncertain.

Conclusions
In this paper, we have built on our previous work on developing an efficient numerical framework for solving uncertain parameter optimal control problems. Unlike uncertainties introduced into systems due to stochastic "noise", parameter uncertainties do not average or cancel out in regard to their effects. Instead, each possible parameter value creates a specific profile of possibility and risk. The uncertain optimal control framework which has been developed for these problems exploits this inherent structure by producing answers which have been optimized over all parameter profiles. This approach takes into account the possible performance ranges due to uncertainty, while also utilizing what information is known about the uncertain features, such as parameter domains and prior probability distributions over the parameters. Thus, we are able to contain risk, while providing plans which have been optimized for performance under all known conditions. The results reported in this paper include analysis of the consistency of the adjoint variables of the numerical solution. In addition, the paper includes a numerical analysis of a large scale adversarial swarm engagement that clearly demonstrates the benefits of using the proposed framework.
There are many directions for future work for the topics of this paper. The numerical simulations in this paper consider the parameters individually, as one-dimensional parameter spaces. However, Problem P allows for multi-dimensional parameter spaces. A more dedicated implementation, taking advantage of the parallelizable form of Equation (16), for example, could certainly manage several simultaneous parameters. Exponential growth as parameter space dimension increases is an issue for both the quadrature format of Equation (15) and handling of the state space size for Equation (16). This can be somewhat mitigated by using sparse grid methods for high-dimensional integration to define the nodes in Equation (15). For large enough sizes, Monte Carlo sampling, rather than quadrature might be more appropriate for designating parameter nodes.
A further direction for future work would be to incorporate these methods into the design of more responsive closed-loop control solutions. The optimization methods in this paper provide open-loop controls. While useful, closed-loop controls would be more ideal for dynamic situations with uncertainty. There are many ways, however, that open-loop solutions can provide stepping stones to developing closed-loop solutions. For instance, Ref. [26] utilizes closed-loop solutions to train a neural network to learn an optimal closed-loop control strategy. Open-loop solutions can also be used to provide initial guesses to discretized closed-loop optimizations, seeding the optimization algorithm.
Another direction for future work is in the greater application of the duality results of Section 4. The numerical results in this paper simply utilize the Hamiltonian consistency. The proof of Theorem 1, however, additionally demonstrates the consistency of the adjoint variables for the problem. As the results demonstrate, parameter sensitivity for these swarm models is highly non-linear. The numerical solutions of Section 5 are able to demonstrate this sensitivity by applying the solution to varied parameter values. However, this is actually a fairly expensive method for a large swarm, as it involves re-evaluation of the swarm ODE for each parameter value. More importantly, it would not be scalable to highdimensional parameter spaces, as the exponential growth of that approach to sensitivity analysis would be unavoidable. The development of an analytical adjoint sensitivity method for this problem could be of great utility for paring down numerical simulations to only focus on the parameters most relevant to success. x ∞ (0, θ) = x 0 (θ) where H is defined as per Equation (19), and let {(x M (t, θ), λ M (t, θ))} for M ∈ V be the sequence of solutions to the dynamical systems: Then, the sequence {(x M (t, θ), λ M (t, θ))} converges pointwise to (x ∞ (t, θ), λ ∞ (t, θ)) and this convergence is uniform in θ.
Proof. The convergence of {x M (t, θ)} is given by Lemmas 3.4 and 3.5 of [4]. The convergence of the sequence of solutions {λ M (t, θ)} is guaranteed by the optimality of {u M }. The convergence of {λ M (t, θ)} then follows the same arguments given the convergence of {x M (t, θ)}, utilizing the regularity assumptions placed on the derivatives of F, r, and f with respect to x to enable the use of Lipschitz conditions on the costate dynamics and transversality conditions. Remark A1. Note that λ M (t, θ) is not a costate of Problem P λM , since it is a function of θ. However, when θ = θ M i , then λ M (t, θ M i ) =λ M i (t), whereλ M i is the costate of Problem P λM generated by the pair of solutions to Problem P M , (x M i , u * M ) . In other words, the function λ M (t, θ) matches the costate values at all collocation nodes. Since these values satisfy the dynamics equations of Problem P λM , a further implication of this is that the values of λ M (t, θ M i ) produce feasible solutions to Problem P λM .
Remark A2. Since the functions {(x M (t, θ), λ M (t, θ))} obey the respective identities x M (t, θ M i ) = x M i (t) and λ M (t, θ M i ) =λ M i (t), their convergence to (x ∞ (t, θ), λ ∞ (t, θ)) also implies the convergence of the sequence of discretized primals and duals, {X M } and {Λ M }, to accumulation points given by the relations We now prove Theorem 1. Let {(x M (t, θ), λ M (t, θ))} for M ∈ V be the sequence of solutions defined by Equation (A3) and let (x ∞ (t, θ), λ ∞ (t, θ)) be the accumulation functions defined by Equation (A1). Incorporating Remarks A1 and A2, we have: Due to the results of Lemma A1, and applying Remark 1 of [4] on the convergence of the quadrature scheme for uniformly convergent sequences of continuous functions, we find that: