In this section, some preliminaries are provided and the problem formulation is briefly discussed. First, the dynamic of the networked multi-UAS is given; then, flock topology is modeled by means of a dynamic graph. Next, network-induced delay and BELBIC model are introduced, and, ultimately, the problem is formulated.

#### 2.1. Flock Modelling

Assuming the movement of the flock in an

m–dimensional space

$(m=2,3)$, the equation of motion of the

ith agent with continuous-time double integrator dynamics could be described according to the following set of equations:

where

${u}_{i}\left(t\right)\in \mathrm{I}\phantom{\rule{-0.166667em}{0ex}}{\mathrm{R}}^{\mathrm{m}}$ is the control input,

$\{{q}_{i}\left(t\right),{p}_{i}\left(t\right)\}\in \mathrm{I}\phantom{\rule{-0.166667em}{0ex}}{\mathrm{R}}^{\mathrm{m}}$ are position, and velocity of the

ith agent, respectively. Consider a dynamic graph

$\mathcal{G}(\upsilon ,\epsilon (t\left)\right)$ that consists of a set of vertices

$\upsilon =\{1,2,\dots ,n\}$, and edges

$\epsilon \left(t\right)\subseteq \left\{\right(i,j):i,j\in \upsilon ,j\ne i\}$. Each vertex represents an agent of the flock while a communication link between a pair of agents is represented by an edge.

${N}_{i}^{\alpha}\left(t\right)=\left\{j\in {\upsilon}_{\alpha}:\phantom{\rule{3.33333pt}{0ex}}\parallel {q}_{j}\left(t\right)-{q}_{i}\left(t\right)\parallel <r,j\ne i\right\}$ is the neighborhood set of agent

i, where the range of interaction between agents

i and agent

j is defined by a positive constant

r, and

$\parallel \xb7\parallel $ is the Euclidean norm in

$\mathrm{I}\phantom{\rule{-0.166667em}{0ex}}{\mathrm{R}}^{\mathrm{m}}$. Solving the set of algebraic conditions:

$\parallel {q}_{j}\left(t\right)-{q}_{i}\left(t\right)\parallel =d\phantom{\rule{11.38109pt}{0ex}}\forall j\in {N}_{i}^{\alpha}\left(t\right)$, we could describe the geometric model of the flock, i.e., the

$\alpha $-lattice [

3], where the distance between two neighbors

i and

j is represented by a positive constant

d.

To avoid the singularity of the collective potential function at ${q}_{i}\left(t\right)={q}_{j}\left(t\right)$, the $\sigma $-norm (i.e., ${\parallel \xb7\parallel}_{\sigma}$) is defined where ${\parallel z\parallel}_{\sigma}=\frac{1}{\u03f5}[\sqrt{1+\u03f5{\parallel z\parallel}^{2}}-1]$, and $\u03f5$ is a positive constant. To resolve the singularity problem, the set of algebraic conditions can be rewritten as: ${\parallel {q}_{j}\left(t\right)-{q}_{i}\left(t\right)\parallel}_{\sigma}={d}_{\alpha}\phantom{\rule{11.38109pt}{0ex}}\forall j\in {N}_{i}^{\alpha}\left(t\right)$.

A smooth collective potential function $V\left(q\right)=\frac{1}{2}{\sum}_{i}{\sum}_{j\ne i}{\psi}_{\alpha}\left({\parallel {q}_{j}\left(t\right)-{q}_{i}\left(t\right)\parallel}_{\sigma}\right)$ can be obtained by considering the above-mentioned constraints, where ${\psi}_{\alpha}\left(z\right)={\int}_{{d}_{\alpha}}^{z}{\varphi}_{\alpha}\left(s\right)ds$ is a smooth pairwise potential function with ${\varphi}_{\alpha}\left(z\right)={\rho}_{h}(z/{r}_{\alpha})\varphi (z-{d}_{\alpha})$, $\varphi \left(z\right)=\frac{1}{2}[(a+b){\sigma}_{1}(z+c)+(a-b)]$, and ${\sigma}_{1}\left(z\right)=z/\sqrt{1+{z}^{2}}$.

A possible choice for defining

$\rho \left(z\right)$, which is a scalar bump function that smoothly varies between [0,1], is [

3]:

${u}_{i}\left(t\right)={u}_{i}^{\alpha}+{u}_{i}^{\beta}+{u}_{i}^{\gamma}$ is the flocking control algorithm introduced in [

3], which consists of three main terms:

- (i).
${u}_{i}^{\alpha}$ is the interaction component between two

$\alpha $-agents and is defined as follows:

where

${c}_{1}^{\alpha}$ and

${c}_{2}^{\alpha}$ are positive constants. The terms

${\mathit{n}}_{i,j}$ and

${a}_{ij}\left(q\right)$ are vector and the elements of the spatial adjacency matrix

$A\left(q\right)$, respectively, which are described as follows:

where

${r}_{\alpha}={\parallel r\parallel}_{\sigma}$, and

${a}_{ii}\left(q\right)=0$ for all

i and

q.

- (ii).
${u}_{i}^{\beta}$ is the interaction component between the

$\alpha $-agent and an obstacle (named the

$\beta $-agent) and is defined as follows:

where

${c}_{1}^{\beta}$ and

${c}_{2}^{\beta}$ are positive constants.

${\widehat{q}}_{i,k}$ and

${\widehat{p}}_{i,k}$ are position, and velocity of the

kth obstacle (i.e.,

$\beta $-agent), respectively. The terms

${\widehat{\mathit{n}}}_{i,k}$ and

${b}_{i,k}\left(q\right)$ are vector and the elements of the heterogeneous adjacency matrix

$B\left(q\right)$, respectively, which are defined as follows:

${\varphi}_{\beta}\left(z\right)={\rho}_{h}(z/{d}_{\beta})({\sigma}_{1}(z-{d}_{\beta})-1)$ is a repulsive action function and

${N}_{i}^{\beta}=\left\{k\in {\upsilon}_{\beta}:\parallel {\widehat{q}}_{i,k}-{q}_{i}\parallel <{r}^{\prime}\right\}$ is the set of

$\beta $-neighbors of an

$\alpha $-agent

i, where the range of interaction of an

$\alpha $-agent with obstacles is the positive constant

${r}^{\prime}$. Here,

${d}_{\beta}={\parallel {d}^{\prime}\parallel}_{\sigma}$, and

${r}_{\beta}={\parallel {r}^{\prime}\parallel}_{\sigma}$ - (iii).
${u}_{i}^{\gamma}$ is a goal component that consists of a distributed navigational feedback term and is defined as follows:

where

${c}_{1}^{\gamma}$ and

${c}_{2}^{\gamma}$ are positive constants.

More detailed studies about flocking control algorithms can be found in [

3].

**Remark** **1.** In practical networked multi-UAS flocking control systems, due to the complexity of the overall system, autonomous agents are commonly described by the double integrator dynamics. Since our analysis is focused on developing an intelligent distributed controller for flocking of networked multi-UAS, in this paper, the double integrator dynamics is adopted. Considering the fact that the double integrator dynamics is a very reduced dynamics of the quad rotorcrafts, one can extend our results by employing model-free inner-loop controllers in [23,30], etc. #### 2.3. Brain Emotional Learning-Based Intelligent Controller

Brain Emotional Learning Based Intelligent Controller (BELBIC) is one of the neurobiologically-motivated intelligent methodologies, which is based on the computational model of emotional learning observed in the mammalian limbic system proposed in [

25]. This model (depicted in

Figure 1), has two main parts:

Amygdala, and

Orbitofrontal Cortex. Amygdala is responsible for immediate learning, while Orbitofrontal Cortex is responsible for inhibition of any inappropriate learning happening in the Amygdala.

Sensory Inputs (

$SI$) and

Emotional Signal (

$ES$) are two main inputs to the BELBIC model.

The output of the BELBIC model (

$MO$) can be defined as

which is calculated by the difference between the Amygdala outputs (

${A}_{l}$) and the Orbitofrontal Cortex outputs (

$O{C}_{l}$). Here,

l is the number of sensory inputs.

The Orbitofrontal Cortex and the Amygdala outputs are calculated by the summation of all their corresponding nodes, where the output of each node is described as:

where

$S{I}_{l}$ is the

lth sensory input,

${V}_{l}$ is the weight of the Amygdala, and

${W}_{l}$ is the weight of the Orbitofrontal Cortex. The following equations are employed for updating

${V}_{l}$ and

${W}_{l}$, respectively:

where

${K}_{w}$ and

${K}_{v}$ are the learning rates.

The maximum of all

$SIs$ is another input considered in the model. This signal (i.e.,

${A}_{th}$), which is directly sent from the Thalamus to the Amygdala, is defined as:

where

${V}_{th}$ is the weight and the corresponding update law is the same as Equation (

10).

Several techniques have been adopted for tuning the BELBIC parameters [

26,

31,

32,

33,

34,

35]. In this paper, to significantly reduce the computational complexity, a heuristic approach is utilized for tuning the BELBIC parameters.

#### 2.4. Objectives

The objective is to design a biologically-inspired distributed intelligent controller for flocking control of multi-unmanned aircraft systems (i.e.,

${u}_{i}$,

$i=1,\dots ,n$ and

n is the number of UASs), specifically, in the events of network-induced delay. The proposed intelligent control method is leveraging the computational model of emotional learning in the mammalian limbic system (i.e., BELBIC) introduced in

Section 2.3, and is applied to the flocking model of networked multi-UAS described in

Section 2.1.

In other words, the solution proposed in this paper is a model-free distributed intelligent controller, which is designed to maintain the motion of all agents in the flock in the events of network-induced delay.