Biologically-Inspired Intelligent Flocking Control for Networked Multi-UAS with Uncertain Network Imperfections

In this paper, a biologically-inspired distributed intelligent control methodology is proposed to overcome the challenges, i.e., networked imperfections and uncertainty from the environment and system, in networked multi-Unmanned Aircraft Systems (UAS) flocking. The proposed method is adopted based on the emotional learning phenomenon in the mammalian limbic system, considering the limited computational ability in the practical onboard controller. The learning capability and low computational complexity of the proposed technique make it a propitious tool for implementing in real-time networked multi-UAS flocking considering the network imperfection and uncertainty from environment and system. Computer-aid numerical results of the implementation of the proposed methodology demonstrate the effectiveness of this algorithm for distributed intelligent flocking control of networked multi-UAS.


Motivation
Distributed coordination of networked multi-Unmanned Aircraft Systems (UAS) has been studied by diverse research communities in recent years [1][2][3][4].Due to the broad applications of flocking in real-world scenarios, most of the networked multi-UAS control methodologies are adopted based on the mathematical model of flocking [5][6][7].In general, three basic rules (i.e., separation, alignment, and cohesion) are considered for simulating the flocking behavior [1] which are observed in many living beings (i.e., birds, fish, bacteria, and insects) [8].
Several research groups have been contributed for improving the flocking behavior of networked multi-UAS in recent years [9][10][11][12].Groups of aircrafts, in many applications involving networked multi-UAS, require communicating in order to successfully accomplish their assigned tasks.Network imperfection, e.g., delay, commonly exists in communication due to the limited communication resource and heavy traffic in the network [13,14].As a result, it is of paramount importance to address the challenges of network-induced delay and taking the influence of network-induced delays into account in designing control algorithms for multi-unmanned aircraft systems.Besides network imperfections, the uncertainty from the complex environment and system dynamics is another critical challenge and cannot be ignored in advanced applicable control development.Therefore, it is important to consider the uncertainties from the environment and system in designing control algorithms.

Related Works
Diverse research groups have attempted to address the issues arising from the effects of network-induced delay in flocking control of multi-unmanned aircraft systems/multi-agent systems.For example, delay-independent flocking control of multi-agent systems have been addressed in [15,16].Closely related, a distributed control design for the discrete-time directed multi-agent systems with distributed network-induced delay has been proposed in [17].The authors in [18] presented a distributed algorithm for the sensor networks by considering the effects of the imperfect communication such as link failures and channel noise.In [19], the authors studied the design of distributed formation recovery control for nonlinear heterogeneous multi-agent systems.A distance constrained based adaptive flocking control for the multi-agent system with network-induced delay was investigated in [20].Recently, coordinated control of two-wheel mobile robots with input network-induced delay was presented in [21].Although all of these proposed methods perform well dealing with the effects of network-induced delay, they still need the detailed information of the system.In this regard, the development of control strategies for dealing with the network-induced delay with less dependency on the full knowledge of the system dynamics is of paramount importance.
In recent years, intelligent approaches have been extensively utilized for successfully solving diverse complex problems [22][23][24].Among them, biologically-inspired intelligent approaches have received tremendous interest by many researchers because of their inherent potential to deal with computationally complex systems.Emotional Learning is one such approach, which takes advantage of a computational model of the amygdala in the mammalian limbic system [25].This model, known as Brain Emotional Learning Based Intelligent Controller (BELBIC), consists of the Amygdala, Orbitofrontal Cortex, Thalamus, and Sensory Input Cortex as its main components.From a control systems point of view, BELBIC is a model-free controller (i.e., model dynamics are fully or partially unknown) that has shown promising performance under noise and system uncertainty [26].
Sensory Inputs (SI) and Emotional Signal (ES) are two main inputs to the BELBIC model, and it is shown that the multi-objective problems could be solved by defining appropriate SI and ES [27,28].The flexibility in assigning different SI and ES makes this controller a practical tool for implementation in real-time applications.Furthermore, BELBIC could effectively control a system even when the states of the system and the controller performance feedback are the only available information [26].In addition, compared to other existing learning-based intelligent control methods, the computational complexity of BELBIC is on the order of O (n) , which makes it a suitable approach for real-time implementation.

Main Contributions
The main contribution of this paper is to develop a model-free distributed intelligent control methodology to overcome the challenges including the network-induced delay and uncertainties from the environment and system in networked multi-UAS.To this end, we propose a biologically-inspired distributed intelligent controller, which takes advantage of the computational model of emotional learning in the mammalian limbic system.Our recent work in [29] introduced the theoretical framework for the flocking control for networked multi-UAS using BELBIC.These results are improved upon and extended in this paper for accomplishing an intelligent and practical flocking of networked multi-UAS, when subjected to impacts from uncertain network imperfections.The proposed methodology has a low computational complexity, which makes it a promising method for real-time applications.Furthermore, keeping the system complexity in a practically achievable limit, the proposed method delivers a controller with multi-objective properties (i.e., control effort optimization, network-induced delay handling, and noise/disturbance rejection).Moreover, we provided the Lyapunov stability analysis to demonstrate that our proposed methodology guarantees the convergence of the designed control signals as well as maintaining the system stability during the learning.The learning capability of the proposed approach is validated for flocking control of multi-unmanned aircraft systems influenced by the network-induced delay with promising performance.Computer-based numerical results of the implementation of the proposed methodology demonstrate the effectiveness of this algorithm for distributed intelligent flocking control of networked multi-UAS.
In other words, the solution proposed in this paper is a model-free distributed intelligent controller, which provides the following benefits:

•
The knowledge of system dynamics is not fully or partially required.

•
It has the capabilities of overcoming the network-induced delay, handling the uncertainties, and noise/disturbance rejection.

•
It is appropriate for real-time implementation due to its low computational complexity (i.e., the developed algorithm is a real-time applicable learning technique).

•
It ensures the stability of the system.
The rest of the paper is organized as follows.Section 2 presents the problem formulation and some preliminaries about flock modeling with network-induced delay and emotional learning.Our main contribution is introduced in Section 3, which consists of a distributed intelligent flocking control strategy based on emotional learning.Section 4 presents numerical simulation results.The conclusions of the paper and future directions of our work are provided in Section 5.

Problem Formulation and Preliminaries
In this section, some preliminaries are provided and the problem formulation is briefly discussed.First, the dynamic of the networked multi-UAS is given; then, flock topology is modeled by means of a dynamic graph.Next, network-induced delay and BELBIC model are introduced, and, ultimately, the problem is formulated.

Flock Modelling
Assuming the movement of the flock in an m-dimensional space (m = 2, 3), the equation of motion of the ith agent with continuous-time double integrator dynamics could be described according to the following set of equations: where u i (t) ∈ IR m is the control input, {q i (t), p i (t)} ∈ IR m are position, and velocity of the ith agent, respectively.Consider a dynamic graph G(υ, ε(t)) that consists of a set of vertices υ = {1, 2, ..., n}, and edges ε(t) ⊆ {(i, j) : i, j ∈ υ, j = i}.Each vertex represents an agent of the flock while a communication link between a pair of agents is represented by an edge.
where the range of interaction between agents i and agent j is defined by a positive constant r, and • is the Euclidean norm in IR m .Solving the set of algebraic conditions: q j (t) − q i (t) = d ∀j ∈ N α i (t), we could describe the geometric model of the flock, i.e., the α-lattice [3], where the distance between two neighbors i and j is represented by a positive constant d.
To avoid the singularity of the collective potential function at q i (t) = q j (t), the σ-norm (i.e., • σ ) is defined where , and is a positive constant.To resolve the singularity problem, the set of algebraic conditions can be rewritten as: q j (t) ) can be obtained by considering the above-mentioned constraints, where A possible choice for defining ρ(z), which is a scalar bump function that smoothly varies between [0,1], is [3]: (2) , which consists of three main terms: u α i is the interaction component between two α-agents and is defined as follows: where c α 1 and c α 2 are positive constants.The terms n i,j and a ij (q) are vector and the elements of the spatial adjacency matrix A(q), respectively, which are described as follows: where r α = r σ , and a ii (q) = 0 for all i and q.
(ii).u β i is the interaction component between the α-agent and an obstacle (named the β-agent) and is defined as follows: where c β 1 and c β 2 are positive constants.qi,k and pi,k are position, and velocity of the kth obstacle (i.e., β-agent), respectively.The terms ni,k and b i,k (q) are vector and the elements of the heterogeneous adjacency matrix B(q), respectively, which are defined as follows: is a repulsive action function and r } is the set of β-neighbors of an α-agent i, where the range of interaction of an α-agent with obstacles is the positive constant r .Here, d β = d σ , and r β = r σ (iii).u γ i is a goal component that consists of a distributed navigational feedback term and is defined as follows: where c More detailed studies about flocking control algorithms can be found in [3].
Remark 1.In practical networked multi-UAS flocking control systems, due to the complexity of the overall system, autonomous agents are commonly described by the double integrator dynamics.Since our analysis is focused on developing an intelligent distributed controller for flocking of networked multi-UAS, in this paper, the double integrator dynamics is adopted.Considering the fact that the double integrator dynamics is a very reduced dynamics of the quad rotorcrafts, one can extend our results by employing model-free inner-loop controllers in [23,30], etc.

Network-Induced Delays
Assuming that the state of agent i gets to agent j after passing through a communication channel with network-induced delay τ ij > 0, the u α i could be rewritten as: In this paper, we consider the case where the network-induced delays in all channels are equal to τ > 0. Although the delay is deterministic, it is unknown.The proposed method can effectively handle this unknown delay.

Brain Emotional Learning-Based Intelligent Controller
Brain Emotional Learning Based Intelligent Controller (BELBIC) is one of the neurobiologically-motivated intelligent methodologies, which is based on the computational model of emotional learning observed in the mammalian limbic system proposed in [25].This model (depicted in Figure 1), has two main parts: Amygdala, and Orbitofrontal Cortex.Amygdala is responsible for immediate learning, while Orbitofrontal Cortex is responsible for inhibition of any inappropriate learning happening in the Amygdala.Sensory Inputs (SI) and Emotional Signal (ES) are two main inputs to the BELBIC model.
The output of the BELBIC model (MO) can be defined as which is calculated by the difference between the Amygdala outputs (A l ) and the Orbitofrontal Cortex outputs (OC l ).Here, l is the number of sensory inputs.The Orbitofrontal Cortex and the Amygdala outputs are calculated by the summation of all their corresponding nodes, where the output of each node is described as: where SI l is the lth sensory input, V l is the weight of the Amygdala, and W l is the weight of the Orbitofrontal Cortex.The following equations are employed for updating V l and W l , respectively: where K w and K v are the learning rates.
The maximum of all SIs is another input considered in the model.This signal (i.e., A th ), which is directly sent from the Thalamus to the Amygdala, is defined as: where V th is the weight and the corresponding update law is the same as Equation (10).Several techniques have been adopted for tuning the BELBIC parameters [26,[31][32][33][34][35].In this paper, to significantly reduce the computational complexity, a heuristic approach is utilized for tuning the BELBIC parameters.

Objectives
The objective is to design a biologically-inspired distributed intelligent controller for flocking control of multi-unmanned aircraft systems (i.e., u i , i = 1, ..., n and n is the number of UASs), specifically, in the events of network-induced delay.The proposed intelligent control method is leveraging the computational model of emotional learning in the mammalian limbic system (i.e., BELBIC) introduced in Section 2.3, and is applied to the flocking model of networked multi-UAS described in Section 2.1.
In other words, the solution proposed in this paper is a model-free distributed intelligent controller, which is designed to maintain the motion of all agents in the flock in the events of network-induced delay.

System Design
Specifically, our focus is on the design of a biologically-inspired distributed intelligent controller for flocking of networked multi-UAS by using BELBIC because the implementation of it could be accomplished without increasing the complexity of the overall system.The BELBIC architecture implemented in this work is shown in Figure 2.This figure demonstrates a closed loop configuration that consists of the following blocks: (i) BELBIC block, (ii) Sensory inputs (SI) function block, (iii) Emotional signal (ES) generator block, and finally (iv) a block for the plant.This architecture implicitly demonstrates the overall emotional learning based control concept, which consists of the action selection mechanism, the critic, and the learning algorithm [26].

Emotional Signal and Sensory Input Development
Fundamentally, BELBIC is an action selection technique, in which action is produced based on the sensory input (SI) and the emotional signal (ES).The general forms of SI and ES are given as follows: where e is the system error, r is the system input, y is the system output, and u is the control effort.The control objectives (e.g., reference tracking and optimal control) could implicitly be decided by choosing the adequate ES.For example, it is possible to choose the ES for achieving a better reference tracking performance, for reducing the overshoot, and/or for the energy expense minimization, among others.
Aiming at designing a model-free distributed intelligent control for flocking control of multi-unmanned aircraft systems, the proposed biologically-inspired technique will focus on improving: (i) reference tracking performance, (ii) network-induced delay handling, and (iii) disturbance rejection.
To accomplish these objectives, for each of the control inputs (i.e., {u 1 ,...,u n }), the SI i and ES i , will be designed as:

and K γ
ES are positive gains.The ES will change its impact on the system behavior by assigning different values to these positive gains.In this work, different gains are assigned for each one of the control inputs (i.e., u i , i = 1, ..., n) of the system.
It should be mentioned that we designed the ES in such a way that the increase in reference tracking error will generate a negative emotion in the system, which is then taken as evidence for the unsatisfactory performance of the system.Therefore, the proposed controller will behave in such a way that it will always minimize the negative emotion, which leads to the satisfactory performance of the system.

Learning-Based Intelligent Flocking Control
In flocking control of networked multi-UAS, multiple performance considerations have to be taken into account all at the same time; therefore, it is a very interesting case for using biologically-inspired learning-based multi-objective methodologies like BELBIC.Designing a model-free distributed intelligent control for flocking of networked multi-UAS by considering the network-induced delay, in addition to designing a suitable controller for real-time implementation, encourages us to take advantage of the computational model of emotional learning in the mammals' limbic system, i.e., BELBIC.
From Equations ( 15) and ( 16), the BELBIC-inspired distributed intelligent flocking control strategy for networked multi-UAS is defined as Here, i = 1, ..., n makes reference to each control input and n is the number of UASs.Considering the results obtained in Theorem 1 and by substituting the Emotional Signal with Equation ( 16), the BELBIC model output of the distributed intelligent control for flocking of networked multi-UAS could be obtained as follows: which clearly satisfies our goal of distributed intelligent flocking control.In other words, the model output accomplished all the flocking properties i.e., Collision Avoidance, Obstacle Avoidance, and Navigational Feedback.The overall model-free biologically-inspired distributed intelligent flocking control methodology proposed in this paper is summarized as a pseudo-code in Algorithm 1.
for each iteration t = t s do for each agent i do Update W i end for end for

Stability Analysis
Theorem 1 is presenting the convergence of the weights of the Amygdala (V l ) and the Orbitofrontal Cortex (W l ).Theorem 2 is providing the closed-loop stability of the proposed controller and Remark 2 explains how the proposed method converges to distributed intelligent flocking control of networked multi-UAS.Theorem 1.Given the BELBIC design as Equations ( 15)-( 18), there exists the positive BELBIC tuning parameters, K v , K w satisfying I. 1 such that the BELBIC estimated weights of the Amygdala (V l ) and the Orbitofrontal Cortex (W l ) converge to the desired targets asymptotically.
Proof.See Appendix A.
Theorem 2. (Closed-loop Stability): Given that the initial networked multi-UAS state x(0) and the BELBIC estimated weights of the Amygdala (V l (0)) and the Orbitofrontal Cortex (W l (0)) are bounded in the set Λ, let the BELBIC be tuned and estimated control policy be given as Equations ( 10), ( 11) and ( 17), respectively.Then, there exist positive constants, K v , K w , satisfying Theorem 1 such that networked multi-UAS state, x(t) and BELBIC weight estimation errors are all asymptotically stable.

Proof. See Appendix B.
Remark 2. Based on the BELBIC theory [26] and (Equation ( 17)), the distributed intelligent flocking control of networked multi-UAS can be obtained while the estimated weights of the Amygdala (V l ) and the Orbitofrontal Cortex (W l ) are converging to desired targets.According to Theorem 1, estimated weights converge to desired targets asymptotically.Therefore, the designed BELBIC input u BEL i (i.e., Equation ( 17)) converges to distributed intelligent flocking control of networked multi-UAS asymptotically.

Simulation Results
This section presents computer-based simulation results showing the performance of the proposed biologically-inspired distributed intelligent flocking control of multi-unmanned aircraft systems (Section 4.2) and multi-unmanned ground systems (Section 4.1) in an obstacle-free environment.A total of 50 unmanned aircraft and 150 unmanned ground vehicles (UGVs) where employed, with initial velocities equal to zero, and positions randomly distributed in a squared area.The following parameters are used through the simulation: For the σ-norm, the parameter = 0.1, for φ(z) the parameters a = b = 5, for the bump functions φ α (z) and φ β (z), h = 0.2 and h = 0.9, respectively.The same network-induced delays (i.e., τ i,j = τ = 0.3, ∀i) are considered for all unmanned aircrafts and unmanned ground vehicles.All simulations are carried out on a platform with following specifications: Windows Server 2012 R2 standard, Processor: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70 GHz (4 processors), RAM: 8.00 GB.

Flocking of UGVs in an Obstacle-Free Environment
Figure 3 shows two snapshots of the simulation of the multi-unmanned ground systems in the obstacle-free environment.Figure 3 Left shows the 150 UGVs in their initial positions at t = 0 s while Figure 3 Right shows the UGVs at t = 40 s where they are flocking and have successfully formed a connected network.
For comparison purposes, two similar experiments were performed, but using the flocking algorithm introduced in [3], and Multirobot Cooperative Learning for Predator Avoidance (MCLPA) flocking algorithm introduced in [36] instead of the proposed algorithm.Figure 4 shows the velocities of all UGVs on the xand y-axis for all methods in an obstacle-free environment under the influence of the networked-induced delay.The plot shows that, although the delay is deterministic, it is unknown and the proposed method could effectively handle the influences of the network-induced delay.Figure 5 shows the Mean Square Error (MSE) 1 n ∑ n i=1 (p i − p d ) 2 of the velocities of the overall group of agents, for the flocking methods in [3], the MCLPA flocking strategy in [36], and the BELBIC-based flocking, when evolving in an obstacle-free environment.
Table 1 shows the characteristics of the MSE of velocities of all flocking strategies for UGVs in the obstacle-free environment.The BELBIC-based flocking is presented in dot-dashed red, the MCLPA flocking strategy in [36] in dashed green, and the flocking in [3] in solid blue.Notice that the MSE of the BELBIC-based flocking are smaller, and therefore more appropriate to implement in real-robots.
Table 1.Characteristics of the MSE of all flocking strategies for UGVs in the obstacle-free environment.
Flocking in [3] MCLPA [36]   For comparison purposes, two similar experiments were performed, but using the flocking algorithm introduced in [3], and Multirobot Cooperative Learning for Predator Avoidance (MCLPA) flocking algorithm introduced in [36] instead of the proposed algorithm.Figure 7 shows the velocities of all UASs on the xand y-axis for all methods in an obstacle-free environment under the influence of the networked-induced delay.The plot shows that, although the delay is deterministic, it is unknown and the proposed method could effectively handle the influences of the network-induced delay.
Figure 7. Velocities of all UASs on the x and y-axis for all methods in an obstacle-free environment under the influence of the networked-induced delay.The proposed method top row, the MCLPA [36] middle row, and the flocking algorithm proposed in [3] the bottom row.
Figure 8 shows the Mean Square Error 1 n ∑ n i=1 (p i − p d ) 2 of the velocities of the overall group of agents, for the flocking methods in [3], the MCLPA flocking strategy in [36], and the BELBIC-based flocking, when evolving in an obstacle-free environment.The BELBIC-based flocking is presented in dot-dashed red, the MCLPA flocking strategy in [36] in dashed green, and the flocking in [3] in solid blue.Notice that the MSE of the BELBIC-based flocking are smaller, and therefore more appropriate to implement in real-robots.
Table 2 shows the characteristics of the MSE of velocities of all flocking strategies for UAVs in the obstacle-free environment.

Conclusions
The challenges of network-induced delay in flocking of networked multi-unmanned aircraft systems have been studied in this paper.A biologically-inspired distributed intelligent control methodology based on the emotional learning phenomenon in the mammalian limbic system has been proposed to overcome these challenges.Considering the influence of network-induced delay in real-time networked multi-unmanned aircraft systems flocking, the proposed technique demonstrated to be a promising tool for practical implementation because of its learning capability and low computational complexity.Computer-based numerical results of the implementation of the proposed methodology show the effectiveness of this algorithm for distributed intelligent flocking control of networked multi-UAS.
Future work will consider the implementation of a biologically-inspired intelligent control strategy for addressing real-world scenarios such as tracking, search and rescue, underwater explorations, etc.
To provide the stability analysis of the actual system, let's consider that the u a is an actual controller for the following system: ẋ = f (x) + g(x)u a , (A17) where u a is as follows: and ũ is the controller, which is given by the BELBIC model output MO.Considering the Lyapunov function L MO (x), the following is obtained: Taking the first derivative, we have: Considering the Lyapunov function L a (x), the stability proof of overall system is as follows: Taking the first derivative, we have: = x T [ f (x) + g(x)u s ] + x T g(x) ũ (A20)

Figure 1 .
Figure 1.Computational model of emotional learning.

Figure 2 .
Figure 2. Brain Emotional Learning Based Intelligent Controller (BELBIC) in the control loop.

Figure 3 .
Figure 3. Simulation in an obstacle-free environment.(Left) 150 UGVs randomly distributed in a squared area at t = 0 s.(Right) at t = 40 s, the 150 UGVs are flocking and have successfully formed a connected network.

Figure 4 .
Figure 4. Velocities of all UGVs on the xand y-axis for all methods in an obstacle-free environment under the influence of the networked-induced delay.The proposed method (top row), the MCLPA[36] (middle row), and the flocking algorithm proposed in[3] (bottom row).

Figure 5 .
Figure 5. Mean Square value of the velocities of all UGVs on the x-axis-(Left) and y-axis-(Right) generated by the overall group of agents when flocking in an obstacle-free environment.The BELBIC-based flocking is presented in dot-dashed red, the MCLPA flocking strategy in[36] in dashed green, and the flocking in[3] in solid blue.Notice that the MSE of the BELBIC-based flocking are smaller, and therefore more appropriate to implement in real-robots.

4 4. 2 .Figure 6
Figure 6  shows two snapshots of the simulation of the multi-unmanned aircraft systems in the obstacle-free environment.Figure6Left shows the 50 UASs in their initial positions at t = 0 s while Figure6Right shows the UASs at t = 40s where they are flocking and have successfully formed a connected network.

Figure 6 .
Figure 6.Simulation in an obstacle-free environment.(Left) 50 UASs randomly distributed in a squared area at t = 0 s; (Right) At t = 40 s, the 50 UASs are flocking and have successfully formed a connected network.

Figure 8 .
Figure 8. Mean Square value of the velocities of all UAVs on the x-axis-(Left) and y-axis-(Right) generated by the overall group of agents when flocking in an obstacle-free environment.The BELBIC-based flocking is presented in dot-dashed red, the MCLPA flocking strategy in[36] in dashed green, and the flocking in[3] in solid blue.Notice that the MSE of the BELBIC-based flocking are smaller, and therefore more appropriate to implement in real-robots.

Table 2 .
Characteristics of the MSE of all flocking strategies for UAVs in the obstacle-free environment.