Intelligent Control of Swarm Robotics Employing Biomimetic Deep Learning

: The collective motion of biological species has robust and flexible characteristics. Since the individual of the biological group interacts with other neighbors asymmetrically, which means the pairwise interaction presents asymmetrical characteristics during the collective motion, building the model of the pairwise interaction of the individual is still full of challenges. Based on deep learning (DL) technology, experimental data of the collective motion on Hemigrammus rhodostomus fish are analyzed to build an individual interaction model with multi-parameter input. First, a Deep Neural Network (DNN) structure for pairwise interaction is designed. Then, the interaction model is ob-tained by means of DNN proper training. We propose a novel key neighbor selection strategy, which is called the Largest Visual Pressure Selection (LVPS) method, to deal with multi-neighbor interaction. Based on the information of the key neighbor identified by LVPS, the individual uses the properly trained DNN model for the pairwise interaction. Compared with other key neighbor selection strategies, the statistical properties of the collective motion simulated by our proposed DNN model are more consistent with those of fish experiments. The simulation shows that our proposed method can extend to large-scale group collective motion for aggregation control. Thereby, the individual can take advantage of quite limited local information to collaboratively achieve large-scale collective motion. Finally, we demonstrate swarm robotics collective motion in an experimental platform. The proposed control method is simple to use, applicable for different scales, and fast for calculation. Thus, it has broad application prospects in the fields of multi-robotics control, intelligent transportation systems, saturated cluster attacks, and multi-agent logistics, among other fields.


Introduction
Collective motion occurs widely in group-living animals, which can help the groups to adapt to the environment through solving problems collectively, such as by predator avoidance or cluster foraging. Hence, extensive research has been focusing on this topic. For instance, the collective motion of desert locusts was studied in [1], which revealed the principle of the insect's aggregation. Cavagna took advantage of the machine vision technology to capture the trajectory data of the large-scale purple-winged pheasant and built a model of social interaction for collective motion [2]. Altshuler et al. [3], inspired by Helbing [4], discovered an interesting collective behavior of ant colonies: the symmetrical breaking of collective escape motion. These works show that most collective motions are mainly caused by the social interactions between individuals [5,6].
Social interaction is defined as the transmission and processing of distributed information by the individual, which can be divided into two main structures: hierarchical structure and egalitarian structure [7]. Most of the mammalian beasts in nature are organized in hierarchical structures. On the contrary, the collective motion of bacteria is often supposed as egalitarian structures. Furthermore, the bird flocks and fish schools are basically between hierarchical structures and egalitarian structures. The hierarchical structure of interaction, such as leader-follower relationships in the group, is more ubiquitous, which can lead to more effective organization [8,9]. Therefore, after the leadership literature of Couzin et al. [10], the hierarchical interaction model has attracted more and more attention.
However, it is difficult to study hierarchical interaction. First, the pairwise interaction between two individuals should be treated asymmetrically, which leads to different models for two paired individuals. Second, the input of most existing interaction models is based on the information of the position of the individual and its neighbors [11]. However, the literature [12] proposes that collective motion should also be related to the speed value of each individual. In fact, there are many parameters of each individual for potential explanation of the interaction. However, it is difficult to build mathematically analytical models with both relative position and speed inputs for explaining the asymmetrical interaction [11]. Therefore, building a multi-parameter model of distributed interaction for large-scale collective behavior is still an open and challenging problem.
The data-driven models, especially those based upon Deep Neural Network (DNN), have a strong ability to reveal the relationship between multi-parameter inputs and the individual decision. Because the deep learning (DL) technology is suitable for solving complex mapping problems, and in recent years, DNN models have been widely applied in pattern recognition [13][14][15][16], behavior prediction [17,18], and collective motion analysis [19,20]. However, there are few cases of using DNN models to control swarm robotics.
Swarm robotics has many important potential applications, such as nanoparticles controlled by a magnetic field, which can be used for medical treatments [21][22][23][24]. It is very difficult to control swarm robotics to formulate collective motion. Most methods for swarm robotics control rely on the control theories [25][26][27][28][29][30], but the performance of these methods exhibits a lack of flexibility. Thus, Vasarhelyi G. et al. successfully used a social interaction model to realize 30 drones' flexible collective motion [31]. Inspired by them, we used the data-driven method on real fish data to build a social interaction model, which could be used to drive our self-made swarm robotics (Cuboid) to move collectively [32]. However, the above two works never used the DNN interaction model.
Due to the computational limitation of the individual, very few works have explicitly addressed the question about how the individual sparsely integrates pairwise interaction with all its neighbors in an animal group [33]. Instead of using average contributions of all neighbors, as many models previously proposed [34][35][36][37][38][39], our previous work suggests that an individual fish pays attention to a few neighbors [32]. This mechanism has the advantage of overcoming the natural limitation of individual information processing [40].
The contribution of this work is listed as follows. First, we use DL technology to build a pairwise interaction model and analyze the pairwise interaction between real fish. Then, we integrate this pairwise model into the collective motion control of multi-agents in different scales. Most researchers argue that each individual should take advantage of the information of all neighbors in a certain domain around the focal individual [11,32], such as the Aoki model [34], the Couzin model [35], and the Vicsek model [36]. Vicsek believes that each individual makes directive decisions with all individuals in a certain range. However, in starling flocks [37,38], it is believed that the motion decision of each individual in collective motion depends only on a limited number of neighbors. We try to reveal that if an individual only interacts with one key neighbor, it can also formulate collective motion through our proposed model. The key neighbor is selected by our algorithm based on the visual information of the focal individual. We name this algorithm the Largest Visual Pressure Selection (LVPS) strategy. This assumption of only one neighbor's attention can significantly reduce the computation load of the individual [32]. The fish experiment is used to verify the similarity of our method's simulation. Finally, we extend our method to the collective motion control of large-scale multi-agents and swarm robotics.

Experimental Data
The fish (Hemigrammus rhodostomus) collective motion experimental data were downloaded in the supplementary materials of [11,32]. These data were extracted from experimental video via idTracker software [41] (see Figure 1A). We used 30-mm long and 2.5mm wide fish for the collective motion experiment. The fish had a burst-and-coast swimming pattern, which means that the fish first turn their direction with speed increasing and then decelerate for a straight slide [11]. Most heading angle change takes place entirely at the beginning of the acceleration phase. We call the speed increase of fish a "kick." At each time of the kick, the fish makes a motion decision, such as moving distance, kick duration, and heading variation. We detected 60,312 kicks for the five-fish experiment and 147,776 kicks for the two-fish experiment. We decided to use the two-fish experimental data to build our DNN model and the five-fish data to verify the DNN model.
The heading angle of fish i is described by the angle between the fish velocity vector ( , ) x y v v v =  and the horizontal line of the x coordinate: The positive direction of the fish heading angle is counterclockwise. This heading angle value is restricted to the range of (−π, π]. The fish velocities are calculated by  The relative orientation of the focal fish with respect to its neighbor j : The distance to the wall may be described as follows: The angle to the wall may be described as follows: The speed of the focal fish i may be described as follows: The distance to the neighbor j may be described as follows: The viewing angle of the neighbor j may be described as follows: The orientation difference from the focal fish to its neighbor j may be described as follows: Since the two positions of one neighbor at two consecutive decision moments can reflect the neighbor position changing with respect to the focal fish, we define the following formula as the average relative speed of neighbor j : Hence, the average speed of the focal fish i can also be defined as The motion decisions of the focal fish i are the heading changing angle i δφ , the kick distance i l , and the kick duration i KT , which can be calculated by the positions and heading angles of two sequential kicks as follows: where d n t is the decision time when one kick occurs and 1 d n t + is the decision time of the next kick.

Deep Neural Network (DNN) Model
Based on the trajectory of the two-fish experiment, a data-driven model for focal fish interaction with respect to the environment and its neighbor was trained. The interaction model could be represented by a function mapping from local information to the motion decision. The information about the environment (25-cm radius circular wall in Figure  1A) could be regarded as a static obstacle. Meanwhile, the moving neighbor could be addressed as the dynamic obstacle. Thus, the focal fish should have taken both types of different information into account to determine the action for the next kick. Due to the large amount of perceptual information for the decision, using mathematical analytical functions to build the interaction model was difficult. Hence, we took advantage of a DNN model to solve this problem.
For the focal fish i at decision time  KT t (see Figure 1D).
The distribution of the input and output data of the 2-fish experiment is illustrated in Figure 2. It is obvious that the individual always swam near the wall, referring to the distribution of the relative distance from the wall W ( )  Figure 2E) was close to zero (for the follower) and π ± (for the leader). One fish was always aligned with and close to another (see the PDF of ( ) , respectively (see Figure 2H,I). We used these data as the record set to train our DNN model for pairwise interaction.  KT t contain the relative position and average speed information for the suddenly increased speed (kick) and the following passive gliding period. Therefore, we designed two DNN models to mimic the above two decision phases (see Figure 3). The first DNN model was named the Angle Changing Network (ACN).
The second one was named the Length and Duration Network (LDN). Since the functionality and information input of the two DNN models were similar, the input layer and the hidden layer of the ACN and LDN had the same structure. They both had 7 neurons the activation function of the output layer of both networks was selected as a Linearly Activated Function, which is suitable for regression applications.
The output layer of the ACN has only one neuron for heading angle change generation. Consider the static information of focal fish i ACN such that the output of the ACN is listed as follows: After the ACN generates the output ( ) d i n t δφ , the LDN takes advantage of this value to update its input information. Thus, the static information then becomes , which means that the orientation angle to the wall has been changed and impacts the straight motion. For the same reason, the dynamic information of neighbor j for the LDN input is also changed as follows: Hence, the two outputs of the LDN neural network are listed as follows: where LDN w represents the weight parameters of the LDN, ( ) According to the standard procedure of the regression DNN training, we designed the following mean square error formula as a cumulative loss function for both the ACN and LDN: T ∈ , respectively, where d T is the set for all decision times. We employed the Adam Optimizer [42] to minimize the loss function. The learning rate was set at 0.0005. We randomly selected 20% of all record samples as the test set. The dropout algorithm [43] was adopted to improve the generalization ability of the algorithm.

The Fusion Method of Pairwise Interaction for the Multi-Agents
Instead of using the average contributions of all neighbors as many models previously proposed [34][35][36][37][38][39], our previous work suggests that an individual paying attention to only a few neighbors can lead to collective motion [32]. This mechanism may overcome the natural limitation of information each individual can process [40]. In this paper, we wanted to test whether collective motion could emerge from the group when the individual only interacted with one neighbor. Hence, we tested three different neighbor selection strategies to investigate their impacts on collective motion. We then compared the results of a five-agent simulation implemented by different neighbor selection strategies with five real fish experiments. The three neighbor selection strategies were Nearest Neighbor Selection (NNS), Random Neighbor Selection (RNS), and Largest Visual Pressure Selection (LVPS).
For NNS, each individual only considers the information of the nearest neighbor at the decision time For LVPS, the focal fish chooses its leader as the one with the largest visual pressure. Due to the importance of the fish's vision, the social interaction based on visual sensory input has been intensely researched [44]. Here, we defined the visual pressure of the focal fish as the visual angle of the focal fish with respect to its neighbors (see Figure 4). The larger the visual angle of the neighbor, the greater the visual pressure pressed on the focal fish by this neighbor. In order to simplify the calculation of the visual pressure angle, one can consider each neighbor as a straight vector. In Figure 4, the red fish i is the focal fish. The blue fish j and yellow fish k are its neighbors. A relative coordinate system is established at the center of the focal fish body.
where BL is the average body length of the fish (30 mm for the fish experiment) and , , is the local information of fish i with respect to the neighbor j . Based on ij α and ij β , the visual pressure angle ij θ of neighbor j is then calculated as follows: Note that the visual pressure angles of the blue fish j and yellow fish k are overlapped. Hence, it seems that the visual pressure angle of the blue fish should reduce this overlapped part. However, the fish is an intelligent species with imagination and memory. For instance, fish are able to predict the behavior of short-term hidden prey [45]. Owing to this reason, in spite of only seeing some parts of the neighbor, the focal fish should have the ability to detect the full body of its neighbor. Thus, we tended to use the visual pressure angle with respect to the full body of the neighbors, which is different from the method in the literature [44].

Software Configuration of the Simulation Platform
The DNN interaction model was a core module of the simulation, which was written in the Python and LabVIEW computer languages. The DNN training software was written with TensorFlow, which is a module of Python. Meanwhile, the multi-agent simulation and Graphical User Interface (GUI) were written with LabVIEW because it was convenient for designing Object-Oriented Programming (OOP). OOP is a powerful available programming tool that can easily keep separate the information about each agent in a single software unit. With this facility, all agents in the simulation software are organized by simulation time. At each decision time, the focal agent asks Python for the new motion decision independently. The communication interface between LabVIEW and Python is a client-server program. The server program runs on Python. It receives the DNN input from the focal agent running in the LabVIEW simulation software. The input includes the local static and dynamic information of a focal agent. After neighbor selection for interaction, the Python server program uses TensorFlow to compute the motion decision output of the DNN model (ACN and LDN). Then, this output is downloaded to the LabVIEW client, which sends this decision to the focal agent (see Figure 5).
is the average speed of the decision at d n t . When the timer value of agent i is less than zero (i.e., t ( ) 0 i T t < ), the agent i comes into a new decision moment 1 d n t + to ask the server for a new motion target. If the distance to the wall is less than one body length, the agent resets its timer to stop one kick process and then asks the Python server for a new decision (see Algorithm 1 for details).

Statistical Properties of Collective Motion
Five agents' simulation trajectory results could be used for the comparison with the five-fish experiment to evaluate the effectiveness of our model. We selected six different , which is calculated with the following formulas: (21) where N is the total number of agents in the group. Based on the position of the barycenter, the speed of the barycenter Then, the direction of the barycenter is given by The barycenter holds a reference system in which the relative position and velocity of the fish are defined as In the global reference system, we defined six statistical properties as follows: 1. The distance from all fish (agents) to the wall: 4. Group size: ( ) C t : When the group is compact, ( ) C t is low, and vice versa.
In the relative coordinate system with respect to the barycenter, we defined the following two characters for collective behavior:

Results
The results are mainly divided into two parts. The first part is the analysis of pairwise interaction of the model in the two-fish (agent) experiment (simulation). The second part describes the analysis of the neighbor selection hypotheses in the five-fish (agent) experiment (simulation).

The Effect of Model Pairwise Interaction
In Figure 6, we present the distributions of six properties for the pairwise interaction behavior of the DNN model. These properties include three output values  Figure 6A-C. This can be regarded as the training error of the fish motion data. Figure 6A-C shows that both the simulation (blue lines) and training data outputs (black lines) of the DNN were narrower than that of the real fish (red lines) because the DNN model could filter the noise of the original data of the fish schooling. Thus, if the DNN model is used in a simulation with training data input, the PDF of the output values indeed becomes narrower than the original training data label. In order to prevent overfitting, we stopped the learning iteration when the training error became larger. Hence, the DNN model could learn the general characteristics of the data, and it filtered out the special features such as the noise of the individual. As a result, the distribution of the DNN model output was sharper than that of the real fish. The peaks of the kick length for both the real fish and the DNN simulation were around 2BL (60 mm), and the average values of kick duration distribution of the DNN output, simulation, and real fish were similar. We then compared the signed relative orientation tween the two fish, which illustrated their direction of alignment (see Figure 6D). The simulation result showed that two agents aligned all the time that the real fish did. In Figure 6E, the solid and dashed lines show the relative distance to the wall of the leader and follower, respectively. We defined the leader as the agent (fish) with the larger the viewing angle, and thus ( ) ij t ψ was in the range of [90 ,180 ]°°. Note that the literature [11] indicates that the leader and follower relationship is not stable in the experiment (i.e., fish change roles all the time).
In order to deeply analyze the pairwise interaction, we plotted the relationship between the heading angle change ( , ) x y ∆ ∆ , and orientation 12 ∆φ of its neighbor, individual 2 (see Figure 7A,B). Since there were seven parameters for the input of the DNN pairwise interaction model . This means that the heading angle of the individual was always parallel to the wall. We selected these two parameter values because the fish group swam at this environmental position frequently (see Figure 6E Figure 7A. The green color means turning left, while yellow means turning right. Because of the left wall, the focal fish mainly turned right to avoid an environmental collision. This led to the main color of each panel in Figure 7A being yellow. We investigated the speed variation 1 1 1 l KT V − of the focal fish reflecting the influence of the neighbor, which was determined by the output of the LDN and the speed of the local fish (see Figure 7B). The red color means acceleration of the focal fish, while blue means deceleration.
The focal fish was sensitive to the heading angle of the front neighbor (see the strong alignment of the three top central panels of Figure 7A). If the neighbor's heading was on the left or right, the focal turned left or right (green and yellow). Furthermore, the focal fish decelerated for collision avoidance when its orientation was the same as that of the neighbor. On the other hand, the focal fish accelerated for neighbor attraction when the orientation of the neighbor was different from that of the focal (see the three top central panels of Figure 7B).
The bottom three central panels of Figure 7A,B show a situation where the neighbor was behind the focal fish. If the neighbor moved to the left, the focal fish would turn right (yellow) and decelerate to wait for the neighbor. On the contrary, the focal fish turned left (green) to go outside of the tank with a speed and acceleration for keeping the leading position. Additionally, the focal fish kept its direction of motion when the neighbor had the same orientation.
If the neighbor was on the left of the focal fish, the focal fish decelerated when 12 =0 ∆φ and accelerated for attraction when 12 ∆φ was large. If the neighbor was on the right and the wall was on the left, the focal fish decelerated to prevent colliding with the wall. If the neighbor was on the left and in front, the focal fish inclined to align with the neighbor. Contrarily, if the neighbor was on the left and behind, the focal fish neglected it and decided to turn right to avoid the wall.

The Analysis of the Multi-Fusion Method of Pairwise Interaction
The cohesion of the group in the fish experiment (fish, red lines) and in the DNN model simulation was high, with C 50mm ≈ (see Figure 8D). However, the DNN model simulation with the Near Neighbor Selection (NNS, green lines) strategy was low. The cohesion PDF of the Random Neighbor Selection (RNS, blue lines) strategy of simulation was wider than that of the fish. Only the Large Visual Pressure Neighbor Selection (LVPS, black lines) strategy for the DNN model simulation was more compact than that of the fish. Figure 8C shows the PDF of polarization of the group. All the strategies were highly polarized, except that of NNS, which means that all individuals swam in the same direction (with a huge peak at P 1 ≈ ). All individuals swam near to the wall ( W r 50mm ≈ ) (see Figure 8A) and were always parallel to the wall ( 90 ≈  θ ) (see Figure 8B). The PDF of the DNN model simulation was sharper than that of the fish experiment, because the DNN model filtered the noise of the pairwise interaction. The relative speed shown in Figure 8F was similar for both the DNN model simulation and the fish experiment. Counter-milling behavior was observed more frequently than over-milling behavior in the fish experiment (see Figure 8E). The counter-milling behavior was caused by the fact that the leader fish (at the front of the group) decelerated as they were closest to the wall. On the contrary, the follower fish had more space to go inside of the wall, and hence they moved faster than the leader. Thus, the follower would catch up with the leader and become a new leader in front of the group. This operation was repeated continuously, causing all fish to rotate around the group barycenter. In counter-milling behavior, the direction of rotation around the barycenter was different with the direction of the group swimming around the experimental tank. This collective behavior was mainly caused by the asymmetric interaction.
The results of Figure 8 show that the LVPS strategy could lead to a more compact and stable collective motion than other neighbor selection strategies which was more similar to the real fish.
We extend our pairwise interaction model with LVPS strategy to the simulation with 100 agents (see Figure 9). It took about 2.5 min to aggregate a compact collective motion group from a random state. Compared with other social interaction models, our model could formulate collective motion by only interacting with one neighbor, which was selected by the LVPS strategy. This character allowed an individual to spend less computational load on formulating the collective motion.
Finally, we applied the DNN model with the LVPS strategy on three "Cuboids" robots for collective motion (see Figure 10). The diameter of the circular robot platform was 1000 mm, and the body length and width of the Cuboid robot were both 40 mm. Figure  10 shows the control structure and related functions of the Cuboid robots' experimental platform; detailed information about the Cuboid robots' platform is illustrated in [32]. We spent one hour with three robots in a collective motion experiment. Figure 10A-E shows the top view sequence of the robots' motion in the experiment. The PDF of the Cuboid robot experiment is shown in Figure 11. All robots ran around the wall in the experiment. The relative distance to the wall was small (see Figure 11A). Figure 11B shows that the relative angle to the wall was kept at approximately 90 degrees. The polarization of the group was high, which was similar to that of the fish group (see Figure 11C). Meanwhile, Figure 11D shows that the cohesion of the group was also high, which meant that the Cuboid robot group was always compact.

Discussion and Conclusions
In collective motion, each individual should adjust its behavior to adapt to its neighbors. Previous works suggest that one individual needs to interact with a lot of neighbors to achieve group cohesion [34][35][36]. However, focusing on one neighbor can overcome the low information processing ability of the individual. Therefore, selecting the most influential neighbor in the group with which to interact is very important for understanding the coordination mechanisms of the group.
Here, we developed a pairwise interaction model of collective motion based on the DNN model, which could achieve stable collective motion with one-neighbor interaction. The neighbor was selected by the largest visual pressure. The results comparison between the two-and five-fish experiments and the DNN model simulation verified the motion similarity between our method and natural fish. All simulation agents perfectly had the same moving direction with the compact group. Counter-milling occurred in both fish groups and the agents' simulation with the LVPS strategy. This property of collective behavior enabled all individuals to alternate their positions in the group. For large-scale collective motion, we extended our method to 100 agents for simulation to verify the aggregation ability. The simulation showed that our method could formulate stable large-scale collective movement in a small period.
Compared with the deep attention network model proposed in [46], it can only provide the possibility of turning the direction of the individual. However, our model can output not only the specific steering angle, but also the straight moving distance and time. Pairwise interaction analysis showed that the follower individual preferred decelerating its speed for safety to maintain the alignment rather than change its heading angle for front neighbor avoidance, which was different from the results of other studies in the literature [11]. When the focal fish becomes the leader, it keeps its speed unchanged when the followers are aligning, whether the follower's speed is fast or slow. When the follower's orientation angle is different from that of the leader, the leader turns to maintain its leadership.
The proposed method has the potential to control swarm robotics. The performance of the Cuboid robots' motion was similar to that of the fish. This means that the DNN model control for real swarm robotics had stable, flexible, and scalable characters. This was because the fish's collective motion was robust and flexible. These good control characters can help swarm robotics to be applied in many areas, such as swarm robotic multirobot cooperative pursuits [47] and exploration missions in dangerous areas with swarms [48].
Our pairwise interaction DNN model can integrate the information of both the static environment and the dynamic neighbors. Since this is only the primary research of deep learning technology in swarm robotic control, the model should have the ability to avoid collision by prediction. In the future, we will add predictive information of the neighbor to the deep network model and explore more complex environments which can handle the large-scale traffic congestion of swarm robotics.