1. Introduction
In recent years, the distributed control of unmanned aerial vehicles (UAVs) has gained significant attention among researchers. Indeed, the scalability of modern control protocols, alongside the price convenience of the platforms, allows for more accessible planning of missions with multiple agents. This provides robustness against a single point of failure, increased coverage, and greater time efficiency.
Formation [
1] is a prominent issue to tackle when controlling a group of UAVs, as drones commonly have to reach a safe relative distance during the execution of a mission. In the literature, several strategies are exploited to achieve this in a distributed way. Formation has been tackled through the use of artificial potential fields [
2], treating each agent as a charged particle in an electric field. The geometric properties of the agents have been employed to deal with this problem [
3]. Optimization methods [
4,
5,
6] have also been explored to minimize a cost function subject to several constraints.
The aim of creating a formation usually revolves around a single [
7] or multiple [
8,
9] targets to be chased by the swarm, to perform tasks such as bridge inspection, patrolling, or surveillance. Unlike many case studies reported in the literature [
10,
11], in real-life scenarios, the state of the target may not be accessible to the members of the swarm, as the target may be non-collaborative. In this situation, its precise state variables are unknown and must be estimated through certain estimation filters, as in [
12]. Therefore, the agents have to follow the target location as provided by some collaborative estimation process, rather than by its exact position.
Hence, it is evident that the convergence of the estimation procedure is crucial for the outcome of the formation task [
13]. The purpose of this paper is to outline and analyze an approach for a target estimation performed collaboratively onboard by a swarm of UAVs in formation flight.
A flocking consensus protocol, first established in [
14], is adopted as a starting point for the formation task. In particular, we implement a tailored version of the algorithm in [
14] presented by the authors in [
15]. Indeed, the proposed version eliminates the errors occurring at steady-state in the relative distance between agents. Moreover, it is able to achieve smoother transient behavior with respect to the protocol in [
14]. However, in [
15], the target states were straightforwardly sent to all members of the swarm, so that the resulting topology behaved as a leader–followers one.
Therefore, in [
16], the authors applied the formation protocol presented in [
15] alongside a collaborative estimation method. In particular, the information form of a decentralized Kalman filter was used. The selection of this filter form was dictated by the many advantages it yields in decentralized sensor networks [
17]. With this configuration, every drone acted as a mobile sensor whose measurements of the target state were affected by some noise. Namely, a range-bearing sensor was considered. The information provided by it was first employed locally and then integrated with the one coming from neighboring agents. This resulted in a leaderless topology that brought the configuration closer to an actual setup.
However, in [
16], a linear process was selected to model the target dynamics during the prediction step of the filter. That is, the target was assumed to be a non-maneuvering one. When the estimation process was tested against a target covering a sinusoidal trajectory, the algorithm provided satisfactory results only when the interaction topology between UAVs was fully connected. Instead, when no information fusion was performed, the linearity of the process model was not able to handle the non-linear path, resulting in higher estimation errors.
That is why, in this work, we compare the performance of the linear process model, which we will refer to as the constant velocity (CV) model, with two non-linear models, i.e., a constant turn (CT) and a full-state (FS) model. The former assumes a constant turn rate for the target motion, while the latter takes into account the inner-loop dynamics of the target UAV. In this way, it is possible to evaluate the benefits of employing non-linear models, even though this leads to increased complexity and a higher computational load. Such an approach towards multiple model estimations has been applied in the literature for a single observer in [
18,
19], or for a team of agents in [
20] to the classic Kalman filter formulation. Instead, this paper investigates the effects of different model estimations in the information form of the filter.
A robot operating system (ROS) is used to provide the simulation results. In particular, the software-in-the-loop (SITL) mode available through the PX4 autopilot stack [
21] allows it to run multiple vehicles simultaneously, through the Gazebo simulator. Also, it is possible to exploit the navigation variables that would be accessible on-board actual UAVs, so that the autopilot can be interfaced as in a real implementation.
Metrics such as the estimation root mean square error and the drones’ target distances will be presented to analyze the influence of the process model selection on the outcome of the process.
The remainder of this paper is organized as follows:
Section 2 introduces the flocking protocol, alongside the tailored version developed by the authors. In
Section 3, the decentralized estimation task is discussed and the difference between the three selected process models is highlighted.
Section 4 presents the features of the simulation and illustrates the numerical examples. In
Section 5, a general discussion of the results is given.
Section 6 draws our conclusions and provides some comments on future work.
2. Tailored Flocking Algorithm
In this section, we discuss the formation protocol. In particular, we employ, as a starting point, the popular flocking algorithm established in [
14]. In this paper, we denote it as the
standard protocol. In [
14], it is shown that this algorithm is able to provide, for double-integrator agents, the three flocking rules introduced in [
22]. Namely, these rules are: staying in the vicinity of the center of the swarm; avoiding collision with other agents; and having a common velocity value.
Consider a group of
N agents with double-integrator dynamics
where
and
indicate the position and velocity of agent
i, respectively. The input of the
standard protocol consists of the summation of three contributions, as
The first one,
, regulates the inter-agent distances to a desired value
d. It is given by the gradient of a potential field acting between any two agents, whose minimum is located in
d. The gradient is defined as
where
is a positive gain and
is the set of neighbors of agent
i. The function
is defined as
where
indicates a map
differentiable everywhere given by
, with
. The parameter
depends on the communication range
and the relative distance between agents. In particular,
and
In this way, the force repelling two agents that are dangerously close to each other is stronger than the attractive one. Furthermore, agents that are further than do not contribute to the control input.
The second term
of Equation (
1) is an average consensus step, expressed as
where
is a positive gain. This term brings the agents to an agreement on their velocity values.
Finally, denoting
and
as the position and velocity of a target, the contribution
is a PD (proportional-derivative) controller driving the position and velocity of the agents toward the target ones, where
and
are positive gains.
However, the
standard protocol presents some issues when it is applied to non-linear agents. First, a steady-state offset arises on the relative distance between agents, as found in [
23]. Hence, in [
15], the authors modified the
standard protocol to overcome this issue. Specifically, the error in the inter-agent distances
allows us to define the following integral action
where
is a positive gain. This term was included to eliminate the steady-state offset, even though the transient phase started to display a larger overshoot and considerable oscillations. Thus, the following integral action on the velocity mismatch between each agent and the target
was further employed to dampen the initial transient, where
is a positive gain. Finally, a dynamic gain that multiplies the position error between the agents and the target was used, as
This was done to decrease the attractive force of the target in its vicinity, where
is a parameter regulating the degree of this adjustment.
After these adaptations, the formation protocol could eliminate the steady-state offset while providing satisfactorily smooth transient behavior. A more thorough analysis of the flocking protocol can be found in a previous work by the authors [
15]. In this study, the steps from the standard algorithm to the tailored one are showcased through numerical simulations.
4. Results
In this section, we present numerical results to illustrate the described application and compare the proposed dynamic models.
The flocking algorithm and the decentralized estimation using the three different models are implemented in C++ exploiting the robot operating system (ROS). Thus, the Gazebo simulator and the SITL mode available through the PX4 autopilot are used to carry out simulation experiments. Specifically, we simulate three Iris quadcopters through three separate ROS nodes. Through the publisher/subscriber communication pattern, the nodes share with each other the information described in the previous sections. Then, the computed control input is published via Mavros to the PX4 Autopilot in the offboard mode. The UAVs have to attain a flocking task whose desired inter-agent distance is and whose communication radius is . The previously discussed EKF is employed to perform a decentralized estimate of the state of a target Iris platform. A global reference system has to be shared among all the members of the swarm to yield coherent information fusion.
To evaluate the performance of the three models, the target trajectory is split into three phases. First, the target moves in a straight line, with a constant velocity of along the x-axis, and a null velocity on the y-axis. We refer to this phase as the linear one. Second, the target follows a sinusoidal path with a velocity on the y-axis equal to , while keeping the previous constant velocity on the x-axis. Finally, in the third phase, the target switches to a circular motion at a higher frequency, while still proceeding along the x-axis. The velocity values in this phase are and . The different types of trajectories are chosen to challenge the capabilities of the selected models.
We assume that the process noise covariance matrix is constant, so that for the CV model
for the CT model
and for the FS model
Instead, we assume that the measurement noise covariance matrix is expressed as
where
is the estimated distance between UAV
i and the target, and
is a rotation matrix with
being the estimated relative bearing angle, as in [
30]. The term
in
makes the covariance matrix coefficients shrink as the relative distance between the agent and the target decreases. Hence, sensors provide measurements with lower covariance matrices as they get closer to the target location [
13].
The estimation task is performed at . Since no a priori information is assumed to be known at the beginning of the mission, the initial estimate values are far from the true target states, and the estimation process requires several observation steps before converging to reasonable values. Thus, the drones only begin to chase the target after the first 20 iterations, i.e., after 1 second, so that the initial estimates do not drift them away.
In this study, we carried out three rounds of simulations, one for each dynamic model described previously. This was done to assess the advantages of using a more complex dynamic model in the prediction step. Note that increasing the level of complexity of the prediction model in an information filter comes with the price of inverting higher dimension matrices. In our simulation scenario, each agent must perform the inversion of a matrix for the CV model, of a matrix for the CT model, and of a matrix for the FS model. Instead, a classic Kalman filter would need each agent to invert a matrix, given by the total number of observations (two measurements performed by three UAVs). Thus, it is straightforward to notice that, in terms of computational complexity, it is convenient to use the FS model only when the number of UAVs in the swarm is sufficiently large. Similar reasoning can be applied with respect to the communication cost due to the filter decentralization. Indeed, denote as n the dimension of the state estimate, and as the number of neighbors of agent i. Thus, at each filter iteration, agent i sends out a message of size and collects messages of the same size. However, in this work, the computational and communication costs are not the main issues, as the focus is establishing how the performance of the filter changes with different dynamic models.
In all scenarios, each drone shares the measurement information computed in Equation (
6) according to the line interaction topology shown in
Figure 1.
In this way, only agent 2 can access the complete information contained in the graph. This represents a middle way between a fully connected topology and a situation in which no communication takes place. Refer to a previous work by the authors in [
16] for numerical simulations showcasing the effects of these two opposite configurations. In the first case, all the information is available to each agent, so that the swarm mimics a centralized filter, but a higher number of communication links is demanded. In the latter case, each agent carries on the estimation process in an isolated manner. Thus, a line topology represents a compromise between performance and communication cost.
The estimates provided by the three UAVs of the target positions and velocities on the
x- and
y-axis are shown in
Figure 2 for the CV, in
Figure 3 for the CT, and in
Figure 4 for the FS.
In particular, we report the true target states
and
in green, as well as the estimates
and
provided by the UAVs
. Note that the simulation of the target UAV runs in a ROS node through the offboard mode, as described at the beginning of the section for the three drones in the swarm. Hence, its trajectory is not ideal and is affected by realistic noises and localization errors. This is why, in
Figure 2,
Figure 3 and
Figure 4, the trajectories of the target do not appear to be exactly the same.
We also report, in
Table 1,
Table 2 and
Table 3, the mean of the root mean square errors (RMSEs) over the three UAVs for each model. It is clear that the performance of the estimation filter is affected by the choice of the model.
The CV handles linear motion very well. Its related position and velocity estimates provide low RMSE values, as seen in
Table 1. Moving from the linear to the sinusoidal phase, the RMSEs regarding the
X coordinate remain comparable, due to the fact that the velocity along this axis is still constant. However, it is possible to notice a substantial increase in the RMSE provided by the estimates in the
Y coordinates. Finally, the strong non-linearity of the circular maneuver makes the CV model performance drop drastically, especially in the velocity estimation. This shows the poor performance of the model when dealing with highly maneuvered trajectories.
Regarding the CT model, the initial linear path challenges its capability in tracking the constant non-zero velocity along the
x-axis. Indeed, while the position estimates are still satisfactorily close to the real ones, the
X velocity estimate is quite far from the actual value in this phase. The CT performs similarly to the CV model during the sinusoidal phase. Indeed, despite being curvilinear, this motion is not characterized by a constant turn rate. Instead, the position and velocity estimates during the circular phase yield lower estimation errors. A remarkable improvement can be noticed, especially in the velocity estimates RMSE. As seen in
Table 2, they are about four times smaller than the ones provided by the CV model.
Lastly, the FS provides, in general, noisier estimates with respect to the previous two models, as seen in
Figure 4. However, the analysis of its RMSEs in
Table 3 provides some insights. The FS handles the first linear trajectory with average performances with respect to the other two models, especially from the velocity RMSE point of view. However, it performs quite well in the sinusoidal phase, providing lower estimation errors with respect to the CV and CT models. Finally, in the circular motion, it yields again average RMSEs with respect to the two other models. This may be due to the fact that, while the CV and the CT are specifically designed for a linear and a circular path, the FS does not assume anything about the target’s motion. Thus, it behaves worse than these models in the first and last phases, while it outperforms them in the second one.
The different performances of the estimation process influence the flocking behavior. Indeed, the two tasks are executed in cascade, and the target tracking term in Equation (
2) follows the estimates provided by the filter. In
Figure 5,
Figure 6 and
Figure 7, it is possible to analyze this influence. In particular, in
Figure 5, the actual trajectories of the target and the agents are presented. For clarity of the pictures, only agent 1 is considered.
Figure 6 displays the relative distances between the target and the three UAVs in the swarm, while
Figure 7 showcases the inter-agent distances between the three agents, for each described model. Looking at the trajectory of agent 1 in
Figure 5, it is possible to see that, overall, the three models can make the UAVs correctly and timely follow the target. Some more in-depth differences may be noticed in
Figure 6 and
Figure 7. During the linear phase, the three models provide more or less the same flocking behavior. The desired inter-agent distance of 4 m is reached in all cases with a satisfactory transient and is kept almost constant. However, when the sinusoidal part takes place, an oscillatory behavior emerges. This is because, in the proximity of the target, the flocking protocol has an adaptive gain as in Equation (
3), so that the reduced attractive force makes the agents less prompt in following the target. However, this distance is kept reasonably small and this behavior may be adjusted through parameter
for a prompter formation. The FS provides the smoothest transient, while the CV and CT models act similarly. This can be clearly seen in
Figure 7, where the inter-agent distances remain almost constant also in the sinusoidal phase. Finally, when the circular motion starts, the CV model yields the highest oscillations both in
Figure 6 and
Figure 7. Instead, the CT provides lower oscillations, especially in terms of the target–agent distances. Finally, the FS model results in the lowest oscillations in both metrics. The better performance of the FS with respect to the CT is due to the noisier nature of the velocity estimates that it provides. Indeed, the actual velocity tracked by the agents through (
2) ends up being smaller with the FS compared to the CT, resulting in lower oscillations.