Distributed Model-Free Bipartite Consensus Tracking for Unknown Heterogeneous Multi-Agent Systems with Switching Topology

This paper proposes a distributed model-free adaptive bipartite consensus tracking (DMFABCT) scheme. The proposed scheme is independent of a precise mathematical model, but can achieve both bipartite time-invariant and time-varying trajectory tracking for unknown dynamic discrete-time heterogeneous multi-agent systems (MASs) with switching topology and coopetition networks. The main innovation of this algorithm is to estimate an equivalent dynamic linearization data model by the pseudo partial derivative (PPD) approach, where only the input–output (I/O) data of each agent is required, and the cooperative interactions among agents are investigated. The rigorous proof of the convergent property is given for DMFABCT, which reveals that the trajectories error can be reduced. Finally, three simulations results show that the novel DMFABCT scheme is effective and robust for unknown heterogeneous discrete-time MASs with switching topologies to complete bipartite consensus tracking tasks.


Introduction
Multi-agent systems (MASs) and machine learning, two exciting trends in the robotics field, have recently attracted more and more researchers' attention due to the new epoch of artificial intelligence (AI) [1,2]. How to introduce intelligent algorithms into traditional control theories is one of the hottest and significant research topics. Specifically, utilizing intelligent algorithms to improve the robustness of MASs and reducing the calculation burden of designing controllers [3][4][5] to achieve consensus tracking are two of the challenges we need to address.
In the past half-century, most of the excellent control schemes have been developed based on explicit or implicit mathematical models. Examples are sliding model control, intermittent control, impulse control, and fuzzy control, to name but a few. In addition, most of these control theories were successfully applied to consensus tracking tasks of MASs. In [5], Barbot et al. first introduced the concept of a second-order sliding mode. Many novelty approaches have been developed since then. For instance, a novel sliding-mode-based discrete differentiator was proposed that can estimate the accurate derivatives input of the controlled plant [6], and the output constraint problems are considered in the second-order sliding mode controller designer in [7]. In [8], Xu et al. researched the second-order consensus problems of MASs, where local intermittent information among the agents is utilized to design a distributed adaptive completely intermittent controller to achieve second-order and/or angle. After that, BC sparked the interest of many researchers and has been discussed for MASs with linear, nonlinear, and even heterogeneous dynamics. Moreover, the BC for MASs with Lipschitz-type, second-order, or high-order dynamics is investigated in [24,26,27]. Inspired by the above contributions, several theories have been extended. In [28], a distributed extended state observer is employed to guarantee leader-follower BC for MASs with mismatched unknown disturbance. It is observable that formulating a BC controller is more challenging for high-order MASs than for low-order ones. The BC problem for high-order MASs with input saturation is researched by combining distributed event-triggered control and a low-gain feedback technique in [29]. The finite-time and fixed-time BC for MASs are explored in [30,31], respectively. A novel RL based protocol is presented in [32], which is the first use of RL for unknown discrete-time leader-follower MASs, where the author utilizes data-driven actor-critic-based NNs to address the BC problem for unknown MASs, but it increases computations. Moreover, a training process is necessary.
Although much effort has been made toward solving the BC problem [33][34][35][36], to the best of our knowledge, pseudo partial derivative (PPD) approaches have not been taken into account in the existent results. From the above observations and analysis, this paper employs a PPD method to estimate an equivalent dynamic linearization data model of an easy agent, where merely the measurement I/O data of neighborhood agents is necessary. Then, a distributed model-free adaptive bipartite consensus tracking (DMFABCT) scheme is designed for unknown detected-time heterogeneous nonaffine nonlinear MASs with switching topologies to realize time-invariant and time-varying reference trajectory bipartite consensus tracking tasks by using the neighbor-based tracking error. It is worth pointing out that although a few agents could receive the desired trajectory, the rigorous theoretical proof confirms that our proposed algorithm can guarantee convergence of all agents. In the investigation of the existing consensus approaches of MASs, the main contributions of this work might be summarized as follows: (1) A DMFABCT framework is established for unknown heterogeneous nonaffine nonlinear detected-time MASs with switching topologies and a coopetition network. It is a data-driven distributed intelligent algorithm, which has good performance to address the BC problem under both time-invariant and time-varying reference trajectories. Although Bu et al. [37] proposed a novel data-driven framework for MASs, it only discussed the cooperative interactions. (2) The proposed DMFABCT scheme is designed by neighbor-based online measurement I/O data that can bypass the confusion of existing consensus algorithms as seen in [5][6][7][8][9][10][11][24][25][26][27][28][29][30][31][32][33][34][35] to obtain an accurate mathematical model so that the designed scheme is more robust and reduces energy costs from the massive computation. (3) Both collaborative and antagonistic interactions among agents are considered in the proposed protocol. Compared with the protocols in , the proposed protocol is more reasonable. Moreover, the difference of DMFABCT from the novel algorithm proposed in [32] is that DMFABCT copes with the BC problem with PPD, where the training processes and external testing signals are not necessary.
The remainder of this paper is structured as follows. Several essential preliminaries are presented in Section 2. The introduction of the DMFABCT algorithm and the tracking performance of fixed and time-varying reference trajectory analysis are presented in Section 3. Three numerical simulation experiments are provided in Section 4. Finally, conclusions and future work are provided in Section 5.

Graph Theory and Some Notations
Let R denote the set of real numbers. The Euclidean norm of X ∈ R nxn is expressed by X . The identity matrix and diagonal matrix are expressed by I and diag(•), respectively, where the dimension is dependent on the context. In this paper, the algebraic graph theory is employed Sensors 2020, 20, 4164 4 of 21 to analyze the interaction topologies of MASs. It should to be pointed out that the graphs are directed and the weighted directed graph is expressed by and A are the set of vertices, the set of edges, and the adjacency matrix, respectively. Then, i as the parent and j is the child, if the i can transmit the information to j directly, which is expressed as (i, j) ∈ E. If i is not the father of j, a i,j = 0, otherwise a i,j 0. In the graph of MASs, the i has many children so utilizes the N i = j j i, V j , V i ∈ E to describe the relationships among each agent, which is named as the neighborhood of the agent i in other literature. In this paper, the cooperative and competitive relationships are considered between each agent so that the elements of A = (a i,j ) ∈ R N×N have three different values, −1, 0, and 1. If the node i and j belong to a same group, agent i could get the information from agent j, a i,j = 1, otherwise a i,j 1. When a i,j = −1, the agents i and j must be in opposite groups, which is called a competitive relationship between the agents i and j. Alternatively, there is another definition, which is cooperation. Moreover, we usually use cooperation to represent the two different situations among the MASs network. The Laplacian matrix of G can be calculated by L = D − A, where D =diag d in 1 , d in 1 , · · ·, d in N and d in i = N j=1 a i, j are called in-degree of vertex i. The coopetition network G is called structurally balanced if the whole nodes in V can be divided into two disjointed subsets, that is, V 1 , V 2 . They satisfy the following three conditions: Furthermore, if this MASs graph G contains a spanning tree, the information can transmit from a root node to any other node, and so this graph is considered to be a strongly connected graph.
In order to investigate time-varying switching topologies, let G(k) denote a time-varying switching graph with a virtual leader, which is dependent on k, and A F (k) = a i j (k) ∈ R N×N , d i (k) = j∈N(i) a ij (k) , L(k) = −A F (k) + D(k) ∈ R N×N are the corresponding adjacency matrix, degree matrix, and Laplacian matrix, respectively. N p (i) denotes the neighborhood of the ith agent and B(k) = diag b 1 (k), · · ·, b N (k) ∈ R N×N is employed to depict the relationship between the virtual leader 0 and each follower. If the agent i can directly get the desired trajectory from virtual leader 0, i.e., {0, i} ∈ E, b i (k) = 1. Otherwise, b i (k) = 0. To describe the time-varying topology, let G l = G 1 , G 2 , · · ·, G κ denote the set of all directed graphs for the agents, where κ ∈ Z + denotes the total number of possible interaction graphs.

Problem Formulation
In existing studies, the consensus problem, especially the bipartite consensus problem, is often considered for a group of agents with identical dynamics. However, heterogeneity is the intrinsic property for multi-agent systems. Therefore, the problem of bipartite consensus for heterogeneous agents presents many challenges. It is noteworthy that the following assumptions are fundamental conditions of nonlinear dynamics for our analysis. Definition 1. Consider a discrete-time heterogeneous SISO (simple-input-simple-output) MAS with N agents and the nonlinear dynamics of agent i satisfies the following equivalent: where y i (k) ∈ R is the output, i = 1, 2, . . . , N, f i (·) is an unknown nonlinear function, and u i (k) ∈ R is the controlling input, respectively. y 0 (k) denotes the trajectory of a virtual leader, which is represented by using vertex 0 in the graph. Furthermore, only a subset of agents can receive information from the virtual leader directly. Hence, the directed graph G of MASs is combined with N + 1 agents and the corresponding edge set and weighted adjacency matrix are expressed by E and A, respectively.
Sensors 2020, 20, 4164 5 of 21 Assumption 1. u i (k) is a continuous function in order to obtain the partial derivative of nonlinear function f i (·).
The authors of [12,37] and those in their references have introduced the reasonability of Assumptions 1 and 2 for practical nonlinear systems and MASs.
where Γ i (k) ≤ r, r is a positive constant, and Γ i (k) is a variable named pseudo-partial-derivative (PPD).

Remark 2.
Using PPD to establish a dynamic linearization data model is called the PPD approach, where the PPD is only dependent on ∆y i (k + 1) and ∆u i (k). Moreover, the dynamic linearization data model is updated by the PPD, which could approximate the practical dynamics of the controlled plant better. Γ i (k) is not easy to obtain, so we design a parameter estimation law (4) to obtained the estimation (Γ i (k)) of Γ i (k). Meanwhile, the estimation error of Γ i (k) is analyzed in Theorem 1. Since the PPD approach is not complex and the dynamic linearization data model obtained is simple, the PPD approach is a hot topic in data-driven control for researches to study discrete-time nonlinear systems. However, it is still an open topic for utilizing the PPD approach to solve consensus problems of multi-agent systems, especially the multi-agent systems bipartite consensus problems with switching topologies.

Definition 2.
The following distributed measurement output: If the agent i can directly get the desired trajectory from virtual leader 0, i.e., Assumption 3. All of the time-varying switching communication graphs are strongly connected graphs and the trajectory information of the virtual leader can be transmitted to one or more follower agents directly.

Assumption 4.
In the relative literature, Remark 3. The above Assumption 3 is a fundamental condition for researching the bipartite consensus tracking problems. Moreover, it can obviously find Assumption 4, which is implied in the traditional model-based control algorithms as a type of linear-like characteristic. Furthermore, this assumption is wildly used in some practical multi-agent systems, for instance, in unmanned air vehicles and mobile robots.

Main Results
In order to solve the bipartite consensus tracking problem stated in Section 2.2, we propose the DMFABCT approach below: Sensors 2020, 20, 4164 6 of 21 where p > 0, ρ > 0 are the step sizes, which will be defined in the next section.w > 0 and λ > 0 are weight factors. According to Assumption 4, letΓ i (1) > 0, which is the initial value ofΓ i (k), and it is the estimated value of Γ i (k). Practically, if the c is very small, it means that theΓ i (k) does not update any more, thus, c is selected as 10 −4 .

Remark 4.
It is noted thatΓ i (k) could be obtained by merely using the output data ∆y i (k) in the parameters estimation scheme (4) and another important thing is worth pointing out that the convergence of parameters estimation scheme (4) can be guaranteed as shown in [12] and [37]. The control law (6) illustrates that the controlling input u i (k) is updated by using the distributed measurement output ξ i (k) for agent i, so that the algorithm is a kind of DMFABCT scheme.

Remark 5.
The feature of this DMFABCT scheme is that agents' model dynamics are not required, for instance, the PPD parameters estimation algorithm is only used on the measured I/O data of multi-agent systems to complete the formulation, therefore, it is a classic data-driven control approach for solving the MASs' BC problem.

Remark 6.
Both λ and ρ are important parameters of the distributed DMFABCT algorithms. A suitable λ, which is a weight parameter, can ensure the stability of MASs, and ρ is a controller parameter that can guarantee the tracking error that will be cut. Furthermore, the value ranges of ρ will be analyzed in the following Theorems.
To analyze the stability of MASs, Lemma 2 is one of the important conditions.

Lemma 2.
A time-varying irreducible substochastic matrix and the set of all possible T(Q) are denoted by T(Q) and T respectively [39]. Also, the diagonal entries of T(Q) are positive. Then, we can obtain where 0 << 1 and T(Q), K = 1, 2, . . . , Q, are Q matrices arbitrarily selected from T.
The stability analysis of the DMFABCT approach is presented by Theorem 1.

Theorem 1.
Under these circumstances where the MASs (1) satisfies Assumptions 1, 2, and 4 and its communication topology satisfies Assumption 3, apply the proposed DMFABCT algorithms (4)-(6) to track the desired reference trajectory y 0 (k), which is time invariable, i.e., y 0 (k) = const, if ρ satisfies the following condition Proof: We prove this theorem using the three steps below.
According to Equation (7) the following equation can be obtained.
The inequalities p∆u i (k − 1) 2 ≤ ∆u i (k − 1) 2 ≤ w+ ∆u i (k − 1) 2 can be obtained by selecting p and w, which satisfy 0 < p ≤ 1 and w ≥ 0. ∆u i (k − 1) 2 = ∆u i (k − 1) 2 because the system studied in this paper is a single input and output. Thus, a constant can be selected to satisfy the following inequality.
Since Γ i (k) ≤ r, according to Assumption 4, the following inequalities can be obtained.
Step 3 (Obtaining the Convergence Condition of MASs): In this step, the convergence condition of MASs will be derived.
According to the conditions , λ min > 0, and λ > λ min for all i = 1, 2, . . . , N, the following inequalities can be obtained: First of all, in order to guarantee the strictly connected property of MASs under all of the communication topologies, I − ρΞ(k) must be an irreducible matrix. Secondly, 0 < Φ i (k) < 1 for all i = 1, 2, . . . , N and ρ satisfies following inequality which means that all of the diagonal entry in L(k) + B(k) are larger than the reciprocal of ρ. In this case, obviously I − ρΞ(k) is strictly less than one, so I − ρΞ(k) is an irreducible substochastic matrix and its diagonal entries are positive. According to (15), the following inequality can be obtained.
According to Lemma 1, the following inequality can be obtained.
where · stands for the floor function. Hence, the bipartite consensus fixed trajectory tracking errors of MASs can converge to the origin.
The communication topology of considered MASs is shown in Figure 1. It demonstrates that the virtual leader is denoted by using vertex 0 and the followers are distributed into two alliances in each topology. Moreover, in Figure 1, the black solid lines are used to express the cooperative relationships among agents, and the competitive relationships are denoted by dotted lines. It is noted that only a subset of agents could directly receive the information from the leader. Moreover, the information among agents only transmits along the arrows and the direction is fixed. Although other agents cannot directly get the commands from the virtual leader, all of the communication graphs satisfy Assumption 3, so the virtual leader can intervene in the two competitive alliances. As the matrixes above show, the reciprocal of the greatest diagonal entry of L(l) + B(l) is 0.5 for l = 1, 2, 3. In order to satisfy the convergence condition for all i = 1, 2, 3, 4, 5, 6, 7 in Theorem 2, we choose the controller parameters as ρ = 0.3 for each simulation and the other parameters are selected as p = 0.5, w = 1, λ = 0.5, and c = 10 −4 . cannot directly get the commands from the virtual leader, all of the communication graphs satisfy Assumption 3, so the virtual leader can intervene in the two competitive alliances. As the matrixes above show, the reciprocal of the greatest diagonal entry of ( ) ( ) L l B l + is 0.5 for

Fixed Trajectory Tracking Example
In order to obtain a clear result of this simulation, a piecewise function and the desired reference trajectory are given below:
The simulation results of the bipartite tracking performance, tracking errors, and PPD estimation of each agent are shown in Figures 2-4, respectively.
From Figures 2-4 it can be seen that the output between followers and leader has an extreme variation initially, but the bipartite tracking errors can be decreased radically and the bipartite tracking is realized after a few steps. For example, in Figure 2, the value of trajectory is changed from 10 to 20 at k = 400 and we could also find that several agents exchanged their groups at the same time, but only after about 100 steps after a new bipartite consensus is achieved, which Figure 3 also reveals. Furthermore, from Figure 4 we can see that the changing of the topology and the desire trajectory affect the estimation value of PPDs for each agent, but they achieve stable values immediately, which shows that the proposed DMFABCT has a good robustness.
but only after about 100 steps after a new bipartite consensus is achieved, which Figure 3 also reveals. Furthermore, from Figure 4 we can see that the changing of the topology and the desire trajectory affect the estimation value of PPDs for each agent, but they achieve stable values immediately, which shows that the proposed DMFABCT has a good robustness.

Time-Varying Trajectory Tracking Example
In this example, the bipartite consensus time-varying trajectory tracking is discussed, and the desired trajectory is y 0 (k + 1) = 90 cos(kπ/Ψ) + 100 where Ψ= 2200 is the output gain rate and the time-varying topologies are governed by where the initial data of y i (k), u i (k), dynamics of each agent, and other parameters were defined in the beginning of this section. The bipartite consensus tracking performance of this example and the tracking errors of each agent are presented in Figure 5, which shows that the DMFABCT scheme can decrease the number of errors dramatically. Although the errors of the bipartite tracking cannot be removed, they converge to a small bound, which is demonstrated in Figures 6 and 7. Compared with the desired output data of agents, the max distortion rate can be obtained in Figure 7, which is 0.084%. Obviously, this result demonstrates that MASs with switching topologies also can perform the bipartite time-varying tracking tasks. From Figure 8, we can also arrive at the same conclusion that MASs can change the value of PPDs to adaptive environmental change and can obtain a high fault-tolerance property.         By tracking performance of different tracking trajectories, according to Figures 3 and 6, we can conclude that the performance of fixed trajectory tracking is better than that of the time-varying trajectory tracking, which further validates the correctness of the theoretical analysis in Section 3. In addition, in order to further analyze the errors forces of the time-varying trajectory, we change the output gain rate Ψ of the desired trajectory y 0 (k + 1) = 90 cos(kπ/Ψ) + 100 from 500 to 4000 to analyze the tracking performance. From Figure 9, we can easily find that the error rates of each agent all decrease, when the value of Ψ increases. The error rates of MASs at Ψ = 500, Ψ = 2200, and Ψ = 4000 are shown in Figures 7, 10 and 11, respectively. Although the biggest error rate of MASs at Ψ = 500 is about 0.418%, it can bind the error rates of each agent, which means that the errors of MASs are also bounded. Furthermore, errors rates of each agent, which are shown in Figure 11, are close to the original point, so that it further demonstrates the correctness of Theorem 2. Meanwhile, we can conclude that MASs are stable under the proposed DMFABCT scheme and the tracking errors are dependent on the output gain ∆y 0 (k) of the reference trajectory.

Realistic DC Linear Motors Example
In this case, we utilize seven permanent magnet DC linear motors to perform fixed and timevarying trajectory bipartite consensus tracking tasks. The realistic dynamic of the DC linear motor is investigated in [37,40], which has been modeled as below: where c f is the minimum level of Coulomb friction and s f is the level of static friction, x δ  and v f are lubricant and load parameters, respectively. δ is an additional empirical parameter. In this example, these parameters are selected as:

Realistic DC Linear Motors Example
In this case, we utilize seven permanent magnet DC linear motors to perform fixed and time-varying trajectory bipartite consensus tracking tasks. The realistic dynamic of the DC linear motor is investigated in [37,40], which has been modeled as below: where t is continuous time (s), x(t) is the position (m), v(t) is the speed (m/s), m is the combined mass of translator and load, u(t) is the developed force (N), f f riction (t) is the friction force (N), and f ripple (t) is the ripple force (N). The friction and ripple forces have been identified as: where f c is the minimum level of Coulomb friction and f s is the level of static friction, .
x δ and f v are lubricant and load parameters, respectively. δ is an additional empirical parameter. In this example, these parameters are selected as: m = 0.59kg, The desired velocity is given as y 0 (t) = 90 cos(tπ/4000) + 100, t ∈ [0, 8] Using the Euler formula to discretize the above model and selecting sampling time as h = 0.001, we have T = 1000.
In this case, a random noise is introduced in the output measurement data for each DC motor. Moreover, we define the bound of the noise as [−0.02, 0.02]. Here, we use the same parameters and the communication topology as those of example 2 to perform the simulation.
The fixed trajectory bipartite consensus tracking performances of seven DC motors are shown in Figure 12 and another tracking task is presented in Figure 13. From the two simulation results, we observe that several agents have changed their alliance, but the results of the two different bipartite consensus tracking tasks show that the tracking errors of MASs can be reduced, which further proves the effectiveness and applicability of the designed DMFABCT.
Sensors 2020, 20, x FOR PEER REVIEW 19 of 22 In this case, a random noise is introduced in the output measurement data for each DC motor.
Moreover, we define the bound of the noise as [ ] 0 02 0 02 -.
. ， . Here, we use the same parameters and the communication topology as those of example 2 to perform the simulation. The fixed trajectory bipartite consensus tracking performances of seven DC motors are shown in Figure 12 and another tracking task is presented in Figure 13. From the two simulation results, we observe that several agents have changed their alliance, but the results of the two different bipartite consensus tracking tasks show that the tracking errors of MASs can be reduced, which further proves the effectiveness and applicability of the designed DMFABCT.  As shown above, the proposed DMFABCT scheme is correct and effective.

Conclusions
In this work, a data-driven bipartite consensus tracking scheme has been proposed for unknown nonlinear discrete-time multi-agent systems with switching topologies, and a compact form linearization model is established. This algorithm ensures that all agents can track the fixed and time- As shown above, the proposed DMFABCT scheme is correct and effective.

Conclusions
In this work, a data-driven bipartite consensus tracking scheme has been proposed for unknown nonlinear discrete-time multi-agent systems with switching topologies, and a compact form linearization model is established. This algorithm ensures that all agents can track the fixed and time-varying desired trajectory and realize the bipartite tracking. Compared with the model-based control algorithm, one of the main advanced features in our method is that it does not need the agent's dynamics and requires only the input-output. Moreover, both of the cooperation and competition relationships among multi-agent systems are considered, and the convergence and stability of the algorithm are proven by rigorous mathematical analyses. Meanwhile, the corresponding simulation of the bipartite consensus tracking algorithm has been presented to validate the effectiveness of the proposed algorithm. In the future work, we will consider the bipartite consensus problem for multi-input-multi-output multi-agent systems with delay and disturbances.