Hidden Markov Model-Based Control for Cooperative Output Regulation of Heterogeneous Multi-Agent Systems under Switching Network Topology

: This paper investigates the problem of stochastically cooperative output regulation of heterogeneous multi-agent systems (MASs) subject to hidden Markov jumps using observer-based distributed control. In order to address a more realistic situation than prior studies, this paper focuses on the following issues: (1) asynchronous phenomena in the system mode’s transmission to the controller; (2) the impact of system mode switching on network topology; and (3) the emergence of coupled terms between the mode-dependent Lyapunov matrix and the control gain in control design conditions. Speciﬁcally, to reduce the complexity arising from the asynchronous controller-side mode, the leader–state observer is developed so that the solution pair of regulator equations can be integrated into the observer. Furthermore, a linear decoupling method is proposed to handle the emergence of the aforementioned coupled terms; this provides sufﬁcient LMI conditions to achieve stochastically cooperative output regulation for heterogeneous MASs. Finally, the validity of the proposed method is shown through two illustrative examples.


Introduction
Multi-agent systems (MASs) refer to complex systems composed of multiple autonomous agents that interact with each other and their environment, they have been used in various research fields, including robotics [1,2], automated vehicles [3,4], unmanned autonomous vehicles [5][6][7][8], and urban networks [9,10]. Recently, with the advent of these systems, effective techniques for cooperating MASs with different structures and parameters (referred to as heterogeneous MASs) have been swiftly developed for various purposes, including leader following and formation.
Over the past few years, the cooperative output regulation problem has also been regarded as one of the most fundamental consensus problems for MASs. In this problem, one essential requirement is to develop a control strategy that ensures stable and efficient cooperation between agents while achieving the desired overall performance. Additionally, the control strategy should be able to achieve appropriate behaviors with MASs by explicitly considering the interconnections and interactions between multiple agents. Following this, various methods have been proposed to deal with the cooperative output regulation problem of heterogeneous MASs on the premise of a fixed network topology and system parameters (see [11][12][13][14][15] and references therein). However, the network topology and system parameters can randomly change due to obstacles posed by network sizes, functional connectivity disturbances, limited communication ranges, and random packet losses. As a mathematical model for handling the aforementioned random changes, Markov jump multi-agent systems have been widely utilized in many control problems, such as the leader-following consensus control [16,17], scaled consensus control [18], the formation control [19], and the cooperative output regulation [20][21][22][23]. The above studies mainly focused on control problems for MASs with deterministic dynamics (with no sudden changes), and the Markov process was only used to model sudden changes in the network topology, or vice versa. However, little effort has been devoted to research on realistic cases in which rapid changes in the system modes of MASs affect the network topology. A more serious problem is that due to network issues, such as packet dropout and data transmission delay, the controller modes cannot be designed in accurate synchronization with the system or network topology modes. Thus, it is necessary to carefully consider the impact of this asynchronous problem when designing a controller that operates in such an environment. According to this need, [24] used a hidden Markov model (HMM) to deal with the problem of the leader-following consensus for MASs with asynchronous control modes. However, in [24], random changes in the network topology are modeled as a Markov process, but the system parameters of MASs are assumed to be deterministic. Hence, to overcome these weaknesses, more progress needs to be made toward addressing the impact of changes in both system parameters and network topology while achieving the HMM-based cooperative output regulation for continuous-time heterogeneous MASs.
Based on the above discussion, the main goal of this paper is to address the problem of stochastically cooperative output regulation for continuous-time heterogeneous MASs with hidden Markov jumps in the system mode and network topology. First, a mode-dependent leader-state observer is designed to transmit an estimated leader-state to each follower agent. After that, an asynchronous mode-dependent distributed controller is designed so that it can ensure the stochastically cooperative output regulation of MASs. To be specific, the main contributions of this paper can be summarized as follows.

•
This paper makes a first attempt to reflect the influence of the asynchronous mode between heterogeneous MASs and observer-based distributed controllers while achieving stochastically cooperative output regulation subject to Markov jumps. Different from [22,23,25,26], the realistic case where rapid changes in the system modes of MASs affect the network topology is considered in the control design processes. • This paper proposes a method to design a continuous-time leader-state observer capable of estimating the leader-state value for each agent under abrupt changes in both systems and network topology. Also, it introduces an alternative mechanism by integrating system-mode-dependent solutions of regulator equations into the output of the leader-state observer to reduce the complexity arising from the asynchronous controller-side mode.

•
In the control design process, the asynchronous mode-dependent control gain is coupled with the system-mode-dependent Lyapunov matrix, which makes it difficult to directly use the well-known variable replacement technique [27]. For this reason, this paper suggests a suitable linear decoupling method that is capable of handling the aforementioned coupling problem.
The rest of the paper is organized as follows. Section 2 presents a class of heterogeneous MASs with hidden Markov jumps under our consideration. Next, Section 3 presents methods for designing a mode-dependent leader-state observer and asynchronous modedependent distributed controllers for MASs. In Section 4, two illustrative examples are provided to demonstrate the validity of the proposed method. Finally, the concluding remarks are given in Section 5.
Notations: spec(A) denotes the eigenvalue set of matrix A. In symmetric block matrices, ( * ) is used as an ellipsis for terms induced by symmetry. diag(·) stands for a block-diagonal matrix; col(v 1 , v 2 , · · · , v n ) = [v T 1 v T 2 · · · v T n ] T denotes the scalars or vectors, v i ; He{Q} = Q + Q T denotes any square matrix Q; ⊗ denotes the Kronecker product; I n denotes the n-dimension identity matrix; ||x|| 2 denotes the Euclidean norm of vector x; λ max (A) denotes the maximum eigenvalue of matrix A; ||A|| 2 2 = λ max (A T A) denotes matrix A; Re(λ) denotes the real part of λ; N n denotes the set {1, 2, · · · , n}; E{·} denote the mathematical expectation; "T" and "−1" represent matrix transposition and inverse, respectively. The triplet notation (Ω, F , Pr) denotes a probability space, where Ω, F , and Pr represent the sample space, the algebra of events, and the probability measure defined on F , respectively.

Heterogeneous Multi-Agent System Description
Let us consider the following continuous-time Markov jump dynamics of the ith follower and the leader (or exogenous system), defined on a complete probability space (Ω, F , Pr): where x i (t) ∈ R n , u i (t) ∈ R m , and y i (t) ∈ R p denote the state, the control input, and the output of the ith follower, respectively; x 0 (t) ∈ R l and y 0 (t) ∈ R p are the state and the output of the leader, respectively; N N = 1, 2, . . . , N is the set of agents; φ(t) ∈ N α denotes the time-varying switching system mode with N α denoting the set of systems modes; and N is the number of follower agents. In (1), , and C 0 (φ(t) = g) = C 0g are known system matrices with appropriate dimensions; and A ig , B ig and C ig , A ig are controllable and detectable, respectively. In addition, the process {φ(t) ∈ N α , t ≥ 0} is characterized by a continuous-time homogeneous Markov process with the following transition probabilities (TPs): where o(δ) is the little-o notation defined as lim δ→0 (o(δ)/δ) = 0; and π gh indicates the transition rate (TR) from mode g to mode h at time t + δ and satisfies that ∑ h∈N α π gh = 0, and π gh ≥ 0 for h ∈ N α \ {g}.

Remark 1.
Different from other studies [22,23,25,26,28], this paper deals with the case where changes in system parameters (affected by the life span of system components, the increase in heat of motors, wear and tear, etc.) directly lead to signal loss or degradation in the transmission quality between agents, inducing changes in the network topology. In addition, external influences could also impact the system parameters and communication network of agents at the same time. Therefore, in this paper, the network topology mode is set to be the same as the system mode φ(t).
To be specific, Figure 1 shows a block diagram of the cooperative output regulation of multi-agent systems under our consideration, which contains four main parts: the multiagent system (v i ), network analyzer, leader-state observer, and distributed controller (c i ). Functionally, the network analyzer finds the current network topology mode using data transmitted over a wireless local area network (black dash line), and the mode information is transmitted to the observer directly and to the distributed controller over a wireless wide area network (blue dash line). As in [23,29], the leader-state observer provides the estimated leader-statex 0i for the ith agent to overcome the difficulty that some followers cannot receive information directly from the leader. Next, based on x i , Π igx0i , and Λ igx0i , the controller c i generates the control input u i and transmits it to the ith agent so that the output y i approaches y 0 as time increases. In a real environment, since the transmission of the network topology mode can be affected by various network issues from a wireless wide area network, such as network-induced delay, packet dropout, network congestion, and interference, this paper employs another asynchronous observation mode, ρ(t) ∈ N β . In other words, this paper considers a hidden Markov model (HMM) concept with φ(t) and ρ(t) when designing the distributed controller, where the asynchronism between the two is described as the following conditional probability: Pr(ρ(t) = s|φ(t) = g) = gs , which satisfies ∑ s∈N β gs = 1 and gs ≥ 0 for g ∈ N α and s ∈ N β .

Remark 2.
If the network topology mode observed in the network analyzer is not affected by network issues in the transmission process, there will be no asynchronous phenomenon between φ(t) and ρ(t), i.e., ρ(t) = φ(t) and N β = N α . To cover this special case, the conditional probability matrix [ gs ] g∈N α ,s∈N α ∈ R α×α can be set to the identity matrix I ∈ R α×α .

Assumption 2.
For φ(t) = g, there are pairs of solutions (Λ ig , Π ig ) for the following regulator equations: where Λ ig ∈ R n×l and Π ig ∈ R m×l .
Remark 3. Assumption 1 is made only for convenience and loses no generality. In fact, if the linear output regulation problem is solvable by any controller under Assumption 1, it is also solvable by the same controller even if Assumption 1 is violated. More explanations can be found in [30]. Assumption 2 provides the well-known regulator equations [23,31] whose solutions impose a necessary condition for the control design process. Furthermore, the feasibility of Assumption 2 is guaranteed according to Remark 4.

Communication Topology
As mentioned in Remark 1 and Figure 1, the network topology of (1) with φ(t) = g ∈ N α is represented as a mode-dependent weighted and directed graph (digraph) G g = V, E g , A g formed by N follower agents.
that has information flow leaving from agent v j to agent v i at time t; and A g = a ij,g i,j∈{1,2,··· ,N} denotes the adjacency matrix with a ij,g > 0 if and only if (v j , v i ) ∈ E g , and a ij,g = 0 otherwise. In addition, the neighbor set and degree of v i ∈ V are defined as j=1 a ij,g , respectively, and the Laplacian matrix of G g is given Furthermore, a directed path leaving from node v j to node v i is a sequence of ordered edges [24]. In particular, the multi-agent system under our consideration consists of N follower agents and one leader v 0 with a spanning tree from the leader to each agent in the network topology. Thus, to represent such a system, this paper considers an extended graph where M g denotes the leader adjacency matrix with m i,g > 0 if and only if the leader v 0 transmits information to v i , and m i,g = 0 otherwise. That is, the union of graphs is given by G 0 := α g=1 G 0,g , and its node set is equal to V 0 .

Remark 5.
By considering the bidirectional information link between agents, this paper ensures that all agents have the opportunity to communicate with each other, depending on the network topology modes.
The following definitions and lemma will be adopted in this paper.
Definition 1 ([32,33]). Let us consider a Markov jump linear system with state χ(t). For any initial conditions, then the Markov jump linear system is stochastically stable.
Definition 2 ([20,23,34]). The heterogeneous MAS system (1) is said to achieve the stochastically cooperative output regulation if it holds that For any initial conditions, x i (0) and x 0 (0), where e i (t) = y i (t) − y 0 (t) represents the error between the output of the ith agent and the output of the leader.

Lemma 1 ([35]
). For any matrix U > 0, and matrices X and Y of compatible dimensions, the following inequality holds:

Main Results
As in [29], this paper first designs a mode-dependent leader-state observer that provides an estimated leader-state for each agent. Following that, this paper designs a distributed controller that achieves cooperative output regulation for heterogeneous multi-agent systems (1).

Leader-State Observer Design
As depicted in Figure 1, the leader-state observer directly receives g and x 0 from the network analyzer and the leader. Thus, the mode-dependent observer for the ith agent can be established as follows: for wherex 0i (t) ∈ R l is the estimated leader-state for the ith agent; and F g ∈ R l×l is the observer gain. Thus, based on the ith observation error stateê That is, the resultant error system is represented as follows:ê whereê 0 (t) = col ê 01 (t),ê 02 (t), · · · ,ê 0N (t) .
The following theorem provides the stabilization condition for system (10), formulated in terms of LMIs. Theorem 1. Suppose there exists 0 < Q g = Q T g ∈ R l×l andF g ∈ R l×l such that the following conditions hold: for g ∈ N α , (11) Then, system (10) is stochastically stable under the Markov network topology, and the observer gain is obtained by F g = Q −1 gFg .
Proof. Let us choose a mode-dependent Lyapunov function of the following form: where I N ⊗ Q g = diag Q g , . . . , Q g > 0. Then, the weak infinitesimal operator acting on V(t, φ(t)) provides Thus, by (13), it can be seen that ∇V(t) < 0 holds if whereQ g = ∑ h∈N α π gh Q h . That is, (14) implies that there exists a small scalar κ > 0, such that ∇V(t) ≤ −κ||ê 0 (t)|| 2 2 , and by a generalization of Dynkin's formula, it is obtained that which results in Hence, based on Definition 1, system (10) can be said to be stochastically stable. Moreover, definingF g = Q g F g , condition (14) can be converted into 0 > I N ⊗ He Q g A 0g + He L g + M g ⊗F g + I N ⊗Q g , which becomes (11).

Remark 6.
It is worth noting that not all follower agents can obtain the leader-state values directly under the switching network topology, but only the one who is connected to the leader. For distributed control purposes, the mode-dependent leader-state observer is designed for each agent, such that the estimated states can track the leader-state values by intermittent communication. Indeed, following Theorem 1, system (10) is stochastically stable, which leads to lim t→∞ê0i (t) =x 0i (t) − x 0 (t) = 0. Figure 1,x 0i (t) is multiplied by each component in the solution pair (Λ ig , Π ig ) of regulator equations as the output of the leader-state observer and then sent to the controller to reduce the complexity arising from the asynchronous controller-side mode.

Distributed Controller Design
Let us definex i (t) = x i (t) − Λ ig x 0 (t). Then, the error system of the ith agent is given as follows:˙x Furthermore, for (17), the following distributed control law is considered: where K is = K i (ρ(t) = s) denotes the asynchronous mode-dependent controller gain with s ∈ N β . Thus, based on Assumption 2, the closed-loop system with (17) and (18) is described as follows:˙x The following theorem presents the LMI-based cooperative output regulation conditions of (19).

Theorem 2.
For any scalars µ > 0, > 0, and δ > 0, suppose that there exist 0 <P ig =P T ig ∈ R n×n , 0 < Z igh = Z T igh ∈ R n×n , W i = W T i ∈ R n×n , andK is ∈ R m×n , such that the following conditions hold for i ∈ N N , g ∈ N α : where Then, system (1) achieves the cooperative output regulation, where the controller gain is constructed by K is =K is W −1 i .
Proof. Let us consider the following mode-dependent Lyapunov function: where P ig = P T ig > 0. Then, applying the weak infinitesimal operator to V i (t, φ(t)) leads to Also, by Lemma 1, it is obtained that where K ig = ∑ s∈N β gs K is , and Thus, the condition Υ ig < 0 implies where γ = Λ ig 2 2 + Π ig 2 2 . Furthermore, using a generalization of Dynkin's formula, it is obtained that

Remark 8.
In the synchronous case (φ(t) ≡ ρ(t) = g), the bilinear problems (the third and fifth terms on the right-hand side) of condition (29) could be addressed using the conventional variable replacement technique by denotingK ig = K igPig , and then applying the Schur complement method to derive the final LMI conditions. However, because of the difference between the system and controller modes in the asynchronous case (φ(t) = g = ρ(t) = s), it is impossible to use the aforementioned approach. Hence, additional numerical processes and equivalent LMI conditions are introduced here to handle the coupling problems in condition (29).

Remark 9.
The most recent studies concentrated on addressing the cooperative output regulation problem of heterogeneous Markov jump MASs, where both system and controller modes operate synchronously [23,34], or on deterministic system parameters with a switching network topology [20,21]. Meanwhile, the proposed control strategy in this work can, overall, cover these two problems. Furthermore, we make a first attempt to reflect the influence of the asynchronous mode between the continuous-time heterogeneous MASs and observer-based distributed controllers on achieving stochastically cooperative output regulation subject to Markov jumps. Thus, it is hard to draw a direct comparison since the lack of comparable results developed in a similar framework to this study.

Illustrative Examples
To show the validity of the proposed method, this paper provides two examples. Example 1. Let us consider the following heterogeneous multi-agent system with g ∈ N 2 and i ∈ N 4 , adopted in [23]: In addition, the used network topology is depicted in Figure 2 (each agent is a numbered circle), which can be characterized as follows:  Also, based on Assumption 2, Λ ig = I 2 , Π i1 = 0, Π i2 = 0 −0.1 . Furthermore, to handle two synchronous and asynchronous cases between the system and controller modes, we consider the following transition rate π gh and conditional probability gs :

Case 1 (Synchronous case):
gs g∈N 2 ,s∈N 2 Case 2 (Asynchronous case): Thereupon, for µ = 0.05, δ = 0.05, and = 0.1, Theorems 1 and 2 provide the following leader-state observer gain F g and controller gain K is : Case 1 (Synchronous case): The details of the hardware used in this experiment are provided in Table 1 and the experiment is conducted using MATLAB for both cases. Based on the central limit theorem, the expected computation times are nearly 0.0363 s for the synchronous case and 0.0394 s for the asynchronous case (after 10,000 trials). Although there is a slight difference, the computation times to derive the final results are approximately similar for both cases overall. Table 1. System specifications.

Hardware Resources Information
Operating System Microsoft Windows 10 Pro  Figure 3, suppose that the hidden mode (also called the system mode) and the observed mode (also called the control mode) are generated according to (36)- (38). Then, based on (39), Figure 4 shows the leader-state estimation errorê 0i =x 0i − x 0 , wherê e 0i = [ê 0i1ê0i2 ] T . That is, from Figure 4, it can be seen thatx 0i steadily approaches x 0 as time increases, which reveals that the observer (8) with (39) can accurately estimate the leaderstate despite abrupt changes in the system mode (related to the network topology mode). Next, based on (40), Figure 5a shows the leader and agent outputs for the synchronous case, which demonstrates the cooperative output regulations to ensure that all agent outputs (see the solid lines) follow the leader's output (see the green-dotted line). Furthermore, Figure 6a shows the output error e i = y i − y 0 , clearly illustrating how the agent output approaches the leader's output from the given initial conditions. Meanwhile, based on (41), Figure 5b shows the leader and agent outputs for the asynchronous case, which illustrates that all agent outputs follow the leader's output as time increases, despite the emergence of Markov switching and the asynchronous phenomenon. Figure 6b shows that the output error e i steadily converges to zero as time increases; hence, it validates the results presented in Figure 5b. Eventually, from Figure 5a,b, it can be seen that the proposed method can be successfully used to achieve the cooperative output regulation for heterogeneous multi-agent systems, with hidden Markov jumps containing both synchronous and asynchronous cases. Example 2 (Practical application). Consider the following double integrator dynamics driven by different types of actuators, adopted in [36], for g ∈ N 2 , and i ∈ N 4 : where the system state x i = x 1i , x 2i , x 3i T consists of the position x 1i , the velocity x 2i , and the actuator state x 3i ; a i > 0 denotes the speed of the actuator; b i and c i > 0 are gains; and d ig ≥ 0 represents the influence of velocity on the actuator. Specifically, the value of d ig is changed according to the Markov process of system mode g ∈ N 2 , which is affected by the environment on the actuator, or internal factors, such as temperature, humidity, and lifespan.
If d ig > 0, the actuator is influenced by the velocity, otherwise, d ig = 0 (refer to [36] for more details). In this paper, the system parameters are set to a 1 = 1, a 2 = 10, a 3 = a 4 = 2, , and d 32 = 5. Furthermore, to synchronize the position and velocity of agents, the leader is described according to [36] as follows: In addition, the used network topology is depicted in Figure 7 (each agent is a numbered circle), which can be characterized as follows: Also, based on Assumption 2, we can see that Furthermore, to handle the two synchronous and asynchronous cases between the system and controller modes, we consider the following transition rate π gh and conditional probability gs : Case 2 (Asynchronous case): Thereupon, for µ = 0.05, δ = 0.05, and = 0.1, Theorems 1 and 2 provide the following observer gain F g and controller gain K is : Let us consider the following initial conditions:  Figure 8, suppose that the hidden mode (also called the system mode) and the observed mode (also called the control mode) are generated according to (44)-(46). Then, based on (47), Figure 9 shows the leader-state estimation errorê 0i =x 0i − x 0 , whereê 0i = [ê 0i1ê0i2 ] T . That is, as shown in Figure 9,x 0i steadily approach x 0 as time increases, which reveals that the observer (8) with (47) can accurately estimate the leader-state regardless of the abrupt changes in the system (42). Subsequently, Figure 10a shows the leader and agent outputs for the synchronous case, which verifies that (48) achieves the cooperative output regulation of (42) since all the agent outputs (see the solid lines) follow the leader's output (see the green-dotted line). Meanwhile, based on (49), Figure 10b shows the leader and agent outputs for the asynchronous case; this also illustrates that all agent outputs follow the leader's output as time increases despite the emergence of the Markov switching and the asynchronous phenomenon. Eventually, from Figure 10a,b, it can be seen that the proposed method can be effectively used to realize the cooperative output regulation for (42) with hidden Markov jumps containing both synchronous and asynchronous cases.

Concluding Remarks
In this paper, we investigated the stochastic cooperative output regulation problem of heterogeneous multi-agent systems subject to hidden Markov jumps. In particular, when dealing with this problem, we also considered a time-varying network topology that changes according to the system operation mode. First, a leader-state observer was designed using a mode-dependent Lyapunov function to ensure that all agents can accurately estimate the leader-state. Then, an asynchronous mode-dependent distributed controller was designed to ensure the stochastic cooperative output regulation for heterogeneous multi-agent systems with hidden Markov jumps. In addition, recent studies [37][38][39] motivated us to extend the proposed strategy to cover more practical control problems, such as stochastic time delay, input saturation, and unknown system dynamics in the continuous-time (discrete-time) domain for a wider range of applications.

Data Availability Statement:
The authors confirm that the data supporting the findings of this study are available within the article.

Conflicts of Interest:
The authors declare no conflict of interest.