Local stability of McKean-Vlasov equations arising from heterogeneous Gibbs systems using limit of relative entropies

A family of heterogeneous mean-field systems with jumps is analyzed. These systems are constructed as a Gibbs measure on block graphs. When the total number of particles goes to infinity, a law of large numbers is shown to hold in a multi-class context resulting in the weak convergence of the empirical vector towards the solution of a McKean-Vlasov system of equations. We then investigate the local stability of the limiting McKean-Vlasov system through the construction of a local Lyapunov function. We first compute the limit of adequately scaled relative entropy functions associated with the explicit stationary distribution of the N-particles system. Using a Laplace principle for empirical vectors we show that the limit takes an explicit form. Then we demonstrate that this limit satisfies a descent property which, combined with some mild assumptions shows that it is indeed a local Lyapunov function.


Introduction
The study of heterogeneous mean-field systems is a growing area of research. The main motivation is that the homogeneity assumption underlying the classical mean-field models often no longer holds when considering systems outside statistical physics. Therefore, many researchers studied systems with different heterogeneous assumptions; see, e.g., Contucci et al. (2008); Dawson et al. (2020); Knöpfel et al. (2021); Lacker et al. (2019) and the references therein for an overview of the recent development in the subject. The focus in the current paper is on a particular family of mean-field systems of Gibbs type constructed on block graphs. This family is a particular instance of the models introduced in Dawson et al. (2020Dawson et al. ( , 2021 with the particularity laying in the specification of a heterogeneous Gibbs measure as a stationary distribution. This was in particular inspired by the interacting particle systems of Gibbs type on complete graphs analyzed in Budhiraja et al. (2015a,b).
Classical questions in the study of mean-field systems include their asymptotic behavior when the total number of particles N in the system and/or the time t tends to infinity. Moreover, under which conditions one can justify the interchange of the limits N → ∞ and t → ∞. In particular, we will show for the specific family of Gibbs systems introduced in Section 1 that, under mild assumptions, the associated empirical vector converges weakly and uniformly over compact time intervals, as N → ∞, towards the solution of a McKean-Vlasov system of equations (see Theorem 2.1). This kind of result is known in the literature as law of large numbers. Thus, as a consequence, the McKean-Vlasov limiting system can be used to approximate the behavior of the large particles system over finite time intervals. Thence, one might wonder whether or not this approximation is still relevant when time goes to infinity. This question turns out to be related to the stability properties of the limiting McKean-Vlasov system. More precisely, if the latter contains a unique asymptotically stable equilibrium, then the interchange N → ∞ and t → ∞ is fully justified. However, if the McKean-Vlasov system contains multiple equilibria or a unique equilibrium that is not stable, care must be taken since the approximation is not necessarily accurate. The intuition behind this is that if the McKean-Vlasov limit system has several ω-limit sets, the question is which of these sets characterizes the large-time behavior of the system. In addition, in such a case, metastable phenomena might be observed resulting in transitions of the empirical vector process from one ω-limit set to another. For a more detailed discussion, one can consult Dawson et al. (2021); Benaïm & Boudec (2008) and the references therein. Therefore, the stability of the limiting system is of great interest, which motivated us to investigate it in the current paper.
The classical approach for studying the stability of dynamical systems is through the construction of Lyapunov functions. However, for McKean-Vlasov equations, given the nonlinearity, finding Lyapunov functions in the general case is very challenging. Nevertheless, the Gibbs nature of the systems studied here allows us to adopt the limit of relative entropy approach introduced in Budhiraja et al. (2015a,b). The main idea behind this approach takes root in the observation that the relative entropy is a natural Lyapunov function for linear ergodic Markov processes; see, e.g. Spitzer (1971). Moreover, despite the nonlinearity of the McKean-Vlasov system, the corresponding N -particle processes describe a linear Markov process. Therefore, one then proposes the limit of suitably normalized relative entropies associated with the stationary distribution of the N -particles system as the candidate Lyapunov function, providing that the limit takes a useful form. This approach was shown in Budhiraja et al. (2015a,b) to be successful for a family of homogeneous mean-field models with jumps of Gibbs type. The goal of the current paper is to extend the approach to the multi-class setting tackled here.
Notice that for the general non-Gibbs family of mean-field models on block graphs introduced in Dawson et al. (2020Dawson et al. ( , 2021, the stationary distribution usually will not take an explicit form, and thus a different approach is needed. One possible line of work is the approach proposed in Budhiraja et al. (2015a) for particular systems with exchangeable particles, where instead of considering limits of relative entropies associated with the stationary distribution, one may consider the large time t and large particles N limits associated with the exchangeable joint probability distribution. Therefore, given the multi-class structure of the systems in Dawson et al. (2020Dawson et al. ( , 2021, one might consider specific systems with multi-exchangeable particles. Another possible approach to construct Lyapunov functions is through the Friedlin-Wentzell quasipotential Freidlin & Wentzell (2012). Indeed, a close tie between the quasipotential associated with small noise stochastic systems and Lyapunov functions for the underlying deterministic models was observed in the literature; see, e.g., DeMarco & Qian (1989); Marco & Bergen (1987). Therefore, for the general models introduced in Dawson et al. (2020Dawson et al. ( , 2021, one might view the finite N -particles system as a small noise perturbation of the limiting McKean-Vlasov system. Notice that the specific quasipotential for these models was introduced in Dawson et al. (2021). The idea is then to investigate under which condition the quasipotential can serve as a Lyapunov function for the McKean-Vlasov system. This however goes beyond the scope of the current paper.
The rest of the paper is organized as follows. We introduce in Section 1 the family of Gibbs systems on block graphs and show some preliminary properties. We then prove in Section 2 the law of large numbers and the convergence of the N -particles empirical vector toward the solution of a McKean-Vlasov system of equations. Therefore, we investigate in Section 3 the local stability of the limiting McKean-Vlasov system by constructing a candidate Lyapunov function. Using a particular Laplace principle associated with the vector of empirical measures (see Proposition 3.2), we start by computing in Proposition 3.3 the limit of suitably normalized relative entropies and show that the limit takes an explicit form. Then, we show in Lemma 3.1 that the limiting function characterizes the fixed points of the McKean-Vlasov system. Finally, Proposition 3.4 shows that this limiting function satisfies a descent property, which, combined with mild assumptions, shows that it is indeed a local Lyapunov function for the McKean-Vlasov system of equations.

Gibbs measures on block graphs
The Gibbs measure concept has a long history and plays an important role in statistical physics. The underlying principle is that, when a system is in equilibrium, states with lower energy are more likely than those with higher energy. Thence, J.W. Gibbs proposed the probability measure exp{−U (x x x)/κT } to capture this principle, where κ is the Boltzmann constant, T is the temperature and the function U (x x x) gives the energy of the system when it is in the state x x x. Thence, the Gibbs measure being a model of equilibrium, given an energy function, one might seek a Markov process for which the Gibbs measure is the stationary distribution. Such Markov processes are often referred to as Glauber dynamics thanks to his seminal paper Glauber (1963). For a detailed introduction to the topic, one can consult, e.g., Martinelli (1999); Stroock (2005).
Consider a graph G = (V, Ξ) composed of r blocks C 1 , . . . , C r of sizes, respectively, N 1 , . . . , N r , where V is the set of the nodes and Ξ is the set of the edges. Denote by |V| = N 1 + · · · + N r = N the total number of nodes in the graph. Suppose that each block C j is a clique, i.e., all the N j nodes of the same block are connected. Furthermore, the nodes within the same block C j are decomposed into two subsets: • The set of central nodes C c j : composed of the nodes that are connected to all the other nodes within the same block but not to any node from the other blocks. We set |C c j | = N c j .
• The set of peripheral nodes C p j : composed of the nodes that are connected to all the other nodes within the same block and to all the peripheral nodes from the other blocks. We set |C p j | = N p j . Thus, the sub-graph engendered by all the peripheral nodes in G is complete.
The graph G = (V, Ξ) is thus composed of 2r components. This will play an important role in the upcoming analysis. We associate each node of the graph with a particle taking values in a finite state space Z = {1, 2, . . . , K} ⊂ N. Denoting by x x x = (x n , x m , n ∈ C c j , m ∈ C p j , 1 ≤ j ≤ r) ∈ Z N a configuration of the N particles over the graph G = (V, Ξ), the corresponding local empirical measures describing the state of each component are defined by for any z ∈ Z, ι ∈ {c, p}, and 1 ≤ j ≤ r. (1.1) Notice that for each 1 ≤ j ≤ r and ι ∈ {c, p}, the local empirical measure ̺ j,ι,N is the space of probability measures on Z endowed with the topology of weak convergence and Z is the set of integers. The corresponding empirical vector describing the state of the entire system is denoted by For ease of readability, we suppress in the sequel the dependency of ̺ j,ι,N and ̺ N upon N . Thus we simply write ̺ j,ι and ̺ instead. We associate with the configuration x x x of the N -particles the following energy function where V : Z → R is the potential function, W : Z × Z → R is the symmetric interaction function, and β > 0 is the interaction parameter. Thence, the corresponding Gibbs measure is given by where Z N is a normalization constant. We will now construct a Markov process, namely a Glauber dynamics, having the Gibbs measure π N as its stationary distribution. To this end, let us first introduce the directed graph (Z, E) with E ⊂ Z × Z\{(z, z)|z ∈ Z} representing the set of admissible jumps. Thence, whenever (z, z ′ ) ∈ E, a particle at state z is allowed to transit to z ′ ∈ Z at a rate that depends on the current state of the node and the state of its neighbors. Before going further, suppose the following assumptions throughout the paper.
Assumption 1 1. The set of edges E is symmetric.
3. For each block 1 ≤ j ≤ r, there exist p c j , p p j , α j ∈ (0, 1) such that, as N → ∞, Define the matrix (α(z, z ′ )) z,z ′ ∈Z that identifies the allowed transitions for one particle as Moreover, let the matrix A N (x x x, y y y) indexed by the elements x x x, y y y ∈ Z N be defined as A N (x x x, y y y) = a(x l , y l ) if x x x and y y y differ exactly in one index l ∈ C j , for some 1 ≤ j ≤ r. 0 else, Hence A N determines which states of the N -particle system can be reached in one jump. At this stage, there are several ways to construct Glauber dynamics. See, e.g., (Martinelli, 1999, Sect. 3) for an overview. We propose here the Metropolis dynamic characterized by the following rate matrix: 5) and ψ N (x x x, x x x) = − y y y =x x x ψ N (x x x, y y y), for all x x x ∈ Z N . One can verify that the rate matrix ψ N has π N as its stationary distribution. In fact, consider two configurations x x x, y y y ∈ Z N . By symmetry, one has A N (x x x, y y y) = A N (y y y, x x x). Moreover, using (1.3) and (1.5), it is easily to check that π N (x x x)ψ N (x x x, y y y) = π N (y y y)ψ N (y y y, x x x). Then ψ N (x x x, y y y) satisfies the detailed balance condition with respect to π N . Hence, π N is a stationary distribution of the Markov chain, and the rate matrix ψ N is reversible with respect to π N . Furthermore, since the graph of allowed transitions (Z, E) is irreducible by Assumption 1, the rate matrix ψ N is also irreducible and thus, π N is the unique stationary distribution. One might observe from the rate matrix introduced in (1.5) that the transition between two configurations x x x and y y y depends on the difference between their total energy. Therefore, to investigate the large-scale behavior of the system, we first estimate this difference when the total number N of particles in the system is very large. Given the multi-class structure of the system, we take throughout the paper the convention that as N → ∞, min 1≤j≤r ι∈{c,p} N ι j → ∞. In addition, and for the aim of simplicity, we will ignore from now on the environment potential by supposing that V ≡ 0 and thus focus on the interaction component of the system. Nevertheless, our results can be easily extended to the case with non-zero potential. Let us define, for x, y ∈ Z, q = (q 1,c , q 1,p , . . . , q r,c , q r,p ) ∈ (M 1 (Z)) 2r , and a, b 1 , . . . , b r ∈ R, the following real-valued functions Lemma 1.1 Let x x x, y y y ∈ Z N be two configurations such that A N (x x x, y y y) = 1. If the unique index satisfying x l = y l is a central node, i.e., l ∈ C c j * , for some block 1 ≤ j * ≤ r, then (1.9) If the unique index such that x l = y l is a peripheral node, i.e. l ∈ C p j * , for some block 1 ≤ j * ≤ r, then (1.11) Proof Let x x x, y y y ∈ Z N be two configurations such that A N (x x x, y y y) = 1, and let l be the unique index such that x l = y l . The index l can either refer to a central or a peripheral node. We treat the two cases separately.
Case 1. Suppose that the two vectors x x x and y y y differ in one central node, i.e., l ∈ C c j * for some 1 ≤ j * ≤ r. Therefore by (1.2) one obtains (1.12) By the symmetry of the interaction function W and using (1.1) one further obtains (1.14) Therefore, one concludes that (1.15) from which (1.8) and (1.9) follow.
Case 2: Suppose now that the two vectors x x x and y y y differs in one peripheral node, i.e., l ∈ C p j * for some 1 ≤ j * ≤ r. Therefore from (1.2) one gets (1.16) Again, using the symmetry of W together with (1.1) one further obtains Thus, one deduces that from which (1.10) and (1.11) follow. ✷ One might notice from Lemma 1.1 that if two configurations differ in only one index l, then the jump rates of the Markov process governed by the rate matrix ψ N depend on the states x i of the other particles i = l only through the local empirical measures To emphasize this fact, we introduce the rate functions λ j,c,N and λ j,p,N defined, for all 1 ≤ j ≤ r, z, z ′ ∈ Z, q = (q 1,c , q 1,p , . . . , q r,c , q r,p ) ∈ (M 1 (Z)) 2r , and a, b 1 , . . . , b r ∈ R, by Thus, when the system is in configuration x x x, a central node l ∈ C c j of a given block 1 ≤ j ≤ r jumps from a state z to z ′ at rate and a peripheral node l ∈ C p j within the block 1 ≤ j ≤ r jumps from a state z to z ′ at rate (1.23) Now, using (1.6) and (1.7) together with Assumption 1, one can easily prove that the functions ψ j,c,N (·, N c j , N p j ) and ψ j,p,N (·, N c j , N p 1 , . . . , N p r ) converge respectively, as N → ∞, toward the functions ψ c and ψ p defined, for all 1 ≤ j ≤ r, z, z ′ ∈ Z and q = (q 1,c , q 1,p , . . . , q r,c , q r,p ) ∈ (M 1 (Z)) 2r by Using this together with the inequalities in (1.9) and (1.11), one can further prove that the rate functions ( 1.24) Fix q = (q 1,c , q 1,p , . . . , q r,c , q r,p ) ∈ (M 1 (Z)) 2r then, for all 1 ≤ j ≤ r, the rate matrix A j,c , is the generator of an ergodic Markov chain with a unique invariant measure defined, for all z ∈ Z, by where Z N (q) is a normalization constant given by In fact, this can be easily verified by checking the detailed balance condition. Similarly, the rate matrix , is the generator of an ergodic Markov chain with a unique stationary distribution defined, for all z ∈ Z, by where again Z N (q) is the normalization constant given by

Law of large numbers and McKean-Vlasov limiting system
Denote by x x x N (t) t≥0 = x n (t), x m (t), n ∈ C c j , m ∈ C p j , 1 ≤ j ≤ r t≥0 the stochastic process characterizing the state of the system at each time t ≥ 0. Hence x x x N (t) t≥0 is a Markov process with state space Z 2r and transition rate matrix ψ N (x x x, y y y) x x x,y y y∈Z N given by (1.5). Define for convenience the following local empirical measures . Therefore, at any time t ≥ 0, a given central (resp. peripheral) particle within the block j jumps from z to z ′ , for z, (1.21)). Recall the important notion of multi-exchangeability for multi-class systems in the definition given below.
is said to be multi-exchangeable if its law is invariant under permutation of the indexes within the classes, that is, for 1 ≤ k ≤ K and any permutations σ k of {1, . . . , N k }, the following equality in distribution holds Given the symmetry of the rate functions within each of the 2r classes C c 1 , C p 1 , . . . , C c r , C p r , a multiexchangeability assumption made on the initial condition x x x N (0) guarantees that, at any t > 0, x x x N (t) is also multi-exchangeable. It follows that the local empirical measures µ ι,N j (t) are Markov processes taking values respectively in the state spaces M N ι j 1 (Z). For a detailed proof of this claim, one can consult (Dawson, 1991, Prop. 2.3.3). The random evolution of the empirical vector µ N (t) is summarized as follows: First, notice that the current setting allows almost surely at most one particle to jump at any given time. Therefore the jumps of µ N (t) happen within each component µ ι,N j (t) at a time and have the form j q j,ι z particles of the class C ι j at each state z ∈ Z. Each of these particles jumps independently to z ′ ∈ Z at rate λ j,c,N z,z ′ q, N c j , N p j for ι = c and λ j,p,N z,z ′ q, N c j , N p 1 , . . . , N p r for ι = p. Therefore, the local central empirical processes µ c,N j (t) transit from q j,c to q j,c + 1 Similarly the peripheral local empirical processes µ p,N j (t) transit from q j,p to q j,p + 1 . . , N p r . Thence, the infinitesimal generator associated with the empirical vector µ N (t) is given for any real-valued function Φ on (M 1 (Z)) 2r by One might also describe the changes in the system locally by introducing the infinitesimal generators corresponding to the classes C ι j and defined for functions f on M 1 (Z) by As mentioned in the introduction, one classical question in the study of interacting particle systems is their large-scale behavior, namely, their behavior when the total number of particles goes to infinity. In the next result, we state that a law of large numbers holds for the empirical vector µ N (t) as N → ∞. It is worthwhile to mention that contrary to the models studied in Dawson et al. (2020Dawson et al. ( , 2021, the rate functions here are convergent. Proposition 2.1 Suppose that the initial condition µ N (0) = µ c,N 1 (0), µ p,N 1 (0), . . . , µ c,N r (0), µ p,N r (0) converges weakly, as N → ∞, towards ν = (ν c,N 1 , ν p,N 1 , . . . , ν c,N r , ν p,N r ) ∈ (M 1 (Z)) 2r . Then, the empirical vector process µ N (t) converges in probability and uniformly over compact time intervals toward the solution µ to the following McKean-Vlasov system (2.1) Proof Since µ N (t) is a pure jump Markov process, one might rely on the classical Kurtz theorem Kurtz (1970). We first verify the conditions of its application. Denote E = M 1 (Z) and E N ι j = M N ι j 1 (Z) for all 1 ≤ j ≤ r and ι ∈ {c, p}. Define the following functions for 1 ≤ j ≤ r and q = (q 1,c , q 1,p , . . . , q r,c , q r,p ) ∈ r j=1 ι∈{c,p} E N ι j . Observe from (1.24) that the rate functions λ j,ι z,z ′ are Lipschitz in q j,ι for all 1 ≤ j ≤ r and ι ∈ {c, p}. Therefore, since q j,ι are probability measures, one can find a constant M such that, for anyq = (q 1,c , q 1,p , . . . ,q r,ι , . . . , q r,c , q r,p ) ∈ r j=1 where · denotes a distance on R K . In addition, recall that the functions λ j,ι,N z,z ′ convergence toward λ j,ι z,z ′ as N → ∞. Hence, it is easy to see that Now, straightforward calculations allow us to verify that the z-th component of the are bounded given the form of the rate functions λ j,c,N z,z ′ and λ j,p,N z,z ′ in (1.20) and (1.21) and the fact that Z is finite. Thus condition (2.9) in Kurtz (1970) is verified. To verify condition (2.10) of Kurtz (1970) where C > e z ′ − e z . Thus ε N c j ↓ 0 as N → ∞, and the following converges hold true Thus condition (2.10) in Kurtz (1970) holds. Therefore, we are now in the position to apply (Kurtz, 1970, Theorem. 2.11) since all the related conditions are satisfied. Thus each µ ι,N j (t) converges in probability and uniformly over any time interval [0, T ] towards the solution µ ι j (t) of the differential equation d dt µ ι j (t) = µ ι j (t)A j,ι µ(t) . Define F (q) = F 1,c q (q 1,c ), F 1,p q (q 1,p ), . . . , F r,c q (q r,c ), F r,p q (q r,p ) .
From (2.2), it follows that F is Lipschitz. Therefore standard arguments show that the differential equationμ(t) = F (µ(t)) has a unique solution. This concludes the proof. ✷

Stability of the McKean-Vlasov system
The law of large numbers established in the last section characterizes the large-scale behavior of the N -particles system over finite time intervals. In particular, as N → ∞, the empirical vector µ N (t) converges in probability towards the solution µ(t) to the McKean-Vlasov system (2.1). Thence, when the total number of particles in the system is very large, one might approximate the behavior of µ N (t) by µ(t) over finite time intervals. However, when time t → ∞, this approximation might no longer be accurate. In particular, this will depend on the uniqueness of equilibrium points of the McKean-Vlasov system and their stability. In the case where there are multiple equilibria or a unique unstable equilibrium, metastable phenomena can arise and care must be taken in using the approximation. One can consult Borkar & Sundaresan (2012); Dawson et al. (2021) for a detailed discussion. The long-time behavior of the large N -particles system is thus related to the stability of the McKean-Vlasov system (2.1). The goal of the current section is to investigate it by constructing a suitable Lyapunov function. The approach we take is based on the calculation of the limit of suitably normalized relative entropies. This idea was introduced in Budhiraja et al. (2015a,b) to study the stability of Kolmogorov forward equations arising as the limit of mean-field systems with jumps on complete graphs. We thus generalize this method to multi-class mean-field systems with jumps.
Let us first introduce some definitions. Recall that |Z| = K is the number of possible states. Let S = {p ∈ R K : p z ≥ 0, z ∈ Z, z∈Z p z = 1} be the (K − 1)-dimensional simplex, and let S • = {p ∈ S : p z > 0, for all z ∈ Z} denotes its relative interior. Notice that the space M 1 (Z) of probability measures on Z can be identified with S. Thus, the empirical vector (̺(t), t ≥ 0) takes values in S 2r .
Definition 3.1 A point φ = (φ c j , φ p j , 1 ≤ j ≤ r) ∈ S 2r is said to be a fixed point of the McKean-Vlasov system (2.1) if the right-hand side of (2.1) evaluated at µ = φ is equal to zero, namely, (3.1) Definition 3.2 A fixed point φ = (φ c j , φ p j , 1 ≤ j ≤ r) ∈ S •2r of the McKean-Vlasov system (2.1) is said to be locally stable if there exists a relatively open subset Γ of S 2r that contains φ and has the property that whenever µ(0) ∈ Γ, the solution µ(t) to (2.1) with initial condition µ(0) converges to φ as t → ∞. Definition 3.3 Let φ = (φ c j , φ p j , 1 ≤ j ≤ r) ∈ S •2r be a fixed point of (2.1) and let Γ be a relatively open subset of S 2r that contains φ. A function J : Γ → R is called positive definite if for some K * ∈ R, the sets M K = {r ∈Γ : J(r) ≤ K} decrease continuously to φ as K ↓ K * .
The next classical result shows that the existence of a local Lyapunov function is equivalent to the local stability of an equilibrium of the McKean-Vlasov system (2.1). We give detailed proof tailored for the specific system (2.1).
Proposition 3.1 Let φ ∈ S •2r be a fixed point of the McKean-Vlasov system (2.1), and let Γ be some relatively open subset of S 2r that contains φ. Suppose that there exists a local Lyapunov function associated with (Γ, φ). Then φ is locally stable.
Proof Let J be a Lyapunov function associated with (Γ, φ). Therefore, J is positive definite. Thus, for some K * ∈ R, the sets M K = {r ∈Γ : J(r) ≤ K} decrease continuously to φ as K ↓ K * . One can then find some L > K * and an open set O such that φ ∈ O ⊂ M L ⊂ Γ. Let µ(0) ∈ O such that µ(0) = φ. By the chain rule property together with (2.1) we find where DJ(µ(t)) is the derivative (gradient's transpose) of J at the solution µ(t) of the McKean-Vlasov system (2.1), and ·, · denotes the scalar product on R 2r . Define the stopping time τ = inf{t > 0 : µ(t) ∈ Γ c }. Suppose that τ < ∞. By definition of Lyapunov function we have that d dt J(µ(t)) < 0 for all 0 ≤ t ≤ τ and µ(t) = φ. Moreover, observe that the function q → DJ q , q c 1 A 1,c q , q p 1 A 1,p q , . . . , q c r A r,c µ , q p r A r,p q is continuous on Γ. Then we have necessarily d dt J(µ(t)) = 0 for µ(t) = φ since d dt J(µ(t)) < 0 elsewhere on Γ (no discontinuity). Thus d dt J(µ(t)) ≤ 0 for all 0 ≤ t ≤ τ . Furthermore, since S 2r is a metric space, R is a complete metric space, and J is uniformly continuous, we have that the function J can be extended continuously on the closureΓ of Γ. Using this together with d dt J(µ(t)) ≤ 0 for all 0 ≤ t ≤ τ and since µ(0) ∈ M L proves that J(µ(τ )) ≤ L (since J is decreasing when t ∈ [0, τ ]). We thus deduce that µ(τ ) ⊂ M L ⊂ Γ. This means that τ = ∞ and hence again by definition of Lyapunov function d dt J(µ(t)) < 0 for all t ≥ 0 with µ(t) = φ. (3.3) Let K * < K n < L for n ≥ 1 be a decreasing sequence of real numbers going to K * , and define the corresponding stopping times We next prove that τ n < ∞ for all n ≥ 1. First, if µ(0) ∈ M Kn , this follows immediately by the above arguments. Suppose that µ(0) / ∈ M K 1 . Then, by the positive definitness of the function J, we can always find ε > 0 such that the ball B(φ, ε) centered at φ with radius ε satisfies B(φ, ε) ∩ S 2r ⊂ M K 1 ⊂ M L . Let q ∈ (B(φ, ε)) c ∩ M L and define by µ q (t) the solution to the McKean-Vlasov system (2.1) with initial condition µ q (0) = q. Therefore, using (3.3), one obtains that d dt J(µ q (t)) t=0 < 0. Recall that the function q → DJ q , q c 1 A 1,c q , q p 1 A 1,p q , . . . , q c r A r,c µ , q p r A r,p q is continuous on Γ , and that B(φ, ε) ∩ M L is a closed subset of Γ, thus In addition, since B(φ, ε) ∩ S 2r ⊂ M K 1 , we have that µ(t) ∈ (B(φ, ε)) c ∩ M L for all t ≤ τ 1 from which we deduce that sup t≤τ 1 d dt J(µ(t)) ≤ 0. Then τ 1 < ∞. Repeating the same argument for all n > 1 gives that τ n < ∞ for all n ≥ 1. This concludes the proof. ✷

Limit of relative entropies
We now construct a candidate Lyapunov function for the McKean-Vlasov system (2.1) as the limit of suitably scaled relative entropies. We start by recalling the form of the relative entropy function denoted by R(· ·) and defined, for all p ∈ (M 1 (Z)) 2r , by Notice that the relative entropy function plays an important role in various fields of mathematics. For an account of its properties and applications, one can consult e.g. COVER & THOMAS (2006). Define the functionF N given, for any q = (q 1,c , q 1,p , . . . , q r,c , q r,p ) ∈ M 1 (Z) 2r , bȳ where π N (x x x) is the Gibbs measure introduced in (1.3). The idea now is to identify lim N →∞F N (q) and then investigate its Lyapunov function properties for the McKean-Vlasov limiting system (2.1). The calculation of this limit relies on the following Laplace principle for empirical vectors which is an extension of the Sanov's theorem for empirical measures given in (Dupuis & Ellis, 1997, Th. 2

.2.1).
Proposition 3.2 Let Z be a Polish space and let ν be a probability measure on Z. Let r ≥ 1, and let {x j i } 1≤i≤N j , for 1 ≤ j ≤ r, be r sequences of Z-valued i.i.d random variables with a common distribution ν. Suppose that r j=1 N j = N and that there exist positive reals α 1 , . . . , α r such that lim N →∞ N j N = α j and j α j = 1. Denote by µ N = µ N 1 , . . . , µ N r the empirical vector associated with these sequences, Then, for all sequence {h N , N ∈ N} of bounded continuous functions mapping (M 1 (Z)) r into R and converging to some h as N → ∞, the following Laplace principle holds Proof We follow the strategy elaborated in Dupuis & Ellis (1997). We first establish a variational formulation, then we prove the convergence by establishing lower and upper bounds.
Step 2: Upper bound. Let {γ j } 1≤j≤r ∈ M 1 (Z) be a sequence of probability measures on Z and set η N ji = γ j for all 1 ≤ i ≤ N j . Then, by (3.6), one obtains the following upper bound Moreover, since η N ji = γ j for all 1 ≤ i ≤ N j , the random variablesx j i , for 1 ≤ i ≤ N j , are i.i.d. Furthermore, since the measurable functions h N are bounded continuous and convergent towards h, using the dominated convergence theorem, the laws of large numbers, and the convergence of the proportions with γ = (γ 1 , . . . , γ r ). Therefore, since the measures {γ j , 1 ≤ j ≤ r} are arbitrary chosen, the following upper bound holds true lim sup (3.7) Step 3: Lower bound. First, by the convexity of relative entropy and the Jensen's inequality, the following upper bound holds true Set, for all 1 ≤ j ≤ r, Therefore, for any ε > 0, one can always find a sequence {η N ji } such that Now, for all 1 ≤ j ≤ r and 1 ≤ i ≤ N j , let F j,i be the σ-algebra generated by the random variables x 1 1 , . . . ,x 1 N 1 , . . . ,x j 1 , . . . ,x j i . Then, for any bounded measurable function g : Z → R, it is easy to verify thatĒ One thus deduces that forms a martingale difference sequence with respect to F j,i . In addition, straightforward calculations give, for all 1 ≤ j ≤ r, Hence, recalling that the terms of a martingale difference sequence are uncorrelated we obtain by the Markov's inequality that, for any ε > 0, From the last inequality one deduces that, for each g ∈ U b (Z), with U b (Z) being the space of bounded and uniformly continuous functions on Z, the sequence { Z g(y)dμ N j − Z g(y)dχ N j } N ≥1 converges in probability to 0, and hence in distribution. Moreover, the sequence {(χ N j ,μ N j ), N ≥ 1} takes values in M 1 (Z) × M 1 (Z) which is a compact space by the Prokhorov's theorem given that Z is compact. In addition, M 1 (Z)×M 1 (Z) endowed with the topology of weak convergence is a metric space. Therefore, any subsequence admits a further subsubsequence that converges weakly to some (χ j ,μ j ) ∈ M 1 (Z) × M 1 (Z). Thence, by the Skorokhod's representation theorem, there exists some probability space such that, for all g ∈ U b (Z), the following convergences hold almost surely, We thus deduce that Z gdχ j = Z gdμ j . Then, since the space U b (Z) is measure-determining, one deduces that χ j =μ j almost surely. We then conclude that any subsequence of {(χ N j ,μ N j ), N ≥ 1} contains a further subsubsequence that converge in distribution to (μ j ,μ j ), for someμ j ∈ M 1 (Z). Denote by {N ∈ N} this subsequence. Again by the Skorokhod's representation theorem, there exists some probability space such that (χ N j ,μ N j ) converges almost surely to (μ j ,μ j ) along the subsequence {N ∈ N}. Thence, using the nonnegativy and semi-continuity of the relative entropy function R(·, ν) on M 1 (Z), the boundedness and continuity of the function h N on (M 1 (Z)) r together with its convergence towards h, and the convergence Finally, by Fatou's lemma we obtain lim inf Letting ε ↓ 0, we conclude that any subsequence of the original sequence {W N , N ∈ N} contains a further subsubsequence that satisfies the following lower bound Thence we deduce that the original sequence {W N , N ∈ N} satisfies the lower limit in (3.9). Combining this with the upper bound (3.7) leads to the stated result. ✷ We are now ready to state the main result of this section.
Proposition 3.3 There exists a constant C ∈ R such that for all q = (q 1,c , q 1,c , . . . , q r,c , q r,p ) ∈ M 1 (Z) (3.10) Proof Fix q = (q 1,c , q 1,p , . . . , q r,c , q r,p ) ∈ M 1 (Z) 2r . Then, by (1.3) one obtains (3.11) For all 1 ≤ j ≤ r, let {X j,c i } i∈N and {X j,p i } i∈N be sequences of i.i.d. Z-valued random variables with common distributions q j,c and q j,p respectively. We then can write Taking the limit in the last display and using Assumption 1 we find (3.12) Moreover, it is easy to see that distribution ν given by ν z = 1 |Z| , z ∈ Z. Then have (3.14) Furthermore, recalling (1.1), one finds

Fixed points of the McKean-Vlasov system
One of the important features of the limiting function F (q) given by (3.10) is that it characterizes the critical points of the McKean-Vlasov system in (2.1). To prove this fact, let us introduce some additional notations. Define the hyperplane and the corresponding shifted version For any v ∈ H 0 , the directional derivative of the function F (q) with respect to v is given by where, for x ∈ Z and 1 ≤ j ≤ r, Moreover by (1.25) and (1.26) one can rewrite the last equality as Let q = (q 1,c , q 1,p , . . . , q r,c , q r,p ) ∈ M 1 (Z) 2r be a fixed point of the McKean-Vlasov system in (2.1) corresponding to the rate functions λ j,c z,z ′ and λ j,p z,z ′ defined in (1.24). Thus, by the definition,    q j,c A j,c q = 0, q j,p A j,p q = 0, 1 ≤ j ≤ r.
Thence q = π(q), and thus q is a fixed point of the McKean-Vlasov system. This concludes the proof. ✷ The previous result allows us to identify the equilibrium points of the McKean-Vlasov system in (2.1) by the critical points of the limit function F . Also, note that the dynamic system can contain multiple ω -limit sets as shown in the following example.
Hence, the critical points of f (q 1,c 1 , q 1,p 1 , q 2,c 1 , q 2,p 1 ) on [0, 1] 4 correspond to the critical points of F on (M 1 (Z)) 4 which, by Lemma 3.1, correspond to the equilibria of the McKean-Vlasov system (2.1). Therefore, one has to solve the following system of equations: Moreover, recall that, since ν is fixed, π j,c (ν j,c ) and π j,p (ν j,p ) are the stationary distributions of ergodic Markov processes generated by the rate functions (1.24). The linear Kolmogorov forward equations associated with these Markov processes are given by     η c j (t) = η c j (t)A j,c ν , η p j (t) = η p j (t)A j,p ν , 1 ≤ j ≤ r, t ≥ 0. (3.21) Fix t ≥ 0. Since ν is arbitrary, then one can take ν = µ(t) in the last equation. Therefore, at time t, both µ(t) = (µ 1,c (t), µ 1,p (t), . . . , µ r,c (t), µ r,p (t)) and η(t) = (η 1,c (t), η 1,p (t), . . . , η r,c (t), η r,p (t)) solve a Kolmogorov forward equation with the same rate matrix A µ(t) . Moreover, recall that the relative entropy function has descent property along the solution to the linear Kolmogorov forward equation. In particular, the proof of this relies on the fact that two solutions to the linear equation with different initial conditions satisfy the forward equation with the same fixed rate matrix; See, e.g., the proof in (Budhiraja et al., 2015a, Lem 3.1). Therefore, since this fact is also satisfied here the descent property of the relative entropy function still holds true and thus d dt F (µ(t)) = d dt r j=1 α j p c j R µ j,c (t) π j,c (ν j,c ) + α j p p j R µ j,p (t) π j,p (ν j,p ) ν=µ(t) ≤ 0.
Using again (Budhiraja et al., 2015a, Lem 3.1.) gives that d dt F (µ(t)) = 0 if and only if µ j,ι (t) = π j,ι (µ j,ι (t)), for all ι ∈ {c, p} and 1 ≤ j ≤ r. ✷ The descent property established in Proposition 3.4 together with the observation that F (q) is C 1 and uniformly continuous shows that, if F is positive definite in the neighborhood Γ of a fixed point φ of the McKean-Vlasov system in (2.1), then F (q) is a local Lyapunov function in the sense of Definition 3.4. In such a case, the corresponding fixed point φ is locally stable by Proposition 3.1.

Remark 1
1. If the McKean-Vlasov system in (2.1) contains multiple fixed points, then under positive definiteness assumptions, the function F serves as a Lyapunov function for all of these points and thus allows studying their local stability. Note that if the McKean-Vlasov system in (2.1) contains multiple (local) stable equilibria, one can investigate the metastability of the corresponding N -particles system. It amounts to studying the transitions of the empirical vector µ N between the different attractor states as time becomes large. Note that these transitions happen even though the existence of a unique invariant measure for µ N . It is then of interest to estimate quantities such as the mean time spent by the process near a stable point, the probability of reaching a given stable point before reaching another one, or also the probability of transiting between a collection of ω-limit sets in a particular order, and so on. Notice that the metastability analysis was conducted in Dawson et al. (2021) for the general finite-state mean-field systems on block graphs under the classical conditions of Friedlin and Wentzell Freidlin & Wentzell (2012).
2. In the general case studied in Dawson et al. (2020Dawson et al. ( , 2021, the explicit form of the invariant measure is in general not available. Therefore, one cannot evaluate the N → ∞ limit of the function F N (q) defined in (3.5). An alternative approach consists in evaluating the limit of F N t (q) = 1 N R ⊗ N c 1 q 1,c ⊗ N p 1 q 1,p · · · ⊗ N c r q r,c ⊗ N p r q r,p p p p N (t) , as N → ∞ and t → ∞ where p p p N (t) is a multi-exchangeable probability distribution of x x x N (t) = x n (t), x m (t), n ∈ C c j , m ∈ C p j , 1 ≤ j ≤ r at time t. Then, providing that this limit takes a useful form, one can investigate its Lyapunov properties. Note that this approach was shown to work in Budhiraja et al. (2015a,b) for some family of finite-state mean-field models on complete graphs. The main condition is the existence of large deviations principle for the empirical measure. It is then of interest to apply similar strategy to our multi-class setting. One need to rely on the large deviations principles for the empirical vector process, which was proved to hold in Dawson et al. (2021). However this goes beyond the scope of this paper.