Next Article in Journal
A Periodic Mapping Activation Function: Mathematical Properties and Application in Convolutional Neural Networks
Previous Article in Journal
Unfolding Post-Quantum Cryptosystems: CRYSTALS-Dilithium, McEliece, BIKE, and HQC
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Construction of an Optimal Strategy: An Analytic Insight Through Path Integral Control Driven by a McKean–Vlasov Opinion Dynamics

by
Paramahansa Pramanik
Department of Mathematics and Statistics, University of South Alabama, 411 University Boulevard North, Mobile, AL 36688, USA
Mathematics 2025, 13(17), 2842; https://doi.org/10.3390/math13172842
Submission received: 3 July 2025 / Revised: 17 August 2025 / Accepted: 21 August 2025 / Published: 3 September 2025
(This article belongs to the Section D1: Probability and Statistics)

Abstract

In this paper, we have constructed a closed-form optimal strategy within a social network using stochastic McKean–Vlasov dynamics. Each agent independently minimizes their dynamic cost functional, driven by stochastic differential opinion dynamics. These dynamics reflect agents’ opinion differences from others and their past opinions, with random influences and stubbornness adding to the volatility. To gain an analytic insight into the optimal feedback opinion, we employed a Feynman-type path integral approach with an appropriate integrating factor, marking a novel methodology in this field. Additionally, we utilized a variant of the Friedkin–Johnsen-type opinion dynamics to derive a closed-form optimal strategy for an agent and conducted a comparative analysis.

1. Introduction

In recent years, there has been a growing interest in the study of opinion dynamics, which explores how the beliefs of a typically large population of individuals evolve over time through repeated interactions within a social network [1]. In continuous opinion dynamics, beliefs (referred to interchangeably as opinions or beliefs) are depicted as scalar values or vectors, with each individual’s belief moving towards a weighted average of other agents’ beliefs, reflecting the influence of social interactions [2]. Traditional models generally suggest that, given a connected social network, consensus among agents will eventually be reached over time [3]. However, exceptions arise in bounded confidence models, where agents disregard the influence of others whose beliefs diverge significantly from their own, and in models featuring stubborn agents who resist changing their opinions but can still sway the opinions of others. These stubborn agents may represent leaders, political parties, or media sources aiming to influence the beliefs of the broader population [4]. Studies on scaling limits indicate that in a homogeneous population, as the population size increases, the observed distribution of beliefs tends to converge towards the solution of a specific deterministic mean-field differential equation in the space of probability measures. These findings align with the concept of the propagation of chaos in interacting particle systems [3]. It is a well understood fact that when agents interact, their opinions tend to converge towards an average [5]. Global interactions often lead to consensus, while local interactions typically result in clusters of similar opinions. This clustering happens when agents only engage with neighbors who share their views, avoiding those with differing opinions. To understand the differences in final opinion values, researchers have developed a Lagrangian model of dissensus, employing graph theory and stochastic stability theory [6].
Social networks exert significant influence on various aspects of behavior, such as educational attainment [7], employment opportunities [8], technology uptake [9], consumer habits [10], and even smoking habits [11,12]. Since social networks emerge from individual decisions, understanding the consensus plays a pivotal role in deciphering their formation. Despite extensive theoretical exploration of social networks, there has been minimal investigation into consensus as a Nash equilibrium within stochastic networks. Ref. [12] characterizes the network as a simultaneous-move game, where social ties are formed based on the utility effects from indirect connections. Ref. [12] also introduces a computationally feasible method for partially identifying large social networks. The statistical examination of network formation traces back to pioneering work [13], where a random graph is constructed with independent links governed by a fixed probability. Beyond the Erdös–Rényi model, numerous techniques have been devised to simulate graphs exhibiting characteristics like varying degree distributions, small-world properties, and Markovian traits. Model-based approaches are valuable if they can be effectively fitted and yield realistic simulations. The Exponential Random Graph Model (ERGM) is widely used due to its ability to capture observed network statistics [14,15,16,17]. However, the ERGM lacks micro foundations crucial for counterfactual analyses, and economists consider network analysis as the domain of rational agents seeking to optimize their utilities. Another popular framework involves viewing networks as evolving through stochastic processes, where the focus lies on the parameters governing the process rather than on individual network realizations [16].
A distinction should be drawn between the classical Nash equilibrium in finite-agent stochastic games and the Mean Field Game (MFG) equilibrium. In a Nash equilibrium, each of the n agents optimizes their cost functional given the strategies of all other agents, and no individual agent can unilaterally reduce their cost. The resulting strategies depend explicitly on the empirical distribution of the population. In contrast, an MFG equilibrium arises in the limit n , where a representative agent optimizes against the distribution of the population, typically modeled by a McKean–Vlasov process. The MFG equilibrium is then defined by a fixed-point condition: the optimal strategy of the representative agent generates exactly the same distribution that the agent anticipates. This limiting framework was formalized by Lasry and Lions [18,19], and a rigorous probabilistic foundation was later developed by Carmona and Delarue [20,21]. In this paper, we derive the optimal strategy for each finite agent and interpret its limiting behavior as an approximation of the MFG equilibrium when the number of agents is large.
To provide intuition before introducing the technical framework, consider a simplified opinion dynamics model with n = 50 agents. Each agent’s opinion is represented by a point on a one-dimensional line, where proximity reflects similarity in views. At each time step, agents adjust their opinions toward those of nearby agents, with closer opinions exerting stronger influence. Control inputs may be applied to selected agents to steer the group toward a consensus or desired configuration, while random fluctuations represent unpredictable external influences. Figure 1 illustrates a group of 50 agents represented as dots on a random surface. The surface indicates an underlying field that changes across space. Each agent is connected to other agents by lines (i.e., shares opinion). These lines show that every agent interacts with other agents. The colors of the surface reflect variation in the background opinion environment. The positions of the agents can change over time in response to these interactions and the environment.
In this paper, we propose an approach to opinion dynamics within the framework of McKean–Vlasov dynamics. We focus on a stochastic model with time-continuous beliefs comprising a homogeneous population of agents. Pioneering research into McKean–Vlasov equations was conducted by [22,23], where they examined uncontrolled Stochastic Differential Equations (SDEs) to establish chaos propagation outcomes. Kac’s work delved into mean-field SDEs driven by classical Brownian motion, offering deeper insights into the Boltzmann and Vlasov kinetic equations. Ref. Snitman [24] also contributed significantly to understanding this equation. While subsequent authors have expanded on this research, there has been a notable surge in interest over the past decade, largely due to its intersection with Mean Field Game (MFG) theory. This theory was independently introduced by [19,25]. The McKean–Vlasov equation is instrumental in analyzing scenarios where numerous identical agents, interacting via the empirical distribution of their state variables, aim for a Nash equilibrium. Conversely, for agents seeking a Pareto Equilibrium, stochastic control of McKean–Vlasov SDEs is considered. While these two equilibria are related, nuances exist, as highlighted by Carmona et al. [26]. Certainly, as indicated by these studies, as the number of agents tends towards infinity, it is anticipated that the individual agents’ opinions evolve independently. Each agent’s opinion is governed by a specific stochastic differential equation, where the coefficients are contingent upon the statistical distribution of their private state. The optimization of each agent’s objective function within the confines of this new dynamic state equates to the stochastic control of McKean–Vlasov dynamics, a mathematical conundrum not universally comprehended. Refer to [27] for an initial exploration in this direction. Returning to our original inquiry, one might question whether or not optimizing feedback strategies in this new control scenario yields a form of approximate equilibrium for the initial N-agent game. Regardless, if such a problem can be resolved, the crux lies in understanding how an optimal feedback strategy in this context aligns with the outcomes of the MFG theory.
Furthermore, it is possible to transform a class of non-linear HJB equations into linear equations through a logarithmic transformation. This technique dates back to the early days of quantum mechanics, where Schrödinger first used it to connect the HJB equation to the Schrödinger equation. Due to this linearity, backward integration of the HJB equation over time can be replaced by computing expectation values under a forward diffusion process, which involves stochastic integration over trajectories described by a path integral [28,29,30]. In more complex cases, such as the Merton–Garman–Hamiltonian system, finding a solution through the Pontryagin’s maximum principle is impractical, but the Feynman path integral method provides a solution. Previous works using the Feynman path integral method include its application in motor control theory by [31,32,33,34]. The use of Feynman path integrals in finance has been extensively discussed by [35]. Additionally, ref. [36] introduced a Feynman-type path integral to determine a feedback control.
Classical mean-field game theory, initiated by Lasry and Lions [18,19,37], provides a general framework for studying Nash equilibria in large populations of agents where the cost functional depends on the state distribution. In that setting, the analysis is typically conducted through a coupled system of HJB and Fokker–Planck (FP) equations. By contrast, the present work is based on controlled McKean–Vlasov dynamics, in which the emphasis is placed on the stochastic evolution of interacting particles and their empirical measure. This perspective avoids the need to solve the full PDE system and is closer to probabilistic methods for interacting diffusions. On the numerical side, a considerable body of work has focused on approximating mean-field equilibria using grid-based PDE solvers [38,39], as well as particle methods and deep learning approaches [40,41,42]. Our contribution is complementary to these developments: rather than approximating the Lasry–Lions equations directly, we formulate an optimal control problem over the particle system itself, with numerical implementation carried out through simulation of interacting agents. This approach allows us to capture finite-population effects and to design control strategies that remain interpretable in terms of agent-level interactions.
From a numerical standpoint, the literature on the mean-field approach has largely been developed along two complementary directions. On the one hand, finite-difference and related PDE-based solvers have been extensively applied to the Lasry–Lions system, yielding accurate approximations of the coupled HJB and FP equations on discretized grids [38,39]. While these approaches provide precise resolution in low-dimensional settings, their computational cost grows rapidly with the dimension of the state space, which limits their applicability in high-dimensional environments. On the other hand, simulation-based approaches, often leveraging interacting particle systems, provide a more scalable alternative. By directly evolving a large but finite set of controlled dynamics, such methods approximate the mean-field limit probabilistically and naturally incorporate stochasticity and finite-population effects. More recently, machine learning and deep reinforcement learning techniques [40,41,42] have extended this particle-based philosophy by introducing data-driven function approximations, enabling numerical treatment of high-dimensional and non-linear problems. The methodology developed in this paper belongs to this latter class; rather than discretizing PDEs, we simulate controlled particle systems and analyze their empirical distributions, thereby retaining interpretability at the agent level while avoiding the curse of dimensionality inherent in PDE solvers.
The rest of this paper is structured as follows: In Section 3, we formulate the opinion dynamics along with the cost functional. Section 4 outlines the assumptions and essential properties of stochastic McKean–Vlasov dynamics. In Section 5, we derive the deterministic Hamiltonian and stochastic Lagrangian for the system. Section 6 presents the main results related to Feynman-type path integral control and its applications to the Friedkin–Johnsen-type model. Finally, Section 7 concludes the paper and suggests possible future extensions.

2. Notations and Symbols

For convenience, the main notations and symbols used throughout the paper (including Appendix A) are summarized below.
SymbolDescription
nNumber of agents in the population.
i , j Agent indices, i , j { 1 , , n } .
s , t Continuous time variables.
x i ( s ) Opinion state of agent i at time s.
x 0 i Initial opinion of agent i.
η i Set of neighbors of agent i, defined as η i : = { j N : ( i , j ) E } .
A i Neighborhood of agent i, i.e., { x j ( s ) | | | x i ( s ) x j ( s ) | | 2 2 r } .
θ = ( θ 1 , θ 2 ) T Parameter vector of interaction kernel.
θ 1 Scale parameter of the kernel.
θ 2 Range parameter of the kernel.
ϕ θ ( β ) Interaction kernel governing influence between agents (Equation (7)).
α ( s ) Time-varying weight on self-opinion dynamics (decay rate).
u i ( s ) Control input of agent i at time s, u i ( s ) U ( [ 0 , t ] ) .
u i ( s ) Optimal control strategy of agent i (Equation (16)).
h i Feedback control function mapping agent states to admissible controls (Assumption 3).
σ i Noise intensity (volatility) for agent i.
B i ( s ) Standard Brownian motion for agent i.
γ Measure parameter in the mean-field formulation.
w i j Peer influence weight of agent j on agent i.
k i Stubbornness parameter of agent i.
L i ( s , x , u i ) Cost functional of agent i (Equation (1)).
L 0 , L 0 i Initial cost functional of society and of agent i, assumed concave (Assumption 4).
U Admissible control space (convex, open subset of R n ).
Z Information (knowledge) space of the society (Assumption 4).
Z i Information subset available to agent i.
E [ · ] Expectation operator.
E 0 [ · ] Conditional expectation given initial state x 0 i .
V i ( s , x ) Value function of agent i solving the HJB equation.
λ i ( s ) Adjoint (co-state) variable from stochastic optimal control.
T 1 , T 2 , T 3 Coefficients in quadratic formula representation of u i ( s ) (Equation (16)).
d d x i ϕ θ , d 2 d ( x i ) 2 ϕ θ First and second derivatives of the kernel (used in optimality conditions).
N ( 0 , 1 ) Standard Gaussian distribution, used for noise terms.
M Set of probability measures on R .
| | · | | Euclidean norm.
Δ Laplacian operator.
s Partial derivative with respect to time.
P ( x i ) Probability law of agent i.

3. Construction of a Stochastic Differential Game of Opinion Dynamics

Following [43], consider a social network of n agents by a weighted directed graph G = ( N , E , w i j ) , where N = { 1 , . . . , n } is the set of all agents. Let, E N × N be the set of all ordered pairs of all connected agents, and w i j be the influence of agent j on agent i for all ( i , j ) E . There are two types of connections, one-sided or two-sided. For the principle–agent problem, the connection is one-sided (i.e., Stackelberg model), and for the agent–agent problem, it is two-sided (i.e., Cournot model). Suppose x i ( s ) [ 0 , 1 ] be the opinion of agent i t h at time s [ 0 , t ] with their initial opinion x i ( 0 ) = x 0 i [ 0 , 1 ] . Then, x i ( s ) has been normalized into [ 0 , 1 ] , where x i ( s ) = 0 stands for a strong disagreement and x i ( s ) = 1 represents strong agreement and all other agreements stay in between. Let x ( s ) = x 1 ( s ) , x 2 ( s ) , ,   x n ( s ) T [ 0 , 1 ] n be the opinion profile vector of n agents at time s, where T represents the transposition of a vector. Following [43], define the cost function of agent i as
L i ( s , x , u i ) : = E 1 2 0 t j η i w i j x i ( s ) x j ( s ) 2 + k i x i ( s ) x 0 i 2 + u i ( s ) 2 d s ,
where w i j [ 0 , ) is a parameter that weighs the susceptibility of agent j to influence agent i, k i [ 0 , ) is agent i’s stubbornness, u i ( s ) U ( [ 0 , t ] ) is an adaptive control process of agent i taking values in a convex open set in R n , and set of all agents with whom i interacts is η i and defined as η i : = { j N : ( i , j ) E } . In this paper, u i represents agent i’s control over their own opinion as well as influencing other agents’ opinions. The cost function L i ( s , x , u i ) is twice differentiable with respect to time in order to satisfy Wick rotation, is continuously differentiable with respect to i t h agent’s control u i ( s ) , non-decreasing in opinion x i ( s ) , non-increasing in u i ( s ) , and convex and continuous in all opinions and controls [44,45]. The opinion dynamics of agent i follow a McKean–Vlasov stochastic differential equation,
d x i ( s ) = μ i [ s , x i ( s ) , P ( x i ) , u i ( s ) ] d s + σ i [ s , x i ( s ) , P ( x i ) , u i ( s ) ] d B i ( s ) ,
with the initial condition x 0 i , where μ i and σ i are the drift and diffusion functions and P ( x i ) is the probability law the opinion of agent i with Brownian motion B i ( s ) = B i ( s ) , s [ 0 , t ] . The reason behind incorporating Brownian motion in agent i’s opinion dynamics is because of Hebbian Learning, which states that neurons increase the synaptic connection strength between them when they are active together simultaneously, and this behavior is probabilistic in the sense that resource availability from a particular place is random [46,47]. For example, for a given stubbornness, and influence from agent j, agent i’s opinion dynamics have some randomness in opinion. Suppose from other resources agent i knows that the information provided by agent j’s influence is misleading. Apart from that, after considering humans as automatons, motor control and foraging for food become a big example of minimization of costs (or the expected return) [47]. As control problems like motor controls are stochastic in nature because there is a noise in the relation between the muscle contraction and the actual displacement with joints with the change of the information environment over time, we consider the Feynman path integral approach to determine the stochastic control after assuming the opinion dynamics; Equation (2) [48,49]. The coefficient of the control term in Equation (1) is normalized to 1, without loss of generality. The cost functional represented in Equation (1) is viewed as a model of the motive of agent i towards a prevailing social issue [43]. The aim of this paper is to characterize a feedback Nash equilibrium u i U ( [ 0 , t ] ) , such that
L i ( u i ) = arg min u i U ( [ 0 , t ] ) E s L i ( s , x , u i ) | F 0 x ,
subject to the Equation (2), where E 0 ( L i | F 0 x ) represents the expectation on L i at time 0 subject to agent i’s opinion filtration F 0 x generated by the Brownian motion B i starting at the initial time 0 for a complete probability space ( Ω , F , F 0 x , P ) . A solution to this problem is a feedback Nash equilibrium as the control of agent i is updated based on the opinion at the same time s.

4. Preliminaries

Let t > 0 be a fixed finite horizon. Assume B i ( s ) = { B i ( s ) } s = 0 t is a 1-dimensional Brownian motion defined on a probability space ( Ω , F , P ) , and F s = { F s x } s = 0 t is its natural filtration augmented with an independent σ -algebra F 0 x , where P is the probability law defined above. The McKean–Vlasov stochastic opinion dynamic of agent i is represented in Equation (2), where the drift and diffusion coefficients of opinion x i ( s ) are given by a pair of deterministic functions ( μ i , σ i ) : [ 0 , t ] × R × P 2 R × U R × R , and u i = { u i ( s ) } s = 0 t is the admissible control of agent i assumed to be a progressively measurable process with values in a measurable space ( U , U * ) . In ( U , U * ) , U is an open subset of an Euclidean space R , and U * is a σ -field induced by a Borel σ -field in the same Euclidean space [50]. For a metric space E, if E is its Borel σ -field, we use P ( E ) as the notation for the set of all probability measures on ( E , E ) . We further assume P ( E ) is endowed with the topology of weak convergence [51]. If E is a Polish space G, then for all r 1 with metric d G define
P r ( G ) : = γ P ( G ) : G d G ( x 0 i , x i ) r γ ( d x i ) < ,
where x 0 i G is arbitrary. For r 1 Wasserstein distance W r ( γ , γ ) on P ( E ) define
W r ( γ , γ ) : = inf G × G d G ( x i , y i ) r π ( d x i , d y i ) : π P ( G × G ) so that π ( . × G ) = γ , and π ( G × . ) = γ 1 r ,
for all γ , γ P r ( G ) . The space P r ( G ) , W r is indeed a Polish space [51]. The term ‘non-linear’, used to describe Equation (2), does not mean the drift ( μ i ) and diffusion ( σ i ) coefficients are non-linear functions of X, but instead they not only depend on the value of the unknown process x i ( s ) but also on its distributions P ( x i ) [50]. In the r-Wasserstein distance W r ( γ , γ ) the infimum is taken over all couplings π P ( G × G ) , such that the marginal laws
π ( A × G ) = γ ( A ) , and π ( G × A ) = γ ( A ) , for all Borel sets A G
are satisfied. In other words, π has marginals γ and γ . The space P r ( G ) , W r is indeed a Polish space. To get rid of more symbols we represent “A” by “.” In this paper, γ = P ( x ) .
The set U of admissible controls  u i as the set of U -valued progressively measurable processes u i H 2 , m , where H 2 , m ˜ is a Hilbert space,
H 2 , m ˜ : = y i H 0 , m ˜ ; E 0 t | y i ( s ) | 2 d s <
with H 0 , m ˜ being the collection of all R m ˜ -valued progressive measurable processes on [ 0 , t ] . Let V be a sub- σ -algebra of F so that the following assumption holds.
Assumption 1. (i). V and the filtration F generated by the Brownian motion of i t h agent B i ( s ) are independent.
  • (ii). V is “rich enough” by means of the following condition:
P 2 ( C ( [ 0 , t ] , H 2 , m ˜ ) ) = { P ( Y ) with Y : [ 0 , t ] × Ω H 2 , m ˜ continuous and B ( [ 0 , t ] ) V measurable process satisfying E 0 t | y i ( s ) | 2 d s < } .
In other words, for all γ P 2 ( C ( [ 0 , t ] , H 2 , m ˜ ) ) there exists a continuous and Borel B ( [ 0 , t ] ) V -process y i : [ 0 , t ] × Ω H 2 , m ˜ , such that E 0 t | y i ( s ) | 2 d s < , and Y has the law (distribution) equal to γ.
Lemma 1 
([51]). Let V ˜ be another sub-σ-algebra of F on the probability space ( Ω , F , P ) . If Assumption 1 holds then the following statements are equivalent.
  • (i). There exists a V ˜ -measurable random variable z i : Ω R with the Uniform distribution [ 0 , 1 ] .
  • (ii). V ˜ is “rich enough” by means of the following condition:
P 2 ( C ( [ 0 , t ] , H 2 , m ˜ ) ) = { P ( y i ) with y i : [ 0 , t ] × Ω H 2 , m ˜ continuous and B ( [ 0 , t ] ) V ˜ measurable process satisfying E 0 t | y i ( s ) | 2 d s < } .
Remark 1. 
Consider two sets of sub-σ-algebras F 1 V ˜ and F 2 V ˜ in the probability space ( Ω , F , P ) so that P ( F 1 ) > 0 , and F 1 F 2 with 0 < P ( F 1 ) < P ( F 2 ) (atomless space). Then, statements (i) and (ii) in Lemma 1 are equivalent.
Assumption 2. (i) There exists a linear, unbounded operator O : D ( O ) H 2 , m ˜ H 2 , m ˜ , which facilitates a C 0 -semigroup of pseudo-contractions { exp ( s O ) ; s 0 in H 2 , m ˜ .
  • (ii). The drift μ i and the diffusion coefficients σ i are measurable.
  • (iii). There exists a constant C * , such that
| μ i ( s , γ , x i , u i ) μ i ( s , γ , x i , u i ) | C * ( W 2 ( γ , γ ) + | x i ( s ) x i ( s ) | + | u i ( s ) u i ( s ) | ) , for all ( γ , x i , u i ) , ( γ , x i , u i ) P 2 ( R × U ) × R × U , | σ i ( s , γ , x i , u i ) σ ( s , γ , x i , u i ) | C * ( W 2 ( γ , γ ) + | x i ( s ) x i ( s ) | + | u i ( s ) u i ( s ) | ) , for all γ , γ P 2 ( R ) , | L i ( s , x , u i ) L i ( s , x , u i ) | C * ( W 2 ( γ , γ ) + | x i ( s ) x i ( s ) | + | u i ( s ) u i ( s ) | ) , for all ( γ , x i , u i ) , ( γ , x i , u i ) P 2 ( R × U ) × R × U .
  • (iv). σ i is differentiable in ( γ , x i , u i ) P 2 ( R × U ) × R × U , and the derivative γ σ i : P 2 ( R × U ) × R × U R × U is bounded Lipschitz continuous. Hence, there exists a positive constant C * , such that
| γ σ i ( s , γ , x i , u i ) | C * , for all ( γ , x i , u i ) P 2 ( R × U ) × R × U , | γ σ i ( s , γ , x i , u i ) γ σ ( s , γ , x i , u i ) | C * ( W 2 ( γ , γ ) + | x i ( s ) x i ( s ) | + | u i ( s ) u i ( s ) | ) , for all ( γ , x i , u i ) , ( γ , x i , u i ) P 2 ( R × U ) × R × U .
  • (v). Consider ω i = μ i and L i . Therefore, ω i is differentiable in ( γ , x i , u i ) P 2 ( R × U ) × R × U and the derivatives γ ω i : P 2 ( R × U ) × R × U × ( R × U ) R × U , x ω i : P 2 ( R × U ) × R × U R , and u ω i : P 2 ( R × U ) × R × U U are bounded and Lipschitz continuous. Hence, for a given s [ 0 , t ] , there exists a constant C * > 0 , so that
| γ ω i ( γ , x i , u i ) | + | x ω i ( γ , x i , u i ) | + | u ω i ( γ , x i , u i ) | C * , for all ( γ , x i , u i ) P 2 ( R × U ) × R × U × ( R × U ) , | γ ω i ( γ , x i , u i ) γ ω i ( γ , x i , u i ) | C * ( W 2 ( γ , γ ) + | x i x i | + | u i u i | ) , for all ( γ , x i , u i ) , ( γ , x i , u i ) P 2 ( R × U ) × R × U × ( R × U ) , | x ω i ( γ , x i , u i ) x ω i ( γ , x i , u i ) | C * ( W 2 ( γ , γ ) + | x i x i | + | u i u i | ) , for all ( γ , x i , u i ) , ( γ , x i , u i ) P 2 ( R × U ) × R × U , | u ω i ( γ , x i , u i ) u ω i ( γ , x i , u i ) | C * ( W 2 ( γ , γ ) + | x i x i | + | u i u i | ) , for all ( γ , x i , u i ) , ( γ , x i , u i ) P 2 ( R × U ) × R × U .
Assumption 3. 
Under a feedback control structure of a society, there exists a measurable function h i such that h i : [ 0 , t ] × C ( [ 0 , t ] ; R ) × U U , for which u i ( s ) = h i [ x i ( s ) , u i ] ensures that Equation (2) admits a solution.
Remark 2. 
This assumption is standard in stochastic control theory, where feedback (or closed-loop) controls guarantee measurability and admissibility of control strategies [52,53]. In the context of social networks, feedback-type control structures capture the fact that agents adjust their opinions dynamically based on their current state and the observed states of their neighbors. Empirical and theoretical studies in opinion dynamics also support this modeling choice, since adaptive responses to social influence are more realistic than open-loop strategies [1,54,55]. Such assumptions have also been employed in the control of mean-field models and networked systems, where measurable feedback ensures both tractability and real-world interpretability [56].
Assumption 4. (i) Let Z denote the information (or knowledge) space of the society, assumed to be a measurable space. For each agent i, we define an information set Z i Z , which represents the collection of signals or knowledge available to agent i at time s [ 0 , t ] . The family { Z i } i = 1 n is such that Z i Z j whenever i j , reflecting heterogeneity in access to information across agents. Moreover, we assume that each Z i is non-empty, measurable, and evolves monotonically with respect to time, i.e., Z i ( s 1 ) Z i ( s 2 ) , whenever s 1 < s 2 .
(ii) The initial cost functional of the society is given by
L 0 : [ 0 , t ] × R × U R ,
which is concave in its arguments. For each agent i, the individual cost functional is denoted L 0 i : [ 0 , t ] × R × U R , satisfying L 0 i L 0 , and the concavity of L 0 i is assumed. This condition is equivalent to the Slater condition (see [57]), ensuring feasibility and dual attainability in the associated optimization problem.
(iii) For each admissible control u i ( · ) U ( [ 0 , t ] ) , there exists ϵ > 0 sufficiently small, such that the quadratic form
E 0 0 t 1 2 j η i w i j x i ( s ) x j ( s ) 2 + k i x i ( s ) x 0 i 2 + u i ( s ) 2 d s ϵ ,
holds for all i = 1 , , n with i j , where E 0 [ · ] = E [ · | x 0 i ] . This ensures a uniform coercivity condition on the expected quadratic cost.
Remark 3. 
Assumption 4 reflects the realistic heterogeneity of information across agents in social networks, a feature emphasized in the literature on bounded confidence models, and Bayesian learning in networks [1]. The monotonicity of information sets formalizes the idea that knowledge is non-decreasing over time, consistent with cumulative learning models [58]. The concavity of the cost functional L 0 and its agent-specific restrictions L 0 i parallels standard convex optimization assumptions (e.g., Slater condition), ensuring well-posedness of equilibrium and duality [57]. Finally, the uniform coercivity condition in (iii) is common in quadratic cost formulations of stochastic control and mean-field games [20,56], guaranteeing the stability of optimal strategies and preventing degenerate solutions.
Remark 4. 
Assumption 3 guarantees the possibility of at least one fixed point in the knowledge space. It is important to note that the agent makes decisions based on all available information. Then, the following Lemma 2 shows that the fixed point is indeed unique. Assumption 4 implies that each agent has some initial cost functional L 0 i at the beginning of [ 0 , t ] , and the conditional expected cost functional E 0 { L i } is positive throughout this time interval.
Lemma 2. 
Suppose i t h agent’s initial opinion x 0 i G is independent of B i ( s ) , and μ i and σ i satisfy Assumptions 1 and 2. Then there exists a unique solution to opinion dynamics represented by the Equation (2) in H 2 , m ˜ . Moreover, there exists some positive constant c ^ on time t and Lipschitz constants μ i and σ i ; the unique solution satisfies
E sup s [ 0 , t ] | x i ( s ) | 2 c ^ ( 1 + E | x 0 i | 2 ) exp ( c ^ t ) ,
for all i = 1 , 2 , , n .
Proof. 
See Appendix A. □
Remark 5. 
Lemma 2 guarantees that the stochastic opinion dynamics shown in Equation (2) exhibits a unique fixed point, and the expectation is bounded in the polish space G.
Assume the set of admissible strategies U ( [ 0 , t ] ) is convex and u i U ( [ 0 , t ] ) . Define x i ( s ) : = x i ( s , u i ) as the optimal opinion, which is the solution of Equation (2) with the initial opinion x 0 i . The first objective is to determine the Gâteaux derivative of the cost functional L i ( s , x , u i ) at u i in all directions. Consider another strategy v i such that v i ( s ) = u i ( s ) u i ( s ) for another admissible strategy u i U ( [ 0 , t ] ) . Hence, v i U ( [ 0 , t ] ) . v i can be considered as the direction of the Gâteaux derivative of L i ( s , x , u i ) [59]. For every ϵ > 0 small enough, define a strategy u i ϵ ( s ) = u i ( s ) + ϵ v i ( s ) , and the corresponding controlled opinion vector x ϵ : = x ϵ ( s , u i ) . Furthermore, define the variational process V = { V i ( s ) } s = 0 t as the solution of the equation
d V i ( s ) = x i μ i ( s , x i ( s ) , P ( x i ) , u i ( s ) ) V i ( s ) + ζ s , P ( x i , V i ) + u i μ i ( s , x i ( s ) , P ( x i ) , u i ( s ) ) d s + x i σ i ( s , x i ( s ) , P ( x i ) , u i ( s ) ) V i ( s ) + ζ ^ s , P ( x i , V i ) + u i σ i ( s , x i ( s ) , P ( x i ) , u i ( s ) ) d B i ( s ) ,
where
ζ ( . ) = E ˜ γ μ i ( s , x i ( s ) , P ( x i ) , u i ( s ) ) ( x ˜ i ( s ) ) . V ^ i ( s ) | x i = x i ( s ) , u i = u i ( s ) ,
and
ζ ^ ( . ) = E ˜ γ σ i ( s , x i ( s ) , P ( x i ) , u i ( s ) ) ( x ˜ i ( s ) ) . V ^ i ( s ) | x i = x i ( s ) , u i = u i ( s )
with { x ˜ i ( s ) , V ^ i ( s ) } being an independent copy of { x i ( s ) , V i ( s ) } . A Fréchet differentiability has been used to define E ˜ . This type of functional analytic differentiability was introduced by Pierre Lions at the Collége de France [50,59]. This is a type of differentiability based on the lifting of functions P 2 ( R n ) γ H ( γ ) into functions H ^ defined on Hilbert space H 2 , m ˜ ( Ω ˜ ; R n ) on some probability space ( Ω ˜ , F ˜ , P ˜ ) after setting H ^ ( x ˜ ) = H ( P ˜ x ˜ ) for all x ˜ H 2 , m ˜ ( Ω ˜ ; R n ) , with Ω ˜ being a Polish space and P ˜ an atomless measure [50]. Since there are n number of agents in the system, such that n , instead of considering the opinions of the other agents, agent i considers the distribution of all opinions in the system H ( P ˜ x ˜ ) and makes their opinions. Therefore, in this case the distribution function of opinions H is said to be differentiable at γ ¯ P 2 ( R n ) if there exists a set of random opinions x ˜ * with probability distribution γ ¯ (i.e., P ˜ x ˜ * = γ ¯ ). The Fréchet derivative of H ^ at x ˜ * is the element of Hilbert space H 2 , m ˜ ( Ω ˜ ; R n ) by identifying itself and its dual [50]. One important aspect of the Fréchet differentiation in this type of environment is that the distribution of the derivative depends on γ ¯ , not on x ˜ * . The Fréchet derivative of H is
H ( γ ) = H ( γ ^ ) + [ D f H ^ ] ( x ˜ * ) . ( x ˜ x ˜ * ) + o x ˜ x ˜ * 2 ,
where [ D f H ^ ] ( x ˜ * ) is the Fréchet derivative, the dot is the inner product of the Hilbert space over ( Ω ˜ , F ˜ , P ˜ ) , and . 2 is a norm of that Hilbert space. For a deterministic function g ˜ : R n R n , it is well understood that the Fréchet derivative of the form g ˜ ( x ˜ * ) is uniquely defined γ ^ almost everywhere in R [59,60]. The unique equivalence class of g ˜ is denoted by γ H ( γ ¯ ) . γ H ( γ ¯ ) is the partial derivative of H at γ ¯ , such that
γ H ( γ ¯ ) ( . ) : R n γ H ( γ ¯ ) ( y ) R n .
The partial derivative γ H ( γ ¯ ) allows one to express the Fréchet derivative [ D f H ^ ] ( x ˜ * ) as a function of any random variable x ˜ * with its law γ ¯ , irrespective of the definition of x ˜ * . If H ( γ ) = R n g ( y ) γ ( d y ) = g , γ for some scalar differentiable function g on R n . Here, H ^ ( y ˜ ) = E ˜ [ g ( y ˜ ) ] and D f H ^ ( y ˜ ) . ( x ˜ ) = E ˜ [ g ( y ˜ ) . ( x ˜ ) ] , where γ H ( γ ) is thought to be a deterministic function g [50].
Lemma 3. 
For a small ϵ > 0 , the admissible strategy u i ϵ defined as u i ϵ ( s ) = u i ( s ) + ϵ v i ( s ) , with the opinion of agent i as x ϵ : = x ϵ ( s , u i ) , the following condition holds
lim ϵ 0 E sup s [ 0 , t ] V i ( s ) x i ϵ ( s ) x i ( s ) ϵ = 0 ,
where x i ϵ ( s ) x ϵ is agent i’s opinion at time s coming from the set of all opinions in the environment.
Proof. 
See Appendix A. □
Remark 6. 
Based on Assumptions 1–3, Lemma 3 guarantees the existence and the uniqueness of V i ( s ) . Furthermore, for any ϱ [ 1 , ) this V i ( s ) satisfies E sup s [ 0 , t ] | V i ( s ) | ϱ < . Above Lemma 3 in some Hilbert space H 2 , m ˜ , V i ( s ) is derivative of the opinion driven by i t h agent’s strategy when the direction of the derivative v i ( s ) is changed.
Lemma 4. 
For ϵ > 0 small enough and the time interval [ s , s + ϵ ] [ 0 , t ] , there exists some δ [ 0 , ϵ ) so that the admissible strategy function of agent i denoted as u i ( s ) L i ( s , x , u i ) is Gâteaux differentiable and
δ L i ( s , x , u i ) | δ = 0 = E s s s + ϵ V i ( ν ) j η i w i j x i ( ν ) x j ( ν ) + k i x i ( ν ) x 0 i + u i ( ν ) v i ( ν ) d ν ,
where E s { . } = E { . | x i ( s ) } for all ν [ s , s + ϵ ] .
Proof. 
See Appendix A. □
Remark 7. 
The above Lemma determines the directional derivative of the cost functional L i ( s , x , u i ) for some ϵ > 0 small enough, [ s , s + ϵ ] [ 0 , t ] .

5. The Lagrangian and the Adjoint Processes

Let for s [ 0 , t ] , g ( s ) : [ p , q ] C be an opinion dynamics of i t h agent with initial and terminal points g ( p ) and g ( q ) , respectively, such that the line path integral is C f ( γ ) d s = p q f ( g ( s ) ) | g ( s ) | d s , where g ( s ) = g ( s ) / s . In this paper, a functional path integral approach is considered where the domain of the integral is assumed to be the space of functions [36]. In [61], theoretical physicist Richard Feynman introduced the Feynman path integral and popularized it in quantum mechanics. Furthermore, mathematicians developed the measurability of this functional integral, and in recent years it has become popular in probability theory [49]. In quantum mechanics, when a particle moves from one point to another, between those points it chooses the shortest path out of infinitely many paths, such that some of them touch the edge of the universe. After introducing n number of small intervals of equal length [ s , s + ϵ ] [ 0 , t ] with ϵ > 0 small enough, and using the Riemann–Lebesgue lemma if at time s one particle touches the end of the universe, then at a later time point it would come back and go to the opposite side of the previous direction to make the path integral a measurable function [62]. Similarly, since agent i has infinitely many opinions, they choose the opinion associated with least cost given by the constraint explained in Equation (2). Furthermore, the Feynman approach is useful in both linear and non-linear stochastic differential equation systems where constructing an HJB equation numerically is quite difficult [35].
Definition 1. 
For a particle, let L ^ [ s , y ( s ) , y ˙ ( s ) ] = ( 1 / 2 ) m ^ y ˙ ( s ) 2 V ^ ( y ) be the Lagrangian in the classical sense in generalized coordinate y with mass m ^ , where ( 1 / 2 ) m ^ y ˙ 2 and V ^ ( y ) are kinetic and potential energies, respectively. The transition function of the Feynman path integral corresponding to the classical action function Z * = 0 t L ^ ( s , y ( s ) , y ˙ ( s ) ) d s is defined as Ψ ( y ) = R exp { Z * } D Y , where y ˙ = y / s and D Y is an approximated Riemann measure that represents the positions of the particle at different time points s in [ 0 , t ] [36].
Remark 8. 
Definition 1 describes the construction of the Feynman path integral in a physical sense. This definition is important to construct the stochastic Lagrangian of agent i.
From Equation (45) of [63] for agent i, the stochastic Lagrangian at time s [ 0 , t ] is defined as
L ^ i s , x , P ( x ) , λ i , u i = E 0 { 1 2 0 t j η i w i j x i ( s ) x j ( s ) 2 + k i x i ( s ) x 0 i 2 + u i ( s ) 2 d s + 0 t x i ( s ) x 0 i 0 s [ μ i [ ν , x i ( ν ) , P ( x i ) , u i ( ν ) ] d ν σ i [ ν , x i ( ν ) , P ( x i ) , u i ( ν ) ] d B i ( ν ) ] d λ i ( s ) } ,
where λ i ( s ) is the Lagrangian multiplier and E 0 = E { . | x 0 i } .
Proposition 1 
([63,64]). Suppose for agent i, u i ( s ) is an admissible strategy and x i ( s ) is the corresponding opinion. Furthermore, assume that there exists a progressively measurable Lagrangian multiplier λ i ( s ) so that following two conditions hold:
L i x i s , x * ( s ) , P ( x * ) , λ i ( s ) , u i ( s ) = 0 ,
L i u i s , x * ( s ) , P ( x * ) , λ i ( s ) , u i ( s ) = 0 .
Moreover, assume that the mapping
( x i , u i ) L i ( s , x , u i ) + μ i [ ν , x i , P ( x i ) , u i ] λ i ( s ) + σ i [ ν , x i , P ( x i ) , u i ] d λ i ( s ) d B ( s ) d s ,
is almost surely concave in s [ 0 , t ] . Therefore, the admissible strategy u i ( s ) is an optimal strategy of agent i.
Remark 9. 
Following [63], we know that if u i is the solution of the system represented by Equations (1) and (2) and Condition 7, then there exists a progressively measurable Itô process λ i ( s ) , such that Equations (5) and (6) hold. In Proposition 1, the Lagrangian multiplier λ i ( s ) is indeed a progressively measurable process.
Since at the beginning of the continuous interval [ s , s + ϵ ] for all ϵ 0 , agent i does not have any future information to build their opinion. Thus, E [ s , s + ϵ ] { . } E s { . } = E { . | x i ( s ) } . Furthermore, as ϵ 0 , the Lagrangian expressed in Equation (4) becomes
L i s , x , P ( x ) , λ i , u i : = lim ϵ 0 E s { 1 2 s s + ϵ j η i w i j x i ( ν ) x j ( ν ) 2 + k i x i ( ν ) x 0 i 2 + u i ( ν ) 2 d ν + s s + ϵ x i ( ν ) x 0 i s ν [ μ i [ ν ^ , x i ( ν ^ ) , P ( x i ) , u i ( ν ^ ) ] d ν ^ σ i [ ν ^ , x i ( ν ^ ) , P ( x i ) , u i ( ν ^ ) ] d B i ( ν ^ ) ] d λ i ( ν ) } E s { 1 2 j η i w i j x i ( s ) x j ( s ) 2 + k i x i ( s ) x 0 i 2 + u i ( s ) 2 d s + x i ( s ) x 0 i μ i [ s , x i ( s ) , P ( x i ) , u i ( s ) ] σ i [ s , x i ( s ) , P ( x i ) , u i ( s ) ] d λ i ( s ) } ,
where [ s , ν ] [ s , s + ϵ ] .
The adjoint process of the system is
d λ 1 i ( s ) = x μ i [ s , x i ( s ) , P ( x i ) , u i ( s ) ] λ 1 i ( s ) + x σ i [ s , x i ( s ) , P ( x i ) , u i ( s ) ] λ 1 i ( s ) + x L i ( s , x , u i ) d s + λ 2 i ( s ) d B i ( s ) ,
where λ 1 i ( s ) and λ 2 i ( s ) are two new dual variables belong to the dual spaces of the spaces from where μ i and σ i take their values, such that λ 1 i R like x i , and that λ 2 i R 2 . Notice that the deterministic Hamiltonian of the system is
H i s , x , P ( x ) , λ 1 i , λ 2 i , u i = L i ( s , x , u i ) + λ 1 i ( s ) μ i [ s , x i ( s ) , P ( x i ) , u i ( s ) ] + λ 2 i ( s ) σ i [ s , x i ( s ) , P ( x i ) , u i ( s ) ] .
The differences between the above Hamiltonian and Equation (4) are the presence of Δ x i ( s ) : = x i ( s ) x 0 i , λ i ( s ) , E s { . } , d s and d B i ( s ) . If Δ x i ( s ) 0 , under the deterministic case, the Hamiltonian and Lagrangian share a similar structure. Since the Feynman path integral approach has been used, d B i ( s ) determines the true fluctuation of L i , and further inclusion of E s facilitates the conditional expectation of a forward looking process for [ s , s + ϵ ] . The Lagrangian used in Equation (4) is stochastic, but the usual Hamiltonian of the control theory is deterministic.
Definition 2. 
For a set of admissible strategies u i = { u i ( s ) } s = 0 t U ( [ 0 , t ] ) of agent i, denote x i ( s ) = x i ( s , u i ) the set of corresponding controlled opinions, and let ( λ 1 i , λ 2 i ) = { λ 1 i ( s ) , λ 2 i ( s ) } s = 0 t be any coupled adjoint progressively measurable stochastics processes satisfying
d λ 1 i ( s ) = x H i ( s , x , P x , λ 1 i , λ 2 i , u i ) + λ 2 i ( s ) d B i ( s ) E ˜ γ H ˜ i ( s , x ˜ , P x ˜ , λ ˜ 1 i , λ ˜ 2 i , u ˜ i ) [ x i ( s ) ] ,
where ( x ˜ , λ ˜ 1 i , λ ˜ 2 i , u ˜ i , L ˜ i ) is the independent copy of ( x , λ 1 i , λ 2 i , u i , L i ) and E ˜ is the expectation of the independent copy. In the adjoint equation, x = / x and γ = / γ .
Remark 10. 
If μ i and σ i are independent with the marginal distributions of the process, the extra terms appearing in the adjoint equation in Definition 2 vanishes, and indeed this equation becomes the classical adjoint equation.
In the present setup, the adjoint equation can be written as
d λ 1 i ( s ) = x μ i [ s , x i ( s ) , P ( x i ) , u i ( s ) ] λ 1 i ( s ) + x σ i [ s , x i ( s ) , P ( x i ) , u i ( s ) ] λ 1 i ( s ) d s + x L i ( s , x , u i ) + λ 2 i ( s ) d B i ( s ) E ˜ γ μ ˜ i ( ( s , x ˜ , x , λ ˜ 1 i , λ ˜ 2 i , u ˜ i ) ) + γ σ ˜ i ( s , x ˜ , x , λ ˜ 1 i , λ ˜ 2 i , u ˜ i ) d s + γ L ˜ i ( s , x ˜ , x , λ ˜ 1 i , λ ˜ 2 i , u ˜ i ) | x = x ( s ) .
It is important to note that for a given admissible strategy u i U ( [ 0 , t ] ) and the controlled opinion x i , despite the boundedness assumptions of the partial derivatives of μ i and σ i , and despite that the first part of the above adjoint equation is linear with respect to λ 1 i ( s ) and λ 2 i ( s ) , the existence and uniqueness of a solution { λ 1 i , λ 2 i } of the adjoint equation cannot be determined by the standard process (for example, Theorem 2.2 in [59]). The main reason for this is that the joint distribution of the solution process appears in μ i and σ i [50,59].
Lemma 5. 
Under (v) of Assumption 2, there exists a unique adapted solution ( λ 1 i , λ 2 i ) of the coupled adjoint progressively measurable stochastics processes, satisfying
d λ 1 i ( s ) = x L i ( s , x , P ( x ) , λ 1 i , λ 2 i , u i ) + λ 2 i ( s ) d B i ( s ) E ˜ γ L ˜ i ( s , x ˜ , P ( x ˜ ) , λ ˜ 1 i , λ ˜ 2 i , u ˜ i ) [ x i ( s ) ] ,
in H λ 1 2 , m ˜ H λ 2 2 , m ˜ , where
H λ 1 2 , m ˜ : = λ 1 i H λ 1 0 , m ˜ ; E 0 t | λ 1 i ( s ) | 2 d s < ,
and
H λ 2 2 , m ˜ : = λ 2 i H λ 1 0 , m ˜ ; E 0 t | λ 2 i ( s ) | 2 d s < .
Sketch of Proof. 
Here we provide the sketch of the proof. Details are discussed in Appendix A. We first define the weighted norm,
( λ 1 i , λ 2 i ) ρ 2 : = E 0 t | λ 1 i ( s ) | 2 + | λ 2 i ( s ) | 2 exp ( ρ s ) d s , ρ > 0 .
For a candidate pair ( λ 1 i # , λ 2 i # ) H 2 , m ˜ , Proposition 2.2 of [65] and Theorem 2.2 of [59] ensure the existence of a solution ( λ 1 i , λ 2 i ) to the adjoint equation
d λ 1 i ( s ) = x L i s , x , P x , λ 1 i # , λ 2 i # , λ 1 i , λ 2 i , u i d s + λ 2 i ( s ) d B i ( s ) E ˜ γ L ˜ i ( s , x ˜ , P x ˜ , λ 1 i # , λ 2 i # , λ ˜ 1 i , λ ˜ 2 i , u ˜ i ) [ x i ( s ) ] .
This defines a mapping M : ( λ 1 i # , λ 2 i # ) ( λ 1 i , λ 2 i ) on H 2 , m ˜ . To establish uniqueness, let ( λ 1 i 1 , λ 2 i 1 ) = M ( λ 1 i # 1 , λ 2 i # 1 ) and ( λ 1 i 2 , λ 2 i 2 ) = M ( λ 1 i # 2 , λ 2 i # 2 ) . Denote the differences
λ ˜ 1 i : = λ 1 i 2 λ 1 i 1 , λ ˜ 2 i : = λ 2 i 2 λ 2 i 1 , λ ˜ 1 i # : = λ 1 i # 2 λ 1 i # 1 , λ ˜ 2 i # : = λ 2 i # 2 λ 2 i # 1 .
Applying Itô’s formula to | λ ˜ 1 i ( s ) | 2 exp ( ρ s ) yields the estimate
E 0 t e ρ s | λ ˜ 1 i ( s ) | 2 + | λ ˜ 2 i ( s ) | 2 d s C ρ E 0 t e ρ s | λ ˜ 1 i # ( s ) | 2 + | λ ˜ 2 i # ( s ) | 2 d s ,
where C > 0 depends on the Lipschitz constants of the derivatives of μ i , σ i , L i from Assumption (v). Choosing ρ sufficiently large ensures C ρ < 1 , i.e., M is a contraction under · ρ .
By the Banach fixed point theorem, there exists a unique fixed point ( λ 1 i , λ 2 i ) H λ 1 2 , m ˜ H λ 2 2 , m ˜ , which is the unique adapted solution of the adjoint system. □
Remark 11. 
Lemma 5 states that for each admissible strategy u i , there exists a couple of adjoint processes λ 1 i , λ 2 i so that E sup s [ 0 , t ] λ 1 i ( s ) 2 + E 0 t λ 2 i ( s ) 2 d s .
For a normalizing constant L ϵ i > 0 , define a transition function from s to s + ϵ as
Ψ s , s + ε i ( x i ) : = 1 L ε i R n exp [ ε A s , s + ε ( x i ) ] Ψ s i ( x i ) d x i ( s ) ,
where Ψ s i ( x i ) is the value of the transition function based on opinion x i at time s with the initial condition Ψ 0 i ( x i ) = Ψ 0 i . The penalization constant L ϵ i is chosen in such a way that the right hand side of the expression (10) becomes unity. Therefore, the action function of agent i in time interval [ s , s + ϵ ] is
A s , s + ε ( x i ) = s s + ε E ν { 1 2 j η i w i j x i ( ν ) x j ( ν ) 2 + k i x i ( ν ) x 0 i 2 + u i ( ν ) 2 d ν + h i [ ν + Δ ν , x i ( ν ) + Δ x i ( ν ) ] d λ i ( ν ) } ,
where h i [ ν + Δ ν , x i ( ν ) + Δ x i ( ν ) ] C 2 ( [ 0 , t ] × R ) is an Itô process so that,
h i [ ν + Δ ν , x i ( ν ) + Δ x i ( ν ) ] x i ( ν ) x 0 i μ i [ ν , x i ( ν ) , P ( x i ) , u i ( ν ) ] d ν σ i [ ν , x i ( ν ) , P ( x i ) , u i ( ν ) ] d B i ( ν ) .
It is important to note that the implicit form of the McKean–Vlasov SDE is replaced by an Itô process. The action A s , s + ϵ ( x i ) tells us that within [ s , s + ϵ ] the action of agent i depends on their opinion x i under a feedback structure.
Definition 3. 
For optimal opinion x i ( s ) there exists an optimal admissible control u i ( s ) , such that for all s [ 0 , t ] the conditional expectation of the cost function is
E 0 0 t 1 2 j η i w i j x i ( s ) x j ( s ) 2 + k i x i ( s ) x 0 i 2 + u i ( s ) 2 d s | F 0 x E 0 0 t 1 2 j η i w i j x i ( s ) x j ( s ) 2 + k i x i ( s ) x 0 i 2 + u i ( s ) 2 d s | F 0 x ,
such that Equation (2) holds, where F 0 x is the optimal filtration satisfying F 0 x F 0 x .

6. Main Results

Consider that, for an opinion space X 0 = { x ( s ) : s [ 0 , t ] } and agent i’s control space U , there exists an admissible control u i : [ 0 , t ] × X 0 U , and for all i N define the integrand of the cost function L i ( . ) as
L i ( s , x , u i ) = E 1 2 0 t j η i w i j x i ( s ) x j ( s ) 2 + k i x i ( s ) x 0 i 2 + u i ( s ) 2 d s .
Proposition 2. 
For agent i
  • (i). The feedback control u i ( s , x i ) : [ 0 , t ] × R × U U is a continuously differentiable function,
  • (ii). The cost functional L i ( s , x , u i ) : [ 0 , t ] × R n × R R is smooth on R × U .
  • (iii). If X 0 = { x ( s ) , s [ 0 , t ] } is an opinion trajectory of agent i, then the feedback Nash equilibrium u i ( s , x i ) U ; i N would be the solution of the following equation:
u i f i ( s , x , u i ) 2 ( x i ) 2 f i ( s , x , u i ) 2 = 2 x i f i ( s , x , u i ) 2 x i u i f i ( s , x , u i ) ,
where, for an Itô process h i ( s , x i ) [ 0 , t ] × P 2 ( R ) × R
f i ( s , x , γ , u i ) = L i ( s , x , u i ) + h i ( s , x i ) d λ i ( s ) + h i ( s , x i ) s d λ i ( s ) + d λ i ( s ) d s h i ( s , x i ) + h i ( s , x i ) x i μ i s , x i , P ( x i ) , u i d λ i ( s ) + 1 2 σ i s , x i , P ( x i ) , u i 2 2 h i ( s , x i ) ( x i ) 2 d λ i ( s ) .
Proof. 
See Appendix A. □
Remark 12. 
The central idea of Proposition 2 is to choose h i appropriately. Therefore, one natural candidate should be a function of the integrating factor of the stochastic opinion dynamics represented in Equation (2).
To demonstrate the preceding proposition, we present a detailed example to identify an optimal strategy in McKean–Vlasov SDEs and the corresponding systems of interacting particles. Specifically, we examine a stochastic opinion dynamics model involving six unknown parameters. To construct this example, we are going to combine the equations from [66,67,68]. Following [67], consider a one-dimensional stochastic opinion dynamics model, parameterized by θ = ( θ 1 , θ 2 ) T R 2 of the form
d x i ( s ) = R ϕ θ | | x i ( s ) x j ( s ) | | x i ( s ) x j ( s ) γ d ( x j ) d s + σ i x i ( s ) d B i ( s ) ,
where σ i > 0 , B i = { B i ( s ) } s 0 is standard Brownian motion, and the interaction kernel ϕ θ : R + R + , which has the form
ϕ θ ( β ) = θ 1 exp 0.01 1 ( β θ 2 ) 2 if β > 0 0 if β 0 .
This model is often described in terms of the corresponding system of interacting particles, which is represented by
d x i ( s ) = 1 n j = 1 n ϕ θ | | x i ( s ) x j ( s ) | | x i ( s ) x j ( s ) d s + σ i x i ( s ) d B i ( s ) ,
where θ 1 is the scale and θ 2 is the range parameters. Models of this type appear in a variety of fields, ranging from biology to the social sciences, where ϕ θ determines how the behavior of one particle (such as an agent’s opinions) affects the behavior of other particles (such as the opinions of others). For a comprehensive discussion of these models, see [66,69,70,71]. In deterministic versions of these models, it is well established that, over time, particles converge into clusters. The number of clusters depends on both the interaction kernel (i.e., the scope and intensity of interactions between particles) and the initial conditions.
Figure 2 presents contour plots that illustrate the dependence of opinion dynamics on the interaction parameters θ 1 (the strength of interaction) and θ 2 (the effective range of interaction) for different population sizes n = 20 , 100 , 200 , 500 in the McKean–Vlasov framework. Each panel corresponds to a different value of n and depicts how the joint influence of θ 1 and θ 2 shapes the stability and convergence of agents’ opinions over time. In the top-left panel ( n = 20 ), where the population is relatively small, the contour structure is irregular and less symmetric. This reflects the strong influence of stochastic fluctuations due to the limited number of interacting agents. In this regime, opinion clusters form in a less predictable way, with random effects playing a central role in determining whether consensus or polarization emerges. The system is therefore more sensitive to both initial conditions and noise.
As the population increases to n = 100 (top-right), the contours become more structured and approximately symmetric. This indicates that as the number of agents grows, the law of large numbers begins to smooth out random fluctuations, and the underlying mean-field effect of the interaction kernel ϕ θ starts to dominate. At this stage, opinion formation exhibits more stable patterns, and clusters are determined less by chance and more by systematic interaction dynamics. For n = 200 (bottom-left), the contour lines become even more regular, exhibiting near-circular symmetry centered around the interaction parameter region where the system is most stable. This suggests that the population is now large enough for the McKean–Vlasov approximation to provide an accurate description of the collective dynamics. The influence of randomness is significantly reduced, and the system’s macroscopic behavior is largely governed by the deterministic mean-field equation. Opinion clusters become sharper, and their positions in the opinion space depend primarily on θ 1 and θ 2 .
Finally, in the bottom-right panel ( n = 500 ), the contour structure shows almost perfect symmetry and smoothness. At this scale, the dynamics are essentially deterministic, with negligible stochastic deviations. The clustering process is entirely driven by the balance between the strength of interactions (how strongly agents adjust their opinions toward their neighbors) and the range of interactions (how far across the opinion space such influence extends). As a result, the opinion distribution evolves in a predictable manner, converging toward stable equilibrium clusters.
Following [68], we use a modified version of the Friedkin and Johnsen model [72]
d x i ( s ) = α ( s ) x i ( s ) d s α ( s ) F i ( A i ) d s + G ( s , x i ( s ) ) [ u i ( s ) ] 2 + σ i x i ( s ) d B i ( s ) ,
where F i : [ 0 , 1 ] × [ 0 , t ] [ 0 , 1 ] , A i defines the set of neighbors of i t h agent such that A i = x j ( s ) | | | x i ( s ) x j ( s ) | | 2 2 r , for all r be the radius of neighborhood, and G : [ 0 , t ] × [ 0 , 1 ] [ 0 , 1 ] × U be the actuator dynamics [68]. After assuming F i ( A i ) = 1 n j = 1 n ϕ θ | | x i ( s ) x j ( s ) | | x i ( s ) x j ( s ) and G ( s , x i ( s ) ) = x i ( s ) , Equation (16) becomes
d x i ( s ) = α ( s ) x i ( s ) d s α ( s ) 1 n j = 1 n ϕ θ | | x i ( s ) x j ( s ) | | x i ( s ) x j ( s ) d s + x i ( s ) [ u i ( s ) ] 2 + σ i x i ( s ) d B i ( s ) .
To justify the structure of Equation (17), we note that the model synthesizes key elements from opinion dynamics, mean-field interaction kernels, and stochastic control theory. The first term, α ( s ) x i ( s ) , captures the natural tendency of an agent’s opinion to decay toward a neutral baseline over time and reflects internal anchoring behavior consistent with the Friedkin–Johnsen-type framework. The second term incorporates the aggregate influence from other agents through a non-linear interaction kernel ϕ θ , whose support is bounded and shaped by parameters θ 1 and θ 2 . This kernel modulates the strength and scope of pairwise interactions based on opinion proximity and is consistent with kinetic models of opinion exchange. The third term, x i ( s ) [ u i ( s ) ] 2 , arises from an actuator-based control framework, where the actuator gain is assumed to be state-dependent, i.e., proportional to the current opinion magnitude. This formulation ensures that control input scales with opinion intensity, aligning with models where individuals exert greater effort when holding stronger views. Finally, the multiplicative noise term σ i x i ( s ) d B i ( s ) introduces stochasticity into the dynamics, with volatility increasing with opinion magnitude a structure motivated by empirical observations in behavioral and social systems. Collectively, these components yield a state-dependent, stochastic differential equation with non-linear drift, capturing the coupled effects of self-dynamics, social influence, control, and uncertainty in the evolution of individual opinions.
Figure 3 presents the temporal evolution of opinions in the McKean–Vlasov opinion dynamics (17) for different population sizes: n = 20 , 100 , 200 , and 500. Each panel corresponds to a specific value of n, with the standardized opinion values plotted against standardized time. The colored trajectories represent the dynamics of individual agents, while the bold black curve traces the average opinion trajectory across the population.
The opinion dynamics are governed by the stochastic differential equation
d x i ( s ) = α ( s ) x i ( s ) d s α ( s ) 1 n j = 1 n ϕ θ x i ( s ) x j ( s ) x i ( s ) x j ( s ) d s + x i ( s ) [ u i ( s ) ] 2 d s + σ i x i ( s ) d B i ( s ) ,
where the terms capture, respectively, opinion decay due to stubbornness ( α ( s ) x i ( s ) ), attraction to neighbors through the interaction kernel ϕ θ (the second term), a quadratic reinforcement or destabilization effect due to control ( x i ( s ) [ u i ( s ) ] 2 ), and stochastic diffusion in opinions driven by multiplicative noise ( σ i x i ( s ) d B i ( s ) ).
The panels reveal a clear pattern: although the trajectories appear organized and the average opinion stabilizes around a central trend, the opinions do not converge to a single cluster as might be expected in classical deterministic consensus models. This lack of convergence can be explained directly by the structure of (17). In particular, the multiplicative noise term σ i x i ( s ) d B i ( s ) continually injects variability into the system, preventing the formation of stable consensus clusters. Furthermore, the presence of the quadratic control term x i ( s ) [ u i ( s ) ] 2 amplifies deviations in opinion trajectories, acting as a self-reinforcing mechanism that counteracts the attractive forces of the interaction kernel. Unless u i ( s ) is carefully regulated, this term destabilizes the trajectories and disperses opinions rather than concentrating them.
The influence of the attraction kernel ϕ θ ( · ) is also crucial in determining whether or not convergence occurs. If the scale parameter θ 1 or the range parameter θ 2 are too small, the effective strength of interactions between agents is insufficient to dominate the stochastic fluctuations and control-induced dispersion. As a result, even though the decay term α ( s ) x i ( s ) biases opinions toward zero and the interaction kernel exerts a weak consensus force, these effects cannot fully overcome the opposing forces of noise and quadratic reinforcement. This balance explains why the trajectories remain spread out, with only partial alignment visible in the average opinion curve.
The dependence on n highlights the statistical smoothing properties of the system. For small populations ( n = 20 ), individual trajectories are highly dispersed and the average opinion curve is relatively volatile. As n increases to 100 , 200 , and 500, the law of large numbers exerts a stabilizing effect: the average opinion trajectory becomes smoother and more predictable, and fluctuations of the population mean diminish. Nevertheless, even in the largest population size shown ( n = 500 ), individual opinions remain widely scattered and do not collapse into consensus clusters. This illustrates that simply increasing the number of agents does not resolve the fundamental non-convergence induced by the interplay of noise, self-reinforcement, and weak interaction forces.
Our main aim is to minimize Equation (1) subject to Equation (17). The integrating factor of Equation (17) is exp { 0 s σ i d B i ( ν ) + 1 2 0 s ( σ i ) 2 d ν } . Therefore, the h i ( s ) function is
h i ( s ) = x i ( s ) exp 0 s σ i d B i ( ν ) + 1 2 0 s ( σ i ) 2 d ν = x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s .
Equation (12) becomes
f i ( s , x , γ , u i ) = 1 2 j η i w i j x i ( s ) x j ( s ) 2 + k i x i ( s ) x 0 i 2 + u i ( s ) 2 + x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) + 1 2 x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s σ i 2 d λ i ( s ) + d λ i ( s ) d s x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s α ( s ) x i ( s ) α 1 n j = 1 n ϕ θ | | x i ( s ) x j ( s ) | | x i ( s ) x j ( s ) + x i ( s ) ( u i ( s ) ) 2 d λ i ( s ) .
Now,
x i f i ( s , x , γ , u i ) = j η i w i j x i ( s ) x j ( s ) + k i x i ( s ) x 0 i + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) d s + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s × α ( s ) α ( s ) 1 n j = 1 n d d x i ϕ θ | | x i ( s ) x j ( s ) | | x i ( s ) x j ( s ) + ϕ θ | | x i ( s ) x j ( s ) | | + ( u i ( s ) ) 2 d λ i ( s ) , 2 ( x i ) 2 f i ( s , x , γ , u i ) = j η i w i j + k i + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s × α ( s ) 1 n j = 1 n d 2 d ( x i ) 2 ϕ θ | | x i ( s ) x j ( s ) | | x i ( s ) x j ( s ) + 2 d d x i ϕ θ | | x i ( s ) x j ( s ) | | d λ i ( s ) , u i f i ( s , x , γ , u i ) = u i ( s ) 1 + 2 x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) , 2 x i u i f i ( s , x , γ , u i ) = 2 exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) .
Using the above results, Equation (11) yields
4 u i ( s ) 2 exp 2 σ i B i ( s ) + ( σ i ) 2 s d λ i ( s ) 2 u i ( s ) 1 + 2 x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) × [ j η i w i j + k i + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s α ( s ) 1 n j = 1 n d 2 d ( x i ) 2 ϕ θ | | x i ( s ) x j ( s ) | | x i ( s ) x j ( s ) + 2 d d x i ϕ θ | | x i ( s ) x j ( s ) | | d λ i ( s ) ] + 4 exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) [ j η i w i j x i ( s ) x j ( s ) + k i x i ( s ) x 0 i + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) d s + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s × α ( s ) α ( s ) 1 n j = 1 n d d x i ϕ θ | | x i ( s ) x j ( s ) | | x i ( s ) x j ( s ) + ϕ θ | | x i ( s ) x j ( s ) | | d λ i ( s ) ] = 0 .
Clearly, Equation (20) is a quadratic equation with respect to strategy u i ( s ) and can be written as T 1 u i ( s ) 2 + T 2 u i ( s ) + T 3 = 0 . Therefore, the optimal strategy of agent i is
u i ( s ) = T 2 ± T 2 2 4 T 1 T 3 2 T 1 ,
where
T 1 = 4 exp 2 σ i B i ( s ) + ( σ i ) 2 s d λ i ( s ) 2 > 0 , T 2 = 1 + 2 x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) [ j η i w i j + k i + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s × α ( s ) 1 n j = 1 n d 2 d ( x i ) 2 ϕ θ | | x i ( s ) x j ( s ) | | x i ( s ) x j ( s ) + 2 d d x i ϕ θ | | x i ( s ) x j ( s ) | | d λ i ( s ) ] , T 3 = 4 exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) [ j η i w i j x i ( s ) x j ( s ) + k i x i ( s ) x 0 i + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) d s + exp σ i B i ( s ) + 1 2 ( σ i ) 2 s × α ( s ) α ( s ) 1 n j = 1 n d d x i ϕ θ | | x i ( s ) x j ( s ) | | x i ( s ) x j ( s ) + ϕ θ | | x i ( s ) x j ( s ) | | d λ i ( s ) ] ,
and
d d x i ϕ θ | | x i ( s ) x j ( s ) | | = 0.02 θ 1 exp 0.01 1 x i ( s ) x j ( s ) θ 2 2 1 1 x i ( s ) x j ( s ) θ 2 2 2 x i ( s ) x j ( s ) θ 2 , d 2 d ( x i ) 2 ϕ θ | | x i ( s ) x j ( s ) | | = 0.02 θ 1 [ 0.02 exp 0.01 1 x i ( s ) x j ( s ) θ 2 2 1 ] + exp 0.01 1 x i ( s ) x j ( s ) θ 2 2 1 × 4 1 x i ( s ) x j ( s ) θ 2 2 3 x i ( s ) x j ( s ) θ 2 2 + 1 x i ( s ) x j ( s ) θ 2 2 2 .
Figure 4 shows the time evolution of the optimal control strategy u i ( s ) for a representative agent. The strategy starts at a high level near u i ( 0 ) = 1 , reflecting strong initial adjustment efforts, and then monotonically decreases as time progresses. The decay indicates that agents exert less control over their opinions in the long run, as the system stabilizes and the marginal benefit of intervention diminishes.
Figure 5 illustrates how the optimal control strategy u i ( s ) varies as a function of the system parameters α and θ . Parameter α represents the decay rate of opinion inertia, while θ captures the intensity and effective range of interactions through the kernel ϕ θ . The figure demonstrates that higher values of α lead to stronger suppression of opinion deviations, thus requiring lower control intensity by the agents. Conversely, when α is small, the system exhibits slower natural convergence, and agents must apply stronger control (larger u i ( s ) ) to stabilize their opinions. Similarly, parameter θ regulates the sensitivity of opinion interactions: when θ is large, peer effects dominate, diminishing the necessity for active control. These results highlight the non-linear interplay between individual control efforts and collective opinion dynamics, showing that the optimal control strategy adapts endogenously to both internal resistance (governed by α ) and external interaction strength (captured by θ ).
It is important to clarify the admissible choice of root in (21). Since T 1 > 0 by construction, the quadratic equation is strictly convex in u i ( s ) . Hence, the minimizer of the quadratic cost is uniquely determined by the smaller root of (21), i.e.,
u i ( s ) = T 2 T 2 2 4 T 1 T 3 2 T 1 .
The larger root would correspond to a maximizer of the quadratic form and is therefore not admissible for an optimal control. Moreover, the discriminant condition T 2 2 4 T 1 T 3 0 guarantees real-valued strategies. In the case T 2 2 4 T 1 T 3 = 0 , both roots coincide and yield the same admissible control. Thus, under the standing assumption of T 1 > 0 and the discriminant condition, the unique optimal strategy is given by the negative branch of the quadratic solution above.
Remark 13. 
From a dynamical perspective, the choice of the negative branch in (21) is also consistent with stability of the controlled state process. Indeed, substitution of the larger root into the dynamics amplifies the control magnitude and leads to trajectories that may diverge due to the exponential terms present in T 1 , T 2 , T 3 . In contrast, the smaller root ensures that the control remains bounded and that the associated quadratic cost functional is coercive. This observation is consistent with standard results in stochastic control, where admissible controls must not only minimize the cost functional but also guarantee well-posedness of the controlled dynamics [52,73].
Let us discuss how the optimal strategy of agent i derived in Equation (21) varies with the difference in the opinions between i t h and j t h agents.
Case I.
Suppose there is no difference in opinions between agents i and j. In other words, x i ( s ) x j ( s ) = 0 , which implies ϕ θ | | x i ( s ) x j ( s ) | | = 0 , and
T 1 = 4 exp 2 σ i B i ( s ) + ( σ i ) 2 s d λ i ( s ) 2 0 , T 2 = j η i w i j + k i 1 + 2 x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) , T 3 = 4 exp 2 σ i B i ( s ) + ( σ i ) 2 s d λ i ( s ) k i x i ( s ) x 0 i exp σ i B i ( s ) 1 2 ( σ i ) 2 s + d λ i ( s ) + d λ i ( s ) d s α ( s ) d λ i .
Therefore,
x i u i ( s ) = 2 ( T 1 ) 1 j η i w i j + k i exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) ± 2 { j η i w i j + k i 2 1 + 2 x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) 2 64 exp 4 σ i B i ( s ) + 2 ( σ i ) 2 s ( d λ i ( s ) ) 3 × k i x i ( s ) x 0 i exp σ i B i ( s ) 1 2 ( σ i ) 2 s + d λ i ( s ) + d λ i ( s ) d s α ( s ) d λ i ( s ) } 3 / 2 exp σ i B i ( s ) + 1 2 ( σ i ) 2 s
× 1 + 2 x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) 64 exp 3 σ i B i ( s ) + 3 2 ( σ i ) 2 s ( d λ i ( s ) ) 3 .
Since T 1 > 0 , the sign of the above partial derivative depends on the terms on two sides of ±. Furthermore, as the optimal strategy cannot be negative and the term after ± in Equation (21) is a negative dominant term, we ignore the + sign. Moreover, assuming w i j = k i = 0 , Equation (22) yields
x i u i ( s ) = 2 64 exp 4 σ i B i ( s ) + 2 ( σ i ) 2 s ( d λ i ( s ) ) 3 d λ i ( s ) + d λ i ( s ) d s α ( s ) d λ i ( s ) 3 / 2 × exp σ i B i ( s ) + 1 2 ( σ i ) 2 s 1 + 2 x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) 64 exp 3 σ i B i ( s ) + 3 2 ( σ i ) 2 s ( d λ i ( s ) ) 3 > 0 .
The above result is true for all positive values of w i j and k i . This implies that an agent’s opinion positively influences their optimal strategy.
Case II.
Consider the opinion of agent i is less influential than agent j, or x i ( s ) x j ( s ) < 0 . By construction ϕ θ | | x i ( s ) x j ( s ) | | = 0 . The terms T 1 and T 2 take the same value as in Case I. The other term is
T 3 = 4 exp 2 σ i B i ( s ) + ( σ i ) 2 s d λ i ( s ) j η i w i j x i ( s ) x j ( s ) + k i x i ( s ) x 0 i × exp σ i B i ( s ) 1 2 ( σ i ) 2 s + d λ i ( s ) + d λ i ( s ) d s α ( s ) d λ i .
Hence,
x i u i ( s ) = 2 ( T 1 ) 1 j η i w i j + k i exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) ± 2 { j η i w i j + k i 2 1 + 2 x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) 2 64 exp 4 σ i B i ( s ) + 2 ( σ i ) 2 s ( d λ i ( s ) ) 3 × j η i w i j x i ( s ) x j ( s ) + k i x i ( s ) x 0 i exp σ i B i ( s ) 1 2 ( σ i ) 2 s + d λ i ( s ) + d λ i ( s ) d s α ( s ) d λ i ( s ) } 3 / 2
× exp σ i B i ( s ) + 1 2 ( σ i ) 2 s 1 + 2 x i ( s ) exp σ i B i ( s ) + 1 2 ( σ i ) 2 s d λ i ( s ) 64 exp 3 σ i B i ( s ) + 3 2 ( σ i ) 2 s ( d λ i ( s ) ) 3 .
After assuming w i j = k i = 0 , Equation (25) becomes Equation (24). Therefore, x i u i ( s ) > 0 . This implies that agent i’s opinion positively influences u i ( s ) , even if agent j’s opinion is more influential in the society. Furthermore,
x j u i ( s ) = 4 j η i w i j x j ( s ) x i ( s ) 3 / 2 j η i w i j exp 3 2 σ i B i ( s ) + 3 4 ( σ i ) 2 s ( d λ i ( s ) ) 3 / 2 < 0 .
The above equation shows a negative correlation between the i t h agent’s optimal strategy and the j t h agent’s opinion. This implies that as the opinion of the more influential j t h agent becomes stronger, the i t h agent becomes more hesitant to make a decision.

7. Corresponding HJB Approach

We begin with the controlled dynamics of agent i, given by Equation (17) and the associated cost functional described in Equation (1). Since the cost is quadratic in the state variables and the dynamics are affine in x i ( s ) , it is natural to postulate a quadratic value function of the form
V i ( s , x ) = A i ( s ) ( x i ( s ) ) 2 + B i ( s ) x i ( s ) + C i ( s ) ,
where A i ( s ) , B i ( s ) , C i ( s ) are scalar time-dependent functions to be determined. Differentiating with respect to the state variable, one obtains
V i x i ( s , x ) = 2 A i ( s ) x i ( s ) + B i ( s ) , 2 V i ( x i ( s ) ) 2 ( s , x ) = 2 A i ( s ) .
The HJB equation involves the control-dependent term
Ψ ( u i ) = 1 2 + x i ( s ) 2 A i ( s ) x i ( s ) + B i ( s ) ( u i ( s ) ) 2 .
If the prefactor satisfies 1 2 + 2 A i ( s ) ( x i ( s ) ) 2 + B i ( s ) x i ( s ) > 0 , then the minimizer is attained at u i ( s ) = 0 . This is consistent with the standard structure of Linear Quadratic (LQ) control problems.
Further, substituting u i ( s ) = 0 into the HJB equation yields
A ˙ i ( s ) ( x i ) 2 B ˙ i ( s ) x i C ˙ i ( s ) = 1 2 j η i w i j ( x i x j ) 2 + k i ( x i x 0 i ) 2 + α ( s ) x i α ( s ) 1 n j = 1 n ϕ θ ( x i x j ) ( x i x j ) 2 A i ( s ) x i + B i ( s ) + ( σ i x i ) 2 A i ( s ) .
The right-hand side can now be decomposed into contributions from the quadratic running cost, the drift, and the diffusion. Expanding the running cost gives
1 2 j η i w i j ( x i ( s ) x j ( s ) ) 2 = 1 2 j η i w i j ( x i ( s ) ) 2 2 x i ( s ) x j ( s ) + ( x j ( s ) ) 2 ,
1 2 k i ( x i ( s ) x 0 i ) 2 = 1 2 k i ( x i ( s ) ) 2 2 x i ( s ) x 0 i + ( x 0 i ( s ) ) 2 ,
so that the quadratic terms collect as
1 2 j η i w i j + k i ( x i ( s ) ) 2 j η i w i j x j ( s ) + k i x 0 i x i ( s ) + const .
The drift term involving α ( s ) x i contributes
( α ( s ) x i ( s ) ) ( 2 A i ( s ) x i + B i ( s ) ) = 2 α ( s ) A i ( s ) ( x i ( s ) ) 2 α ( s ) B i ( s ) x i ( s ) ,
while the interaction part
α ( s ) 1 n j = 1 n ϕ θ ( x i ( s ) x j ( s ) ) ( x i ( s ) x j ( s ) ) 2 A i ( s ) x i ( s ) + B i ( s )
remains non-linear in x i ( s ) , unless ϕ θ is approximated by a linearization around equilibrium. Finally, the diffusion term yields
( σ i x i ( s ) ) 2 A i ( s ) = ( σ i ) 2 A i ( s ) ( x i ( s ) ) 2 .
Equating coefficients of powers of x i ( s ) on both sides of (28), one obtains the coupled Riccati-type ordinary differential equations
A ˙ i ( s ) = 1 2 j η i w i j + k i 2 α ( s ) A i ( s ) + ( σ i ) 2 A i ( s ) + ( interaction terms ) ,
B ˙ i ( s ) = j η i w i j x j ( s ) + k i x 0 i α ( s ) B i ( s ) + ( interaction terms ) ,
C ˙ i ( s ) = 1 2 j η i w i j ( x j ( s ) ) 2 + k i ( x 0 i ) 2 + ( interaction terms ) ,
with terminal conditions
A i ( t ) = 0 , B i ( t ) = 0 , C i ( t ) = 0 .
Thus, the value function of agent i admits the explicit quadratic representation
V i ( s , x ) = A i ( s ) ( x i ( s ) ) 2 + B i ( s ) x i ( s ) + C i ( s ) ,
where ( A i , B i , C i ) solve the Riccati system (29)–(31). The optimal control is then given by
u i ( s ) = 0 , whenever 1 2 + 2 A i ( s ) ( x i ( s ) ) 2 + B i ( s ) x i ( s ) > 0 .
This completes the construction of the explicit solution.
Remark 14. 
It is important to note that the Feynman path integral avoids solving non-linear PDEs (like the HJB equation), which are computationally expensive and high-dimensional. Instead, it reformulates the control problem into a path-integral expectation, allowing direct sampling and Monte Carlo evaluation. This makes it well-suited for McKean–Vlasov systems where the law dependence complicates the PDE approach. In contrast, Pontryagin’s maximum principle requires solving coupled forward-backward SDEs, which can be unstable and difficult to compute in large-scale opinion dynamics. Path-integral control turns the optimization into a reweighting of trajectories, which can be both numerically stable and parallelizable.

8. Numerical Experiments

In this section, we illustrate the effectiveness of the proposed optimal control strategy derived in Equation (21) for the McKean–Vlasov SDE (Equation (17)). We conduct simulations under varying network topologies and different initial conditions and compare our strategy to benchmark control laws from the literature, such as MFG-based strategies and classical consensus protocols.

8.1. Experimental Setup

We simulate a system of n { 20 , 100 , 200 , 500 } agents, where each agent’s opinion x i ( s ) evolves according to the controlled stochastic differential Equation (17). The control u i ( s ) is implemented using the closed-form solution (21). For comparison, we also consider the following:
  • Consensus baseline: a Friedkin–Johnsen type update without optimization, i.e., fixed weights and no feedback control;
  • MFG strategy: a feedback Nash equilibrium control derived from the HJB equation of the mean field limit [18,20,37].
The interaction kernel ϕ θ ( · ) is parameterized as in Equation (14) with scale θ 1 = 1 and range θ 2 = 0.5 . The time horizon is rescaled to [ 0 , 1 ] , and initial opinions are drawn from one of the following:
1.
Clustered initialization: two polarized groups at 0.25 and 0.75;
2.
Uniform initialization: independent and identically distributed (i.i.d.) opinions from a uniform distribution, Unif [ 0 , 1 ] ;
3.
Gaussian initialization: opinions are from a normal distribution, N ( 0.5 , 0.1 2 ) .

8.2. Network Configurations

We analyze the role of network topology by considering the following:
1.
A complete network (fully connected);
2.
An Erdős–Rényi random graph with link probability p = 0.2 ;
3.
A Watts–Strogatz small-world network with rewiring probability 0.1 ;
4.
A Barabási–Albert scale-free network.
In each case, the adjacency matrix defines the neighborhood set η i used in the cost Functional (1).
Figure 6 displays the time evolution of opinions for n { 20 , 100 , 200 , 500 } in the fully connected setting. As the population grows, the controlled system stabilizes faster and opinion fluctuations diminish due to the law of large numbers in the McKean–Vlasov dynamics. In Figure 7, we compare the proposed control against the MFG strategy and the classical consensus protocol under the same initialization. Finally, Table 1 summarizes the performance comparison. We report the average cost functional (1), the time to ϵ -consensus ( ϵ = 10 2 ), and the average control effort 0 t ( u i ( s ) ) 2 d s . The proposed strategy consistently achieves lower cost and faster convergence compared to both benchmarks, while requiring moderate control effort.
Moreover, Figure 6 illustrates the dynamic evolution of opinions under the McKean–Vlasov SDE derived by Equation (17) for different population sizes: n = 20 , 100 , 200 , 500 . Each trajectory represents the opinion path of an individual agent, standardized to the interval [ 0 , 1 ] over time. As n increases, the distribution of opinion trajectories becomes more structured and exhibits reduced fluctuations, reflecting the law of large numbers in the mean-field limit. For small populations ( n = 20 ), the system displays higher variance and localized clustering, whereas for larger n ( n = 500 ), opinions evolve more smoothly and align towards stable collective patterns. This demonstrates how the interaction kernel ϕ θ and the averaging effect in large-scale networks drive the system toward mean-field consensus-like behavior, despite the presence of stochastic perturbations.
Figure 7 presents a comparative visualization of the dynamic evolution of opinions under three distinct control paradigms: the proposed optimal strategy (path integral control), the MFG equilibrium strategy, and the classical consensus model. Each panel depicts the trajectories of n = 100 agents’ opinions, normalized to lie within the unit interval [ 0 , 1 ] over a standardized time horizon [ 0 , 1 ] . The left panel corresponds to the proposed optimal strategy derived from Equation (21), where agents dynamically adjust their control inputs to minimize individual costs while accounting for network interactions. In this case, opinions evolve in a relatively smooth manner, displaying gradual alignment and reduced dispersion as time progresses, which highlights the stabilizing effect of the optimized control. The middle panel illustrates the MFG equilibrium strategy, in which agents adopt best-response dynamics consistent with the mean field distribution of the population. Here, opinions demonstrate slower convergence compared to the proposed strategy, and greater heterogeneity persists throughout the time horizon, reflecting the decentralized and equilibrium nature of the MFG solution. The right panel corresponds to the classical consensus model, where agents update their states purely based on averaging dynamics without optimal control intervention. While some partial clustering is observed, consensus is not fully achieved, and significant polarization remains, with subgroups of opinions diverging rather than converging toward a common value.
These results confirm that the proposed Feynman-type path integral control framework is not only analytically tractable but also effective in steering the collective dynamics towards stable consensus with reduced cost, outperforming both classical consensus and MFG benchmarks. We conclude that the proposed strategy not only accelerates convergence but also reduces opinion dispersion more effectively than both the MFG equilibrium and the classical consensus dynamics. The results confirm that embedding optimal control into the McKean–Vlasov opinion dynamics substantially improves coordination across agents, leading to faster consensus formation while maintaining robustness against stochastic fluctuations. In contrast, the MFG solution represents a less centralized, equilibrium-driven adjustment that slows convergence, and the classical consensus scheme fails to eliminate long-run polarization. This underscores the effectiveness of the proposed approach relative to existing methods in both control theory and opinion dynamics literature.
Furthermore, to assess the robustness of the proposed strategy, we conducted a sensitivity analysis of peer influence weights w i j . The results (Figure 8) show that increasing peer influence accelerates convergence and lowers the cost functional up to a saturation point, after which the benefit diminishes. Furthermore, we compared the path integral approach with conventional stochastic control methods (HJB, Pontryagin). In small-scale systems, all methods yield similar strategies; however, the path integral approach substantially reduces computational cost and scales better with agent population size. These results confirm the dual advantage of the proposed method: robustness to parameter variations and computational tractability.

9. Extension to Heterogeneous Agents

The baseline formulation in (17) assumes a homogeneous population in which all agents interact through identical influence kernels and are subject to identical noise characteristics. While this assumption facilitates tractable analysis, it restricts the applicability of the model to real-world social networks, where individuals typically differ in influence, stubbornness, and susceptibility to randomness. To address this limitation, we extend the McKean–Vlasov opinion dynamics to incorporate heterogeneous agents. Specifically, we consider the dynamics
d x i ( s ) = α ( s ) κ i x i ( s ) x 0 i d s α ( s ) 1 n j = 1 n w i j ϕ θ x i ( s ) x j ( s ) x i ( s ) x j ( s ) d s + x i ( s ) u i ( s ) 2 d s + σ i x i ( s ) d B i ( s ) .
In this extended framework, three key sources of heterogeneity are introduced. First, the coefficients w i j 0 encode asymmetric influence between agents, allowing some individuals (e.g., leaders, experts, or influencers) to exert disproportionately large effects on others’ opinions. Second, the parameter κ i 0 represents the stubbornness of agent i, which models its resistance to deviating from its initial opinion x 0 i . Agents with larger κ i values act as opinion anchors and can prevent full consensus, consistent with empirical findings in social influence research [1,72]. Finally, heterogeneous noise levels are incorporated through σ i , which accounts for individual variability in susceptibility to random shocks or external perturbations. In particular, low-noise agents may represent stable opinion leaders, while high-noise agents correspond to more volatile or uncertain individuals.
This heterogeneous formulation enables the model to reproduce a richer set of behaviors observed in empirical social networks. For instance, stubborn agents can maintain persistent polarization even under strong interaction, while asymmetric influence weights w i j can generate leader–follower dynamics. Similarly, heterogeneous noise terms σ i can amplify opinion diversity by broadening the distribution of final opinion states. By incorporating these mechanisms, the model strengthens its descriptive realism and enhances its ability to capture social phenomena such as echo chambers, polarization, and the disproportionate influence of elites.

Numerical Results

In this section, we complement the theoretical formulation with simulations under heterogeneous settings. These experiments illustrate how the proposed optimal control strategy adapts when agents differ in influence, stubbornness, and noise, and will allow a comparison with alternative strategies such as mean field game equilibria and classical consensus models.
In Figure 9, we incorporate heterogeneity into the opinion dynamics framework and illustrate the resulting effects for a system of n = 50 agents. In the leader–follower network configuration, one agent is assigned disproportionately large influence, and the trajectories reveal that opinions are drawn toward this leader’s stance, producing a centralized convergence pattern in contrast to the homogeneous baseline. In the stubborn minority scenario, a subset of 5 / 50 agents is endowed with higher stubbornness parameters k i , reflecting resistance to change; here the majority of agents drift toward a consensus band, but the stubborn subgroup maintains divergent positions, leading to partial polarization and preventing full consensus. Finally, when a subgroup of agents is subjected to significantly larger volatility levels σ i , modeling erratic or unpredictable behaviors, the trajectories exhibit persistent fluctuations and slower stabilization, with noticeable dispersion relative to the low-noise majority. These numerical experiments demonstrate that the proposed control framework accommodates heterogeneity in influence weights, stubbornness levels, and noise intensities, and they highlight how such features fundamentally alter convergence patterns in social networks compared to the idealized uniform case.

10. Conclusions

In this paper, we have investigated the estimation problem of an optimal strategy for an agent within a stochastic McKean–Vlasov opinion dynamics framework and its related system of weakly interacting particles. By employing a Feynman-type path integral approach with an integrating factor, we identify a u i ( s ) for a social cost function governed by a stochastic McKean–Vlasov equation. Utilizing a variant of the Friedkin–Johnsen-type model, we derive a closed-form solution for agent i’s optimal strategy. We analytically examine the effects of x i ( s ) and x j ( s ) on agent i’s optimal strategy when x i ( s ) = x j ( s ) and x i ( s ) < x j ( s ) . Due to the complexity of the terms involved, we cannot conclusively determine the impact of x i ( s ) and x j ( s ) on u i ( s ) . However, we find that the optimal strategy of agent i is positively correlated with their own opinion, irrespective of agent j’s influence. We also developed a novel analytical and computational framework for the control of stochastic opinion dynamics modeled through McKean–Vlasov SDE. Building upon the modified Friedkin–Johnsen-type structure, we introduced a Feynman-type path integral formulation of the control problem, which enables us to bypass the curse of dimensionality inherent in conventional approaches such as HJB equations. Specifically, we derived the explicit form of the optimal control strategy for each agent, expressed as the solution of a quadratic equation, and rigorously clarified the admissible root selection conditions under convexity and discriminant constraints. The proposed strategy minimizes a quadratic cost functional that balances peer influence, individual stubbornness, and control effort, while being consistent with the stochastic dynamics of agents’ opinions. The path integral approach further provides a tractable and scalable method for evaluating the optimal strategy, offering both computational efficiency and analytical transparency compared to traditional stochastic control methods. We also examined the relationship between Nash equilibria in finite-agent settings and MFG equilibria, showing that the proposed control law converges to the MFG solution as the number of agents grows large. To ensure practical relevance, we relaxed the assumption of agent homogeneity by incorporating heterogeneous influence weights, stubbornness levels, and noise intensities, and demonstrated that the control framework remains functional under such heterogeneity, while producing distinctive phenomena such as leader–follower effects, stubborn minorities, and persistent fluctuations in high-noise subgroups. Through numerical experiments and sensitivity analyses, we illustrated the dynamic evolution of opinions under varying network configurations and parameters, confirming that stronger peer influence accelerates consensus up to a saturation threshold. Comparative results highlighted that, although conventional HJB equation-based methods perform adequately in small-scale systems, the path integral control is more efficient and scalable in large-scale networks. Taken together, our results establish a unified framework that not only advances the theoretical understanding of controlled opinion dynamics but also provides practical computational tools for analyzing consensus, polarization, and control in large and heterogeneous social networks. Future work may extend this framework to adaptive network topologies, incorporate learning-based strategies for real-time control, and explore empirical calibration of model parameters in applications ranging from online social media dynamics to collective decision-making systems.
The optimal control strategy u i ( s ) obtained in (21) should be understood as the Nash equilibrium strategy in the finite-agent stochastic differential game defined by (17). A natural question concerns its relation to the MFG equilibrium. Under standard regularity conditions, including Lipschitz continuity of the drift and diffusion coefficients, convexity of the cost functional, and exchangeability of the agents, the sequence of Nash equilibria of the n-player game converges to the unique MFG equilibrium as n ; see, e.g., refs. [19,20,37]. Thus, the explicit strategy derived here may be interpreted not only as a finite-agent Nash equilibrium but also as an approximation to the MFG solution when the agent population is sufficiently large. This interpretation provides an important bridge between the finite stochastic control formulation used in this paper and the classical MFG framework.
For future research, extending our results to scenarios where the diffusion coefficient is unknown and conducting numerical analyses under various social network structures would be valuable. This extension is intriguing because, for a wide range of McKean–Vlasov SDEs, the uniqueness (or non-uniqueness) of the invariant measure(s) is influenced by the noise coefficient’s magnitude. Additionally, extending this problem to fractional McKean–Vlasov opinion dynamics could more accurately observe the influence of x i ( s ) and x j ( s ) on u i ( s ) .

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The author declares that he has no conflicts of interest.

Appendix A

Appendix A.1. Proof of Lemma 2

Let G be a Polish space with metric d G , and P 2 ( G ) the space of Borel probability measures on G with finite second moment. For γ , γ P 2 ( G ) , the 2-Wasserstein distance is defined as
W 2 ( γ , γ ) 2 : = inf π Γ ( γ , γ ) G × G d G ( x , y ) 2 π ( d x , d y ) ,
where Γ ( γ , γ ) denotes the set of couplings of γ and γ .
Fix π P 2 ( G × G ) such that γ , γ are its marginals. By the Lebesgue dominated convergence theorem, we obtain
W 2 ( γ , γ ) 2 | x i ( s 1 , ω ) x i ( s 2 , ω ) | 2 π ( d ω ) ,
for all s 1 , s 2 [ 0 , t ] , ω G . This implies that s γ s P 2 ( G ) is continuous in the Wasserstein topology.
For a given initial state x 0 i G , replacing P ( x i ) by γ in Equation (2) gives
d x i ( s ) = μ i ( s , x i ( s ) , γ , u i ( s ) ) d s + σ i ( s , x i ( s ) , γ , u i ( s ) ) d B i ( s ) ,
with random coefficients μ i , σ i . By Theorem 1.2 of [59], (A2) admits a unique strong solution, x π i = { x π i ( s ) } s = 0 t . Its law P ( x π i ) belongs to P 2 ( G ) .
Define a mapping
Ξ : P 2 ( G ) P 2 ( G ) , Ξ ( π ) : = P ( x π i ) .
A process x i = { x i ( s ) } s = 0 t with E sup s [ 0 , t ] | x i ( s ) | 2 < is a solution of (A2) if and only if P ( x i ) is a fixed point of Ξ .
Fix π , π P 2 ( G ) . Since x π i and x π i share the same initial condition x 0 i , Doob’s maximal inequality together with Assumption 2 yields
E sup 0 s 1 s 2 | x π i ( s 1 ) x π i ( s 1 ) | 2
2 E sup 0 s 1 s 2 | 0 s 1 μ i ( ν , x π i ( ν ) , π ν , u π i ( ν ) ) μ i ( ν , x π i ( ν ) , π ν , u π i ( ν ) ) d ν | 2
+ 2 E sup 0 s 1 s 2 | 0 s 1 σ i ( ν , x π i ( ν ) , π ν , u π i ( ν ) ) σ i ( ν , x π i ( ν ) , π ν , u π i ( ν ) ) d B i ( ν ) | 2 .
Using Lipschitz bounds from Assumption 2, we collect constants into c ^ > 0 and obtain
E sup 0 s 1 s 2 | x π i ( s 1 ) x π i ( s 1 ) | 2 c ^ ( 1 + t ) [ 0 s 2 E sup ν [ 0 , s 1 ] | x π i ( ν ) x π i ( ν ) | 2 d s 1 + 0 s 2 E sup ν [ 0 , s 1 ] | u π i ( ν ) u π i ( ν ) | 2 d s 1 + 0 s 2 W 2 ( π s 1 , π s 1 ) 2 d s 1 ] .
Applying Gronwall–Bellman inequality to (A6) gives
E sup 0 s 1 s 2 | x π i ( s 1 ) x π i ( s 1 ) | 2 c ^ t e c ^ t 0 s 2 W 2 ( π s 1 , π s 1 ) 2 d s 1 .
Since
W 2 Ξ ( π ) , Ξ ( π ) 2 E sup 0 s 1 s 2 | x π i ( s 1 ) x π i ( s 1 ) | 2 ,
and
W 2 ( π s 1 , π s 1 ) W 2 ( π , π ) ,
Equation (A7) yields
W 2 Ξ ( π ) , Ξ ( π ) 2 c ^ t e c ^ t 0 s 2 W 2 ( π , π ) 2 d s 1 .
Iterating (A10) ρ times, and denoting by Ξ ρ the ρ -fold composition of Ξ , we obtain
W 2 Ξ ρ ( π ) , Ξ ρ ( π ) 2 ( c ^ t ) ρ ρ ! W 2 ( π , π ) 2 .
For sufficiently large ρ , the right-hand side of (A11) becomes strictly contractive. Hence, Ξ admits a unique fixed point in P 2 ( G ) .
This establishes the existence and uniqueness of a solution to (A2), completing the proof. □

Appendix A.2. Proof of Lemma 3

For agent i, define the perturbation
V i ϵ ( s ) : = 1 ϵ x i ϵ ( s ) x i ( s ) V i ( s ) , V i ϵ ( 0 ) = 0 .
Hence
x i ϵ ( s ) = x i ( s ) + ϵ V i ( s ) + V i ϵ ( s ) .
Substituting (A12) into the controlled dynamics yields
d V i ϵ ( s ) = 1 ϵ μ i s , x i ϵ ( s ) , P ( x i ϵ ) , u i ϵ ( s ) μ i s , x i ( s ) , P ( x i ) , u i ( s ) x i μ i s , x i ( s ) , P ( x i ) , u i ( s ) V i ( s ) ζ s , P ( x i , V i ) u i μ i s , x i ( s ) , P ( x i ) , u i ( s ) d s + 1 ϵ σ i s , x i ϵ ( s ) , P x i ϵ , u i ϵ ( s ) σ i s , x i ( s ) , P x i , u i ( s ) x i σ i s , x i ( s ) , P ( x i ) , u i ( s ) V i ( s ) ζ s , P ( x i , V i ) u i σ i s , x i ( s ) , P ( x i ) , u i ( s ) d B i ( s ) ,
where the expectation operators are defined by
ζ ( . ) = E ˜ γ μ i ( s , x i ( s ) , P ( x i ) , u i ( s ) ) ( x ^ i ( s ) ) . V ^ i ( s ) | x i = x i ( s ) , u i = u i ( s ) ,
and
ζ ^ ( . ) = E ˜ γ σ i ( s , x i ( s ) , P ( x i ) , u i ( s ) ) ( x ^ i ( s ) ) . V ^ i ( s ) | x i = x i ( s ) , u i = u i ( s ) .
After inserting the values of ζ ( . ) and ζ ^ ( . ) , Equation (A13) yields
d V i ϵ ( s ) = 1 ϵ μ i s , x i ϵ ( s ) , P ( x i ϵ ) , u i ϵ ( s ) μ i s , x i ( s ) , P ( x i ) , u i ( s ) x i μ i s , x i ( s ) , P ( x i ) , u i ( s ) V i ( s ) E ˜ γ μ i ( s , x i ( s ) , P ( x i ) , u i ( s ) ) ( x ^ i ( s ) ) . V ^ i ( s ) | x i = x i ( s ) , u i = u i ( s ) u i μ i s , x i ( s ) , P ( x i ) , u i ( s ) d s + 1 ϵ σ i s , x i ϵ ( s ) , P ( x i ϵ ) , u i ϵ ( s ) σ i s , x i ( s ) , P ( x i ) , u i ( s ) x i σ i s , x i ( s ) , P ( x i ) , u i ( s ) V i ( s ) E ˜ γ σ i ( s , x i ( s ) , P ( x i ) , u i ( s ) ) ( x ^ i ( s ) ) . V ^ i ( s ) | x i = x i ( s ) , u i = u i ( s ) u i σ i s , x i ( s ) , P ( x i ) , u i ( s ) d B i ( s ) = A d s + B d B i ( s ) .
For each continuous time point s [ 0 , t ] there exists a ϵ > 0 small enough such that
1 ϵ μ i s , x i ϵ ( s ) , P ( x i ϵ ) , u i ϵ ( s ) μ i s , x i ( s ) , P ( x i ) , u i ( s ) = 1 ϵ μ i s , x i ( s ) + ϵ V i ( s ) + V i ϵ ( s ) , P x i ( s ) + ϵ V i ( s ) + V i ϵ ( s ) , u i ( s ) + ϵ v i ( s ) μ i s , x i ( s ) , P ( x i ) , u i ( s ) = 0 1 x i μ i s , x i ( s ) + ϵ β V i ( s ) + V i ϵ ( s ) , P x i ( s ) + ϵ β V i ( s ) + V i ϵ ( s ) , u i ( s ) + ϵ β v i ( s ) V i ( s ) + V i ϵ ( s ) d β + 0 1 E ˜ { γ μ i s , x i ( s ) + ϵ β V i ( s ) + V i ϵ ( s ) , P x i ( s ) + ϵ β V i ( s ) + V i ϵ ( s ) , u i ( s ) + ϵ β v i ( s ) × x ^ i ( s ) + ϵ β V ^ i ( s ) + V ^ i ϵ ( s ) V ^ i ( s ) + V ^ i ϵ ( s ) } d β + 0 1 u i μ i s , x i ( s ) + ϵ β V i ( s ) + V i ϵ ( s ) , P x i ( s ) + ϵ β V i ( s ) + V i ϵ ( s ) , u i ( s ) + ϵ β v i ( s ) V i ( s ) d β .
To get rid of notational complicacy, define x β i ϵ ( s ) : = x i ( s ) + ϵ β V i ( s ) + V i ϵ ( s ) , x ^ β i ϵ ( s ) : = x ^ i ( s ) + ϵ β V ^ i ( s ) + V ^ i ϵ ( s ) , and u β i ϵ ( s ) : = u i ( s ) + ϵ β v i ( s ) . Calculating the “ds” term of Equation (A14) implies
A = 0 1 x i μ i s , x β i ϵ ( s ) , P x β i ϵ ( s ) , u β i ϵ ( s ) V i ϵ ( s ) d β + 0 1 E ˜ γ μ i s , x β i ϵ ( s ) , P x β i ϵ ( s ) , u β i ϵ ( s ) x ^ β i ϵ ( s ) V ^ i ϵ ( s ) d β + 0 1 x i μ i s , x β i ϵ ( s ) , P x β i ϵ ( s ) , u β i ϵ ( s ) x i μ i s , x i ( s ) , P ( x i ) , u i ( s ) V i ( s ) d β + 0 1 E ˜ γ μ i s , x β i ϵ ( s ) , P x β i ϵ ( s ) , u β i ϵ ( s ) x i μ i s , x i ( s ) , P ( x i ) , u i ( s ) x ^ β i ϵ ( s ) V ^ i ϵ ( s ) d β + 0 1 γ μ i s , x β i ϵ ( s ) , P x β i ϵ ( s ) , u β i ϵ ( s ) γ μ i s , x i ( s ) , P ( x i ) , u i ( s ) V i ( s ) d β = 0 1 x i μ i s , x β i ϵ ( s ) , P x β i ϵ ( s ) , u β i ϵ ( s ) V i ϵ ( s ) d β + 0 1 E ˜ γ μ i s , x β i ϵ ( s ) , P x β i ϵ ( s ) , u β i ϵ ( s ) x ^ β i ϵ ( s ) V ^ i ϵ ( s ) d β + I 1 + I 2 + I 3 .
For ϵ 0 , the integral terms I 1 , I 2 , and I 3 converge to zero in H 2 , m ˜ ( [ 0 , t ] × G ) . Moreover, for a finite constant c > 0 ,
E 0 t | I 1 | 2 d s = E 0 t 0 1 x i μ i s , x β i ϵ ( s ) , P x β i ϵ ( s ) , u β i ϵ ( s ) x i μ i s , x i ( s ) , P ( x i ) , u i ( s ) V i ( s ) d β 2 d s E 0 t 0 1 x i μ i s , x β i ϵ ( s ) , P x β i ϵ ( s ) , u β i ϵ ( s ) x i μ i s , x i ( s ) , P ( x i ) , u i ( s ) 2 V i ( s ) 2 d β d s c E 0 t 0 1 ( ϵ β ) 2 V i ϵ ( s ) V i ( s ) 2 + | v i ( s ) | 2 V i ( s ) 2 d β d s c 0 t 0 1 E ϵ β V i ϵ ( s ) V i ( s ) 4 d β d s 1 / 2 + E | V i ( s ) | 4 d s 1 / 2 + c 0 t 0 1 E | ϵ β v i ( s ) | d β d s 1 / 2 E | V i ( s ) | 4 d s 1 / 2 ,
which converges to 0 as ϵ 0 for all the above finite expectations. A similar argument can be applied to I 2 and I 3 . To control the quadratic variation of the “B” term in Equation (A14) Burkholder–Davis–Gundy can be used instead of Jensen’s inequality. Then,
E sup s [ 0 , t ] | V i ϵ ( s ) | 2 d s c 0 t E sup s 1 [ 0 , s ] | V i ϵ ( s 1 ) | 2 d s + 0 t sup s 1 [ 0 , s ] E V i ϵ ( s 1 ) 2 d s + b ϵ c 0 t E sup s 1 [ 0 , s ] | V i ϵ ( s 1 ) | 2 d s + b ϵ ,
where lim ϵ 0 b ϵ = 0 . The desired result would be obtained by implementing Gronwall’s inequality. This completes the proof. □

Appendix A.3. Proof of Lemma 4

For agent i and ν [ s , s + ϵ ] , define V i δ ( ν ) : = ( 1 / δ ) [ x i δ ( ν ) x i ( ν ) ] V i ( ν ) . Therefore, V i δ ( 0 ) = 0 and
x i δ ( ν ) = x i ( ν ) + δ V i ( ν ) + V i δ ( ν ) .
Hence,
δ L i ( s , x , u i ) | δ = 0 = lim δ 0 1 δ E s { s s + ϵ [ L i ν , x i ( ν ) , x i ( ν ) + δ V i ( s ) + V i δ ( s ) , u i ( ν ) + δ v i ( ν ) L i ν , x ( ν ) , u i ( ν ) ] d ν } = lim δ 0 1 δ E s { s s + ϵ 0 1 d d β L i ν , x i ( ν ) , x i ( ν ) + δ β V i ( s ) + V i δ ( s ) , u i ( ν ) + δ β v i ( ν ) d β d ν } = lim δ 0 1 δ E s { s s + ϵ 0 1 V i ( ν ) + V i δ ( ν ) x i L i ν , x i ( ν ) , x i ( ν ) + δ β V i ( s ) + V i δ ( s ) , u i ( ν ) + δ β v i ( ν ) + v i ( ν ) u i L i ν , x i ( ν ) , x i ( ν ) + δ β V i ( s ) + V i δ ( s ) , u i ( ν ) + δ β v i ( ν ) d β d ν } = E s s s + ϵ V i ( ν ) x i L i ν , x ( ν ) , u i ( ν ) + v i ( ν ) u i L i ν , x ( ν ) , u i ( ν ) d ν .
Since for ν [ s , s + ϵ ] ,
x i L i ν , x ( ν ) , u i ( ν ) = s s + ϵ j η i w i j x i ( ν ) x j ( ν ) + k i x i ( ν ) x 0 i d ν ,
and
u i L i ν , x ( ν ) , u i ( ν ) = s s + ϵ u i ( ν ) d ν ,
then
δ L i ( s , x , u i ) | δ = 0 = E s s s + ϵ V i ( ν ) j η i w i j x i ( ν ) x j ( ν ) + k i x i ( ν ) x 0 i + u i ( ν ) v i ( ν ) d ν .
This completes the proof. □

Appendix A.4. Proof of Lemma 5

For all ω Ω , the expectation of the independent copy is defined as
E ˜ γ μ ˜ i ( ( s , x ˜ , x , λ ˜ 1 i , λ ˜ 2 i , u ˜ i ) ) + γ σ ˜ i ( s , x ˜ , x , λ ˜ 1 i , λ ˜ 2 i , u ˜ i ) d s + γ L ˜ i ( s , x ˜ , x , λ ˜ 1 i , λ ˜ 2 i , u ˜ i ) | x = x ( s ) : = Ω [ γ μ ˜ i ( s , ω , ω ˜ , x ˜ , x , λ ˜ 1 i ( ω ) , λ ˜ 1 i ( ω ˜ ) , λ ˜ 2 i ( ω ) , λ ˜ 2 i ( ω ˜ ) , u ˜ i ) + γ σ ˜ i s , ω , ω ˜ , x ˜ , x , λ ˜ 1 i ( ω ) , λ ˜ 1 i ( ω ˜ ) , λ ˜ 2 i ( ω ) , λ ˜ 2 i ( ω ˜ ) , u ˜ i + γ L ˜ i s , ω , ω ˜ , x ˜ , x , λ ˜ 1 i ( ω ) , λ ˜ 1 i ( ω ˜ ) , λ ˜ 2 i ( ω ) , λ ˜ 2 i ( ω ˜ ) , u ˜ i | x = x ( s ) ] P ( d ω ˜ )
For a constant ρ ( 0 , ) , define a norm
( λ 1 i , λ 2 i ) ρ 2 : = E 0 t | λ 1 i ( s ) | 2 + | λ 2 i ( s ) | 2 exp ( ρ s ) d s .
Let ( λ 1 i # , λ 2 i # ) H 2 , m ˜ be another set of adjoint processes. By Proposition 2.2 in [65] and Theorem 2.2 in [59], there exists a unique solution ( λ 1 i , λ 2 i ) of the adjoint process,
d λ 1 i ( s ) = x L i ( s , x , P x , λ 1 i # , λ 2 i # , λ 1 i , λ 2 i , u i ) + λ 2 i ( s ) d B i ( s ) E ˜ γ L ˜ i ( s , x ˜ , P x ˜ , λ 1 i # , λ 2 i # , λ ˜ 1 i , λ ˜ 2 i , u ˜ i ) [ x i ( s ) ] .
Since the above adjoint process is a forward looking process, the linear part x L i ( s , x , P x , λ 1 i # , λ 2 i # , λ 1 i , λ 2 i , u i ) has a unique solution λ 1 i , λ 2 i since at time 0 the agent i only makes expectations based on the available information at that time. There exists a map M such that λ 1 i # , λ 2 i # λ 1 i , λ 2 i = M λ 1 i # , λ 2 i # from H 2 , m ˜ into itself. Since λ 1 i H λ 1 2 , m ˜ , an appropriate choice of ρ is necessary to show the existence of a strict contraction in M. Suppose two pairs, λ 1 i # 1 , λ 2 i # 1 H 2 , m ˜ and λ 1 i # 2 , λ 2 i # 2 H 2 , m ˜ . Let λ 1 i 1 , λ 2 i 1 = M λ 1 i # 1 , λ 2 i # 1 , λ 1 i 2 , λ 2 i 2 = M λ 1 i # 2 , λ 2 i # 2 , λ ˜ 1 i # , λ ˜ 2 i # = λ 1 i # 2 λ 1 i # 1 , λ 2 i # 2 λ 2 i # 1 , and λ ˜ 1 i 1 λ ˜ 1 i 2 = λ 1 i 2 λ 1 i 1 , λ 2 i 2 λ 2 i 1 . For all s [ 0 , t ] , implementing Itô formula to | λ ˜ 1 i ( s ) | 2 exp ( ρ s ) yields
| λ ˜ 1 i ( s ) | 2 + E s t ρ exp [ ρ ( ν s ) ] λ ˜ 1 i ( ν ) 2 d ν | F s + E s t ρ exp [ ρ ( ν s ) ] λ ˜ 2 i ( ν ) 2 d ν | F s = E { 2 exp [ ρ ( ν s ) ] Ξ ν , x ˜ , x , λ 1 i # 2 ( ν ) , λ 2 i # 2 ( ν ) , λ 1 i 2 ( ν ) , λ 2 i 2 ( ν ) , u i Ξ ν , x ˜ , x , λ 1 i # 1 ( ν ) , λ 2 i # 1 ( ν ) , λ 1 i 1 ( ν ) , λ 2 i 1 ( ν ) , u i d ν | F s } ,
where
Ξ ν , x ˜ , x , λ 1 i # 2 ( ν ) , λ 2 i # 2 ( ν ) , λ 1 i 2 ( ν ) , λ 2 i 2 ( ν ) , u i = γ μ ˜ i ν , x ˜ , x , λ 1 i # 2 ( ν ) , λ 2 i # 2 ( ν ) , λ 1 i 2 ( ν ) , λ 2 i 2 ( ν ) , u i + γ σ ˜ i ν , x ˜ , x , λ 1 i # 2 ( ν ) , λ 2 i # 2 ( ν ) , λ 1 i 2 ( ν ) , λ 2 i 2 ( ν ) , u i + γ L ˜ i ν , x ˜ , x , λ 1 i # 2 ( ν ) , λ 2 i # 2 ( ν ) , λ 1 i 2 ( ν ) , λ 2 i 2 ( ν ) , u i ,
and
Ξ ν , x ˜ , x , λ 1 i # 1 ( ν ) , λ 2 i # 1 ( ν ) , λ 1 i 1 ( ν ) , λ 2 i 1 ( ν ) , u i = γ μ ˜ i ν , x ˜ , x , λ 1 i # 1 ( ν ) , λ 2 i # 1 ( ν ) , λ 1 i 1 ( ν ) , λ 2 i 1 ( ν ) , u i + γ σ ˜ i ν , x ˜ , x , λ 1 i # 1 ( ν ) , λ 2 i # 1 ( ν ) , λ 1 i 1 ( ν ) , λ 2 i 1 ( ν ) , u i + γ L ˜ i ν , x ˜ , x , λ 1 i # 1 ( ν ) , λ 2 i # 1 ( ν ) , λ 1 i 1 ( ν ) , λ 2 i 1 ( ν ) , u i .
Let ρ = 16 a 1 + 4 a + 1 . Then, by implementing condition (v) of Assumption 2 and F -integrability of Ξ in H 2 , m ˜ yields
1 2 ρ 2 a 2 a 2 E exp ( ρ ν ) λ ˜ 1 i ( ν ) 2 d ν + 1 2 E exp ( ρ ν ) λ ˜ 2 i ( ν ) 2 d ν 4 a 2 ρ E exp ( ρ ν ) λ ˜ 1 i # ( ν ) 2 d ν + 1 2 E exp ( ρ ν ) λ ˜ 2 i # ( ν ) 2 d ν ,
which implies
E 0 t exp ( ρ s ) λ ˜ 1 i ( s ) 2 + λ ˜ 2 i ( s ) 2 d s 1 2 E 0 t exp ( ρ s ) λ ˜ 1 i # ( s ) 2 + λ ˜ 2 i # ( s ) 2 d s .
Hence, λ ˜ 1 i , λ ˜ 2 i ρ 2 1 / 2 λ ˜ 1 i # , λ ˜ 2 i # ρ . This completes the proof. □

Appendix A.5. Proof of Proposition 2

The Euclidean action function of the system can be represented as
A 0 , t i ( x i ) = 0 t E s L i ( s , x , u i ) d s + x i ( s ) x 0 i μ i s , x i , P ( x i ) , u i d s σ i s , x i , P ( x i ) , u i d λ i ( s ) ,
where E s is the conditional expectation on opinion x i ( s ) at the beginning of time s. For all ε > 0 , and the normalizing constant L ε i > 0 , define a transitional function in small time interval as
Ψ s , s + ε i ( x i ) : = 1 L ε i R exp { ε A s , s + ε i ( x ) } Ψ s i ( x i ) d x i ( s ) ,
for ϵ 0 and Ψ s i ( x i ) is the value of the transition function at time s and opinion x i ( s ) with the initial condition Ψ 0 i ( x i ) = Ψ 0 i for all i N .
For continuous time interval [ s , τ ] , where τ = s + ε , the stochastic Lagrangian can be represented as
A s , τ i ( x ) = s τ E s { L i [ ν , x ( ν ) , x 0 i , u i ( ν ) ] d ν + x i ( ν ) x 0 i μ i ν , x i , P ( x i ) , u i d ν σ i ν , x i , P ( x i ) , u i d B i ( ν ) d λ i ( ν ) } ,
with the constant initial condition x i ( 0 ) = x 0 i . This conditional expectation is valid when the control u i ( ν ) of agent i’s opinion dynamics is determined at time ν such that all n-agents’ opinions x ( ν ) are given [74]. The evolution takes place as the action function is stationary. Therefore, the conditional expectation with respect to time only depends on the expectation of initial time point of interval [ s , τ ] .
Fubini’s Theorem implies,
A s , τ i ( x i ) = E s s τ L i [ ν , x ( ν ) , x 0 i , u i ( ν ) ] d ν + x i ( ν ) x 0 i μ i ν , x i , P ( x i ) , u i d ν σ i ν , x i , P ( x i ) , u i d B i ( ν ) d λ i ( ν ) .
By Itô’s Theorem there exists a function h i [ ν , x i ( ν ) ] C 2 ( [ 0 , ) × R ) , such that Y i ( ν ) = h i [ ν , x i ( ν ) ] where Y i ( ν ) is an Itô process [75]. Assuming
h i [ ν + Δ ν , x i ( ν ) + Δ x i ( ν ) ] = x i ( ν ) x 0 i μ i ν , x i ( ν ) , P ( x i ) , u i ( ν ) d ν σ i ν , x i ( ν ) , P ( x i ) , u i ( ν ) d B i ( ν ) ,
Equation (A18) implies
A s , τ i ( x i ) = E s s τ g i [ ν , x ( ν ) , u i ( ν ) ] d ν + h i ν + Δ ν , x i ( ν ) + Δ x i ( ν ) d λ i ( ν ) .
Itô’s Lemma implies
ε A s , τ i ( x i ) = E s { ε L i [ s , x ( s ) , u i ( s ) ] + ε h i [ s , x i ( s ) ] d λ i ( s ) + ε h s i [ s , x i ( s ) ] d λ i ( s ) + ε h x i [ s , x i ( s ) ] μ i s , x i ( s ) , P ( x i ) , u i ( s ) d λ i ( s ) + ε h x i [ s , x i ( s ) ] σ i s , x i ( s ) , P ( x i ) , u i ( s ) d λ i ( s ) d B i ( s ) + 1 2 ε σ i s , x i ( s ) , P ( x i ) , u i ( s ) 2 h x x i [ s , x i ( s ) ] d λ i ( s ) + o ( ε ) } ,
where h s i = s h i , h x i = x i h i and h x x i = 2 ( x i ) 2 h i , and we use the condition [ d x i ( s ) ] 2 ε with d x i ( s ) ε μ i [ s , x i ( s ) , u i ( s ) ] + σ i [ s , x i ( s ) , u i ( s ) ] d B i ( s ) . We use Itô’s Lemma and a similar approximation to approximate the integral. With ε 0 , dividing throughout by ε and taking the conditional expectation yields
ε A s , τ i ( x i ) = E s { ε L i [ s , x ( s ) , u i ( s ) ] + ε h i [ s , x i ( s ) ] d λ i ( s ) + ε h s i [ s , x i ( s ) ] d λ i ( s ) + ε h x i [ s , x i ( s ) ] μ i s , x i ( s ) , P ( x i ) , u i ( s ) d λ i ( s ) + 1 2 ε σ 2 i s , x i ( s ) , P ( x i ) , u i ( s ) h x x i [ s , x i ( s ) ] d λ i ( s ) + o ( 1 ) } ,
since E s [ d B i ( s ) ] = 0 and E s [ o ( ε ) ] / ε 0 for all ε 0 with the initial condition x 0 i . For ε 0 , denote a transition function at s as Ψ s i ( x i ) for all i N . Hence, using Equation (A16), the transition function yields
Ψ s , τ i ( x i ) = 1 L ϵ i R exp { ε [ L i [ s , x ( s ) , u i ( s ) ] + h i [ s , x i ( s ) ] d λ i ( s ) + h s i [ s , x i ( s ) ] d λ i ( s ) + h x i [ s , x i ( s ) ] μ i s , x i ( s ) , P ( x i ) , u i ( s ) d λ i ( s ) + 1 2 σ i s , x i ( s ) , P ( x i ) , u i ( s ) 2 h x x i [ s , x i ( s ) ] d λ i ( s ) ] } Ψ s i ( x ) d x i ( s ) + o ( ε 1 / 2 ) .
Since ε 0 , first-order Taylor series expansion on the left hand side of Equation (A22) yields
Ψ i s ( x i ) + ε Ψ i s ( x i ) s + o ( ε ) = 1 L ε i R exp { ε [ L i [ s , x ( s ) , u i ( s ) ] + h i [ s , x i ( s ) ] d λ i ( s ) + h s i [ s , x i ( s ) ] d λ i ( s ) + h x i [ s , x i ( s ) ] μ i s , x i ( s ) , P ( x i ) , u i ( s ) d λ i ( s ) + 1 2 σ i s , x i ( s ) , P ( x i ) , u i ( s ) 2 h x x i [ s , x i ( s ) ] d λ i ( s ) ] } Ψ s i ( x ) d x i ( s ) + o ( ε 1 / 2 ) .
For fixed s and τ , let x i ( s ) x i ( τ ) = ξ i so that x i ( s ) = x i ( τ ) + ξ i . When ξ i is not around zero, for a positive number η < we assume | ξ i | η ε x i ( s ) so that for ε 0 , ξ i takes even smaller values and agent i’s opinion 0 < x i ( s ) η ε / ( ξ i ) 2 . Therefore,
Ψ i s ( x i ) + ε Ψ i s ( x i ) s = 1 L ϵ i R Ψ i s ( x i ) + ξ i Ψ i s ( x i ) x i + o ( ϵ ) × exp { ε [ L i [ s , x ( s ) , u i ( s ) ] + h i [ s , x i ( s ) ] d λ i ( s ) + h x i [ s , x i ( s ) ] μ i s , x i ( s ) , P ( x i ) , u i ( s ) d λ i ( s ) + 1 2 σ i s , x i ( s ) , P ( x i ) , u i ( s ) 2 h x x i [ s , x i ( s ) ] d λ i ( s ) ] } d ξ i + o ( ε 1 / 2 ) .
Before solving for the Gaussian integral of the each term of the right hand side of the above Equation, define a C 2 function
f i [ s , ξ , λ i ( s ) , γ , u i ( s ) ] = L i [ s , x ( s ) + ξ , u i ( s ) ] + h i [ s , x i ( s ) + ξ i ] d λ i ( s ) + h s i [ s , x i ( s ) + ξ i ] d λ i ( s ) + h x i [ s , x i ( s ) + ξ i ] μ i s , x i ( s ) + ξ , P ( x i + ξ ) , u i ( s ) d λ i ( s ) + 1 2 σ 2 i s , x i ( s ) + ξ , P ( x i + ξ ) , u i ( s ) h x x i [ s , x i ( s ) + ξ i ] d λ i ( s ) + o ( 1 ) ,
where ξ is a vector of all n-agents’ ξ i ’s. Hence,
Ψ i s ( x i ) + ε Ψ i s ( x i ) s = Ψ i s ( x i ) 1 L ϵ i R exp ε f i [ s , ξ , λ i ( s ) , γ , u i ( s ) ] d ξ i + Ψ i s ( x i ) x i 1 L ϵ i R ξ i exp ε f i [ s , ξ , λ i ( s ) , γ , u i ( s ) ] d ξ i + o ( ε 1 / 2 ) .
After taking ε 0 , Δ u 0 and a Taylor series expansion with respect to x i of f i [ s , ξ , λ i ( s ) , γ , u i ( s ) ] yields
f i [ s , ξ , λ i ( s ) , γ , u ( s ) ] = f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] + f x i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] [ ξ i x i ( τ ) ] + 1 2 f x x i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] [ ξ i x i ( τ ) ] 2 + o ( ε ) .
Define y i : = ξ i x i ( τ ) so that d ξ i = d y i . The first integral on the right hand side of Equation (A24) becomes
R exp { ε f i [ s , ξ , λ i ( s ) , γ , u i ( s ) ] } d ξ i = exp ε f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] R exp { ε [ f x i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] y i + 1 2 f x x i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] ( y i ) 2 ] } d y i .
Assuming a i = 1 2 f x x i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] and b i = f x i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] , the argument of the exponential function in Equation (A25) becomes
a i ( y i ) 2 + b i y i = a i ( y i ) 2 + b i a i y i = a i y i + b i 2 a i y i 2 ( b i ) 2 4 ( a i ) 2 .
Therefore,
exp ε f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] R exp ε [ a i ( y i ) 2 + b i y i ] d y i = exp ε ( b i ) 2 4 ( a i ) 2 f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] R exp ε a i y i + b i 2 a i y i 2 d y i = π ε a i exp ε ( b i ) 2 4 ( a i ) 2 f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] ,
and
Ψ i s ( x i ) 1 L ε i R exp { ε f i [ s , ξ , λ i ( s ) , γ , u i ( s ) ] } d ξ i = Ψ i s ( x ) 1 L ε i π ε a i exp ε ( b i ) 2 4 ( a i ) 2 f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] .
Substituting ξ i = x i ( τ ) + y i , the second integrand of the right hand side of Equation (A24) yields
R ξ i exp ε { f i [ s , ξ , λ i ( s ) , γ , u i ( s ) ] } d ξ i = exp { ε f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] } R [ x i ( τ ) + y i ] exp ε a i ( y i ) 2 + b i y i d y i = exp ε ( b i ) 2 4 ( a i ) 2 f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] [ x i ( τ ) π ε a i + R y i exp ε a i y i + b i 2 a i y i 2 d y i ] .
Substituting k i = y i + b i / ( 2 a i ) in Equation (A29) yields
exp ε ( b i ) 2 4 ( a i ) 2 f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] [ x i ( τ ) π ε a i + R k i b i 2 a i exp [ a i ε ( k i ) 2 ] d k i ] = exp ε ( b i ) 2 4 ( a i ) 2 f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] [ x i ( τ ) b i 2 a i ] π ε a i .
Hence,
1 L ε i Ψ i s ( x i ) x i R ξ i exp ε f [ s , ξ , λ i ( s ) , γ , u i ( s ) ] d ξ i = 1 L ε i Ψ i s ( x i ) x i exp ε ( b i ) 2 4 ( a i ) 2 f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] [ x i ( τ ) b i 2 a i ] π ε a i .
Plugging Equations (A28) and (A31) into Equation (A24) implies
Ψ i s ( x i ) + ε Ψ i s ( x i ) s = 1 L ε i π ε a i Ψ i s ( x i ) exp ε ( b i ) 2 4 ( a i ) 2 f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] + 1 L ε i Ψ i s ( x i ) x i π ε a i exp ε ( b i ) 2 4 ( a i ) 2 f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] [ x i ( τ ) b i 2 a i ] + o ( ε 1 / 2 ) .
Let f i be in Schwartz space. This leads to derivatives rapidly falling, and further assuming 0 < | b i | η ε , 0 < | a i | 1 2 [ 1 ( ξ i ) 2 ] 1 and x i ( s ) x i ( τ ) = ξ i yields
x i ( τ ) b i 2 a i = x i ( s ) ξ i b i 2 a i = x i ( s ) b i 2 a i , ξ 0 ,
such that
| x i ( s ) b i 2 a i | = | η ε ( ξ i ) 2 η ε 1 1 ( ξ i ) 2 | η ε .
Therefore, the Fokker–Plank-type Equation for agent i is
Ψ i s ( x ) s = ( b i ) 2 4 ( a i ) 2 f i [ s , x ( τ ) , λ i ( s ) , γ , u i ( s ) ] Ψ i s ( x ) .
Differentiating Equation (A33) with respect to u i yields an optimal control of agent i under this opinion dynamics,
2 f x i f x x i f x x i f x u i f x i f x x u i ( f x x i ) 2 f u i Ψ i s ( x ) = 0 ,
where f x i = x i f i , f x x i = 2 ( x i ) 2 f i , f x u i = 2 x i u i f i , and f x x u i = 3 ( x i ) 2 u i f i = 0 . Thus, optimal feedback control of agent i in stochastic opinion dynamics is represented as u i ( s , x i ) and is found by setting Equation (A34) equal to zero. Hence, u i ( s , x i ) is the solution of the following Equation:
f u i ( f x x i ) 2 = 2 f x i f x u i .

References

  1. Acemoğlu, D.; Ozdaglar, A. Opinion dynamics and learning in social networks. Dyn. Games Appl. 2011, 1, 3–49. [Google Scholar] [CrossRef]
  2. Pramanik, P. Consensus as a nash equilibrium of a stochastic differential game. Eur. J. Stat. 2023, 3, 10. [Google Scholar] [CrossRef]
  3. Stella, L.; Bagagiolo, F.; Bauso, D.; Como, G. Opinion dynamics and stubbornness through mean-field games. In Proceedings of the 2013 52nd IEEE Conference on Decision and Control, Firenze, Italy, 10–13 December 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 2519–2524. [Google Scholar] [CrossRef]
  4. Acemoğlu, D.; Como, G.; Fagnani, F.; Ozdaglar, A. Opinion fluctuations and disagreement in social networks. Math. Oper. Res. 2013, 38, 1–27. [Google Scholar] [CrossRef]
  5. Castellano, C.; Fortunato, S.; Loreto, V. Statistical physics of social dynamics. Rev. Mod. Phys. 2009, 81, 591. [Google Scholar] [CrossRef]
  6. Bauso, D.; Tembine, H.; Basar, T. Opinion dynamics in social networks through mean-field games. SIAM J. Control Optim. 2016, 54, 3225–3257. [Google Scholar] [CrossRef]
  7. Calvó-Armengol, A.; Patacchini, E.; Zenou, Y. Peer effects and social networks in education. Rev. Econ. Stud. 2009, 76, 1239–1267. [Google Scholar] [CrossRef]
  8. Calvo-Armengol, A.; Jackson, M.O. The effects of social networks on employment and inequality. Am. Econ. Rev. 2004, 94, 426–454. [Google Scholar] [CrossRef]
  9. Conley, T.G.; Udry, C.R. Learning about a new technology: Pineapple in ghana. Am. Econ. Rev. 2010, 100, 35–69. [Google Scholar] [CrossRef]
  10. Moretti, E. Social learning and peer effects in consumption: Evidence from movie sales. Rev. Econ. Stud. 2011, 78, 356–393. [Google Scholar] [CrossRef]
  11. Nakajima, R. Measuring peer effects on youth smoking behaviour. Rev. Econ. Stud. 2007, 74, 897–935. [Google Scholar] [CrossRef]
  12. Sheng, S. A structural econometric analysis of network formation games through subnetworks. Econometrica 2020, 88, 1829–1858. [Google Scholar] [CrossRef]
  13. Erdös, P.; Rényi, A. On random graphs. Publ. Math. 1959, 6, 290–297. [Google Scholar] [CrossRef]
  14. Snijders, T.A. Markov chain monte carlo estimation of exponential random graph models. J. Soc. Struct. 2002, 3, 1–40. [Google Scholar]
  15. Hua, L.; Polansky, A.; Pramanik, P. Assessing bivariate tail non-exchangeable dependence. Stat. Probab. Lett. 2019, 155, 108556. [Google Scholar] [CrossRef]
  16. Polansky, A.M.; Pramanik, P. A motif building process for simulating random networks. Comput. Stat. Data Anal. 2021, 162, 107263. [Google Scholar] [CrossRef]
  17. Pramanik, P. Optimal lock-down intensity: A stochastic pandemic control approach of path integral. Comput. Math. Biophys. 2023, 11, 20230110. [Google Scholar] [CrossRef]
  18. Lasry, J.-M.; Lions, P.-L. Jeux à champ moyen. I–Le cas stationnaire. Comptes Rendus Math. 2006, 343, 619–625. [Google Scholar] [CrossRef]
  19. Lasry, J.M.; Lions, P.L. Mean field games. Jpn. J. Math. 2007, 2, 229–260. [Google Scholar] [CrossRef]
  20. Carmona, R.; Delarue, F. Probabilistic Theory of Mean Field Games with Applications I: Mean Field FBSDEs, Control, and Games; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
  21. Carmona, R.; Delarue, F. Probabilistic Theory of Mean Field Games with Applications II: Mean Field Games with Common Noise and Master Equations; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
  22. McKean, H.P. Propagation of chaos for a class of non-linear parabolic equations. In Stochastic Differential Equations; (Lecture Series in Differential Equations, Session 7, Catholic Univ., 1967); The Catholic University of America: Washington, DC, USA, 1967; pp. 41–57. [Google Scholar]
  23. Kac, M. Foundations of kinetic theory. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, December 1954 and July–August 1955; University of California Press: Berkeley, CA, USA; Volume 3, pp. 171–197.
  24. Sznitman, A.S. Topics in propagation of chaos. Ecole d’été de probabilités de Saint-Flour XIX–1989 1991, 1464, 165–251. [Google Scholar]
  25. Huang, M.; Caines, P.E.; Malhamé, R.P. Individual and mass behaviour in large population stochastic wireless power control problems: Centralized and nash equilibrium solutions. In Proceedings of the 2003 42nd IEEE International Conference on Decision and Control (IEEE Cat. No. 03CH37475), Maui, HI, USA, 9–12 December 2003; IEEE: Piscataway, NJ, USA, 2023; Volume 1, pp. 98–103. [Google Scholar] [CrossRef]
  26. Carmona, R.; Delarue, F.c.; Lachapelle, A. Control of mckean–vlasov dynamics versus mean field games. Math. Financ. Econ. 2013, 7, 131–166. [Google Scholar] [CrossRef]
  27. Andersson, D.; Djehiche, B. A maximum principle for sdes of mean-field type. Appl. Math. Optim. 2011, 63, 341–356. [Google Scholar] [CrossRef]
  28. Pramanik, P. Effects of water currents on fish migration through a feynman-type path integral approach under 8 / 3 liouville-like quantum gravity surfaces. Theory Biosci. 2021, 140, 205–223. [Google Scholar] [CrossRef] [PubMed]
  29. Pramanik, P.; Polansky, A.M. Optimization of a dynamic profit function using euclidean path integral. SN Bus. Econ. 2023, 4, 8. [Google Scholar] [CrossRef]
  30. Pramanik, P. Stubbornness as Control in Professional Soccer Games: A BPPSDE Approach. Mathematics 2025, 13, 475. [Google Scholar] [CrossRef]
  31. Kappen, H.J. Path integrals and symmetry breaking for optimal control theory. J. Stat. Mech. Theory Exp. 2005, 2005, P11011. [Google Scholar] [CrossRef]
  32. Theodorou, E.; Buchli, J.; Schaal, S. Reinforcement learning of motor skills in high dimensions: A path integral approach. In Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA), Anchorage, AK, USA, 3–7 May 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 2397–2403. [Google Scholar] [CrossRef]
  33. Theodorou, E.A. Iterative Path Integral Stochastic Optimal Control: Theory and Applications to Motor Control; University of Southern California: Los Angeles, CA, USA, 2011. [Google Scholar]
  34. Pramanik, P.; Polansky, A.M. Motivation to Run in One-Day Cricket. Mathematics 2024, 12, 2739. [Google Scholar] [CrossRef]
  35. Baaquie, B.E. Quantum Finance: Path Integrals and Hamiltonians for Options and Interest Rates; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
  36. Pramanik, P. Optimization of market stochastic dynamics. In SN Operations Research Forum; Springer: Berlin/Heidelberg, Germany, 2020; Volume 1, pp. 1–17. [Google Scholar]
  37. Lasry, J.-M.; Lions, P.-L. Jeux à champ moyen. II—Horizon fini et contrôle optimal. Comptes Rendus Math. 2006, 343, 679–684. [Google Scholar] [CrossRef]
  38. Achdou, Y.; Capuzzo-Dolcetta, I. Mean field games: Numerical methods. SIAM J. Numer. Anal. 2010, 48, 1136–1162. [Google Scholar] [CrossRef]
  39. Achdou, Y.; Camilli, F.; Capuzzo-Dolcetta, I. Finite difference methods for mean field games. SIAM J. Numer. Anal. 2013, 51, 2585–2612. [Google Scholar] [CrossRef]
  40. Carmona, R.; Laurière, M. Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games: I–The ergodic case. SIAM J. Numer. Anal. 2021, 59, 1455–1485. [Google Scholar] [CrossRef]
  41. Chan, P.; Sircar, R. Fracking, Renewables, and Mean Field Games. Siam Rev. 2017, 59, 588–615. [Google Scholar] [CrossRef]
  42. Han, J.; Long, J.; Zhou, Q. Deep fictitious play for finding Markovian Nash equilibrium in multi-agent games. Math. Control Relat. Fields 2020, 10, 701–722. [Google Scholar]
  43. Niazi, M.U.B.; Özgüler, A.B.; Yildiz, A. Consensus as a nash equilibrium of a dynamic game. In Proceedings of the 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Naples, Italy, 28 November–1 December 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 365–372. [Google Scholar] [CrossRef]
  44. Mas-Colell, A.; Whinston, M.D.; Green, J.R. Microeconomic Theory; Oxford University Press: New York, NY, USA, 1995; Volume 1. [Google Scholar]
  45. Pramanik, P. Path integral control of a stochastic multi-risk sir pandemic model. Theory Biosci. 2023, 142, 107–142. [Google Scholar] [CrossRef]
  46. Hebb, D.O. The Organization of Behavior: A Neuropsychological Theory; Psychology Press: East Sussex, UK, 2005. [Google Scholar]
  47. Kappen, H.J. An introduction to stochastic control theory, path integrals and reinforcement learning. AIP Conf. Proc. 2007, 887, 149–181. [Google Scholar] [CrossRef]
  48. Feynman, R.P. Space-time approach to quantum electrodynamics. Phys. Rev. 1949, 76, 769. [Google Scholar] [CrossRef]
  49. Fujiwara, D. Rigorous Time Slicing Approach to Feynman Path Integrals; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
  50. Carmona, R.; Delarue, F. Forward–backward stochastic differential equations and controlled mckean–vlasov dynamics. Ann. Probab. 2015, 43, 2647–2700. [Google Scholar] [CrossRef]
  51. Cosso, A.; Gozzi, F.; Kharroubi, I.; Pham, H.; Rosestolato, M. Optimal control of path-dependent McKean–Vlasov sdes in infinite dimension. Ann. Appl. Probab. 2023, 33, 2863–2918. [Google Scholar] [CrossRef]
  52. Yong, J.; Zhou, X.Y. Stochastic Controls: Hamiltonian Systems and HJB Equations; Stochastic Modelling and Applied Probability; Springer: New York, NY, USA, 1999. [Google Scholar]
  53. Karatzas, I.; Shreve, S.E. Methods of Mathematical Finance; Springer: Berlin/Heidelberg, Germany, 1998. [Google Scholar]
  54. Friedkin, N.E.; Johnsen, E.C. Social Influence Network Theory: A Sociological Examination of Small Group Dynamics; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
  55. Proskurnikov, A.V.; Tempo, R. A tutorial on modeling and analysis of dynamic social networks. Part I. Annu. Rev. Control 2017, 43, 65–79. [Google Scholar] [CrossRef]
  56. Bensoussan, A.; Frehse, J.; Yam, P. Mean Field Games and Mean Field Type Control Theory; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  57. Marcet, A.; Marimon, R. Recursive contracts. Econometrica 2019, 87, 1589–1631. [Google Scholar] [CrossRef]
  58. Golub, B.; Jackson, M.O. Naïve learning in social networks and the wisdom of crowds. Am. Econ. J. Microecon. 2010, 2, 112–149. [Google Scholar] [CrossRef]
  59. Carmona, R. Lectures on BSDEs, Stochastic Control, and Stochastic Differential Games with Financial Applications; SIAM: Philadelphia, PA, USA, 2016. [Google Scholar]
  60. Cardaliaguet, P. Notes on Mean Field Games; Notes from P.L. Lions’ Lecture at the Collége de France; Technical Report; Collége de France: Paris, France, 2012. [Google Scholar]
  61. Feynman, R.P. Space-time approach to non-relativistic quantum mechanics. Rev. Mod. Phys. 1948, 20, 367. [Google Scholar] [CrossRef]
  62. Bochner, S.; Chandrasekharan, K. Fourier Transforms; No. 19; Princeton University Press: Princeton, NJ, USA, 1949. [Google Scholar]
  63. Ewald, C.O.; Nolan, C. On the adaptation of the lagrange formalism to continuous time stochastic optimal control: A lagrange-chow redux. J. Econ. Dyn. Control 2024, 162, 104855. [Google Scholar] [CrossRef]
  64. Love, C.; Turner, M. Note on utilizing stochastic optimal control in aggregate production planning. Eur. J. Oper. Res. 1993, 65, 199–206. [Google Scholar] [CrossRef]
  65. Pardoux, E.; Peng, S. Adapted solution of a backward stochastic differential equation. Syst. Control Lett. 1990, 14, 55–61. [Google Scholar] [CrossRef]
  66. Sharrock, L.; Kantas, N.; Parpas, P.; Pavliotis, G.A. Parameter estimation for the mckean-vlasov stochastic differential equation. arXiv 2021, arXiv:2106.13751. [Google Scholar] [CrossRef]
  67. Sharrock, L.; Kantas, N.; Parpas, P.; Pavliotis, G.A. Online parameter estimation for the mckean–vlasov stochastic differential equation. Stoch. Processes Their Appl. 2023, 162, 481–546. [Google Scholar] [CrossRef]
  68. Chen, T.; Wang, Z.; Theodorou, E.A. Deep graphic fbsdes for opinion dynamics stochastic control. In Proceedings of the 2022 IEEE 61st Conference on Decision and Control (CDC), Cancun, Mexico, 6–9 December 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 4652–4659. [Google Scholar] [CrossRef]
  69. Brugna, C.; Toscani, G. Kinetic models of opinion formation in the presence of personal conviction. Phys. Rev. E 2015, 92, 052818. [Google Scholar] [CrossRef]
  70. Chazelle, B.; Jiu, Q.; Li, Q.; Wang, C. Well-posedness of the limiting equation of a noisy consensus model in opinion dynamics. J. Differ. Equ. 2017, 263, 365–397. [Google Scholar] [CrossRef]
  71. Lu, F.; Maggioni, M.; Tang, S. Learning interaction kernels in heterogeneous systems of agents from multiple trajectories. J. Mach. Learn. Res. 2021, 22, 1–67. [Google Scholar]
  72. Friedkin, N.E.; Johnsen, E.C. Social influence and opinions. J. Math. Sociol. 1990, 15, 193–206. [Google Scholar] [CrossRef]
  73. Fleming, W.H.; Soner, H.M. Controlled Markov Processes and Viscosity Solutions, 2nd ed.; Stochastic Modelling and Applied Probability; Springer: New York, NY, USA, 2006. [Google Scholar]
  74. Chow, G.C. The lagrange method of optimization with applications to portfolio and investment decisions. J. Econ. Dyn. Control 1996, 20, 1–18. [Google Scholar] [CrossRef]
  75. Øksendal, B. Stochastic differential equations. In Stochastic Differential Equations; Springer: Berlin/Heidelberg, Germany, 2003; pp. 65–84. [Google Scholar]
Figure 1. Visualization of agents (dots) and their pairwise interactions on a random surface.
Figure 1. Visualization of agents (dots) and their pairwise interactions on a random surface.
Mathematics 13 02842 g001
Figure 2. Contour plots of the opinion distribution under the McKean–Vlasov opinion dynamics model (17) for different population sizes: n = 20 , n = 100 , n = 200 , and n = 500 . For small n (top left), the distribution is dominated by stochastic fluctuations, resulting in irregular and unstable clusters. As n increases (top right and bottom left), the effect of averaging across more interactions becomes visible, leading to more coherent and persistent clustering. For large n (bottom right), the empirical distribution closely approximates the mean-field limit, with stable and well-separated clusters. The figure illustrates the propagation of chaos phenomenon: as n , the stochastic particle system converges to the deterministic McKean–Vlasov dynamics, yielding sharp and stable opinion clusters.
Figure 2. Contour plots of the opinion distribution under the McKean–Vlasov opinion dynamics model (17) for different population sizes: n = 20 , n = 100 , n = 200 , and n = 500 . For small n (top left), the distribution is dominated by stochastic fluctuations, resulting in irregular and unstable clusters. As n increases (top right and bottom left), the effect of averaging across more interactions becomes visible, leading to more coherent and persistent clustering. For large n (bottom right), the empirical distribution closely approximates the mean-field limit, with stable and well-separated clusters. The figure illustrates the propagation of chaos phenomenon: as n , the stochastic particle system converges to the deterministic McKean–Vlasov dynamics, yielding sharp and stable opinion clusters.
Mathematics 13 02842 g002
Figure 3. Evolution of standardized opinions under McKean–Vlasov dynamics for different population sizes: n = 20 , 100 , 200 , 500 .
Figure 3. Evolution of standardized opinions under McKean–Vlasov dynamics for different population sizes: n = 20 , 100 , 200 , 500 .
Mathematics 13 02842 g003
Figure 4. Evolution of the optimal strategy u i ( s ) over time.
Figure 4. Evolution of the optimal strategy u i ( s ) over time.
Mathematics 13 02842 g004
Figure 5. Variation of the optimal strategy u i ( s ) with respect to parameters α and θ .
Figure 5. Variation of the optimal strategy u i ( s ) with respect to parameters α and θ .
Mathematics 13 02842 g005
Figure 6. Time evolution of opinions under optimal control for different system sizes: n = 20 , 100 , 200 , 500 .
Figure 6. Time evolution of opinions under optimal control for different system sizes: n = 20 , 100 , 200 , 500 .
Mathematics 13 02842 g006
Figure 7. Comparison of opinion trajectories under three strategies: (Left) proposed optimal control, (Middle) mean field game equilibrium control, and (Right) classical consensus protocol.
Figure 7. Comparison of opinion trajectories under three strategies: (Left) proposed optimal control, (Middle) mean field game equilibrium control, and (Right) classical consensus protocol.
Mathematics 13 02842 g007
Figure 8. Sensitivity analysis and comparative performance. (Left): Consensus time decreases as peer influence w i j increases, illustrating the sensitivity of collective opinion dynamics to network coupling. (Middle): Runtime comparison shows that the path integral approach achieves substantial computational gains relative to the HJB method. (Right): Average cost comparison highlights that the proposed path integral control achieves lower costs than conventional stochastic control strategies.
Figure 8. Sensitivity analysis and comparative performance. (Left): Consensus time decreases as peer influence w i j increases, illustrating the sensitivity of collective opinion dynamics to network coupling. (Middle): Runtime comparison shows that the path integral approach achieves substantial computational gains relative to the HJB method. (Right): Average cost comparison highlights that the proposed path integral control achieves lower costs than conventional stochastic control strategies.
Mathematics 13 02842 g008
Figure 9. Dynamic evolution of opinions under heterogeneous agent configurations for n = 50 agents. (Left) A leader–follower structure leads to centralization around the leader’s stance. (Middle) The presence of a stubborn minority prevents full consensus and sustains polarization. (Right) Agents with higher noise intensities induce persistent fluctuations and slower stabilization.
Figure 9. Dynamic evolution of opinions under heterogeneous agent configurations for n = 50 agents. (Left) A leader–follower structure leads to centralization around the leader’s stance. (Middle) The presence of a stubborn minority prevents full consensus and sustains polarization. (Right) Agents with higher noise intensities induce persistent fluctuations and slower stabilization.
Mathematics 13 02842 g009
Table 1. Performance comparison of strategies.
Table 1. Performance comparison of strategies.
StrategyAverage Cost L i Time to ϵ -ConsensusAvg. Control Effort
Proposed optimal controllowestfastestmoderate
MFG equilibriummediumslowerhigher
Classical consensushighestno consensus (polarization)none
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pramanik, P. Construction of an Optimal Strategy: An Analytic Insight Through Path Integral Control Driven by a McKean–Vlasov Opinion Dynamics. Mathematics 2025, 13, 2842. https://doi.org/10.3390/math13172842

AMA Style

Pramanik P. Construction of an Optimal Strategy: An Analytic Insight Through Path Integral Control Driven by a McKean–Vlasov Opinion Dynamics. Mathematics. 2025; 13(17):2842. https://doi.org/10.3390/math13172842

Chicago/Turabian Style

Pramanik, Paramahansa. 2025. "Construction of an Optimal Strategy: An Analytic Insight Through Path Integral Control Driven by a McKean–Vlasov Opinion Dynamics" Mathematics 13, no. 17: 2842. https://doi.org/10.3390/math13172842

APA Style

Pramanik, P. (2025). Construction of an Optimal Strategy: An Analytic Insight Through Path Integral Control Driven by a McKean–Vlasov Opinion Dynamics. Mathematics, 13(17), 2842. https://doi.org/10.3390/math13172842

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop