Optimal Control of Opinion Dynamics on Complex Networks via Discounted LQR: Theory and Computation

Chen, Yajin; Gao, Hongwei; Liu, Yanshan; Jiang, Zhonghao

doi:10.3390/math14101623

Open AccessArticle

Optimal Control of Opinion Dynamics on Complex Networks via Discounted LQR: Theory and Computation

School of Mathematics and Statistics, Qingdao University, Ningxia Road 308, Qingdao 266071, China

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(10), 1623; https://doi.org/10.3390/math14101623

Submission received: 5 March 2026 / Revised: 2 May 2026 / Accepted: 7 May 2026 / Published: 11 May 2026

(This article belongs to the Special Issue Trends and Prospects in Control and Dynamic Games)

Download

Browse Figures

Versions Notes

Abstract

This paper investigates the optimal control problem of opinion dynamics within complex networks. By introducing a state transformation, the original problem is reformulated within a discounted Linear Quadratic Regulator (LQR) framework, establishing a connection between opinion control and classical control theory. Within this unified framework, the optimal control law can be obtained by solving the discrete-time algebraic Riccati equation, thereby circumventing the complexity of dealing with linear terms inherent in traditional dynamic programming approaches. Numerical experiments validate the effectiveness of the algorithm in a benchmark case, a 20-node complete network, and complex topologies. They also reveal the influence mechanisms of network heterogeneity on convergence speed and control energy consumption, providing a theoretical basis for public opinion guidance strategies under different network structures.

Keywords:

opinion dynamics; social networks; optimal control; discounted LQR; complex networks; automatic stabilizability

MSC:

93C55; 93B52

1. Introduction

Social networks have become the primary medium for information dissemination and opinion exchange in modern society, profoundly shaping public opinion formation, marketing strategies, and social governance effectiveness [1,2,3,4,5]. Understanding how collective opinions evolve over time (opinion dynamics) and designing effective interventions under resource constraints (optimal control) has emerged as a frontier research area at the intersection of control theory and network science [6,7]. The classic DeGroot model [8] and its extensions, such as the Friedkin–Johnsen model [9], elegantly capture interpersonal influence mechanisms through linear weighting rules, laying the foundation for subsequent theoretical control studies.

Currently, research on opinion control mainly proceeds along two technical routes. The first route focuses on the optimal reconstruction of network topology, indirectly guiding opinion convergence by adjusting edge weights or connection relationships [10,11]. However, such methods often face rigid constraints in practical applications where the network structure cannot be adjusted in real time. The second route focuses on designing external intervention strategies, directly influencing node states by applying control inputs, which is more in line with real-world scenarios such as advertising placement and policy propaganda [12,13].

The problem of opinion intervention in reality inherently contains profound characteristics of a “cost–benefit” game. Decision-makers expect the group opinions to gradually approach the target, while having to strictly control the long-term input costs, and the marginal utility of future benefits often decreases with time [14,15]. This structural feature naturally corresponds to the infinite-horizon discounted optimal control framework. By introducing the discount factor

δ \in (0, 1)

, the objective function penalizes opinion deviations while balancing long-term control energy, which can better reflect the logic of resource optimization in the economic sense.

Despite these advances, current research still faces two critical limitations in addressing optimal control problems for opinion dynamics over large-scale complex networks:

L1: Dimensionality versus analytical tractability. Representative studies [13,15,16,17], represented by Jiang et al. [13], focus on deriving analytical, closed-form solutions for optimal control. While mathematically rigorous, such approaches are typically limited to highly symmetric network topologies (e.g., complete graphs, star graphs) and rely on scalar-value analysis. Their computational complexity grows rapidly with the network size n, making them impractical for realistic networks with moderate or large numbers of nodes and heterogeneous structures.

L2: Insufficient exploration of general complex topologies. Existing numerical methods, such as the greedy strategies in [18,19], improve scalability but often overlook the fundamental role of network structure in shaping optimal control performance. In particular, few works systematically compare control performance between representative complex networks [20] such as small-world networks [21] and scale-free networks [22], which capture the core structural features of real social networks.

To overcome the above limitations, this paper develops a unified optimal control framework for opinion dynamics by recasting the problem into a discounted discrete-time LQR problem. By introducing a deviation-state transformation, the original opinion control problem is converted into a standard discounted LQR form, allowing us to avoid the complicated linear terms in traditional dynamic programming derivations [23]. The proposed matrix-based approach naturally handles general network topologies and remains computationally tractable for medium-scale networks (tens to hundreds of nodes).

The main contributions of this paper are summarized as follows:

Theoretical Contribution: We establish a discounted LQR formulation tailored to row-stochastic opinion dynamics. The main theoretical clarification is not the invention of a new Riccati theory, but the observation that the row-stochastic structure of opinion dynamics, together with the discount factor

δ \in (0, 1)

, automatically makes the transformed pair

(\sqrt{δ} A, \sqrt{δ} b)

stabilizable. This removes the need for case-by-case stabilizability verification with respect to network connectivity, controllability, or control-node placement.

Methodological Contribution: We derive the corresponding discounted DARE and optimal state-feedback law in the deviation-state coordinates. Unlike analytical approaches that are restricted to small or highly symmetric networks, the resulting matrix-based computation can be applied to arbitrary row-stochastic topologies. The proposed fixed-point Riccati iteration is used primarily as a transparent implementation of the discounted DARE, and its numerical consistency is checked against a standard DARE solver.

Applied Contribution: Numerical simulations validate the framework on benchmark cases, complete graphs, scale-free networks, and small-world networks. The results reveal that network heterogeneity significantly affects convergence speed and control energy consumption, providing insights for real-world opinion guidance strategies.

The remainder of this paper is organized as follows. Section 2 formulates the controlled opinion dynamics model and the discounted LQR problem. Section 3 presents the theoretical framework, including the discounted DARE and the stabilizability theorem. Section 4 describes the numerical solution algorithm. Section 5 reports numerical experiments, including benchmark validation and network topology comparisons. Section 6 concludes the paper.

2. Problem Formulation

2.1. Controlled Opinion Dynamics Model

Consider a complex network of n agents whose opinions evolve over time according to the following linear discrete-time system:

x (t + 1) = A x (t) + B u (t),

(1)

where

x (t) \in R^{n}

denotes the opinion vector of the agents at time t;

A \in R^{n \times n}

is a row-stochastic matrix (

A 1 = 1

), representing the normalized trust relationships and social influences among agents. This specific structural property

A 1 = 1

implies the spectral radius

ρ (A) = 1

, which forms the basis of our subsequent stability analysis.

B \in R^{n \times m}

is the control input matrix; and

u (t) \in R^{m}

is the control vector applied at time t.

This paper focuses on the single-input case (

m = 1

), corresponding to a scalar control strategy employed by the decision-maker. The control matrix reduces to a vector

b \in R^{n}

, termed the control gain vector, and the control vector reduces to a scalar signal

u (t) \in R

. The model simplifies to

x (t + 1) = A x (t) + b u (t), x (0) = x_{0} .

(2)

Here,

b_{i} \geq 0

represents the non-negative control weight (actuation intensity) applied to agent i. This formulation unifies various control architectures:

b = e_{i}

denotes single-node control, targeting a specific agent; and

b = \sum_{i \in I} w_{i} e_{i}

with

\sum w_{i} = 1

denotes weighted multi-node control, where

I

is the set of controlled agents;

b = 1 / n

denotes uniform network-wide control. All these cases lie within a computationally tractable one-dimensional input space.

Assumption 1.

The target state

S = \hat{x} \cdot 1_{n} \in R^{n}

is a constant consensus vector, where

\hat{x} \in R

denotes the desired global opinion value shared by all agents in the network.

The control objective is to steer the opinion vector

x (t)

toward the target consensus vector S. This objective is formalized through the following infinite-horizon discounted cost function:

J (u) = \sum_{t = 0}^{\infty} δ^{t} [{∥ x (t) - S ∥}_{2}^{2} + γ {(u (t))}^{2}],

(3)

where

δ \in (0, 1)

is the discount factor ensuring future costs are discounted, and

γ > 0

is the positive control cost weight balancing regulation performance and control effort.

2.2. Standard LQR Formulation via Deviation States

To transform the problem into a standard LQR framework, we define the deviation state vector as follows:

z (t) = x (t) - S \in R^{n}

(4)

Utilizing the row-stochastic property

A 1 = 1

, we verify that

A S = A (\hat{x} 1_{n}) = \hat{x} A 1_{n} = \hat{x} 1_{n} = S

. Subtracting S from both sides of Equation (1), the deviation system dynamics are obtained as follows:

z (t + 1) = A z (t) + b u (t), z (0) = x_{0} - S

(5)

The cost function (3) can be rewritten as

J (u) = \sum_{t = 0}^{\infty} δ^{t} [z {(t)}^{⊤} Q z (t) + γ u {(t)}^{2}]

(6)

where

Q = I_{n}

is a positive definite identity matrix, ensuring the penalty is applied equally to all agents’ deviations from the target.

Problem Formulation: The standard discounted LQR problem is formulated as follows: Given the linear discrete-time system (5), find a control sequence

{u (t)}_{t = 0}^{\infty}

that minimizes the discounted cost function (6).

Remark 1.

The “complex networks” considered in this paper specifically refer to two representative classes: (i) scale-free networks: The degree distribution follows a power-law distribution, featuring highly connected hub nodes that dominate the network dynamics; (ii) small-world networks: Characterized by a high clustering coefficient and a short average path length, representing networks with dense local communities and rapid global information flow. These two topologies capture the core structural features of real social networks and are explicitly used in this study to analyze the impact of network topology on control performance. The effect of network topology on control is captured solely through the matrix A.

3. Discounted LQR Theoretical Framework and Automatic Stabilizability

3.1. Review of Standard Discounted LQR

To solve the problem (5)–(6), we first recall the standard discounted LQR theory, which serves as the foundational control-theoretic framework for this work. Through the variable substitution

\tilde{z} (t) = δ^{t / 2} z (t), \tilde{u} (t) = δ^{t / 2} u (t),

(7)

the original discounted problem (6) is equivalent to a standard (undiscounted) LQR problem:

\tilde{z} (t + 1) = \sqrt{δ} A \tilde{z} (t) + \sqrt{δ} b \tilde{u} (t),

(8)

J = \sum_{t = 0}^{\infty} [\tilde{z} {(t)}^{⊤} Q \tilde{z} (t) + γ \tilde{u} {(t)}^{2}] .

(9)

Standard LQR theory states that if

(\sqrt{δ} A, \sqrt{δ} b)

is stabilizable and

(\sqrt{δ} A, Q^{1 / 2})

is detectable, then there exists a unique positive semi-definite solution P satisfying the DARE, and the optimal control is linear state feedback

u^{*} (t) = - K z (t)

[24,25].

Key question: For general complex networks, is

(\sqrt{δ} A, \sqrt{δ} b)

stabilizable? Traditional approaches require case-by-case verification of stabilizability, which constitutes a major theoretical obstacle for opinion dynamics control, especially for large-scale networks with arbitrary topologies. The following section will reveal a special property of opinion dynamics models—automatic stabilizability—which is the core theoretical contribution of this work and distinguishes it from standard LQR applications.

3.2. Discounted Stabilizability: A Key Advantage of Row-Stochastic Structure

This section states the automatic stabilizability property in a precise discounted sense. For the transformed system

(\sqrt{δ} A, \sqrt{δ} b)

, stabilizability is guaranteed without additional verification because the row-stochastic matrix A has spectral radius one and the discount factor contracts the transformed open-loop spectrum inside the unit disk. This property should be understood as a structural simplification of the LQR assumptions for opinion dynamics, not as a new general Riccati theorem.

Theorem 1

(Discounted Stabilizability for Opinion Dynamics). Let A be a row-stochastic matrix (

A 1 = 1

), and

δ \in (0, 1)

. Then

(i): $ρ (\sqrt{δ} A) = \sqrt{δ} < 1$ ;
(ii): For any control vector b, the matrix pair $(\sqrt{δ} A, \sqrt{δ} b)$ is stabilizable;
(iii): The discounted DARE admits a unique symmetric positive semi-definite solution $P \in R^{n \times n}$ ;
(iv): The transformed discounted closed-loop matrix $\sqrt{δ} (A - b K)$ is asymptotically stable, where K is the optimal gain derived from the discounted DARE. Equivalently, $ρ (A - b K) < 1 / \sqrt{δ}$ .

Proof.

(i) For a row-stochastic matrix A, according to the Perron–Frobenius theorem [26], the spectral radius is

ρ (A) = 1

, with corresponding eigenvector

1

. Therefore,

ρ (\sqrt{δ} A) = \sqrt{δ} ρ (A) = \sqrt{δ} < 1 .

(10)

(ii) Stabilizability requires that all uncontrollable modes of the system be asymptotically stable. Since

ρ (\sqrt{δ} A) < 1

, all modes of the open-loop system

(\sqrt{δ} A, \sqrt{δ} b)

are already asymptotically stable, regardless of the specific choice of the control vector b. By definition, this implies the system is necessarily stabilizable.

(iii)–(iv) Applying standard LQR theory [24,25] to the transformed system (8), the stabilizability condition is automatically satisfied by (i)–(ii), while

Q = I_{n}

guarantees detectability. Hence the discounted DARE admits a unique positive semi-definite stabilizing solution P, and the transformed closed-loop matrix

\sqrt{δ} A - \sqrt{δ} b K = \sqrt{δ} (A - b K)

is asymptotically stable, i.e.,

ρ (\sqrt{δ} (A - b K)) < 1

. This implies

ρ (A - b K) < 1 / \sqrt{δ}

; the stronger condition

ρ (A - b K) < 1

may hold in numerical examples but is not required by the discounted LQR theory. □

Remark 2

(Comparison with Standard LQR). Theorem 1 should be interpreted as a problem-specific verification of the assumptions behind discounted LQR. In standard undiscounted LQR (

δ = 1

), a row-stochastic opinion matrix has an eigenvalue at one, and the stabilizability of

(A, b)

may depend on the control-node placement and controllability of the consensus mode. By contrast, after discounting, the transformed matrix

\sqrt{δ} A

is strictly Schur-stable for every row-stochastic A. Therefore all uncontrollable modes of the transformed system are already stable, and the stabilizability condition is automatically satisfied.

This observation follows from spectral contraction, but its value in the present manuscript is to make explicit how a textbook LQR assumption is automatically satisfied for the structured class of row-stochastic opinion dynamics. Hence the theoretical contribution is a structural specialization and clarification for opinion control problems, not a claim that the discounted DARE itself is new.

The property holds for all row-stochastic matrices satisfying

A 1 = 1

, including reducible, disconnected, and directed cases. If a perturbed or empirical matrix is not exactly row-stochastic, the automatic argument no longer applies directly and the stabilizability of the transformed pair must be checked in the usual LQR sense.

3.3. Discounted DARE and Optimal Control Law

Based on the automatic stabilizability verification in Theorem 1, we now derive the discounted DARE and the corresponding optimal control law. The derivation follows standard dynamic programming for discounted LQR, while the opinion-dynamics-specific contribution lies in the deviation-state transformation and the automatic satisfaction of the stabilizability assumption for row-stochastic networks.

Theorem 2.

Consider the deviation state system (5) and the discounted cost function (6), then there exists a unique symmetric positive semi-definite matrix

P \in R^{n \times n}

satisfying the discounted discrete-time algebraic Riccati equation (discounted DARE):

P = Q + δ A^{⊤} P A - δ^{2} A^{⊤} P b {(γ + δ b^{⊤} P b)}^{- 1} b^{⊤} P A

(11)

The optimal control law takes the form of linear state feedback:

u^{*} (t) = - K z (t) = - \frac{δ}{γ + δ b^{⊤} P b} b^{⊤} P A \cdot z (t)

(12)

where the feedback gain matrix is

K = \frac{δ}{γ + δ b^{⊤} P b} b^{⊤} P A .

(13)

The corresponding closed-loop system dynamics are

z (t + 1) = (A - b K) z (t),

(14)

and the transformed discounted closed-loop matrix is asymptotically stable, i.e.,

ρ (\sqrt{δ} (A - b K)) < 1

, or equivalently

ρ (A - b K) < 1 / \sqrt{δ}

.

Proof.

We solve the problem using the principle of dynamic programming, a standard approach for discounted LQR problems [25]. We define the value function (optimal cost function) as follows:

V (z) = min_{{u (τ)}_{τ = t}^{\infty}} \sum_{τ = t}^{\infty} δ^{τ - t} [z {(τ)}^{⊤} Q z (τ) + γ u {(τ)}^{2}] .

(15)

According to the Bellman optimality principle, the value function satisfies the Hamilton–Jacobi–Bellman (HJB) equation:

V (z (t)) = min_{u (t)} \{z {(t)}^{⊤} Q z (t) + γ u {(t)}^{2} + δ V (z (t + 1))\} .

(16)

Step 1: Guess the form of the value function: For linear systems with quadratic cost functions (the defining structure of LQR problems), the value function is necessarily a quadratic form. This is a fundamental result in LQR theory [25], which justifies our conjecture. We therefore assume

V (z) = z^{⊤} P z,

(17)

where

P = P^{⊤} ⪰ 0

is an unknown symmetric positive semi-definite matrix.

Step 2: Substitute into the HJB equation to derive the optimal control

Substituting (17) and the system dynamics (5) into (16) gives

\begin{matrix} z {(t)}^{⊤} P z (t) & = min_{u (t)} {z {(t)}^{⊤} Q z (t) + γ u {(t)}^{2} \\ + δ {[A z (t) + b u (t)]}^{⊤} P [A z (t) + b u (t)]} . \end{matrix}

(18)

We expand the quadratic term as follows:

\begin{matrix} δ {[A z (t) + b u (t)]}^{⊤} P [A z (t) + b u (t)] \\ = δ z {(t)}^{⊤} A^{⊤} P A z (t) + 2 δ z {(t)}^{⊤} A^{⊤} P b u (t) + δ b^{⊤} P b u {(t)}^{2}, \end{matrix}

(19)

where the last equality uses the fact that

b^{⊤} P b

is a scalar since

u (t) \in R

. Substituting (19) into (18) yields

\begin{matrix} z {(t)}^{⊤} P z (t) & = min_{u (t)} {z {(t)}^{⊤} Q z (t) + γ u {(t)}^{2} + δ z {(t)}^{⊤} A^{⊤} P A z (t) \\ + 2 δ z {(t)}^{⊤} A^{⊤} P b u (t) + δ b^{⊤} P b u {(t)}^{2}} . \end{matrix}

(20)

We take the following derivative with respect to

u (t)

and set it to zero (first-order optimality condition):

\frac{\partial}{\partial u (t)} [γ u {(t)}^{2} + 2 δ z {(t)}^{⊤} A^{⊤} P b u (t) + δ b^{⊤} P b u {(t)}^{2}] = 0 .

(21)

We compute the following derivative:

\begin{matrix} 2 γ u (t) + 2 δ b^{⊤} P A z (t) + 2 δ b^{⊤} P b u (t) & = 0 \\ (γ + δ b^{⊤} P b) u (t) & = - δ b^{⊤} P A z (t) . \end{matrix}

(22)

Therefore, the optimal control is

u^{*} (t) = - \frac{δ}{γ + δ b^{⊤} P b} b^{⊤} P A z (t) = - K z (t),

(23)

where

K = \frac{δ}{γ + δ b^{⊤} P b} b^{⊤} P A

, which yields (12) and (13).

Step 3: Derivation of the Riccati equation

Substituting the optimal control

u^{*} (t) = - K z (t)

back into the HJB Equation (16) and simplifying using the optimality condition (22) gives the following:

First, we compute each term:

\begin{matrix} γ u^{*} {(t)}^{2} & = γ z {(t)}^{⊤} K^{⊤} K z (t), \end{matrix}

(24)

\begin{matrix} 2 δ z {(t)}^{⊤} A^{⊤} P b u^{*} (t) & = - 2 δ z {(t)}^{⊤} A^{⊤} P b K z (t) = - 2 \frac{δ^{2}}{γ + δ b^{⊤} P b} z {(t)}^{⊤} A^{⊤} P b b^{⊤} P A z (t), \end{matrix}

(25)

\begin{matrix} δ b^{⊤} P b u^{*} {(t)}^{2} & = δ b^{⊤} P b \cdot z {(t)}^{⊤} K^{⊤} K z (t) = \frac{δ^{3} b^{⊤} P b}{{(γ + δ b^{⊤} P b)}^{2}} z {(t)}^{⊤} A^{⊤} P b b^{⊤} P A z (t) . \end{matrix}

(26)

From (22), we have

K = \frac{δ}{γ + δ b^{⊤} P b} b^{⊤} P A

, therefore

K^{⊤} K = \frac{δ^{2}}{{(γ + δ b^{⊤} P b)}^{2}} A^{⊤} P b b^{⊤} P A .

(27)

Substituting (24)–(26) and (27) into (20) and rearranging gives the following:

\begin{matrix} z {(t)}^{⊤} P z (t) & = z {(t)}^{⊤} Q z (t) + δ z {(t)}^{⊤} A^{⊤} P A z (t) \\ + [γ + δ b^{⊤} P b - 2 (γ + δ b^{⊤} P b) + δ b^{⊤} P b] \frac{δ^{2}}{{(γ + δ b^{⊤} P b)}^{2}} z {(t)}^{⊤} A^{⊤} P b b^{⊤} P A z (t) \\ = z {(t)}^{⊤} Q z (t) + δ z {(t)}^{⊤} A^{⊤} P A z (t) - \frac{δ^{2}}{γ + δ b^{⊤} P b} z {(t)}^{⊤} A^{⊤} P b b^{⊤} P A z (t) . \end{matrix}

(28)

Since (28) holds for all

z (t)

, comparing coefficients yields the matrix equation as follows:

P = Q + δ A^{⊤} P A - δ^{2} A^{⊤} P b {(γ + δ b^{⊤} P b)}^{- 1} b^{⊤} P A,

(29)

which is the discounted DARE (11).

Step 4: Stability verification

According to Theorem 1, the transformed system

(\sqrt{δ} A, \sqrt{δ} b)

satisfies the stabilizability condition. Standard LQR theory guarantees that the closed-loop system

\sqrt{δ} A - \sqrt{δ} b K

is asymptotically stable, i.e.:

ρ (\sqrt{δ} A - \sqrt{δ} b K) = ρ (\sqrt{δ} (A - b K)) = \sqrt{δ} ρ (A - b K) < 1 .

(30)

Therefore, the discounted LQR theory guarantees the stability of the transformed closed-loop system and yields

ρ (A - b K) < 1 / \sqrt{δ}

. This condition is sufficient for the finiteness of the discounted cost and the stabilizing property in the transformed coordinates. A stronger Schur stability condition

ρ (A - b K) < 1

can be verified numerically for the examples considered in this paper, but it is not stated as a general theoretical consequence of discounting alone. □

Remark 3

(Comparison between Discounted DARE and Standard DARE). The DARE for standard LQR (

δ = 1

) is

P = Q + A^{⊤} P A - A^{⊤} P b {(γ + b^{⊤} P b)}^{- 1} b^{⊤} P A .

(31)

The discounted DARE (11) introduces a factor δ in front of the

A^{⊤} P A

term and a factor

δ^{2}

in front of the cross term, reflecting the discounting effect on future costs. As

δ \to 1

, (11) reduces to (31), consistent with standard LQR theory.

Remark 4.

This paper adopts the discounted LQR framework to solve the opinion control problem. Through the deviation transformation

z (t) = x (t) - S

, the target-tracking problem is converted into a standard linear-quadratic form, and the Riccati equation and optimal feedback law can then be obtained by dynamic programming. Compared with direct dynamic programming derivations for opinion dynamics, such as [13], the present formulation has three advantages: (1) it removes affine terms associated with the target consensus state; (2) it makes the assumptions of classical LQR theory transparent in terms of the row-stochastic network matrix; (3) it provides a model-based benchmark for future reinforcement learning approaches. In particular, if A is unknown or only partially observed, policy-gradient or actor–critic methods may be used to approximate the feedback law whose model-based counterpart is given by the Riccati solution [27].

3.4. Discounted Closed-Loop Stability and Optimal Performance

Based on Theorem 2, we obtain the stability property required by the discounted LQR formulation and the corresponding optimal performance index. To avoid overstatement, the following corollary is stated in the transformed discounted coordinates.

Corollary 1.

Under the conditions of Theorem 2, the optimal control law (12) guarantees

(i): All eigenvalues of the transformed discounted closed-loop matrix $\sqrt{δ} (A - b K)$ lie strictly inside the unit disk; equivalently, $ρ (A - b K) < 1 / \sqrt{δ}$ .
(ii): The discounted closed-loop trajectory is exponentially bounded in the transformed coordinates:

$∥ δ^{t / 2} z (t) ∥ \leq c \cdot ρ^{t} ∥ z (0) ∥, for some c > 0, ρ \in (0, 1) .$

(32)
(iii): For the numerical examples in Section 5, the stronger condition $ρ (A - b K) < 1$ is verified computationally, and therefore the original opinion state converges to the target consensus S.

Proof.

(i) According to Theorem 2, the transformed closed-loop matrix

\sqrt{δ} (A - b K)

is Schur-stable. This is equivalent to

\sqrt{δ} ρ (A - b K) < 1

, or

ρ (A - b K) < 1 / \sqrt{δ}

.

(ii) Since

\sqrt{δ} (A - b K)

is Schur-stable, there exist constants

c > 0

and

ρ \in (0, 1)

such that

∥ {[\sqrt{δ} (A - b K)]}^{t} z (0) ∥ \leq c ρ^{t} ∥ z (0) ∥ .

(33)

Because

{[\sqrt{δ} (A - b K)]}^{t} z (0) = δ^{t / 2} z (t)

, (32) follows.

(iii) The convergence of

x (t)

in the original coordinates requires the stronger Schur stability of

A - b K

. In this manuscript, this stronger condition is treated as an empirical property verified in the reported simulations, rather than as a general consequence of the discounted LQR transformation. □

Proposition 1.

The optimal cost function value is

J^{*} = z {(0)}^{⊤} P z (0),

(34)

where P is the unique positive semi-definite solution to the discounted DARE (11).

Proof.

By the definition of the value function (15),

V (z (0))

is precisely the minimum total cost starting from the initial state

z (0)

. From (17),

V (z (0)) = z {(0)}^{⊤} P z (0) = J^{*}

. □

Remark 5.

Proposition 1 shows that the optimal cost is jointly determined by the initial deviation

z (0)

and the Riccati solution P. The matrix P can be interpreted as the system’s “cost-state” weight matrix: larger diagonal entries of P correspond to higher control costs for the associated state components. This explicit formula provides the computational foundation for the subsequent numerical analysis. The parameter

δ \in (0, 1)

should be selected according to the planning horizon of the opinion-intervention task. A smaller δ gives more weight to near-term regulation and makes the transformed matrix

\sqrt{δ} A

more strongly stable, which usually improves the numerical conditioning and convergence speed of Riccati iterations. A larger δ places more weight on long-term regulation and approaches the undiscounted infinite-horizon objective, but the transformed spectrum becomes closer to the unit circle and the numerical solution may converge more slowly. Thus, δ is not merely a technical parameter; it represents the decision-maker’s preference between short-term intervention effectiveness and long-term opinion regulation. In practical applications, one may select δ by comparing convergence time, total control energy

E_{u}

, and final tracking error under several candidate values, consistent with recent discussions on discounted LQR [28].

4. Numerical Solution Algorithm

The discounted DARE (11) is a nonlinear matrix equation without a general analytical closed-form solution. We use a fixed-point Riccati iteration to compute the stabilizing solution. The purpose of this algorithm is to provide a transparent implementation for the present opinion control model; its convergence is justified through the standard discounted DARE theory after the stabilizability and detectability conditions have been verified in Theorem 1.

4.1. Fixed-Point Iterative Algorithm for Discounted DARE

Based on the structural characteristics of the discounted DARE (11), the following fixed-point iterative algorithm is proposed.

Remark 6.

Algorithm 1 is a fixed-point iteration

P_{k + 1} = F (P_{k})

for the discounted DARE. We do not claim that the Riccati map is globally contractive for every initialization and every matrix norm. Instead, the convergence guarantee used here follows from standard Riccati theory for the transformed LQR problem, because

(\sqrt{δ} A, \sqrt{δ} b)

is stabilizable and

(\sqrt{δ} A, Q^{1 / 2})

is detectable.

Algorithm 1 Fixed-Point Iterative Algorithm for Discounted DARE

Require: Row-stochastic matrix

A \in R^{n \times n}

, control vector

b \in R^{n}

, weight matrix

Q = I_{n}

, control cost

γ > 0

, discount factor

δ \in (0, 1)

, convergence tolerance

ϵ > 0

Ensure: Solution P of discounted DARE, feedback gain K

Initialize $P_{0} \leftarrow Q$ , $k \leftarrow 0$
repeat
Compute scalar $M_{k} \leftarrow γ + δ b^{⊤} P_{k} b$
Update $P_{k + 1} \leftarrow Q + δ A^{⊤} P_{k} A - δ^{2} A^{⊤} P_{k} b M_{k}^{- 1} b^{⊤} P_{k} A$
$k \leftarrow k + 1$
until $∥ P_{k + 1} - P_{k} ∥_{F} < ϵ$
Compute feedback gain $K \leftarrow \frac{δ}{γ + δ b^{⊤} P_{k} b} b^{⊤} P_{k} A$
return $P_{k}, K$

4.2. Convergence and Complexity Analysis

Proposition 2

(Convergence of the Riccati Iteration). Suppose that A is row-stochastic,

Q = I_{n} ≻ 0

,

γ > 0

, and

δ \in (0, 1)

. Then the transformed pair

(\sqrt{δ} A, \sqrt{δ} b)

is stabilizable and

(\sqrt{δ} A, Q^{1 / 2})

is detectable. Consequently, the discounted DARE (11) has a unique stabilizing positive semi-definite solution. The Riccati iteration in Algorithm 1, when initialized from a positive semi-definite matrix, such as

P_{0} = Q

, converges to this stabilizing solution under the standard assumptions of discrete-time Riccati iteration.

Proof.

Theorem 1 proves the stabilizability of

(\sqrt{δ} A, \sqrt{δ} b)

for any control vector b. Since

Q = I_{n} ≻ 0

, the detectability of

(\sqrt{δ} A, Q^{1 / 2})

is immediate. Standard discrete-time LQR/DARE theory then guarantees the existence and uniqueness of the stabilizing positive semi-definite solution and the convergence of the Riccati iteration to that solution under the usual positive semi-definite initialization [24,25]. □

Remark 7

(Computational Complexity). Each iteration involves matrix multiplication

A^{⊤} P_{k} A

with complexity

O (n^{3})

. For medium-scale networks (n = 20∼100), convergence is usually achieved within 10∼50 iterations.

Based on Algorithm 1, the next section verifies the effectiveness of the proposed framework through three numerical examples.

5. Numerical Experiments and Verification

This section evaluates the proposed discounted LQR formulation from three complementary perspectives. First, a four-agent complete graph benchmark is used to verify numerical consistency with the analytical case in [13]. Second, a 20-agent complete graph is used to test whether the matrix-based Riccati computation remains stable and efficient beyond the small analytical setting. Third, scale-free and small-world networks are compared to clarify how topology, through the row-stochastic matrix A, affects convergence speed, control energy allocation, and final regulation accuracy.

Unless otherwise stated, all experiments use the fixed-point Riccati iteration in Algorithm 1 with tolerance

ϵ = 10^{- 6}

and initialization

P_{0} = Q

. The control energy and mean absolute error are computed as

E_{u} = \sum_{t = 0}^{T - 1} u {(t)}^{2}, MAE (t) = \frac{1}{n} \sum_{i = 1}^{n} | x_{i} (t) - \hat{x} | .

(35)

The convergence time

t_{c}

is defined as the first time step at which

MAE (t) < 10^{- 2}

and remains below this threshold in the reported horizon. The implementation codes use 600 dpi figures, enlarged labels, and fixed random seeds where random network generation is involved. The current numerical evidence is intended to validate the proposed formulation and clarify structural mechanisms, rather than to provide an exhaustive large-scale simulation benchmark.

5.1. Benchmark Case Validation

We first reproduce the classical four-agent example in [13]. This test serves two purposes: it checks whether the deviation-state discounted LQR formulation reproduces the known opinion control behavior, and it compares Algorithm 1 with a standard DARE solver applied to the transformed undiscounted system

(\sqrt{δ} A, \sqrt{δ} b)

. The influence matrix is

A = (\begin{matrix} 0.55 & 0.15 & 0.15 & 0.15 \\ 0.20 & 0.40 & 0.20 & 0.20 \\ 0.20 & 0.20 & 0.40 & 0.20 \\ 0.20 & 0.20 & 0.20 & 0.40 \end{matrix})

Control input applies to the first agent, i.e.,

b = {[1, 0, 0, 0]}^{⊤}

. The target consensus value

\hat{x} = 0.8

, parameters

γ = 0.1

,

δ = 0.6

, and initial state

x (0) = {[0.7, 0.4, 0.4, 0.4]}^{⊤}

. The experimental results are shown in Figure 1 and Table 1. The scalar input is applied to Agent 1, i.e.,

b = {[1, 0, 0, 0]}^{⊤}

. The target opinion is

\hat{x} = 0.8

, the control parameters are

γ = 0.1

and

δ = 0.6

, and the initial state is

x (0) = {[0.7, 0.4, 0.4, 0.4]}^{⊤}

. The results are shown in Figure 1 and Table 1.

Figure 1a shows that the controlled agent initially overshoots the target, which is typical of an optimal feedback policy that first applies a relatively strong intervention and then lets the network interaction redistribute the effect. Agents 2–4 have nearly identical trajectories because of the symmetry of the benchmark matrix. All opinions approach the target

\hat{x} = 0.8

within the displayed horizon. Figure 1b shows that the control input decays rapidly from its initial value, reflecting the decreasing marginal need for external intervention as the deviation state becomes small.

For numerical verification, Algorithm 1 is compared with the standard DARE solver after applying the transformation

A_{d} = \sqrt{δ} A

and

b_{d} = \sqrt{δ} b

. The Frobenius-norm error of the Riccati matrix P is

1.18 \times 10^{- 6}

, and the Euclidean-norm error of the feedback gain K is

2.81 \times 10^{- 7}

. This comparison confirms consistency with a mature DARE solver; it is not intended to claim computational superiority of the fixed-point implementation.

Table 1 also illustrates the evolution of agent opinions and the trajectory of the optimal control input. As shown in Figure 1a, Agent 1’s initial opinion of 0.7 first jumps to approximately 0.9 under control action, after which all agents’ opinions converge to the target consensus value

\hat{x} = 0.8

within about

t = 15

time steps. The control input

u (t)

exhibits a typical exponential decay characteristic (Figure 1b), the initial value of 0.331 decreases to the magnitude of

10^{- 3}

(0.001) after 20 steps, which is consistent with the theoretical property of optimal control for linear systems. These results match the analytical solution in [13] with an error of less than

10^{- 6}

, further verifying the correctness and effectiveness of the proposed numerical framework.

5.2. Scalability Validation

We next consider a 20-agent complete graph to test the computational scalability of the matrix-based implementation. Each agent assigns 40% of its trust to itself and distributes the remaining 60% uniformly among the other agents, leading to

A = 0.4 I_{20} + \frac{0.6}{19} (1_{20} 1_{20}^{⊤} - I_{20}) .

(36)

The actuation vector is

b = {[0.7, 0.3, 0, \dots, 0]}^{⊤}

, the target is

\hat{x} = 0.75

, and the parameters are

γ = 0.15

and

δ = 0.7

. Initial opinions are linearly distributed in

[0.4, 0.9]

. This example corresponds to a distributed single-input control scheme: a single scalar control signal is simultaneously applied to two selected agents, weighted by different actuation gains (0.7 and 0.3). This is not a true multi-input control experiment. In a standard multi-input LQR formulation, the actuation would be described by a matrix

B \in R^{n \times m}

and the control input would be a vector

u (t) \in R^{m}

.

While the extension to multi-input LQR is straightforward within the proposed framework, we leave this direction for future work. Our ongoing research will focus on multi-agent control design under game-theoretic frameworks, where multiple independent decision-makers optimize their own objectives, leading to a richer class of distributed intervention problems.

As shown in Figure 2a, all 20 opinions approach the target despite the broad initial opinion spread. Figure 2b shows a strong initial input followed by a smooth decay. The Riccati iteration converges in 26 iterations in this setting. Since each iteration is dominated by the multiplication

A^{⊤} P_{k} A

, the dense-matrix complexity is

O (n^{3})

per iteration. This supports the applicability of the proposed implementation to medium-scale networks, while very large-scale sparse networks would require specialized sparse or low-rank Riccati solvers.

Figure 3 provides a direct numerical explanation of the role of the discount factor. When

δ

increases from

0.5

to

0.9

, the Riccati iteration count increases from 17 to 38 and the control energy increases from 0.1787 to 0.4100, indicating a higher numerical and intervention cost. At the same time, the convergence time decreases from 50 to 33 steps and the final error decreases from

3.35 \times 10^{- 3}

to

8.18 \times 10^{- 5}

, indicating better long-term regulation accuracy. Hence,

δ

should be interpreted as a planning-horizon parameter rather than as a purely technical constant: larger values emphasize long-term accuracy, whereas smaller values reduce control effort and numerical burden.

5.3. Complex Network Topology Comparison

We compare a scale-free network and a small-world network to isolate the effect of topology on optimal opinion control (see Figure 4). Both networks contain

n = 20

nodes and use the same control parameters as in Section 5.2:

γ = 0.15

,

δ = 0.7

, and

\hat{x} = 0.75

. Thus, the observed differences arise from the structure of the row-stochastic influence matrix A.

Scale-Free Network: Generated using the Barab’asi–Albert model with parameter $m = 3$ . This network exhibits significant degree heterogeneity, with average degree $〈 k 〉 = 5.1$ , clustering coefficient $C = 0.581$ , and contains a few highly connected hub nodes.
Small-World Network: Generated using the Watts–Strogatz model with parameters $k = 4, p = 0.3$ . This network combines local clustering properties of regular networks with short path lengths of random networks, featuring average degree $〈 k 〉 = 4$ , clustering coefficient $C = 0.22$ , and characteristic path length $L = 2.216$ .

Trust matrices are generated through row randomization: edge weights are first assigned based on network topology (uniform distribution

[0.1, 0.9]

), then normalized to ensure each row sums to 1, with self-trust weight set to

0.4

.

Network topology directly determines the row-stochastic matrix A, which governs the system’s convergence speed, control energy distribution, and steady-state error. This section quantifies the impact of different topologies on control performance, clarifying the role of network structure in the proposed framework. For the discount-factor sensitivity test on the complete graph, we additionally report the stricter final-tracking behavior over a longer horizon; for the topology Monte Carlo test, the same

10^{- 2}

convergence threshold is used so that the results remain comparable with Figure 5.

In the scale-free network, several trajectories move rapidly toward the target after the initial intervention, indicating that hub-mediated diffusion helps spread the control effect as shown in Figure 5a. In the small-world network, trajectories are more homogeneous but converge more slowly, because no small set of hubs dominates information propagation as shown in Figure 5b. This contrast explains why a topological difference can be visible even though the same discounted LQR law is used.

The control inputs in Figure 5c further show different energy allocation mechanisms. The scale-free network requires a larger early input but then decays faster, with total control energy

E_{u}^{SF} = 0.25

. The small-world network starts with a smaller input but requires a more persistent intervention, with

E_{u}^{SW} = 0.18

. Thus, the scale-free topology favors concentrated early action through hub-mediated diffusion, whereas the small-world topology favors a slower and more sustained regulation process.

The convergence error comparison in Figure 5d confirms this mechanism. The scale-free case reaches the threshold

MAE (t) < 10^{- 2}

at approximately

t_{c} = 15

, whereas the small-world case reaches it at approximately

t_{c} = 27

. The final errors are

e_{f}^{SF} = 0.0005

and

e_{f}^{SW} = 0.0074

. These results suggest that heterogeneous topologies can improve convergence speed and final accuracy when the actuation vector is aligned with influential nodes.

To further check whether this topology-dependent tendency is robust to random network generation, we conducted a Monte Carlo comparison with 50 random realizations for each topology. For each run, the network structure, edge weights, and initial opinions are regenerated, while the control parameters remain fixed at

γ = 0.15

,

δ = 0.7

, and

\hat{x} = 0.75

. The resulting means and standard deviations of

e_{f}

,

E_{u}

, and

t_{c}

are summarized in Figure 6.

The Monte Carlo results support the mechanism observed in the representative example. The scale-free networks yield an average final error of

0.000848 \pm 0.001895

and an average convergence time of

12.34 \pm 4.98

, whereas the small-world networks yield

0.014234 \pm 0.009046

and

27.60 \pm 5.01

, respectively. The average control energies are comparable:

0.1783 \pm 0.1167

for scale-free networks and

0.1829 \pm 0.1093

for small-world networks. These results do not replace a full statistical hypothesis test, but they show that the faster convergence and smaller final error of scale-free networks are not artifacts of a single network instance.

The comparison with pinning control should also be understood carefully. Pinning control methods often emphasize the importance of controlling hub or leader nodes. Our results are consistent with this intuition, but the present discounted LQR formulation optimizes the feedback gain for a prescribed actuation vector b rather than solving the separate combinatorial actuator placement problem. A systematic comparison among hub node, peripheral node, and budget-constrained optimized actuation sets is therefore left as future work.

Thus, the present experiments should be read as evidence that topology affects the performance of a prescribed LQR actuator, not as a complete solution to optimal actuator placement. This distinction keeps the numerical claim aligned with the theoretical scope of the paper.

6. Conclusions

This paper developed a discounted LQR framework for the optimal control of opinion dynamics on row-stochastic networks. By shifting the original opinion state to the deviation state

z (t) = x (t) - S

, the target-consensus regulation problem is converted into a standard quadratic regulation problem. This transformation removes the affine terms that would otherwise appear in a direct dynamic programming treatment and makes it possible to compute the optimal feedback law through a discounted DARE. The main theoretical clarification is that discounting automatically verifies the stabilizability condition for the transformed pair

(\sqrt{δ} A, \sqrt{δ} b)

whenever A is row-stochastic and

0 < δ < 1

. This observation should be understood as a problem-specific verification of classical discounted LQR assumptions, rather than as a new Riccati theory. Accordingly, the revised stability statements are made in the transformed discounted coordinates, while the Schur stability of the unscaled matrix

A - b K

is treated as a numerical property to be checked in concrete examples. The numerical results support the usefulness of the formulation in three ways. The four-agent benchmark confirms consistency with a standard DARE solver and with the known analytical example. The 20-agent complete graph demonstrates that the matrix-based implementation remains computationally feasible for medium-scale networks. The scale-free and small-world comparison shows that topology affects the temporal distribution of control energy, convergence speed, and final error through the influence matrix A. In particular, hub-mediated scale-free networks enable faster diffusion of early interventions, whereas small-world networks require more persistent control.

From a practical viewpoint, these findings suggest that opinion guidance strategies should not only choose the magnitude of intervention, but also account for how network structure propagates that intervention. When influential hub nodes are available, early concentrated actuation may be more efficient; when the network is more homogeneous, sustained moderate intervention may be required. This provides a control-theoretic explanation for intuitions commonly used in pinning or leader-based opinion control, while keeping the optimality criterion explicit through the discounted LQR cost. Several limitations remain. The actuation vector is prescribed rather than optimized, so the combinatorial problem of actuator placement is outside the scope of this paper. Moreover, the current model assumes a known and time-invariant row-stochastic influence matrix. Future work will therefore consider budget-constrained actuator selection, multi-input distributed control with

B \in R^{n \times m}

, time-varying or partially observed networks, stubborn agents, multiplex structures, and reinforcement learning methods for model-free opinion guidance. In such model-free settings, the discounted LQR solution derived here can serve as a benchmark for evaluating learned policies. The sensitivity and Monte Carlo checks partially address the remaining numerical concerns: the former clarifies how

δ

trades off regulation accuracy, control effort, and Riccati iteration count, while the latter shows that the scale-free advantage in convergence speed and final error persists across repeated random realizations. Nevertheless, broader large-scale benchmarks and formal actuator placement optimization remain important future work.

Author Contributions

Conceptualization, Y.C. and Y.L.; methodology, Y.C.; software, Z.J.; validation, Y.L. and Z.J.; formal analysis, Y.C.; investigation, Y.C.; resources, H.G.; data curation, Z.J.; writing—original draft preparation, Y.C.; writing—review and editing, H.G.; visualization, Z.J.; supervision, H.G.; project administration, H.G.; funding acquisition, H.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 72171126, and the Systems Science Plus Joint Research Program of Qingdao University, grant number XT2024301.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

During the preparation of this manuscript, the authors used DeepSeek-V3.2 for language polishing and structural suggestions. The authors have reviewed and edited all AI-generated content and take full responsibility for the final manuscript.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Xu, G.; Qian, M.; Meng, L. Misinformation dissemination on social media: Key research themes and evolutionary paths between 2013 and 2023. Humanit. Soc. Sci. Commun. 2025, 12, 1775. [Google Scholar] [CrossRef]
Zareer, M.; Selmic, R.R. A Survey on Opinion Dynamics in Social Media Networks: Analysis, Simulation, and Control. IEEE Trans. Comput. Soc. Syst. 2025, 13, 2626–2660. [Google Scholar] [CrossRef]
Das, S.; Kaur, M. Social Media Analysis in Public Opinion: A Review. In Proceedings of the International Conference on Smart Cyber Physical Systems; Springer Nature: Singapore, 2024; pp. 671–680. [Google Scholar]
Zha, Q.; Kou, G.; Zhang, H.; Liang, H.; Chen, X.; Li, C.; Dong, Y. Opinion dynamics in finance and business: A literature review and research opportunities. Financ. Innov. 2020, 6, 44. [Google Scholar] [CrossRef]
Margaretha, J. Digital Crisis Governance: A Systematic Study of Government Crisis Communication in The Social Media Era (2020–2025). Eduvest-J. Univ. Stud. 2026, 6, 1958–1967. [Google Scholar] [CrossRef]
Volchenkov, D. Mathematical Frameworks for Network Dynamics: A Six-Pillar Survey for Analysis, Control, and Inference. Mathematics 2025, 13, 2116. [Google Scholar] [CrossRef]
Lin, Y.; Wang, X.; Hao, F.; Jiang, Y.; Wu, Y.; Min, G.; He, D.; Zhu, S.; Zhao, W. Dynamic Control of Fraud Information Spreading in Mobile Social Networks. IEEE Trans. Syst. Man. Cybern. Syst. 2019, 51, 3725–3738. [Google Scholar] [CrossRef]
DeGroot, M.H. Reaching a consensus. J. Am. Stat. Assoc. 1974, 69, 118–121. [Google Scholar] [CrossRef]
Friedkin, N.E.; Johnsen, E.C. Social influence and opinions. J. Math. Sociol. 1990, 15, 193–206. [Google Scholar] [CrossRef]
Dietrich, F.; Martin, S.; Jungers, M. Control via leadership of opinion dynamics with state and time-dependent interactions. IEEE Trans. Autom. Control 2017, 63, 1200–1207. [Google Scholar] [CrossRef]
Nugent, A.; Gomes, S.N.; Wolfram, M.T. Steering opinion dynamics through control of social networks. Chaos 2024, 34, 073121. [Google Scholar] [CrossRef] [PubMed]
Wang, C.; Mazalov, V.V.; Gao, H. Opinion dynamics control and consensus in a social network. Autom. Remote Control 2021, 82, 1107–1117. [Google Scholar] [CrossRef]
Jiang, H.; Mazalov, V.V.; Gao, H.; Wang, C. Opinion dynamics control in a social network with a communication structure. Dyn. Games Appl. 2023, 13, 412–434. [Google Scholar] [CrossRef]
Chen, Y.S.; Zaman, T. Optimizing influence campaigns: Nudging under bounded confidence. arXiv 2025, arXiv:2503.18331. [Google Scholar] [CrossRef]
Mazalov, V.; Parilina, E. The Euler-equation approach in average-oriented opinion dynamics. Mathematics 2020, 8, 355. [Google Scholar] [CrossRef]
Gao, J.; Parilina, E.M. Opinion control problem with average-oriented opinion dynamics and limited observation moments. Contrib. Game Theory Manag. 2021, 14, 103–112. [Google Scholar] [CrossRef]
Gao, J.; Parilina, E.M. Optimal control in a multiagent opinion dynamic system. Contrib. Game Theory Manag. 2022, 15, 51–59. [Google Scholar] [CrossRef]
Gentil, L.M.; Bhaya, A. Opinion dynamic games under one step ahead optimal control. IEEE Trans. Comput. Soc. Syst. 2023, 10, 4202–4213. [Google Scholar]
Gentil, L.M.; Bhaya, A. Comparison of control strategies for opinion dynamics models. In Proceedings of the CBA 2024, Rio de Janeiro, Brazil, 2–5 September 2024; pp. 1–6. [Google Scholar]
Newman, M.E.J. The structure and function of complex networks. SIAM Rev. 2003, 45, 167–256. [Google Scholar] [CrossRef]
Watts, D.J.; Strogatz, S.H. Collective dynamics of ‘small-world’ networks. Nature 1998, 393, 440–442. [Google Scholar] [CrossRef]
Barabási, A.L.; Albert, R. Emergence of scaling in random networks. Science 1999, 286, 509–512. [Google Scholar] [CrossRef] [PubMed]
Başar, T.; Olsder, G.J. Dynamic Noncooperative Game Theory, 2nd ed.; Classics in Applied Mathematics; SIAM: Philadelphia, PA, USA, 1999; Volume 23. [Google Scholar]
Anderson, B.D.; Moore, J.B. Optimal Control: Linear Quadratic Methods; Courier Corporation: Mineola, NY, USA, 2007. [Google Scholar]
Bertsekas, D. Dynamic Programming and Optimal Control: Volume I; Athena Scientific: Belmont, MA, USA, 2012. [Google Scholar]
Seneta, E. Non-Negative Matrices and Markov Chains, Rev. reprint of the 2nd ed.; Springer Series in Statistics; Springer: New York, NY, USA, 2006. [Google Scholar]
Zhao, F.; Fu, X.; You, K. Learning stabilizing controllers of linear systems via discount policy gradient. arXiv 2021, arXiv:2112.09294. [Google Scholar] [CrossRef]
de Brusse, J.; Daafouz, J.; Granzotto, M.; Postoyan, R.; Nešić, D. Discounted LQR: Stabilizing (near-) optimal state-feedback laws. In Proceedings of the 2025 IEEE 64th Conference on Decision and Control (CDC), Rio de Janeiro, Brazil, 9–12 December 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 4714–4719. [Google Scholar]

Figure 1. Benchmark validation for the four-agent complete graph. (a) Opinion trajectories

x_{i} (t)

under the discounted LQR feedback law (the dashed horizontal line denotes the target consensus value

\hat{x} = 0.8

. The overlapping trajectories of Agents 2–4 reflect the symmetry of the benchmark influence matrix.) (b) Optimal scalar control input

u (t)

, showing a large initial intervention followed by rapid decay.

Figure 1. Benchmark validation for the four-agent complete graph. (a) Opinion trajectories

x_{i} (t)

under the discounted LQR feedback law (the dashed horizontal line denotes the target consensus value

\hat{x} = 0.8

. The overlapping trajectories of Agents 2–4 reflect the symmetry of the benchmark influence matrix.) (b) Optimal scalar control input

u (t)

, showing a large initial intervention followed by rapid decay.

Figure 2. Scalability validation on the 20-agent complete graph. (a) Opinion trajectories of all agents under distributed single-input actuation. The dashed line denotes the target consensus value

\hat{x} = 0.75

. (b) Optimal scalar control input, showing the characteristic strong-then-weak intervention pattern.

Figure 2. Scalability validation on the 20-agent complete graph. (a) Opinion trajectories of all agents under distributed single-input actuation. The dashed line denotes the target consensus value

\hat{x} = 0.75

. (b) Optimal scalar control input, showing the characteristic strong-then-weak intervention pattern.

Figure 3. Sensitivity of control performance to the discount factor

δ

in the 20-agent complete-graph setting. (a) Riccati iteration count; (b) total control energy

E_{u}

; (c) convergence time

t_{c}

; (d) final error

e_{f}

on a logarithmic scale. Larger

δ

improves long-horizon tracking accuracy and reduces the final error, but it also increases the required control energy and the number of Riccati iterations.

Figure 3. Sensitivity of control performance to the discount factor

δ

in the 20-agent complete-graph setting. (a) Riccati iteration count; (b) total control energy

E_{u}

; (c) convergence time

t_{c}

; (d) final error

e_{f}

on a logarithmic scale. Larger

δ

improves long-horizon tracking accuracy and reduces the final error, but it also increases the required control energy and the number of Riccati iterations.

Figure 4. Complex network topologies for comparative study (

n = 20

): (a) scale-free network generated by Barabási–Albert model; (b) small-world network generated by Watts–Strogatz model.

Figure 4. Complex network topologies for comparative study (

n = 20

): (a) scale-free network generated by Barabási–Albert model; (b) small-world network generated by Watts–Strogatz model.

Figure 5. Topology-dependent optimal control behavior. (a) Opinion evolution on the scale-free network. (b) Opinion evolution on the small-world network. (c) Comparison of optimal control inputs. (d) Mean absolute error on a logarithmic scale, with final error

e_{f}

, control energy

E_{u}

, and convergence time

t_{c}

reported in the inset.

Figure 5. Topology-dependent optimal control behavior. (a) Opinion evolution on the scale-free network. (b) Opinion evolution on the small-world network. (c) Comparison of optimal control inputs. (d) Mean absolute error on a logarithmic scale, with final error

e_{f}

, control energy

E_{u}

, and convergence time

t_{c}

reported in the inset.

Figure 6. Monte Carlo comparison between scale-free and small-world networks over 50 random realizations per topology. Bars represent sample means and error bars denote standard deviations. The scale-free networks achieve a smaller average final error and shorter average convergence time, while the average control energy is comparable to that of small-world networks.

Table 1. Complete graph: optimal control and optimal trajectory.

t	0	2	4	6	8	10	12	14	16	18	20
$x_{1} (t)$	0.700	0.881	0.844	0.824	0.813	0.807	0.804	0.802	0.801	0.801	0.800
$x_{2} (t)$	0.400	0.547	0.663	0.726	0.760	0.778	0.788	0.794	0.797	0.798	0.799
$x_{3} (t)$	0.400	0.547	0.663	0.726	0.760	0.778	0.788	0.794	0.797	0.798	0.799
$x_{4} (t)$	0.400	0.547	0.663	0.726	0.760	0.778	0.788	0.794	0.797	0.798	0.799
$u (t)$	0.331	0.129	0.070	0.038	0.020	0.011	0.006	0.003	0.002	0.001	0.001

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, Y.; Gao, H.; Liu, Y.; Jiang, Z. Optimal Control of Opinion Dynamics on Complex Networks via Discounted LQR: Theory and Computation. Mathematics 2026, 14, 1623. https://doi.org/10.3390/math14101623

AMA Style

Chen Y, Gao H, Liu Y, Jiang Z. Optimal Control of Opinion Dynamics on Complex Networks via Discounted LQR: Theory and Computation. Mathematics. 2026; 14(10):1623. https://doi.org/10.3390/math14101623

Chicago/Turabian Style

Chen, Yajin, Hongwei Gao, Yanshan Liu, and Zhonghao Jiang. 2026. "Optimal Control of Opinion Dynamics on Complex Networks via Discounted LQR: Theory and Computation" Mathematics 14, no. 10: 1623. https://doi.org/10.3390/math14101623

APA Style

Chen, Y., Gao, H., Liu, Y., & Jiang, Z. (2026). Optimal Control of Opinion Dynamics on Complex Networks via Discounted LQR: Theory and Computation. Mathematics, 14(10), 1623. https://doi.org/10.3390/math14101623

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimal Control of Opinion Dynamics on Complex Networks via Discounted LQR: Theory and Computation

Abstract

1. Introduction

2. Problem Formulation

2.1. Controlled Opinion Dynamics Model

2.2. Standard LQR Formulation via Deviation States

3. Discounted LQR Theoretical Framework and Automatic Stabilizability

3.1. Review of Standard Discounted LQR

3.2. Discounted Stabilizability: A Key Advantage of Row-Stochastic Structure

3.3. Discounted DARE and Optimal Control Law

3.4. Discounted Closed-Loop Stability and Optimal Performance

4. Numerical Solution Algorithm

4.1. Fixed-Point Iterative Algorithm for Discounted DARE

4.2. Convergence and Complexity Analysis

5. Numerical Experiments and Verification

5.1. Benchmark Case Validation

5.2. Scalability Validation

5.3. Complex Network Topology Comparison

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI