Nonlinear Optimal Control for Stochastic Dynamical Systems

: This paper presents a comprehensive framework addressing optimal nonlinear analysis and feedback control synthesis for nonlinear stochastic dynamical systems. The focus lies on establishing connections between stochastic Lyapunov theory and stochastic Hamilton–Jacobi–Bellman theory within a unified perspective. We demonstrate that the closed-loop nonlinear system’s asymptotic stability in probability is ensured through a Lyapunov function, identified as the solution to the steady-state form of the stochastic Hamilton–Jacobi–Bellman equation. This dual assurance guarantees both stochastic stability and optimality. Additionally, optimal feedback controllers for affine nonlinear systems are developed using an inverse optimality framework tailored to the stochastic stabilization problem. Furthermore, the paper derives stability margins for optimal and inverse optimal stochastic feedback regulators. Gain, sector, and disk margin guarantees are established for nonlinear stochastic dynamical systems controlled by nonlinear optimal and inverse optimal Hamilton–Jacobi–Bellman controllers.

Expanding on the findings of [15,16,18,19], this paper introduces a framework for analyzing and designing feedback controllers for nonlinear stochastic dynamical systems.Specifically, it addresses a feedback stochastic optimal control problem with a nonlinearnonquadratic performance measure over an infinite horizon.The key is the connection between the performance measure and a Lyapunov function, ensuring asymptotic stability in probability for the nonlinear closed-loop system.The framework establishes the groundwork for extending linear-quadratic control to nonlinear-nonquadratic problems in stochastic dynamical systems.
The focus lies on the role of the Lyapunov function in ensuring stochastic stability and its seamless connection to the steady-state solution of the stochastic Hamilton-Jacobi-Bellman equation, characterizing the optimal nonlinear feedback controller.To simplify the solution of the stochastic steady-state Hamilton-Jacobi-Bellman equation, the paper adopts an approach of parameterizing a family of stochastically stabilizing controllers.This corresponds to addressing an inverse optimal stochastic control problem [20][21][22][23][24][25][26].
The inverse optimal control design approach constructs the Lyapunov function for the closed-loop system, serving as an optimal value function.It achieves desired stability margins, particularly for nonlinear inverse optimal controllers minimizing a meaningful nonlinear-nonquadratic performance criterion.The paper derives stability margins for optimal and inverse optimal nonlinear stochastic feedback regulators, considering gain, sector, and disk margin guarantees.These guarantees are obtained for nonlinear stochastic dynamical systems controlled by nonlinear optimal and inverse optimal Hamilton-Jacobi-Bellman controllers.
Furthermore, the paper establishes connections between stochastic stability margins, stochastic meaningful inverse optimality, and stochastic dissipativity [27,28], showcasing the equivalence between stochastic dissipativity and optimality for stochastic dynamical systems.Specifically, utilizing extended Kalman-Yakubovich-Popov conditions characterizing stochastic dissipativity, our optimal feedback control law satisfies a return difference inequality predicated on the infinitesimal generator of a controlled Markov diffusion process, connecting optimality to stochastic dissipativity with a specific quadratic supply rate.This integrated framework provides a comprehensive understanding of optimal nonlinear control strategies for stochastic dynamical systems, encompassing stability, optimality, and dissipativity considerations.

Mathematical Preliminaries
We start by reviewing some basic results on nonlinear stochastic dynamical systems [29][30][31][32].First, however, we require some definitions.A sample space Ω is the set of possible outcomes of an experiment.Given a sample space Ω, a σ-algebra F on Ω is a collection of subsets of Ω such that ∅ ∈ F , if F ∈ F , then Ω \ F ∈ F , and if F i ∈ F , i ∈ N, then ∞ i=1 F i ∈ F .The pair (Ω, F ) is called a measurable space and the probability measure P defined on (Ω, F ) is a function P : F → [0, 1] such that P(Ω) = 1.Furthermore, if F 1 , F 2 , . . .∈ F and F i ∩ F j = ∅, i ̸ = j, then P( ∞ i=1 F i ) = ∑ ∞ i=1 P(F i ).The triple (Ω, F , P) is called a probability space.The subsets of Ω belonging to F are called F -measurable sets.A probability space is complete if every subset of every null set is measurable.
The σ-algebra generated by the open sets in R n , denoted by B n , is called the Borel σalgebra and the elements B of B n are called Borel sets.Given the probability space (Ω, F , P), a random variable is a real-valued mapping x : Ω → R such that {ω ∈ Ω : x(ω) ∈ B} ∈ F for all Borel sets B ⊆ R n .That is, x is F -measurable.A stochastic process {x(t) : t ∈ R + } is a collection of random variables defined on the complete probability space (Ω, F , P) indexed by the set R + that take values on a common measurable space (S, Σ).Since t ∈ R + , we say that {x(t) : t ∈ R + } is a continuous-time stochastic process.
Occasionally, we write x(t, ω) for x(t) to denote the explicit dependence of the random variable x(t) on the outcome ω ∈ Ω.For every fixed time t ∈ R + , the random variable ω → x(t, ω) assigns a vector to every outcome ω ∈ Ω, and for every fixed ω ∈ Ω, the mapping t → x(t, ω) generates a sample path of the stochastic process x(•), where for convenience we write x(•) to denote the stochastic process {x(t) : t ∈ R + }.In this paper, S = R n and Σ = B n .
A filtration {F t : t ≥ 0} on (Ω, F , P) is a collection of sub-σ-fields of F , indexed by R + , such that F s ⊆ F t , 0 ≤ s ≤ t.A filtration is complete if F 0 contains the (F , P)-negligible sets.The stochastic process x(•) is progressively measurable with respect to {F t : t ≥ 0} if, for every t ≥ 0, the map (s, ω) → x(s, ω) defined on [0, t] × Ω is B([0, t]) × F t -measurable, where B(A) denotes the Borel σ-algebra on A. The stochastic process x(•) is said to be adapted with respect to {F t : t ≥ 0}, or simply F t -adapted, if x(t) is F t -measurable for every t ≥ 0. An adapted stochastic process with right continuous (or left continuous) sample paths is progressively measurable [33].We say that a stochastic process satisfies the Markov property if the conditional probability distribution of the future states of the stochastic process depends only on the present state.
In this paper, we consider controlled stochastic dynamical systems G of the form where ( 1) is a stochastic differential equation and ( 2) is an output equation.The stochastic processes x(•), u(•), and y(•) represent the system state, input, and output, respectively.
Here, U is a set of admissible inputs that contains the input processes u(•) that can be applied to the system, x 0 is a random system initial condition vector, and w(•) is a d-dimensional Brownian motion process.For every t ≥ 0, the random variables x(t), u(t), and y(t) take values in the state space R n , the control space R m , and the output space R l , respectively.The measurable mappings , and H : R n × R m → R l are known as the system drift, diffusion, and output functions.
The stochastic differential Equation ( 1) is interpreted as a way of expressing the integral equation where the first integral in (3) is a Lebesgue integral and the second integral is an Itô integral [34].When considering processes whose initial condition is a fixed deterministic point rather than a distribution, we will find it convenient to introduce the notation x s,x 0 (t) to denote the solution process at time t when the initial condition at time s is the fixed point x 0 ∈ R n almost surely.Similarly, P x 0 [ • ] and E x 0 [ • ] denote probability and expected value, respectively, given that the initial condition x(0) is the fixed point x 0 ∈ R n almost surely.
Let (Ω, F , {F t : t ≥ 0}, P) be a fixed complete filtered probability space, w(•) be a F tadapted Brownian motion, u(•) be a R m -valued F t -progressively measurable input process, and x 0 be a F 0 -measurable initial condition.A solution to (1) with input u(•) is a R n -valued F t -adapted process x(•) with continuous sample paths such that the integrals in (3) exist and (3) holds almost surely (a.s.) for all t ≥ 0. For a Brownian motion disturbance, input process, and initial condition given in a prescribed probability space, the solution to (3) is known as a strong solution [35].In this paper, we focus on strong solutions, and we will simply use the term "solution" to refer to a strong solution.A solution to (1) is unique if for any two solutions x 1 (•) and x 2 (•) that satisfy (1), x 1 (t) = x 2 (t) for all t ≥ 0 almost surely.
We assume that every u(•) ∈ U is a R m -valued Markov control process.An input process u(•) is a Markov control process if there exists a function ϕ : R + × R n → R m such that u(t) = ϕ(t, x(t)), t ≥ 0. Note that the class of Markov controls encompasses both time-varying inputs (i.e., possibly open-loop control input processes) as well as state-dependent inputs (i.e., possibly a state feedback control input u(t) = ϕ(x(t)), where ϕ : R n → R m is a feedback control law).If u(•) is a Markov control process, then the stochastic differential Equation ( 1) is an Itô diffusion, and if its solution is unique, then the solution is a Markov process.
For an Itô diffusion system with solution x(•), the (infinitesimal) generator A of x(•) is an operator acting on the continuous function V : R n → R and is defined as ( [31]) The set of functions V : R n → R for which the limit in (4) exists is denoted by and where we write V ′ (x) for the gradient of V at x and V ′′ (x) for the Hessian of V at x.Note that the differential operator L introduced in (5) is defined for every V ∈ C 2 (R n ) and it is characterized by the system drift and diffusion functions.We will refer to the differential operator L as the (infinitesimal) generator of the system G.However, if discontinuous control inputs in the state variables are considered, then the concept of the extended generator [36] should be used.
If V ∈ C 2 (R n ), then it follows from Itô's formula [35] that the stochastic process {V(x(t) If the terms appearing in ( 6) are integrable and the Itô integral in ( 6) is a martingale, then it follows from ( 6) that The next result is standard and establishes existence and uniqueness of solutions for the controlled Itô diffusion system (1).Theorem 1 ([32]).Consider the stochastic dynamical system (1) with initial condition x 0 such that E[∥x 0 ∥ p ] < ∞, p ∈ N. Let u(•) ∈ U be a Markov control process given by u(t) = ϕ(t, x(t)), t ≥ 0, such that the following conditions hold: (i) Local Lipschitz continuity.For every a ≥ 0, there exists a constant K a > 0 such that ∥F(x, ϕ(t, x)) − F(y, ϕ(t, y))∥ + ∥D(x, ϕ(t, x)) − D(y, ϕ(t, y))∥ ≤ K a ∥x − y∥, (8) for every x, y ∈ R n with ∥x∥ + ∥y∥ ≤ a, and every t ≥ 0. (ii) Linear growth.There exists a constant K > 0 such that, for all x ∈ R n and t ≥ 0, Then, there exists a unique solution to (1) with input u(•).Furthermore, Assumption 1.For the remainder of the paper we assume that the conditions for existence and uniqueness given in Theorem 1 are satisfied for the system (1) and (2).

Stability Theory for Stochastic Dynamical Systems
Given a feedback control law ϕ, the closed-loop system (1) takes the form where, for convenience, we have defined the closed-loop drift function f (x) ≜ F(x, ϕ(x)) and we have omitted the dependence of D on its second parameter so that D(x) ≜ D(x, ϕ(x)).In this case, the infinitesimal generator of the closed-loop system (11) is given by Next, we define the notion of stochatic stability for the closed loop system (11).An equilibrium point of (11) is a point x e ∈ R n such that f (x e ) = 0 and D(x e ) = 0.If x e is an equilibrium point of (11), then the constant stochastic process x(•) ≡ x e is a solution of (11) with initial condition x(0) = x e .The following definition introduces several notions of stability in probability for the equilibrium solution x(•) ≡ x e of the stochastic dynamical system (11).Here, the initial condition x 0 is assumed to be a constant, and hence, whenever we write x 0 ∈ R n we mean that x 0 is a constant vector.It is important to note that if we assume that x 0 is a F 0 random vector, then we replace x 0 ∈ B δ (x e ) with x 0 ∈ B δ (x e ) almost surely in Definition 1.As shown in [32, p. 111] this is without loss of generality in addressing stochastic stability of an equilibrium point.
Definition 1 ([29,32]).(i) The equilibrium solution x(•) ≡ x e to (11) is Lyapunov stable in probability if, for every ε > 0, lim Equivalently, the equilibrium solution x(•) ≡ x e to (11) is Lyapunov stable in probability if, for every ε > 0 and ρ ∈ (0, 1), there exists δ = δ(ρ, ε) > 0 such that, for all x 0 ∈ B δ (x e ), (ii) The equilibrium solution x(•) ≡ x e to (11) is asymptotically stable in probability if it is Lyapunov stable in probability and lim Equivalently, the equilibrium solution x(•) ≡ x e to (11) is asymptotically stable in probability if it is Lyapunov stable in probability and, for every ρ ∈ (0, 1), there exists δ = δ(ρ) > 0 such that if x 0 ∈ B δ (x e ), then (iii) The equilibrium solution x(•) ≡ x e to (11) is globally asymptotically stable in probability if it is Lyapunov stable in probability and, for all x 0 ∈ R n , (iv) The equilibrium solution x(•) ≡ x e to (11) is exponentially p-stable in probability if there exist scalars α, β, and δ > 0, and p ≥ 1 such that if x 0 ∈ B δ (x e ), then If, in addition, (18) holds for all x 0 ∈ R n , then the equilibrium solution x(•) ≡ x e to (11) is globally exponentially p-stable in probability.Finally, if p = 2, we say that the equilibrium solution x(•) ≡ x e to (11) is globally exponentially mean square stable in probability.
We now provide sufficient conditions for local and global asymptotic stability in probability for the nonlinear stochastic dynamical system (11).Theorem 2 ([29]).Let D be an open subset containing the point x e .Consider the nonlinear stochastic dynamical system (11) and assume that there exists a two-times continuously differentiable function LV(x) ≤ 0, x ∈ D.
The equilibrium solution x(•) ≡ x e to (11) is then Lyapunov stable in probability.If, in addition, then the equilibrium solution x(•) ≡ x e to (11) is asymptotically stable in probability.Finally, if, in addition, D = R n and V is radially unbounded, then the equilibrium solution x(•) ≡ x e to (11) is globally asymptotically stable in probability.
Finally, the next result gives a Lyapunov theorem for global exponential stability in probability.

Theorem 3 ([29]
).Consider the nonlinear stochastic dynamical system (11) and assume that there exist a two-times continuously differentiable function V : R n → R and scalars α, β, γ > 0, and p ≥ 1 such that Then the equilibrium solution x(•) ≡ x e to (11) is globally exponentially p-stable in probability.

Dissipativity Theory for Stochastic Dynamical Systems
In this section, we recall several key results from [28] on stochastic dissipativity that are necessary for several results of this paper.For the dynamical system G given by ( 1) and (2), a function r : R m × R l → R is called a supply rate if, for all t ≥ 0 and u(•) ∈ U , Definition 2 ([28]).A nonlinear stochastic dynamical system G given by (1) and (2) is stochastically dissipative with respect to the supply rate r if there exists a nonnegative-definite measurable function V s : R n → R, called a storage function, such that the stochastic process {V s (x(t)) − t 0 r(u(s), y(s)) ds : t ≥ 0} is a supermartingale, where x(•) is the solution to (1) with u(•) ∈ U .In this case, for all t 1 ≤ t 2 , or, equivalently, since (25) holds, The next result shows that if the system storage function V s is two-times continuously differentiable, then, under certain regularity conditions, stochastic dissipativity given by the energetic dissipation inequality in expectation (27) can be characterized by the infinitesimal generator LV s .Theorem 4 ([28]).Consider the nonlinear stochastic dynamical system G given by (1) and (2).Let V s ∈ C 2 (R n ) be nonnegative definite and let r be a supply rate for G. Assume that, for all u(•) ∈ U , the stochastic process { t 0 V ′ s (x(s))D(x(s), u(s)) dw(s) : t ≥ 0} is a martingale and E t 0 |LV s (x(s), u(s))| ds < ∞, t ≥ 0. Furthermore, assume that, for every x ∈ R n and u ∈ R m , there exists an input u u (•) ∈ U , with u u (0) = u, such that, with input u u (•) and deterministic initial condition x 0 = x, the mappings t → E[LV s (x(t), u u (t))] and t → E[r(u u (t), H(x(t), u u (t)))] are continuous at t = 0.Then, G is stochastically dissipative with respect to the supply rate r and with the storage function V s if and only if E[V s (x(t))] < ∞ for every t ≥ 0 and u(•) ∈ U , and The next theorem shows that the regularity conditions needed in Theorem 4 to characterize dissipativity using the power balance inequality (28) is satisfied for a broad-class of stochastic dynamical systems.For the statement of this result, we say that a function g : R n → R is of polynomial growth if there exist positive constants C and m such that and all its partial derivatives up to order r are of polynomial growth.
Theorem 5 ([28]).Consider the nonlinear stochastic dynamical system G given by (1) and ) be a nonnegative definite function, and let the set of admissible inputs U be a set of Markov control processes such that, for every u(•) ∈ U with ϕ : R + × R n → R m , there exist a positive constant m 1 and a continuous function Assume that, for every u ∈ R m , the constant input u(t) ≡ u belongs to U and the mapping x → r(u, H(x, u)) is continuous on R n .Then, r is a supply rate of G, and the stochastic process {V s (x(t)) : t ≥ 0} is integrable for every u(•) ∈ U .Furthermore G is stochastically dissipative with respect to the supply rate r and with the storage function V s if and only if (28) holds.
Theorem 4 gives an equivalent characterization for stochastic dissipativity as defined by the energetic (i.e., supermartingale) Definition 2 using the power balance inequality (28).The energetic (i.e., supermartingale) definition of dissipativity requires the verification of ( 26) which is sample path dependent and can be difficult to verify in practice, whereas (28) is an algebraic condition for dissipativity involving a local power balance inequality using the system drift and diffusion functions of the stochastic dynamical system.This equivalence holds under the regularity conditions stated in Theorem 4.

Assumption 2.
For the rest of the paper we assume that the regularity conditions for the equivalence between the supermartingale definition of dissipativity (27) and the power balance inequality (28) are satisfied.That is, we assume that (1) and (2) is dissipative if and only if (28) holds.Note that Theorem 5 gives sufficient conditions for the regularity conditions to hold by imposing polynomial growth contraints on the storage and supply rate functions.

Connections Between Stability Analysis and Nonlinear-Nonquadratic Performance Evaluation
In this section, we provide connections between stochastic Lyapunov functions and nonlinear-nonquadratic performance evaluation.Specifically, we present sufficient conditions for stability and performance for a given nonlinear stochastic dynamical system with a nonlinear-nonquadratic performance measure.As in deterministic theory [15,16], the cost functional can be explicitly evaluated as long as it is related to an underlying Lyapunov function.For the following result, let f : R n → R n and D : R n → R n×d be such that f (0) = 0 and D(0) = 0. Theorem 6.Consider the nonlinear stochastic dynamical system given by (11) with nonlinearnonquadratic performance measure where x(•) is the solution to (11).Furthermore, assume that there exists a two-times continuously differentiable radially unbounded function Then the zero solution x(•) ≡ 0 to (11) is globally asymptotically stable in probability and Proof.Conditions ( 32)-( 34) are a restatement of ( 19)-( 21).This, along with V being radially unbounded, imply that the zero solution x(•) ≡ 0 of ( 11) is globally asymptotically stable in probability by Theorem 2.
Next, we show that the stochastic process and F t -adapted because of the measurability of the mappings involved and the properties of the process x(•).Now, using Tonelli's theorem [37] it follows that, for all t ≥ 0, for some positive constants α and β, and hence, the Itô integral is a martingale.To arrive at (37) we used the fact that V ∈ C 1 p (R n ), the linear growth condition (9), and the finiteness of the expected value of the supremum of the moments of the system state (10).Note that the supremum in (37) exists because of the continuity of the sample paths of x(•).
It follows from ( 35) and Itô's lemma [31] that, for all t ≥ 0, Taking the expected value operator on both sides of (38) and using the martingale property of the stochastic integral in ( 38) yields where ) and ( 10) holds.Now, taking the limit as where we used the fact that global asymptotic stability in probability implies that V(x(t)) is a nonnegative supermartingale and lim t→∞ V(x(t)) = 0 almost surely [29], and, by Theorem 5.1 of [29], Finally, note that where the interchanging of the integration with the expectation operator in (40) follows from the Lebesgue monotone convergence theorem [38] by noting that t 0 L(x(s)) ds, t ≥ 0, is monotone increasing, and hence, converges pointwise to lim t→∞ t 0 L(x(s)) ds, and noting that, by (34) and (35) Next, we specialize Theorem 6 to linear stochastic systems.For this result, let A ∈ R n×n , let σ ∈ R d , and let R ∈ R n×n be a positive-definite matrix.
Corollary 1.Consider the linear stochastic dynamical system with multiplicative noise given by dx(t) = Ax(t)dt + x(t)σ T dw(t), x(0) = x 0 a.s., t ≥ 0, (41) and with quadratic performance measure Furthermore, assume that there exists a positive-definite matrix P ∈ R n×n such that Then, the zero solution x(•) ≡ 0 to (41) is globally asymptotically stable in probability and Proof.The result is a direct consequence of Theorem 6 with f (x) = Ax, D(x) = xσ T , L(x) = x T Rx, and V(x) = x T Px.Specifically, conditions (32), (33), and V being twotimes continuously differentiable, radially unbounded and of class C 1 p (R n ) are trivially satisfied.Now, and hence, it follows from (43) that conditions (34) and ( 35) hold.Thus, all the conditions of Theorem 6 are satisfied.
Note that ( 43) is a Lyapunov equation, and hence, for every positive-definite matrix R, there exists a positive definite matrix P satisfying (43) as long as A + 1 2 ∥σ∥ 2 I n is Hurwitz, and hence, the eigenvalues of A have real part less than − 1 2 ∥σ∥ 2 .Thus, a continuous-time linear stochastic system driven by a multiplicative Wiener process is globally asymptotically stable in probability if the spectral abscissa of A is less than − 1 2 ∥σ∥ 2 .

Optimal Nonlinear Feedback Control for Stochastic Systems
In this section, we consider a control problem involving a notion of optimality with respect to a nonlinear-nonquadratic cost functional.We use the results developed in Theorem 6 to characterize optimal feedback controllers that guarantee closed-loop global stabilization in probability.Specifically, sufficient conditions for optimality are given in a form that corresponds to a steady-state version of the stochastic Hamilton-Jacobi-Bellman equation.For the following result, let F : R n × R m → R n and D : R n × R m → R n×d be such that F(0, 0) = 0 and D(0, 0) = 0. Theorem 7. Consider the nonlinear stochastic dynamical system given by (1) with nonlinearnonquadratic performance measure where x(•) is the solution to (1) with control input u(•).Furthermore, assume that there exists a two-times continuously differentiable, radially unbounded function V ∈ C 1 p (R n ), and a feedback control law ϕ : R n → R m such that where Then, with the feedback control u(•) = ϕ(x(•)), the zero solution x(•) ≡ 0 of the closed-loop system (11) is globally asymptotically stable in probability, and In addition, the feedback control u(•) = ϕ(x(•)) minimizes (45) in the sense that where S(x 0 ) denotes the set of controllers given by S(x 0 ) ≜ u(•) : u(•) is admissible and x(•) given by (1) is such that Proof.Global asymptotic stability in probability is a direct consequence of ( 46)-( 49) by applying Theorem 6 to the closed-loop system (11).Furthermore, using (50), ( 53) is a restatement of (37) as applied to the closed-loop system.
Note that (50) is the steady-state version of the stochastic Hamilton-Jacobi-Bellman equation.To see this, recall that the stochastic Hamilton-Jacobi-Bellman equation is given by which characterizes the optimal control for stochastic time-varying systems on a finite or infinite time interval [30].For infinite horizon time-invariant systems, V(t, x) = V(x), and hence, (65) collapses to (50) and (51), which guarantee optimality with respect to the set of admissible controllers S(x 0 ).Note that an explicit characterization of the set S(x 0 ) is not required and the optimal stabilizing feedback control law u = ϕ(x) is independent of the initial condition x 0 .
Even though for an optimal controller u(•) = ϕ(x(•)) the transversality condition in (55) is satisfied, the transversality condition involves a sample path dependent condition that can be difficult to verify for an arbitrary control input u(•) ∈ S(x 0 ).The next theorem circumvents this problem by requiring additional restrictions on the cost integrand L and the Lyapunov function V. Theorem 8. Consider the nonlinear stochastic dynamical system given by (1) with the nonlinearnonquadratic performance measure (45) where for some positive constants γ and p ≥ 1. Assume that there exist a two-times continuously differentiable function V ∈ C 1 p (R n ) and a control law ϕ : R n → R m such that (48), (50), and (51) hold and, for positive constants α and β, Then, with the feedback control u(•) = ϕ(x(•)), the zero solution x(•) ≡ 0 of the closed-loop system (11) is globally exponentially p-stable in probability and (53) holds.In addition, the feedback control u(•) = ϕ(x(•)) minimizes (45) in the sense that where Ŝ (x 0 ) denotes the set of controllers given by Proof.Global exponential p-stability in probability is a direct consequence of (66), (67), and (50) by applying Theorem 3 to the closed-loop system (11).To show (53), (68), and , first note that Theorem 7 holds.Therefore, we need only show that with (66) and (67), S(x 0 ) = Ŝ (x 0 ).That is, any input u(•) with finite cost (and hence, belonging to Ŝ (x 0 )) automatically satisfies the transversality condition (and hence, belongs to S(x 0 )).
Next, we specialize Theorem 8 to linear stochastic dynamical systems and provide connections to the stochastic optimal linear-quadratic regulator problem with multiplicative noise.For the following result, let A ∈ R n×n , B ∈ R n×m , σ ∈ R d , and let R 1 ∈ R n and R 2 ∈ R m be given positive definite matrices.
Corollary 2. Consider the linear controlled stochastic dynamical system with multiplicative noise given by and with quadratic performance measure Furthermore, assume that there exists a positive-definite matrix P ∈ R n×n such that Then, with the feedback control u = ϕ(x) T Px, the zero solution x(•) ≡ 0 to (76) is globally exponentially mean-square stable in probability and where Ŝ (x 0 ) is the set of controllers defined in (69) for (76) and x 0 ∈ R n .
The optimal feedback control law ϕ in Corollary 2 is derived using the properties of H as defined in Theorem 7. Specifically, since

Inverse Optimal Stochastic Control
In this section, we specialize Theorem 7 to systems that are affine in the control.Specifically, we devise nonlinear feedback controllers within a stochastic optimal control framework, aiming to minimize a nonlinear-nonquadratic performance criterion.This is achieved by selecting the controller in such a way that the mapping of the infinitesimal generator of the Lyapunov function is negative definite along the sample trajectories of the closed-loop system.We also establish sufficient conditions for the existence of asymptotically stabilizing solutions (in probability) to the stochastic Hamilton-Jacobi-Bellman equation.Consequently, these findings present a set of globally stabilizing controllers, parameterized by the minimized cost functional.
The controllers developed in this section are based on an inverse optimal stochastic control problem [20][21][22][23][24][25][26].To simplify the solution of the stochastic steady-state Hamilton-Jacobi-Bellman equation, we do not attempt to minimize a given cost functional.Instead, we parameterize a family of stochastically stabilizing controllers that minimize a derived cost functional, offering flexibility in defining the control law.The performance integrand explicitly depends on the nonlinear system dynamics, the Lyapunov function for the closedloop system, and the stabilizing feedback control law.This coupling is introduced through the stochastic Hamilton-Jacobi-Bellman equation.Therefore, by adjusting parameters in the Lyapunov function and the performance integrand, the proposed framework can characterize a class of globally stabilizing controllers in probability, meeting specified constraints on the closed-loop system response.
Consider the nonlinear stochastic affine in the control dynamical system given by where f : R n → R n satisfies f (0) = 0, G : R n → R n×m , and D : R n → R n×d satisfies D(0) = 0. Furthermore, we consider performance integrands L of the form where L 1 : R n → R, L 2 : R n → R 1×m , and R 2 : R n → P m , and where P m denotes the set of m × m positive definite matrices, so that (45) becomes Theorem 9. Consider the nonlinear controlled affine stochastic dynamical system (81) with performance measure (83).Assume that there exists a two-times continuously differentiable, radially unbounded function V ∈ C 1 p (R n ) and a function L 2 : R n → R 1×m such that V(0) = 0, (84) L 2 (0) = 0, (86) Then the zero solution x(•) ≡ 0 of the closed-loop system is globally asymptotically stable in probability with the feedback control law and the performance measure (83), with is minimized in the sense that Proof.The result is a direct consequence of Theorem 7 with F(x, u) = f (x) + G(x)u, D(x, u) = D(x), and L(x, u) = L 1 (x) + L 2 (x)u + u T R 2 (x)u.Specifically, with (82) the Hamiltonian has the form Now, the feedback control law (90) is obtained by setting ∂H ∂u = 0.With (90), it follows that (84), (85), and (88) imply ( 46), (47), and (49), respectively.Next, since V is two-times continuously differentiable and x = 0 is a local minimum of V, it follows that V ′ (0) = 0, and hence, since by assumption L 2 (0) = 0, it follows that ϕ(0) = 0, which implies (48).Next, with L 1 given by (91) and ϕ given by ( 90), (50) holds.Finally, since ] and R 2 (x) is positive definite for all x ∈ R n , condition (51) holds.The result now follows as a direct consequence of Theorem 7.
Note that (88) is equivalent to with ϕ given by (90).Furthermore, conditions (84), (85), and (94) ensure that V is a Lyapunov function for the closed-loop system (89).As outlined in [16], it is crucial to acknowledge that the function L 2 present in the integrand of the performance measure (82) is a variable function of x ∈ R n constrained by conditions (86) and (88).Therefore, L 2 offers versatility in the selection of the control law.With L 1 given by (91) and ϕ given by (90), L is given by Since R 2 (x) > 0, x ∈ R n , the first term on the right-hand side of (95) is nonnegative, whereas (94) implies that the second, third, and fourth terms collectively are nonnegative.Thus, it follows that which shows that L may be negative.As a result, there may exist a control input u(•) for which the performance measure J(x 0 , u(•)) is negative.However, if the control u(•) is a regulation controller, that is, u(•) ∈ S(x 0 ), then it follows from (92) and (93) that Furthermore, in this case, substituting u = ϕ(x) into (95) yields which, by (94), is positive.
Example 1.To illustrate the utility of Theorem 9, we showcase an example involving global stabilization of a stochastic version of the Lorentz equations [43].These equations model fluid convection and are known to exhibit chaotic behavior.To construct inverse optimal controllers for the controlled Lorentz stochastic dynamical system consider the system where α, r, b, σ 1 , σ 2 , and σ 3 > 0. Note that (99)-( 101) can be written in the form of (81) with In order to design an inverse optimal control law for the controlled Lorentz stochastic dynamical system (99)-( 101) consider the quadratic Lyapunov function candidate given by satisfies (88); that is, Hence, the feedback control law ϕ(x) = −( p 1 p 2 α + r)x 1 given by (90) globally stabilizes the controlled Lorentz dynamical system (99)-(101).Furthermore, the performance functional (83), with is minimized in the sense of (92).
Figure 1 shows the mean along with the standard deviation of 1000 system sample paths with The next theorem is similar to Theorem 9 and is included here as it provides the basis for our stability margin results given in the next sections.
Theorem 10.Consider the nonlinear controlled affine stochastic dynamical system (81) with performance measure (83).Assume that there exists a two-times continuously differentiable, radially unbounded function L 2 (0) = 0, (107) Then the zero solution x(•) ≡ 0 of the closed-loop system is globally asymptotically stable in probability with the feedback control law and the performance functional (83) is minimized in the sense that Proof.The proof is identical to the proof of Theorem 9 and, hence, is omitted.

Relative Stability Margins for Optimal Nonlinear Stochastic Regulators
In this section, we establish relative stability margins for both optimal and inverse optimal nonlinear stochastic feedback regulators.Specifically, we derive sufficient conditions ensuring gain, sector, and disk margin guarantees for nonlinear stochastic dynamical systems under the control of nonlinear optimal and inverse optimal Hamilton-Jacobi-Bellman controllers.These controllers aim to minimize a nonlinear-nonquadratic performance criterion that includes cross-weighting terms.In the scenario where the cross-weighting term in the performance criterion is omitted, our findings align with the gain, sector, and disk margins derived for the deterministic optimal control problem outlined in [25].
Alternatively, by retaining the cross-terms in the performance criterion and specializing the optimal nonlinear-nonquadratic problem to a stochastic linear-quadratic problem featuring a noise disturbance, our results recover the corresponding gain and phase margins for the deterministic linear-quadratic optimal control problem as presented in [44].Despite the observed degradation of gain, sector, and disk margins due to the inclusion of cross-weighting terms, the added flexibility afforded by these terms enables the assurance of optimal and inverse optimal nonlinear controllers that can exhibit superior transient performance as compared to meaningful inverse optimal controllers.
To develop relative stability margins for nonlinear stochastic regulators consider the nonlinear stochastic dynamical system G given by where f : R n → R n satisfies f (0) = 0, G : R n → R n×m , D : R n → R n×d satisfies D(0) = 0, and ϕ : R n → R m is an admissible feedback controller such that G is globally asymptotically stable in probability with u = −y, with a nonlinear-nonquadratic performance criterion where L 1 : R n → R, L 2 : R n → R 1×m , and R 2 : R n → R m×m are given such that R 2 (x) > 0, x ∈ R n , and L 2 (0) = 0. Next, we define the relative stability margins for G given by ( 114) and (115).Specifically, let u c △ = −y, y c △ = u, and consider the negative feedback interconnection u = ∆(−y) of G and ∆ given in Figure 2, where ∆ is either a linear operator ∆(u c ) = ∆u c , a nonlinear static operator ∆(u c ) = σ(u c ), or a nonlinear dynamic operator ∆ with input u c and output y c .Furthermore, we assume that in the nominal case ∆ = I the nominal closed-loop system is globally asymptotically stable in probability.
Multiplicative input uncertainty of G and input operator ∆.
For the next two definitions, we assume that the system G and the nonlinear operator ∆ are asymptotically zero-state observable.Definition 6 ([16]).Let α, β ∈ R be such that 0 < α ≤ 1 ≤ β < ∞.Then the nonlinear stochastic dynamical system G given by ( 114) and ( 115) is said to have a disk margin (α, β) if the negative feedback interconnection of G and ∆ is globally asymptotically stable in probability for all dynamic operators ∆ such that ∆ is stochastically dissipative with respect to the supply rate r c (u c , and with a two-times continuously differentiable, positive definite storage function, where Then the nonlinear stochastic dynamical system G given by ( 114) and ( 115) is said to have a structured disk margin (α, β) if the negative feedback interconnection of G and ∆ is globally asymptotically stable in probability for all dynamic operators ∆ such that ∆(u c ) = diag[δ 1 (u c1 ), . . ., δ m (u cm )], and δ i , i = 1, . . ., m, is stochastically dissipative with respect to the supply rate r ci (u ci , ci and with a two-times continuously differentiable, positive definite storage function, where Note that if G has a disk margin (α, β), then G has gain and sector margins (α, β).The following lemma is needed for developing the main results of this section.Lemma 1.Consider the nonlinear stochastic dynamical system G given by ( 114) and (115), where ϕ is a stochastically stabilizing feedback control law given by (111) and where V satisfies Furthermore, suppose there exists θ ∈ R such that 0 < θ < 1 and Then, for all x ∈ R n and u ∈ R m .
Proof.Note that it follows from (117) and (118) that, for all x ∈ R n and u ∈ R m , which implies that This completes the proof.
Next, we present disk margins for the nonlinear-nonquadratic optimal regulator given by Theorem 10.We consider the case in which R 2 (x), x ∈ R n , is a constant diagonal matrix and the case in which it is not a constant diagonal matrix.
Theorem 11.Consider the nonlinear stochastic dynamical system G given by ( 114) and ( 115), where ϕ is the stochastically stabilizing feedback control law given by ( 111) and where V ∈ C 1 p (R n ) is a two-times continuously differentiable, radially unbounded function that satisfies (105)-(109).Assume that G is asymptotically zero state observable.If the matrix R 2 (x) = diag[r 1 , . . ., r m ], where r i > 0, i = 1, . . ., m, and there exists θ ∈ R such that 0 < θ < 1 and (118) is satisfied, then the nonlinear stochastic dynamical system G has a structured disk margin ( 1 1+θ , 1 1−θ ).If, in addition, R 2 (x) ≡ I and there exists θ ∈ R such that 0 < θ < 1 and (118) is satisfied, then the nonlinear stochastic dynamical system G has a disk margin ( Proof.Note that it follows from Lemma 1 that Hence, with the storage function V s (x) = 1 2 V(x), it follows from that G is stochastically dissipative with respect to the supply rate r(u, y) = u T R 2 y + 1−θ 2 2 u T R 2 u + 1 2 y T R 2 y.Now, the result is a direct consequence of Definitions 6 and 7, and the stochastic version of Corollary 6.2 given in [16] with α = 1 1+θ and β = 1 1−θ .
For the next result, define where R 2 (x) is such that γ < ∞ and γ > 0.
Theorem 12. Consider the nonlinear stochastic dynamical system G given by ( 114) and ( 115), where ϕ is the stochastically stabilizing feedback control law given by ( 111) and where V ∈ C 1 p (R n ) is a two-times continuously differentiable, radially unbounded function that satisfies (105)-(109).Assume that G is asymptotically zero-state observable.If there exists θ ∈ R such that 0 < θ < 1 and (118) is satisfied, then the nonlinear stochastic system G has a disk margin ( 1 1+ηθ , 1 1−ηθ ), where η Thus, with the storage function V s (x) = 1 2γ V(x), G is stochastically dissipative with respect to thesupply rate r(u, y) = u T y + 1−η 2 θ 2 2 u T u + 1 2 y T y.The result now is a direct consequence of Definition 6 and the stochastic version of Corollary 6.2 given in [16] with α = 1 1+ηθ and β = 1 1−ηθ .
It is important to note that Theorem 13 also holds in the case where (125) is replaced with (118) and with the additional assumption that (118) is radially unbounded.To see this, note that (127) can be written as where In this case, the result follows from Theorem 3.1 of [45].Furthermore, note that in the case where R 2 (x), x ∈ R n , is diagonal, Theorem 13 guarantees larger gain and sector margins to the gain and sector margin guarantees provided by Theorem 12.However, Theorem 13 does not provide disk margin guarantees.

Nonlinear Stochastic Feedback Regulators with Relative Stability Margins Guarantees
In this section, we give sufficient conditions that guarantee that a given nonlinear feedback controller has prespecified disk, sector, and gain margins.Proposition 1.Let θ ∈ (0, 1) and let R 2 ∈ R m×m be a positive-definite matrix.Consider the nonlinear stochastic dynamical system G given by ( 114) and ( 115), where ϕ is a stochastically stabilizing feedback control law.Then there exist functions V : R n → R, L 1 : R n → R, and ) is a two-times continuously differentiable, radially unbounded function such that V(0) = 0, V(x) > 0, x ∈ R n , x ̸ = 0, and, for all x ∈ R n , if and only if there exists a two-times continuously differentiable, radially unbounded function Proof.If there exist functions V : R n → R, L 1 : R n → R, and T and ( 129) and (130) are satisfied, then it follows from Lemma 1 that (131) is satisfied.
Conversely, if (131) is satisfied, then with Q = R 2 , S = R 2 , and R = (1 − θ 2 )R 2 , it follows from the stochastic version of Theorem 5.6 of [16] that, for all x ∈ R n , The result now follows with Note that if (129) and (130) are satisfied, then it follows from Theorem 9 that the feedback control law ϕ T minimizes the cost functional (83).Hence, Proposition 1 provides necessary and sufficient conditions for optimality of a given stochastically stabilizing feedback control law with prespecified disk margin guarantees.
The following result presents specific disk margin guarantees for inverse optimal controllers.Theorem 14.Let θ ∈ (0, 1) be given.Consider the nonlinear stochastic dynamical system G given by ( 114) and ( 115), where ϕ is a stochastically stabilizing feedback control law.Assume that G is asymptotically zero-state observable, there exist functions V : R n → R and R 2 : R n → R m×m such that V ∈ C 1 p (R n ) is a two-times continuously differentiable, radially unbounded function, R 2 (x) > 0, x ∈ R n , and Then the nonlinear stochastic dynamical system G has a disk margin ( 1 1+ηθ , 1 1−ηθ ), where η = γ/ γ and γ and γ are given by (123).Furthermore, with the feedback control law ϕ the performance functional is minimized in the sense that Proof.The result is a direct consequence of Theorems 9 and 12 with L 1 ).Specifically, in this case, all the conditions of Theorem 9 are trivially satisfied.Furthermore, note that (135) is equivalent to (118).The result now follows from Theorems 9 and 12.
The next result provides sufficient conditions that guarantee that a given nonlinear feedback controller has prespecified gain and sector margins.Theorem 15.Let θ ∈ (0, 1) be given.Consider the asymptotically zero-state observable nonlinear stochastic dynamical system G given by ( 114) and (115), where ϕ is a stochastically stabilizing feedback control law.Assume there exist functions R 2 (x) = diag[r 1 (x), . . ., r m (x)], where r i : R n → R, r i (x) > 0, i = 1, . . ., m, and V ∈ C 1 p (R n ) is a two-times continuously differentiable, radially unbounded function, and satisfies (132)-(135).Then the nonlinear stochastic R 2 is diagonal.Then, with K = −R −1 2 (B T P + R 12 ), where P > 0 satisfies (142), the system (139) and (140) has structured disk margin (and hence, gain and sector) margin Proof.The result is a direct consequence of Theorem 11 with f Specifically, note that (142) is equivalent to (109).Now, with θ given by (144), it follows that (1 , and hence, (118) is satisfied so that all the conditions of Theorem 11 are satisfied.
The gain margins specified in Corollary 4 precisely match those presented in [44] for deterministic linear-quadratic optimal regulators incorporating cross-weighting terms in the performance criterion.Additionally, as Corollary 4 ensures structured disk margins of ( 1 1+θ , 1 1−θ ), it implies that the system possesses a phase margin ϕ defined as follows: or equivalently, In the scenario where R 12 = 0, deduced from (144), it follows that θ = 1.Consequently, Corollary 4 ensures a phase margin of ±60 • in each input-output channel.Additionally, stipulating R 1 ≥ 0 leads to the conclusion, based on Corollary 4, that the system described by ( 139) and (140) possesses a gain and sector margin of ( 1 2 , ∞).

Stability Margins and Meaningful Inverse Optimality
In this section, we establish explicit links between stochastic stability margins, stochastic meaningful inverse optimality, and stochastic dissipativity, focusing on a specific quadratic supply rate.More precisely, we derive a stochastic counterpart to the classical return difference inequality for continuous-time systems with continuously differentiable flows [21,46] in the context of stochastic dynamical systems.Furthermore, we establish connections between stochastic dissipativity and optimality for stochastic nonlinear controllers.Notably, we demonstrate the equivalence between stochastic dissipativity and optimality in the realm of stochastic dynamical systems.Specifically, we illustrate that an optimal nonlinear feedback controller ϕ, satisfying a return difference condition based on the infinitesimal generator of a controlled Markov diffusion process, is equivalent to the stochastic dynamical system-with input u and output y = −ϕ(x)-being stochastically dissipative with respect to a supply rate expressed as Here, we assume that L(x, u) is nonnegative for all (x, u) ∈ R n × R m , which, in the terminology of [25,47], corresponds to a meaningful cost functional.Furthermore, we assume L 2 (x) ≡ 0 and L 1 (x) ≥ 0, x ∈ R n , and is radially unbounded.In this case, we establish connections between stochastic dissipativity and optimality for nonlinear stochastic controllers.The first result specializes Theorem 10 to the case in which L 2 (x) ≡ 0. Theorem 16.Consider the nonlinear stochastic dynamical system (114) with performance functional (83) with L 2 (x) ≡ 0 and L 1 (x) ≥ 0, x ∈ R n .Assume that there exists a two-times continuously differentiable, radially unbounded function V(x) > 0, x ∈ R n , x ̸ = 0, (148) 0 = L 1 (x) + V ′ (x) f (x) + 1 2 tr D T (x)V ′′ (x)D(x) Then the zero solution x(•) ≡ 0 of the closed-loop system dx(t) = [ f (x(t)) + G(x(t))ϕ(x(t))]dt + D(x(t))dw(t), x(0) = x 0 , t ≥ 0. ( 150) is globally asymptotically stable in probability with the feedback control law and the performance functional (83) is minimized in the sense that J(x 0 , ϕ(x(•))) = min u(•)∈S (x 0 ) J(x 0 , u(•)), x 0 ∈ R n .(152) Finally, J(x 0 , ϕ(x(•))) = V(x 0 ), Proof.The proof is similar to the proof of Theorem 9, and hence, is omitted.
Next, we show that for a given nonlinear stochastic dynamical system G given by ( 114) and ( 115), there exists an equivalence between optimality and stochastic dissipativity.For the following result we assume that for a given nonlinear stochastic system (114), if there exists a feedback control law ϕ that minimizes the performance functional (83) with R 2 (x) ≡ I, L 2 (x) ≡ 0, and L 1 (x) ≥ 0, x ∈ R n , then there exists a two-times continuously differentiable, radially unbounded function V ∈ C 1 p (R n ) such that (149) is satisfied.
Theorem 17.Consider the nonlinear stochastic dynamical system G given by ( 114) and (115).The feedback control law u = ϕ(x) is optimal with respect to a performance functional (82) with R 2 (x) ≡ I, L 2 (x) ≡ 0, and L 1 (x) ≥ 0, x ∈ R n , if and only if the nonlinear stochastic system G is stochastically dissipative with respect to the supply rate r(u, y) = y T y + 2u T y and has a two-times continuously differentiable positive-definite, radially unbounded storage function V ∈ C 1 p (R n ).
Proof.If the control law ϕ is optimal with respect to a performance functional (82) with R 2 (x) ≡ I, L 2 (x) ≡ 0, and L 1 (x) ≥ 0, x ∈ R n , then, by assumption, there exists a two-times continuously differentiable, radially unbounded function V ∈ C 1 p (R n ) such that (149) is satisfied.Hence, it follows from Proposition 1 that which implies that G is stochastically dissipative with respect to the supply rate r(u, y) = y T y + 2u T y.
Conversely, if G is stochastically dissipative with respect to the supply rate r(u, y) = y T y + 2u T y and has a two-times continuously differentiable positive-definite storage function V ∈ C 1 p (R n ), then, with h(x) = −ϕ(x), J(x) ≡ 0, Q = I, R = 0, and S = 2I, it follows from the stochastic version of Theorem 5.6 of [16] that there exists a function ℓ : R n → R p such that ϕ(x) = − 1 2 G T (x)V ′T (x) and, for all x ∈ R n , 0 = V ′ (x) f (x) + 1 2 tr D T (x)V ′′ (x)D(x) − 1 4 V ′ (x)G(x)G T (x)V ′T (x) + ℓ T (x)ℓ(x).
The next result gives disk and structured disk margins for the nonlinear stochastic dynamical system G given by ( 114) and (115).

Conclusions
In this paper, we merged stochastic Lyapunov theory with stochastic Hamilton-Jacobi-Bellman theory to provide explicit connections between stability and optimality of nonlinear stochastic regulators.The proposed approach involves utilizing a steady-state stochastic Hamilton-Jacobi-Bellman framework to characterize optimal nonlinear feedback controllers wherein the notion of optimality is directly linked to a specified Lyapunov function, guaranteeing stability in probability for the closed-loop system.The derived results are then employed to establish inverse optimal feedback controllers for both affine nonlinear stochastic systems and linear stochastic systems.
Moreover, leveraging the concepts of stochastic stability and stochastic dissipativity theory, we developed sufficient conditions for gain, sector, and disk margin guarantees.These conditions apply to nonlinear stochastic dynamical systems controlled by both nonlinear optimal and inverse optimal regulators, minimizing a nonlinear-nonquadratic performance criterion.Furthermore, we established connections between stochastic dissipativity and optimality for nonlinear stochastic systems.The novelty of the proposed framework provides the foundation for extending linear-quadratic control for stochastic dynamical systems to nonlinear-nonquadratic problems.

3 Figure 1 .
Figure 1.Controlled system states versus time.The bold lines show the average states over 1000 sample paths, whereas the shaded area shows a one standard deviation from the average.