Nonlinear Optimal Control for Stochastic Dynamical Systems

Lanchares, Manuel; Haddad, Wassim M.

doi:10.3390/math12050647

Open AccessFeature PaperArticle

Nonlinear Optimal Control for Stochastic Dynamical Systems

by

Manuel Lanchares

and

Wassim M. Haddad

^*

School of Aerospace Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0150, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(5), 647; https://doi.org/10.3390/math12050647

Submission received: 22 January 2024 / Revised: 15 February 2024 / Accepted: 20 February 2024 / Published: 22 February 2024

(This article belongs to the Special Issue Dynamics and Control Theory with Applications)

Download

Browse Figures

Versions Notes

Abstract

This paper presents a comprehensive framework addressing optimal nonlinear analysis and feedback control synthesis for nonlinear stochastic dynamical systems. The focus lies on establishing connections between stochastic Lyapunov theory and stochastic Hamilton–Jacobi–Bellman theory within a unified perspective. We demonstrate that the closed-loop nonlinear system’s asymptotic stability in probability is ensured through a Lyapunov function, identified as the solution to the steady-state form of the stochastic Hamilton–Jacobi–Bellman equation. This dual assurance guarantees both stochastic stability and optimality. Additionally, optimal feedback controllers for affine nonlinear systems are developed using an inverse optimality framework tailored to the stochastic stabilization problem. Furthermore, the paper derives stability margins for optimal and inverse optimal stochastic feedback regulators. Gain, sector, and disk margin guarantees are established for nonlinear stochastic dynamical systems controlled by nonlinear optimal and inverse optimal Hamilton–Jacobi–Bellman controllers.

Keywords:

Lyapunov theory; stochastic optimal control; inverse optimality; relative stability margins

MSC:

37H05; 37H30; 93E20; 49L12

1. Introduction

Under specific circumstances, nonlinear controllers offer notable advantages compared to linear controllers. This is particularly evident when dealing with nonlinear plant dynamics and/or system measurements [1,2,3,4], nonadditive or non-Gaussian plant/measurement disturbances, nonquadratic performance measures [5,6,7,8,9], uncertain plant models [10,11,12], or constrained control signals/state amplitudes [13,14]. In the work by [15], the current status of deterministic continuous-time nonlinear-nonquadratic optimal control problems is presented, emphasizing the use of Lyapunov functions for stability and optimality (see [15,16,17] and the references therein).

Expanding on the findings of [15,16,18,19], this paper introduces a framework for analyzing and designing feedback controllers for nonlinear stochastic dynamical systems. Specifically, it addresses a feedback stochastic optimal control problem with a nonlinear-nonquadratic performance measure over an infinite horizon. The key is the connection between the performance measure and a Lyapunov function, ensuring asymptotic stability in probability for the nonlinear closed-loop system. The framework establishes the groundwork for extending linear-quadratic control to nonlinear-nonquadratic problems in stochastic dynamical systems.

The focus lies on the role of the Lyapunov function in ensuring stochastic stability and its seamless connection to the steady-state solution of the stochastic Hamilton–Jacobi–Bellman equation, characterizing the optimal nonlinear feedback controller. To simplify the solution of the stochastic steady-state Hamilton–Jacobi–Bellman equation, the paper adopts an approach of parameterizing a family of stochastically stabilizing controllers. This corresponds to addressing an inverse optimal stochastic control problem [20,21,22,23,24,25,26].

The inverse optimal control design approach constructs the Lyapunov function for the closed-loop system, serving as an optimal value function. It achieves desired stability margins, particularly for nonlinear inverse optimal controllers minimizing a meaningful nonlinear-nonquadratic performance criterion. The paper derives stability margins for optimal and inverse optimal nonlinear stochastic feedback regulators, considering gain, sector, and disk margin guarantees. These guarantees are obtained for nonlinear stochastic dynamical systems controlled by nonlinear optimal and inverse optimal Hamilton–Jacobi–Bellman controllers.

Furthermore, the paper establishes connections between stochastic stability margins, stochastic meaningful inverse optimality, and stochastic dissipativity [27,28], showcasing the equivalence between stochastic dissipativity and optimality for stochastic dynamical systems. Specifically, utilizing extended Kalman–Yakubovich–Popov conditions characterizing stochastic dissipativity, our optimal feedback control law satisfies a return difference inequality predicated on the infinitesimal generator of a controlled Markov diffusion process, connecting optimality to stochastic dissipativity with a specific quadratic supply rate. This integrated framework provides a comprehensive understanding of optimal nonlinear control strategies for stochastic dynamical systems, encompassing stability, optimality, and dissipativity considerations.

2. Mathematical Preliminaries

We start by reviewing some basic results on nonlinear stochastic dynamical systems [29,30,31,32]. First, however, we require some definitions. A sample space

Ω

is the set of possible outcomes of an experiment. Given a sample space

Ω

, a

σ

-algebra

F

on

Ω

is a collection of subsets of

Ω

such that

⌀ \in F

, if

F \in F

, then

Ω ∖ F \in F

, and if

F_{i} \in F, i \in N

, then

⋃_{i = 1}^{\infty} F_{i} \in F

. The pair (

Ω, F

) is called a measurable space and the probability measure

P

defined on (

Ω, F

) is a function

P : F \to [0, 1]

such that

P (Ω) = 1

. Furthermore, if

F_{1}, F_{2}, \dots \in F

and

F_{i} \cap F_{j} = ⌀

,

i \neq j

, then

P (⋃_{i = 1}^{\infty} F_{i}) = \sum_{i = 1}^{\infty} P (F_{i})

. The triple (

Ω, F, P

) is called a probability space. The subsets of

Ω

belonging to

F

are called

F

-measurable sets. A probability space is complete if every subset of every null set is measurable.

The

σ

-algebra generated by the open sets in

R^{n}

, denoted by

B^{n}

, is called the Borel σ-algebra and the elements

B

of

B^{n}

are called Borel sets. Given the probability space (

Ω, F, P

), a random variable is a real-valued mapping

x : Ω \to R

such that

{ω \in Ω : x (ω) \in B} \in F

for all Borel sets

B \subseteq R^{n}

. That is, x is

F

-measurable. A stochastic process

{x (t) : t \in {\bar{R}}_{+}}

is a collection of random variables defined on the complete probability space

(Ω, F, P)

indexed by the set

{\bar{R}}_{+}

that take values on a common measurable space

(S, Σ)

. Since

t \in {\bar{R}}_{+}

, we say that

{x (t) : t \in {\bar{R}}_{+}}

is a continuous-time stochastic process.

Occasionally, we write

x (t, ω)

for

x (t)

to denote the explicit dependence of the random variable

x (t)

on the outcome

ω \in Ω

. For every fixed time

t \in {\bar{R}}_{+}

, the random variable

ω \mapsto x (t, ω)

assigns a vector to every outcome

ω \in Ω

, and for every fixed

ω \in Ω

, the mapping

t \mapsto x (t, ω)

generates a sample path of the stochastic process

x (\cdot)

, where for convenience we write

x (\cdot)

to denote the stochastic process

{x (t) : t \in {\bar{R}}_{+}}

. In this paper,

S = R^{n}

and

Σ = B^{n}

.

A filtration

{F_{t} : t \geq 0}

on

(Ω, F, P)

is a collection of sub-

σ

-fields of

F

, indexed by

{\bar{R}}_{+}

, such that

F_{s} \subseteq F_{t}

,

0 \leq s \leq t

. A filtration is complete if

F_{0}

contains the

(F, P)

-negligible sets. The stochastic process

x (\cdot)

is progressively measurable with respect to

{F_{t} : t \geq 0}

if, for every

t \geq 0

, the map

(s, ω) \mapsto x (s, ω)

defined on

[0, t] \times Ω

is

B ([0, t]) \times F_{t}

-measurable, where

B (A)

denotes the Borel

σ

-algebra on

A

. The stochastic process

x (\cdot)

is said to be adapted with respect to

{F_{t} : t \geq 0}

, or simply

F_{t}

-adapted, if

x (t)

is

F_{t}

-measurable for every

t \geq 0

. An adapted stochastic process with right continuous (or left continuous) sample paths is progressively measurable [33]. We say that a stochastic process satisfies the Markov property if the conditional probability distribution of the future states of the stochastic process depends only on the present state.

The stochastic process

x (\cdot)

is a martingale with respect to the filtration

{F_{t} : t \geq 0}

if it is

F_{t}

-adapted,

E [| x (t) |] < \infty, t \geq 0

, and

E [x (t) | F_{s}] = x (s), 0 \leq s < t

, where

E [\cdot]

denotes expectation and

E [\cdot | \cdot]

denotes conditional expectation. If we replace the equality in

E [x (t) | F_{s}] = x (s)

with “≤" (respectively, “≥"), then

x (\cdot)

is a supermartingale (respectively, submartingale). For an additional discussion on stochastic processes, filtrations, and martingales, see [33].

In this paper, we consider controlled stochastic dynamical systems

G

of the form

\begin{matrix} d x (t) & = F (x (t), u (t)) d t + D (x (t), u (t)) d w (t), x (0) = x_{0}, u (\cdot) \in U, t \geq 0, \end{matrix}

(1)

\begin{matrix} y (t) & = H (x (t), u (t)), \end{matrix}

(2)

where (1) is a stochastic differential equation and (2) is an output equation. The stochastic processes

x (\cdot)

,

u (\cdot)

, and

y (\cdot)

represent the system state, input, and output, respectively. Here,

U

is a set of admissible inputs that contains the input processes

u (\cdot)

that can be applied to the system,

x_{0}

is a random system initial condition vector, and

w (\cdot)

is a d-dimensional Brownian motion process. For every

t \geq 0

, the random variables

x (t)

,

u (t)

, and

y (t)

take values in the state space

R^{n}

, the control space

R^{m}

, and the output space

R^{l}

, respectively. The measurable mappings

F : R^{n} \times R^{m} \to R^{n}

,

D : R^{n} \times R^{m} \to R^{n \times d}

, and

H : R^{n} \times R^{m} \to R^{l}

are known as the system drift, diffusion, and output functions.

The stochastic differential Equation (1) is interpreted as a way of expressing the integral equation

\begin{matrix} x (t) = x (0) + \int_{0}^{t} F (x (s), u (s)) d s + \int_{0}^{t} D (x (s), u (s)) d w (s), x (0) = x_{0}, t \geq 0, \end{matrix}

(3)

where the first integral in (3) is a Lebesgue integral and the second integral is an Itô integral [34]. When considering processes whose initial condition is a fixed deterministic point rather than a distribution, we will find it convenient to introduce the notation

x^{s, x_{0}} (t)

to denote the solution process at time t when the initial condition at time s is the fixed point

x_{0} \in R^{n}

almost surely. Similarly,

P^{x_{0}} [\cdot]

and

E^{x_{0}} [\cdot]

denote probability and expected value, respectively, given that the initial condition

x (0)

is the fixed point

x_{0} \in R^{n}

almost surely.

Let

(Ω, F, {F_{t} : t \geq 0}, P)

be a fixed complete filtered probability space,

w (\cdot)

be a

F_{t}

-adapted Brownian motion,

u (\cdot)

be a

R^{m}

-valued

F_{t}

-progressively measurable input process, and

x_{0}

be a

F_{0}

-measurable initial condition. A solution to (1) with input

u (\cdot)

is a

R^{n}

-valued

F_{t}

-adapted process

x (\cdot)

with continuous sample paths such that the integrals in (3) exist and (3) holds almost surely (a.s.) for all

t \geq 0

. For a Brownian motion disturbance, input process, and initial condition given in a prescribed probability space, the solution to (3) is known as a strong solution [35]. In this paper, we focus on strong solutions, and we will simply use the term “solution” to refer to a strong solution. A solution to (1) is unique if for any two solutions

x_{1} (\cdot)

and

x_{2} (\cdot)

that satisfy (1),

x_{1} (t) = x_{2} (t)

for all

t \geq 0

almost surely.

We assume that every

u (\cdot) \in U

is a

R^{m}

-valued Markov control process. An input process

u (\cdot)

is a Markov control process if there exists a function

ϕ : {\bar{R}}_{+} \times R^{n} \to R^{m}

such that

u (t) = ϕ (t, x (t)), t \geq 0

. Note that the class of Markov controls encompasses both time-varying inputs (i.e., possibly open-loop control input processes) as well as state-dependent inputs (i.e., possibly a state feedback control input

u (t) = ϕ (x (t))

, where

ϕ : R^{n} \to R^{m}

is a feedback control law). If

u (\cdot)

is a Markov control process, then the stochastic differential Equation (1) is an Itô diffusion, and if its solution is unique, then the solution is a Markov process.

For an Itô diffusion system with solution

x (\cdot)

, the (infinitesimal) generator

A

of

x (\cdot)

is an operator acting on the continuous function

V : R^{n} \to R

and is defined as ([31])

\begin{matrix} A V (x, u) = lim_{t \to 0^{+}} \frac{E [V (x (t)) | x (0) = x, u (t) = u] - V (x)}{t}, x \in R^{n}, u \in R^{m} . \end{matrix}

(4)

The set of functions

V : R^{n} \to R

for which the limit in (4) exists is denoted by

D_{A}

. If

V \in C^{2} (R^{n})

has compact support, where

C^{2} (R^{n})

denotes the space of two-times continuously differentiable functions on

R^{n}

, then

V \in D_{A}

and

A V (x, u) = L V (x, u)

, where

\begin{matrix} L V (x, u) ≜ V^{'} (x) F (x, u) + \frac{1}{2} tr [D^{T} (x, u) V^{″} (x) D (x, u)], x \in R^{n}, u \in R^{m}, \end{matrix}

(5)

and where we write

V^{'} (x)

for the gradient of V at x and

V^{″} (x)

for the Hessian of V at x.

Note that the differential operator

L

introduced in (5) is defined for every

V \in C^{2} (R^{n})

and it is characterized by the system drift and diffusion functions. We will refer to the differential operator

L

as the (infinitesimal) generator of the system

G

. However, if discontinuous control inputs in the state variables are considered, then the concept of the extended generator [36] should be used.

If

V \in C^{2} (R^{n})

, then it follows from Itô’s formula [35] that the stochastic process

{V (x (t)) : t \geq 0}

satisfies

\begin{matrix} V (x (t)) = V (x (0)) & + \int_{0}^{t} L V (x (s), u (s)) d s \\ + \int_{0}^{t} V^{'} (x (s)) D (x (s), u (s)) d w (s), t \geq 0 . \end{matrix}

(6)

If the terms appearing in (6) are integrable and the Itô integral in (6) is a martingale, then it follows from (6) that

\begin{matrix} E [V (x (t))] = E [V (x (0))] + E [\int_{0}^{t} L V (x (s), u (s)) d s], t \geq 0 . \end{matrix}

(7)

The next result is standard and establishes existence and uniqueness of solutions for the controlled Itô diffusion system (1).

Theorem 1

([32]). Consider the stochastic dynamical system (1) with initial condition

x_{0}

such that

E [‖ x_{0} ‖^{p}] < \infty, p \in N

. Let

u (\cdot) \in U

be a Markov control process given by

u (t) = ϕ (t, x (t)),

t \geq 0

, such that the following conditions hold:

(i): Local Lipschitz continuity. For every $a \geq 0$ , there exists a constant $K_{a} > 0$ such that

$\begin{matrix} ‖ F (x, ϕ (t, x)) - F (y, ϕ (t, y)) ‖ & + ‖ D (x, ϕ (t, x)) - D (y, ϕ (t, y)) ‖ \leq K_{a} ‖ x - y ‖, \end{matrix}$

(8)

for every $x, y \in R^{n}$ with $‖ x ‖ + ‖ y ‖ \leq a$ , and every $t \geq 0$ .
(ii): Linear growth. There exists a constant $K > 0$ such that, for all $x \in R^{n}$ and $t \geq 0$ ,

$‖ F (x, ϕ (t, x)) ‖ + ‖ D (x, ϕ (t, x)) ‖ \leq K (1 + ‖ x ‖) .$

(9)

Then, there exists a unique solution to (1) with input

u (\cdot)

. Furthermore,

E [{(sup_{0 \leq s \leq t} ‖ x (s) ‖)}^{p}] < \infty, t \geq 0, p \in N .

(10)

Assumption 1.

For the remainder of the paper we assume that the conditions for existence and uniqueness given in Theorem 1 are satisfied for the system (1) and (2).

3. Stability Theory for Stochastic Dynamical Systems

Given a feedback control law

ϕ

, the closed-loop system (1) takes the form

\begin{matrix} d x (t) = F (x (t), ϕ (x (t))) d t + D (x (t), ϕ (x (t))) d w (t) ≜ f (x (t)) + D (x (t)) d w (t), \\ x (0) = x_{0}, t \geq 0, \end{matrix}

(11)

where, for convenience, we have defined the closed-loop drift function

f (x) ≜ F (x, ϕ (x))

and we have omitted the dependence of D on its second parameter so that

D (x) ≜ D (x, ϕ (x))

. In this case, the infinitesimal generator of the closed-loop system (11) is given by

L V (x) = V^{'} (x) f (x) + \frac{1}{2} tr [D^{T} (x) V^{″} (x) D (x)], x \in R^{n} .

(12)

Next, we define the notion of stochatic stability for the closed loop system (11). An equilibrium point of (11) is a point

x_{e} \in R^{n}

such that

f (x_{e}) = 0

and

D (x_{e}) = 0

. If

x_{e}

is an equilibrium point of (11), then the constant stochastic process

x (\cdot) \equiv x_{e}

is a solution of (11) with initial condition

x (0) = x_{e}

. The following definition introduces several notions of stability in probability for the equilibrium solution

x (\cdot) \equiv x_{e}

of the stochastic dynamical system (11). Here, the initial condition

x_{0}

is assumed to be a constant, and hence, whenever we write

x_{0} \in R^{n}

we mean that

x_{0}

is a constant vector. It is important to note that if we assume that

x_{0}

is a

F_{0}

random vector, then we replace

x_{0} \in B_{δ} (x_{e})

with

x_{0} \in B_{δ} (x_{e})

almost surely in Definition 1. As shown in ([32], p. 111) this is without loss of generality in addressing stochastic stability of an equilibrium point.

Definition 1

([29,32]).

(i)

The equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is Lyapunov stable in probability if, for every

ε > 0

,

lim_{x_{0} \to x_{e}} P^{x_{0}} (sup_{t \geq 0} ∥ x (t) - x_{e} ∥ > ε) = 0 .

(13)

Equivalently, the equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is Lyapunov stable in probability if, for every

ε > 0

and

ρ \in (0, 1)

, there exists

δ = δ (ρ, ε) > 0

such that, for all

x_{0} \in B_{δ} (x_{e})

,

P^{x_{0}} (sup_{t \geq 0} ∥ x (t) - x_{e} ∥ > ε) \leq ρ .

(14)

(i i)

The equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is asymptotically stable in probability if it is Lyapunov stable in probability and

lim_{x_{0} \to x_{e}} P^{x_{0}} (lim_{t \to \infty} ∥ x (t) - x_{e} ∥ = 0) = 1 .

(15)

Equivalently, the equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is asymptotically stable in probability if it is Lyapunov stable in probability and, for every

ρ \in (0, 1)

, there exists

δ = δ (ρ) > 0

such that if

x_{0} \in B_{δ} (x_{e})

, then

P^{x_{0}} (lim_{t \to \infty} ∥ x (t) - x_{e} ∥ = 0) \geq 1 - ρ .

(16)

(

i i i)

The equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is globally asymptotically stable in probability if it is Lyapunov stable in probability and, for all

x_{0} \in R^{n}

,

P^{x_{0}} (lim_{t \to \infty} ∥ x (t) - x_{e} ∥ = 0) = 1 .

(17)

(

i v)

The equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is exponentially p-stable in probability if there exist scalars

α, β

, and

δ > 0

, and

p \geq 1

such that if

x_{0} \in B_{δ} (x_{e})

, then

E^{x_{0}} [∥ x (t) - x_{e} ∥^{p}] \leq α {∥ x_{0} - x_{e} ∥}^{p} e^{- β t}, t \geq 0 .

(18)

If, in addition, (18) holds for all

x_{0} \in R^{n}

, then the equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is globally exponentially p-stable in probability. Finally, if

p = 2

, we say that the equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is globally exponentially mean square stable in probability.

We now provide sufficient conditions for local and global asymptotic stability in probability for the nonlinear stochastic dynamical system (11).

Theorem 2

([29]). Let

D

be an open subset containing the point

x_{e}

. Consider the nonlinear stochastic dynamical system (11) and assume that there exists a two-times continuously differentiable function

V : D \to R

such that

\begin{matrix} V (x_{e}) = 0, \end{matrix}

(19)

\begin{matrix} V (x) > 0, x \in D, x \neq x_{e}, \end{matrix}

(20)

\begin{matrix} L V (x) \leq 0, x \in D . \end{matrix}

(21)

The equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is then Lyapunov stable in probability. If, in addition,

L V (x) < 0, x \in D, x \neq x_{e},

(22)

then the equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is asymptotically stable in probability. Finally, if, in addition,

D = R^{n}

and V is radially unbounded, then the equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is globally asymptotically stable in probability.

Finally, the next result gives a Lyapunov theorem for global exponential stability in probability.

Theorem 3

([29]). Consider the nonlinear stochastic dynamical system (11) and assume that there exist a two-times continuously differentiable function

V : R^{n} \to R

and scalars α, β,

γ > 0

, and

p \geq 1

such that

\begin{matrix} α ∥ x - x_{e} ∥^{p} \leq V (x) \leq β {∥ x - x_{e} ∥}^{p}, x \in R^{n}, \end{matrix}

(23)

\begin{matrix} L V (x) \leq - γ V (x), x \in R^{n} . \end{matrix}

(24)

Then the equilibrium solution

x (\cdot) \equiv x_{e}

to (11) is globally exponentially p-stable in probability.

4. Dissipativity Theory for Stochastic Dynamical Systems

In this section, we recall several key results from [28] on stochastic dissipativity that are necessary for several results of this paper. For the dynamical system

G

given by (1) and (2), a function

r : R^{m} \times R^{l} \to R

is called a supply rate if, for all

t \geq 0

and

u (\cdot) \in U

,

E [\int_{0}^{t} | r (u (s), y (s)) | d s] < \infty .

(25)

Definition 2

([28]). A nonlinear stochastic dynamical system

G

given by (1) and (2) is stochastically dissipative with respect to the supply rate rif there exists a nonnegative-definite measurable function

V_{s} : R^{n} \to R

, called a storage function, such that the stochastic process

{V_{s} (x (t)) - \int_{0}^{t} r (u (s), y (s)) d s : t \geq 0}

is a supermartingale, where

x (\cdot)

is the solution to (1) with

u (\cdot) \in U

. In this case, for all

t_{1} \leq t_{2}

,

\begin{matrix} E [V_{s} (x (t_{2})) - \int_{0}^{t_{2}} r (u (s), y (s)) d s | F_{t_{1}}] \leq V_{s} (x (t_{1})) - \int_{0}^{t_{1}} r (u (s), y (s)) d s, \end{matrix}

(26)

or, equivalently, since (25) holds,

\begin{matrix} E [V_{s} (x (t_{2})) | F_{t_{1}}] \leq V_{s} (x (t_{1})) + E [\int_{t_{1}}^{t_{2}} r (u (s), y (s)) d s | F_{t_{1}}] . \end{matrix}

(27)

The next result shows that if the system storage function

V_{s}

is two-times continuously differentiable, then, under certain regularity conditions, stochastic dissipativity given by the energetic dissipation inequality in expectation (27) can be characterized by the infinitesimal generator

L V_{s}

.

Theorem 4

([28]). Consider the nonlinear stochastic dynamical system

G

given by (1) and (2). Let

V_{s} \in C^{2} (R^{n})

be nonnegative definite and let r be a supply rate for

G

. Assume that, for all

u (\cdot) \in U

, the stochastic process

{\int_{0}^{t} V_{s}^{'} (x (s)) D (x (s), u (s)) d w (s) : t \geq 0}

is a martingale and

E [\int_{0}^{t} | L V_{s} (x (s), u (s)) | d s] < \infty, t \geq 0

. Furthermore, assume that, for every

x \in R^{n}

and

u \in R^{m}

, there exists an input

u_{u} (\cdot) \in U

, with

u_{u} (0) = u

, such that, with input

u_{u} (\cdot)

and deterministic initial condition

x_{0} = x

, the mappings

t \mapsto E [L V_{s} (x (t), u_{u} (t))]

and

t \mapsto E [r (u_{u} (t), H (x (t), u_{u} (t)))]

are continuous at

t = 0

. Then,

G

is stochastically dissipative with respect to the supply rate r and with the storage function

V_{s}

if and only if

E [V_{s} (x (t))] < \infty

for every

t \geq 0

and

u (\cdot) \in U

, and

L V_{s} (x, u) \leq r (u, H (x, u)), x \in R^{n}, u \in R^{m} .

(28)

The next theorem shows that the regularity conditions needed in Theorem 4 to characterize dissipativity using the power balance inequality (28) is satisfied for a broad-class of stochastic dynamical systems. For the statement of this result, we say that a function

g : R^{n} \to R

is of polynomial growth if there exist positive constants C and m such that

| g (x) | \leq {C (1 + ‖ x ‖}^{m}), x \in R^{n} .

(29)

For

V \in C^{r} (R^{n}), r \in N

, we write

V \in C_{p}^{r} (R^{n})

if V and all its partial derivatives up to order r are of polynomial growth.

Theorem 5

([28]). Consider the nonlinear stochastic dynamical system

G

given by (1) and (2). Let

r : R^{m} \times R^{l} \to R

, let

V_{s} \in C_{p}^{2} (R^{n})

be a nonnegative definite function, and let the set of admissible inputs

U

be a set of Markov control processes such that, for every

u (\cdot) \in U

with

ϕ : {\bar{R}}_{+} \times R^{n} \to R^{m}

, there exist a positive constant

m_{1}

and a continuous function

C_{1} : {\bar{R}}_{+} \to R_{+}

such that

\begin{matrix} | r (ϕ (t, x), H (x, ϕ (t, x))) | \leq C_{1} {(t) (1 + ‖ x ‖}^{m_{1}}), x \in R^{n}, t \geq 0 . \end{matrix}

(30)

Assume that, for every

u \in R^{m}

, the constant input

u (t) \equiv u

belongs to

U

and the mapping

x \mapsto r (u, H (x, u))

is continuous on

R^{n}

. Then, r is a supply rate of

G

, and the stochastic process

{V_{s} (x (t)) : t \geq 0}

is integrable for every

u (\cdot) \in U

. Furthermore

G

is stochastically dissipative with respect to the supply rate r and with the storage function

V_{s}

if and only if (28) holds.

Theorem 4 gives an equivalent characterization for stochastic dissipativity as defined by the energetic (i.e., supermartingale) Definition 2 using the power balance inequality (28). The energetic (i.e., supermartingale) definition of dissipativity requires the verification of (26) which is sample path dependent and can be difficult to verify in practice, whereas (28) is an algebraic condition for dissipativity involving a local power balance inequality using the system drift and diffusion functions of the stochastic dynamical system. This equivalence holds under the regularity conditions stated in Theorem 4.

Assumption 2.

For the rest of the paper we assume that the regularity conditions for the equivalence between the supermartingale definition of dissipativity (27) and the power balance inequality (28) are satisfied. That is, we assume that (1) and (2) is dissipative if and only if (28) holds. Note that Theorem 5 gives sufficient conditions for the regularity conditions to hold by imposing polynomial growth contraints on the storage and supply rate functions.

5. Connections between Stability Analysis and Nonlinear-Nonquadratic Performance Evaluation

In this section, we provide connections between stochastic Lyapunov functions and nonlinear-nonquadratic performance evaluation. Specifically, we present sufficient conditions for stability and performance for a given nonlinear stochastic dynamical system with a nonlinear-nonquadratic performance measure. As in deterministic theory [15,16], the cost functional can be explicitly evaluated as long as it is related to an underlying Lyapunov function. For the following result, let

f : R^{n} \to R^{n}

and

D : R^{n} \to R^{n \times d}

be such that

f (0) = 0

and

D (0) = 0

.

Theorem 6.

Consider the nonlinear stochastic dynamical system given by (11) with nonlinear-nonquadratic performance measure

J (x_{0}) \overset{▵}{=} E^{x_{0}} [\int_{0}^{\infty} L (x (t)) d t],

(31)

where

x (\cdot)

is the solution to (11). Furthermore, assume that there exists a two-times continuously differentiable radially unbounded function

V \in C_{p}^{1} (R^{n})

such that

\begin{matrix} V (0) = 0, \end{matrix}

(32)

\begin{matrix} V (x) > 0, x \in R^{n}, x \neq 0, \end{matrix}

(33)

\begin{matrix} L V (x) < 0, x \in R^{n}, x \neq 0, \end{matrix}

(34)

\begin{matrix} L (x) + L V (x) = 0, x \in R^{n} . \end{matrix}

(35)

Then the zero solution

x (\cdot) \equiv 0

to (11) is globally asymptotically stable in probability and

J (x_{0}) = V (x_{0}) .

(36)

Proof.

Conditions (32)–(34) are a restatement of (19)–(21). This, along with V being radially unbounded, imply that the zero solution

x (\cdot) \equiv 0

of (11) is globally asymptotically stable in probability by Theorem 2.

Next, we show that the stochastic process

{\int_{0}^{t} V^{'} (x (s)) D (x (s)) d w (s) : t \geq 0}

is a martingale. To see this, first note that the process

(t, ω) \mapsto V^{'} (x (t)) D (x (t))

is

B [0, \infty] \times F

-measurable and

F_{t}

-adapted because of the measurability of the mappings involved and the properties of the process

x (\cdot)

. Now, using Tonelli’s theorem [37] it follows that, for all

t \geq 0

,

\begin{matrix} \begin{matrix} E [\int_{0}^{t} {[V^{'} (x (s)) D (x (s))]}^{2} d s] & = \int_{0}^{t} E [{[V^{'} (x (s)) D (x (s))]}^{2}] d s \\ \leq \int_{0}^{t} E [α (1 + ‖ x (s) ‖^{β})] d s \\ \leq \int_{0}^{t} α (1 + E [{(sup_{0 \leq s \leq t} ‖ x (s) ‖)}^{β}]) d s \\ < \infty, \end{matrix} \end{matrix}

(37)

for some positive constants

α

and

β

, and hence, the Itô integral

\int_{0}^{t} V^{'} (x (s)) D (x (s), u (s)) d w (s), t \geq 0,

is a martingale. To arrive at (37) we used the fact that

V \in C_{p}^{1} (R^{n})

, the linear growth condition (9), and the finiteness of the expected value of the supremum of the moments of the system state (10). Note that the supremum in (37) exists because of the continuity of the sample paths of

x (\cdot)

.

It follows from (35) and Itô’s lemma [31] that, for all

t \geq 0

,

\begin{matrix} \int_{0}^{t} L (x (s)) d s & = \int_{0}^{t} - L V (x (s)) d s \\ = V (x (0)) - V (x (t)) + \int_{0}^{t} V^{'} (x (s)) D (x (s)) d w (s) . \end{matrix}

(38)

Taking the expected value operator on both sides of (38) and using the martingale property of the stochastic integral in (38) yields

\begin{matrix} E^{x_{0}} [\int_{0}^{t} L (x (s)) d s] = V (x_{0}) - E^{x_{0}} [V (x (t))], \end{matrix}

(39)

where

E^{x_{0}} [V (x (t))]

exists since

V \in C_{p}^{1} (R^{n})

and (10) holds. Now, taking the limit as

t \to \infty

yields

\begin{matrix} lim_{t \to \infty} E^{x_{0}} [\int_{0}^{t} L (x (s)) d s] = V (x_{0}) - lim_{t \to \infty} E^{x_{0}} [V (x (t))] = V (x_{0}), \end{matrix}

where we used the fact that global asymptotic stability in probability implies that

V (x (t))

is a nonnegative supermartingale and

{lim}_{t \to \infty} V (x (t)) = 0

almost surely [29], and, by Theorem 5.1 of [29],

lim_{t \to \infty} E^{x_{0}} [V (x (t))] = E^{x_{0}} [lim_{t \to \infty} V (x (t))] = 0 .

Finally, note that

\begin{matrix} J (x_{0}) & = E^{x_{0}} [lim_{t \to \infty} \int_{0}^{t} L (x (s)) d s] = lim_{t \to \infty} E^{x_{0}} [\int_{0}^{t} L (x (s)) d s] = V (x_{0}), \end{matrix}

(40)

where the interchanging of the integration with the expectation operator in (40) follows from the Lebesgue monotone convergence theorem [38] by noting that

\int_{0}^{t} L (x (s)) d s, t \geq 0

, is monotone increasing, and hence, converges pointwise to

{lim}_{t \to \infty} \int_{0}^{t} L (x (s)) d s

, and noting that, by (34) and (35),

L (x) \geq 0, x \in R^{n}

. □

Next, we specialize Theorem 6 to linear stochastic systems. For this result, let

A \in R^{n \times n}

, let

σ \in R^{d}

, and let

R \in R^{n \times n}

be a positive-definite matrix.

Corollary 1.

Consider the linear stochastic dynamical system with multiplicative noise given by

d x (t) = A x (t) d t + x (t) σ^{T} d w (t), x (0) = x_{0} a . s ., t \geq 0,

(41)

and with quadratic performance measure

J (x_{0}) \overset{▵}{=} E^{x_{0}} [\int_{0}^{\infty} x^{T} (t) R x (t) d t] .

(42)

Furthermore, assume that there exists a positive-definite matrix

P \in R^{n \times n}

such that

0 = {(A + \frac{1}{2} {∥ σ ∥}^{2} I_{n})}^{T} P + P (A + \frac{1}{2} {∥ σ ∥}^{2} I_{n}) + R .

(43)

Then, the zero solution

x (\cdot) \equiv 0

to (41) is globally asymptotically stable in probability and

J (x_{0}) = x_{0}^{T} P x_{0}, x_{0} \in R^{n} .

(44)

Proof.

The result is a direct consequence of Theorem 6 with

f (x) = A x

,

D (x) = x σ^{T}

,

L (x) = x^{T} R x

, and

V (x) = x^{T} P x

. Specifically, conditions (32), (33), and V being two-times continuously differentiable, radially unbounded and of class

C_{p}^{1} (R^{n})

are trivially satisfied. Now,

\begin{matrix} L V (x) & = V^{'} (x) f (x) + \frac{1}{2} tr [D^{T} (x) V^{''} (x) D (x)] \\ = x^{T} (A^{T} P + P A) x + \frac{1}{2} tr [{(x σ^{T})}^{T} 2 P (x σ^{T})] \\ = x^{T} (A^{T} P + P A) x + \frac{1}{2} tr [x^{T} 2 P x σ^{T} σ] \\ = x^{T} (A^{T} P + P A) x + x^{T} P x {∥ σ ∥}^{2} \\ = x^{T} [{(A + \frac{1}{2} {∥ σ ∥}^{2} I_{n})}^{T} P + P (A + \frac{1}{2} {∥ σ ∥}^{2} I_{n})] x, \end{matrix}

and hence, it follows from (43) that conditions (34) and (35) hold. Thus, all the conditions of Theorem 6 are satisfied. □

Note that (43) is a Lyapunov equation, and hence, for every positive-definite matrix R, there exists a positive definite matrix P satisfying (43) as long as

A + \frac{1}{2} {∥ σ ∥}^{2} I_{n}

is Hurwitz, and hence, the eigenvalues of A have real part less than

- \frac{1}{2} {∥ σ ∥}^{2}

. Thus, a continuous-time linear stochastic system driven by a multiplicative Wiener process is globally asymptotically stable in probability if the spectral abscissa of A is less than

- \frac{1}{2} {∥ σ ∥}^{2}

.

6. Optimal Nonlinear Feedback Control for Stochastic Systems

In this section, we consider a control problem involving a notion of optimality with respect to a nonlinear-nonquadratic cost functional. We use the results developed in Theorem 6 to characterize optimal feedback controllers that guarantee closed-loop global stabilization in probability. Specifically, sufficient conditions for optimality are given in a form that corresponds to a steady-state version of the stochastic Hamilton–Jacobi–Bellman equation. For the following result, let

F : R^{n} \times R^{m} \to R^{n}

and

D : R^{n} \times R^{m} \to R^{n \times d}

be such that

F (0, 0) = 0

and

D (0, 0) = 0

.

Theorem 7.

Consider the nonlinear stochastic dynamical system given by (1) with nonlinear-nonquadratic performance measure

J (x_{0}, u (\cdot)) ≜ E^{x_{0}} [\int_{0}^{\infty} L (x (t), u (t)) d t],

(45)

where

x (\cdot)

is the solution to (1) with control input

u (\cdot)

. Furthermore, assume that there exists a two-times continuously differentiable, radially unbounded function

V \in C_{p}^{1} (R^{n})

, and a feedback control law

ϕ : R^{n} \to R^{m}

such that

\begin{matrix} V (0) & = 0, \end{matrix}

(46)

\begin{matrix} V (x) & > 0, x \in R^{n}, x \neq 0, \end{matrix}

(47)

\begin{matrix} ϕ (0) & = 0, \end{matrix}

(48)

\begin{matrix} L V (x, ϕ (x)) & < 0, x \in R^{n}, x \neq 0, \end{matrix}

(49)

\begin{matrix} H (x, ϕ (x)) & = 0, x \in R^{n}, \end{matrix}

(50)

\begin{matrix} H (x, u) & \geq 0, x \in R^{n}, u \in R^{m}, \end{matrix}

(51)

where

\begin{matrix} H (x, u) ≜ L (x, u) + L V (x, u) . \end{matrix}

(52)

Then, with the feedback control

u (\cdot) = ϕ (x (\cdot))

, the zero solution

x (\cdot) \equiv 0

of the closed-loop system (11) is globally asymptotically stable in probability, and

J (x_{0}, ϕ (x (\cdot))) = V (x_{0}), x_{0} \in R^{n} .

(53)

In addition, the feedback control

u (\cdot) = ϕ (x (\cdot))

minimizes (45) in the sense that

\begin{matrix} J (x_{0}, ϕ (x (\cdot))) = min_{u (\cdot) \in S (x_{0})} J (x_{0}, u (\cdot)), \end{matrix}

(54)

where

S (x_{0})

denotes the set of controllers given by

\begin{matrix} S (x_{0}) ≜ { & u (\cdot) : u (\cdot) is admissible and x (\cdot) given by (1) is such that \\ E^{x_{0}} [\int_{0}^{\infty} | L (x (t), u (t)) | d t] < \infty and lim_{t \to \infty} E^{x_{0}} [V (x (t))] = 0}, \end{matrix}

(55)

where

x_{0} \in R^{n}

and

u (\cdot) = ϕ (x (\cdot)) \in S (x_{0})

.

Proof.

Global asymptotic stability in probability is a direct consequence of (46)–(49) by applying Theorem 6 to the closed-loop system (11). Furthermore, using (50), (53) is a restatement of (37) as applied to the closed-loop system.

To show that

u (\cdot) = ϕ (x (\cdot)) \in S (x_{0})

, note that (49) and (50) imply that

L (x, ϕ (x)) \geq 0,

x \in R^{n}

. Thus,

\begin{matrix} \begin{matrix} J (x_{0}, ϕ (x (\cdot))) & = E^{x_{0}} [\int_{0}^{\infty} L (x (t), ϕ (x (t))) d t] \\ = E^{x_{0}} [\int_{0}^{\infty} | L (x (t), ϕ (x (t))) | d t] \\ = V (x_{0}) \\ < \infty . \end{matrix} \end{matrix}

(56)

Now, using an analogous argument as in the proof of Theorem 6, it follows that

lim_{t \to \infty} E^{x_{0}} [V (x (t))] = 0

for

u (\cdot) = ϕ (x (\cdot))

, and hence,

u (\cdot) = ϕ (x (\cdot)) \in S (x_{0})

.

Next, let

u (\cdot) \in S (x_{0})

, and note that, by Itô’s lemma [31],

\begin{matrix} V (x (t)) = V (x (0)) & + \int_{0}^{t} L V (x (s), u (s)) d s + \int_{0}^{t} V^{'} (x (s)) D (x (s), u (s)) d w (s) . \end{matrix}

(57)

Now, it can be shown that the stochastic integral in (57) is a martingale using a similar argument as the one given in the proof of Theorem 6. Hence,

\begin{matrix} E^{x_{0}} [V (x (t))] & = V (x_{0}) + E^{x_{0}} [\int_{0}^{t} L V (x (s)) d s], \end{matrix}

(58)

where

E^{x_{0}} [V (x (t))]

exists since

V \in C_{p}^{1} (R^{n})

and (10) holds. Next, taking the limit as

t \to \infty

yields

\begin{matrix} lim_{t \to \infty} E^{x_{0}} [V (x (t))] & = V (x_{0}) + lim_{t \to \infty} E^{x_{0}} [\int_{0}^{t} L V (x (s)) d s] . \end{matrix}

(59)

Since

u (\cdot) \in S (x_{0})

, the control law satisfies

{lim}_{t \to \infty} E^{x_{0}} [V (x (t))] = 0

, and hence, it follows from (59) that

\begin{matrix} V (x_{0}) = - lim_{t \to \infty} E^{x_{0}} [\int_{0}^{t} L V (x (s)) d s] . \end{matrix}

(60)

Now, combining (51) and (60) yields

\begin{matrix} V (x_{0}) \leq lim_{t \to \infty} E^{x_{0}} [\int_{0}^{t} L (x (s), u (s)) d s] . \end{matrix}

(61)

Next, note that, for every

t \geq 0

,

\begin{matrix} | \int_{0}^{t} L (x (s), u (s)) d s | & \leq \int_{0}^{t} | L (x (s), u (s)) | d s \leq \int_{0}^{\infty} | L (x (s), u (s)) | d s, \end{matrix}

(62)

and, since

u (\cdot) \in S (x_{0})

,

E^{x_{0}} [\int_{0}^{\infty} | L (x (s), u (s)) | d s] < \infty

. Thus, it follows from the dominated convergence theorem [39] that

\begin{matrix} lim_{t \to \infty} E^{x_{0}} [\int_{0}^{t} L (x (s), u (s)) d s] = E^{x_{0}} [\int_{0}^{\infty} L (x (s), u (s)) d s] . \end{matrix}

(63)

Finally, combining (53), (61), and (63) yields

\begin{matrix} V (x_{0}) = J (x_{0}, ϕ (x (\cdot))) & \leq J (x_{0}, u (\cdot)), \end{matrix}

(64)

which proves (54). □

Note that (50) is the steady-state version of the stochastic Hamilton–Jacobi–Bellman equation. To see this, recall that the stochastic Hamilton–Jacobi–Bellman equation is given by

\frac{\partial V}{\partial t} (t, x) + min_{u \in R^{m}} [L (t, x, u) + L V (t, x, u)] = 0, t \geq 0, x \in R^{n},

(65)

which characterizes the optimal control for stochastic time-varying systems on a finite or infinite time interval [30]. For infinite horizon time-invariant systems,

V (t, x) = V (x)

, and hence, (65) collapses to (50) and (51), which guarantee optimality with respect to the set of admissible controllers

S (x_{0})

. Note that an explicit characterization of the set

S (x_{0})

is not required and the optimal stabilizing feedback control law

u = ϕ (x)

is independent of the initial condition

x_{0}

.

In order to ensure global asymptotic stability in probability of the closed-loop system (11), Theorem 7 requires that V satisfy (46), (47), and (49), which implies that V is a Lyapunov function for the closed-loop system (11). However, for optimality V need not satisfy (46), (47), and (49). Specifically, if V is a two-times continuously differentiable function such that

V \in C_{p}^{1} (R^{n})

and

u (\cdot) = ϕ (x (\cdot)) \in S (x_{0})

, then (50) and (51) imply (53) and (54). It is important to note here that, unlike deterministic theory ([16], p. 857), to ascertain that a control is optimal we require the additional transversality condition

{lim}_{t \to \infty} E * V (x (t)) = 0

appearing in (55); see ([40], p. 337), ([41], p. 125), and ([42], p. 139) for further details.

Even though for an optimal controller

u (\cdot) = ϕ (x (\cdot))

the transversality condition in (55) is satisfied, the transversality condition involves a sample path dependent condition that can be difficult to verify for an arbitrary control input

u (\cdot) \in S (x_{0})

. The next theorem circumvents this problem by requiring additional restrictions on the cost integrand L and the Lyapunov function V.

Theorem 8.

Consider the nonlinear stochastic dynamical system given by (1) with the nonlinear-nonquadratic performance measure (45) where

L : R^{n} \times R^{m} \to R

satisfies

L (x, u) \geq {γ ‖ x ‖}^{p}, x \in R^{n}, u \in R^{m},

(66)

for some positive constants γ and

p \geq 1

. Assume that there exist a two-times continuously differentiable function

V \in C_{p}^{1} (R^{n})

and a control law

ϕ : R^{n} \to R^{m}

such that (48), (50), and (51) hold and, for positive constants α and β,

\begin{matrix} {α ‖ x ‖}^{p} \leq V (x) & \leq {β ‖ x ‖}^{p}, x \in R^{n} . \end{matrix}

(67)

Then, with the feedback control

u (\cdot) = ϕ (x (\cdot))

, the zero solution

x (\cdot) \equiv 0

of the closed-loop system (11) is globally exponentially p-stable in probability and (53) holds. In addition, the feedback control

u (\cdot) = ϕ (x (\cdot))

minimizes (45) in the sense that

\begin{matrix} J (x_{0}, ϕ (x (\cdot))) = min_{u (\cdot) \in \hat{S} (x_{0})} J (x_{0}, u (\cdot)), \end{matrix}

(68)

where

\hat{S} (x_{0})

denotes the set of controllers given by

\begin{matrix} \hat{S} (x_{0}) ≜ {u (\cdot) : & u (\cdot) is admissible and x (\cdot) given by (1) is such that \\ E^{x_{0}} [\int_{0}^{\infty} | L (x (t), u (t)) | d t] < \infty}, x_{0} \in R^{n}, \end{matrix}

(69)

and

u (\cdot) = ϕ (x (\cdot)) \in \hat{S} (x_{0})

.

Proof.

Global exponential p-stability in probability is a direct consequence of (66), (67), and (50) by applying Theorem 3 to the closed-loop system (11). To show (53), (68), and

u (\cdot) = ϕ (x (\cdot)) \in \hat{S} (x_{0})

, first note that Theorem 7 holds. Therefore, we need only show that with (66) and (67),

S (x_{0}) = \hat{S} (x_{0})

. That is, any input

u (\cdot)

with finite cost (and hence, belonging to

\hat{S} (x_{0})

) automatically satisfies the transversality condition (and hence, belongs to

S (x_{0})

).

Assume that (66) and (67) hold. Note that

S (x_{0}) \subseteq \hat{S} (x_{0})

is immediate. To show

S (x_{0}) \supseteq \hat{S} (x_{0})

, let

u (\cdot) \in \hat{S} (x_{0})

and

x_{0} \in R^{n}

. Since

u (\cdot) \in \hat{S} (x_{0})

,

E^{x_{0}} [\int_{0}^{\infty} | L (x (t), u (t)) | d t] < \infty,

(70)

and hence, it follows from the dominated convergence theorem [39] that

E^{x_{0}} [\int_{0}^{\infty} L (x (t), u (t)) d t] = lim_{t \to \infty} E^{x_{0}} [\int_{0}^{t} L (x (t), u (t)) d t] .

(71)

Furthermore, by Tonelli’s theorem [37],

lim_{t \to \infty} E^{x_{0}} [\int_{0}^{t} L (x (t), u (t)) d t] = [\int_{0}^{\infty} E^{x_{0}} [L (x (t), u (t))] d t] .

(72)

Combining (66) and (70)–(72) we obtain

[\int_{0}^{\infty} E^{x_{0}} {[‖ x (t) ‖}^{p}] d t] \leq \frac{1}{γ} [\int_{0}^{\infty} E^{x_{0}} [L (x (t), u (t))] d t] < \infty .

(73)

Now, using Lemma 5.7 of [29], it follows that

lim_{t \to \infty} E^{x_{0}} {[‖ x (t) ‖}^{p}] = 0,

(74)

which, combined with (67), yields

lim_{t \to \infty} E^{x_{0}} [V (x)] = 0 .

(75)

Hence,

u (\cdot) \in S (x_{0})

, which implies

S (x_{0}) \supseteq \hat{S} (x_{0})

. Thus,

S (x_{0}) = \hat{S} (x_{0})

. □

Next, we specialize Theorem 8 to linear stochastic dynamical systems and provide connections to the stochastic optimal linear-quadratic regulator problem with multiplicative noise. For the following result, let

A \in R^{n \times n}

,

B \in R^{n \times m}

,

σ \in R^{d}

, and let

R_{1} \in R^{n}

and

R_{2} \in R^{m}

be given positive definite matrices.

Corollary 2.

Consider the linear controlled stochastic dynamical system with multiplicative noise given by

d x (t) = [A x (t) + B u (t)] d t + x (t) σ^{T} d w (t), x (0) = x_{0}, t \geq 0,

(76)

and with quadratic performance measure

J (x_{0}, u (\cdot)) \overset{▵}{=} E^{x_{0}} [\int_{0}^{\infty} [x^{T} (t) R_{1} x (t) + u^{T} (t) R_{2} u (t)] d t] .

(77)

Furthermore, assume that there exists a positive-definite matrix

P \in R^{n \times n}

such that

0 = {(A + \frac{1}{2} {∥ σ ∥}^{2} I_{n})}^{T} P + P (A + \frac{1}{2} {∥ σ ∥}^{2} I_{n}) + R_{1} - P B R_{2}^{- 1} B^{T} P .

(78)

Then, with the feedback control

u = ϕ (x) \overset{▵}{=} - R_{2}^{- 1} B^{T} P x

, the zero solution

x (\cdot) \equiv 0

to (76) is globally exponentially mean-square stable in probability and

J (x_{0}, ϕ (x (\cdot))) = x_{0}^{T} P x_{0}, x_{0} \in R^{n} .

(79)

Furthermore,

J (x_{0}, ϕ (x (\cdot))) = min_{u (\cdot) \in \hat{S} (x_{0})} J (x_{0}, u (\cdot)),

(80)

where

\hat{S} (x_{0})

is the set of controllers defined in (69) for (76) and

x_{0} \in R^{n}

.

Proof.

The result is a direct consequence of Theorem 8 with

F (x, u)

= A x + B u

,

D (x, u) = x σ^{T}

,

L (x, u) = x^{T} R_{1} x + u^{T} R_{2} u

, and

V (x) = x^{T} P x

. Specifically, (66) is satisfied with

γ = λ_{\min} (R_{1})

and

p = 2

. Moreover, V is a two-times continuously differentiable function that satisfies

V \in C_{p}^{1} (R^{n})

and (67) with

α = λ_{\min} (P)

,

β = λ_{\max} (P)

, and

p = 2

. Furthermore, condition (48) is trivially satisfied. Next, it follows from (78) that

H (x, ϕ (x)) = 0

, showing that (50) holds. Finally,

H (x, u) = H (x, u) - H (x, ϕ (x)) = {[u - ϕ (x)]}^{T} R_{2} [u - ϕ (x)] \geq 0

so that all of the conditions of Theorem 8 are satisfied. □

The optimal feedback control law

ϕ

in Corollary 2 is derived using the properties of H as defined in Theorem 7. Specifically, since

H (x, u) = x^{T} R_{1} x + u^{T} R_{2} u + x^{T} (A^{T} P + P A) x + 2 x^{T} P B u + {∥ σ ∥}^{2} x^{T} P x

it follows that

\frac{\partial^{2} H}{\partial u^{2}} = R_{2} > 0

. Now,

\frac{\partial H}{\partial u} = 2 R_{2} u + 2 B^{T} P x = 0

gives the unique global minimum of H. Hence, since

ϕ

minimizes

H (x, u)

it follows that

ϕ

satisfies

\frac{\partial H}{\partial u} = 0

or, equivalently,

ϕ (x) = - R_{2}^{- 1} B^{T} P x

.

7. Inverse Optimal Stochastic Control

In this section, we specialize Theorem 7 to systems that are affine in the control. Specifically, we devise nonlinear feedback controllers within a stochastic optimal control framework, aiming to minimize a nonlinear-nonquadratic performance criterion. This is achieved by selecting the controller in such a way that the mapping of the infinitesimal generator of the Lyapunov function is negative definite along the sample trajectories of the closed-loop system. We also establish sufficient conditions for the existence of asymptotically stabilizing solutions (in probability) to the stochastic Hamilton–Jacobi–Bellman equation. Consequently, these findings present a set of globally stabilizing controllers, parameterized by the minimized cost functional.

The controllers developed in this section are based on an inverse optimal stochastic control problem [20,21,22,23,24,25,26]. To simplify the solution of the stochastic steady-state Hamilton–Jacobi–Bellman equation, we do not attempt to minimize a given cost functional. Instead, we parameterize a family of stochastically stabilizing controllers that minimize a derived cost functional, offering flexibility in defining the control law. The performance integrand explicitly depends on the nonlinear system dynamics, the Lyapunov function for the closed-loop system, and the stabilizing feedback control law. This coupling is introduced through the stochastic Hamilton–Jacobi–Bellman equation. Therefore, by adjusting parameters in the Lyapunov function and the performance integrand, the proposed framework can characterize a class of globally stabilizing controllers in probability, meeting specified constraints on the closed-loop system response.

Consider the nonlinear stochastic affine in the control dynamical system given by

\begin{matrix} d x (t) = [f (x (t)) + G (x (t)) u (t)] d t + D (x (t)) d w (t), x (0) = x_{0}, t \geq 0, \end{matrix}

(81)

where

f : R^{n} \to R^{n}

satisfies

f (0) = 0

,

G : R^{n} \to R^{n \times m}

, and

D : R^{n} \to R^{n \times d}

satisfies

D (0) = 0

. Furthermore, we consider performance integrands L of the form

L (x, u) = L_{1} (x) + L_{2} (x) u + u^{T} R_{2} (x) u,

(82)

where

L_{1} : R^{n} \to R, L_{2} : R^{n} \to R^{1 \times m}

, and

R_{2} : R^{n} \to P^{m}

, and where

P^{m}

denotes the set of

m \times m

positive definite matrices, so that (45) becomes

J (x_{0}, u (\cdot)) = E^{x_{0}} [\int_{0}^{\infty} [L_{1} (x (t)) + L_{2} (x (t)) u (t) + u^{T} (t) R_{2} (x (t)) u (t)] d t] .

(83)

Theorem 9.

Consider the nonlinear controlled affine stochastic dynamical system (81) with performance measure (83). Assume that there exists a two-times continuously differentiable, radially unbounded function

V \in C_{p}^{1} (R^{n})

and a function

L_{2} : R^{n} \to R^{1 \times m}

such that

\begin{matrix} V (0) = 0, \end{matrix}

(84)

\begin{matrix} V (x) > 0, x \in R^{n}, x \neq 0, \end{matrix}

(85)

\begin{matrix} L_{2} (0) = 0, \end{matrix}

(86)

\begin{matrix} V^{'} (x) [f (x) - \frac{1}{2} G (x) R_{2}^{- 1} (x) L_{2}^{T} (x) - \frac{1}{2} G (x) R_{2}^{- 1} (x) G^{T} (x) V^{' T} (x)] \end{matrix}

(87)

\begin{matrix} + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x) < 0, x \in R^{n}, x \neq 0 . \end{matrix}

(88)

Then the zero solution

x (\cdot) \equiv 0

of the closed-loop system

\begin{matrix} d x (t) = [f (x (t)) + G (x (t)) ϕ (x (t))] d t + D (x (t)) d w (t), x (0) = x_{0}, t \geq 0, \end{matrix}

(89)

is globally asymptotically stable in probability with the feedback control law

ϕ (x) = - \frac{1}{2} R_{2}^{- 1} (x) {[V^{'} (x) G (x) + L_{2} (x)]}^{T},

(90)

and the performance measure (83), with

L_{1} (x) = ϕ^{T} (x) R_{2} (x) ϕ (x) - V^{'} (x) f (x) - \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x),

(91)

is minimized in the sense that

J (x_{0}, ϕ (x (\cdot))) = min_{u (\cdot) \in S (x_{0})} J (x_{0}, u (\cdot)), x_{0} \in R^{n} .

(92)

Finally,

J (x_{0}, ϕ (x (\cdot))) = V (x_{0}), x_{0} \in R^{n} .

(93)

Proof.

The result is a direct consequence of Theorem 7 with

F (x, u)

= f (x) + G (x) u

,

D (x, u) = D (x)

, and

L (x, u) = L_{1} (x) + L_{2} (x) u + u^{T} R_{2} (x) u

. Specifically, with (82) the Hamiltonian has the form

\begin{matrix} H (x, u) & = L_{1} (x) + L_{2} (x) u + u^{T} R_{2} (x) u + V^{'} (x) [f (x) + G (x) u] + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x) . \end{matrix}

Now, the feedback control law (90) is obtained by setting

\frac{\partial H}{\partial u} = 0

. With (90), it follows that (84), (85), and (88) imply (46), (47), and (49), respectively. Next, since V is two-times continuously differentiable and

x = 0

is a local minimum of V, it follows that

V^{'} (0) = 0

, and hence, since by assumption

L_{2} (0) = 0

, it follows that

ϕ (0) = 0

, which implies (48). Next, with

L_{1}

given by (91) and

ϕ

given by (90), (50) holds. Finally, since

H (x, u) = H (x, u) - H (x, ϕ (x)) = {[u - ϕ (x)]}^{T} R_{2} (x) [u - ϕ (x)]

and

R_{2} (x)

is positive definite for all

x \in R^{n}

, condition (51) holds. The result now follows as a direct consequence of Theorem 7. □

Note that (88) is equivalent to

\begin{matrix} L V (x) = V^{'} (x) [f (x) + G (x) ϕ (x)] + \frac{1}{2} tr D^{T} (x) V^{''} (x) D (x) < 0, x \in R^{n}, x \neq 0, \end{matrix}

(94)

with

ϕ

given by (90). Furthermore, conditions (84), (85), and (94) ensure that V is a Lyapunov function for the closed-loop system (89). As outlined in [16], it is crucial to acknowledge that the function

L_{2}

present in the integrand of the performance measure (82) is a variable function of

x \in R^{n}

constrained by conditions (86) and (88). Therefore,

L_{2}

offers versatility in the selection of the control law.

With

L_{1}

given by (91) and

ϕ

given by (90), L is given by

\begin{matrix} L (x, u) & = u^{T} R_{2} (x) u - ϕ^{T} (x) R_{2} (x) ϕ (x) + L_{2} (x) (u - ϕ (x)) \\ - V^{'} (x) [f (x) + G (x) ϕ (x)] - \frac{1}{2} tr D^{T} (x) V^{''} (x) D (x) \\ = {[u + \frac{1}{2} R_{2}^{- 1} (x) L_{2}^{T} (x)]}^{T} R_{2} (x) [u + \frac{1}{2} R_{2}^{- 1} (x) L_{2}^{T} (x)] \\ - V^{'} (x) [f (x) + G (x) ϕ (x)] - \frac{1}{2} tr D^{T} (x) V^{''} (x) D (x) \\ - \frac{1}{4} V^{'} (x) G (x) R_{2}^{- 1} (x) G^{T} (x) V^{' T} (x) . \end{matrix}

(95)

Since

R_{2} (x) > 0

,

x \in R^{n}

, the first term on the right-hand side of (95) is nonnegative, whereas (94) implies that the second, third, and fourth terms collectively are nonnegative. Thus, it follows that

L (x, u) \geq - \frac{1}{4} V^{'} (x) G (x) R_{2}^{- 1} (x) G^{T} (x) V^{' T} (x),

(96)

which shows that L may be negative. As a result, there may exist a control input

u (\cdot)

for which the performance measure

J (x_{0}, u (\cdot))

is negative. However, if the control

u (\cdot)

is a regulation controller, that is,

u (\cdot) \in S (x_{0})

, then it follows from (92) and (93) that

J (x_{0}, u (\cdot)) \geq V (x_{0}) \geq 0, x_{0} \in R^{n}, u (\cdot) \in S (x_{0}) .

(97)

Furthermore, in this case, substituting

u = ϕ (x)

into (95) yields

L (x, ϕ (x)) = - V^{'} (x) [f (x) + G (x) ϕ (x)] - \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x),

(98)

which, by (94), is positive.

Example 1.

To illustrate the utility of Theorem 9, we showcase an example involving global stabilization of a stochastic version of the Lorentz equations [43]. These equations model fluid convection and are known to exhibit chaotic behavior. To construct inverse optimal controllers for the controlled Lorentz stochastic dynamical system consider the system

\begin{matrix} d x_{1} & = & [- α x_{1} (t) + α x_{2} (t)] d t + σ_{1} x_{1} (t) d w (t), x_{1} (0) = x_{10}, t \geq 0, \end{matrix}

(99)

\begin{matrix} d x_{2} & = & [r x_{1} (t) - x_{2} (t) - x_{1} (t) x_{3} (t) + u (t)] d t + σ_{2} x_{2} (t) d w (t), x_{2} (0) = x_{20}, \end{matrix}

(100)

\begin{matrix} d x_{3} & = & [x_{1} (t) x_{2} (t) - b x_{3} (t)] d t + σ_{3} x_{3} (t) d w (t), x_{3} (0) = x_{30}, \end{matrix}

(101)

where

α, r, b, σ_{1}, σ_{2}

, and

σ_{3} > 0

. Note that (99)–(101) can be written in the form of (81) with

f (x) = [\begin{matrix} - α x_{1} + α x_{2} \\ r x_{1} - x_{2} - x_{1} x_{3} \\ x_{1} x_{2} - b x_{3} \end{matrix}], G (x) = [\begin{matrix} 0 \\ 1 \\ 0 \end{matrix}], D (x) = [\begin{matrix} σ_{1} x_{1} \\ σ_{2} x_{2} \\ σ_{3} x_{3} \end{matrix}] .

In order to design an inverse optimal control law for the controlled Lorentz stochastic dynamical system (99)–(101) consider the quadratic Lyapunov function candidate given by

V (x) = p_{1} x_{1}^{2} + p_{2} x_{2}^{2} + p_{3} x_{3}^{2},

(102)

where

x \overset{▵}{=} {[x_{1}, x_{2}, x_{3}]}^{T}

and

p_{1}, p_{2}, p_{3} > 0

. Now, letting

p_{2} = p_{3}

,

σ_{1} < \sqrt{α}

,

| σ_{2} | < 1

,

| σ_{3} | < 1

, and

L (x, u) = L_{1} (x) + L_{2} (x) u + R_{2} u^{2}

, where

R_{2} > 0

, it follows that

L_{2} (x) = \frac{R_{2}}{p_{2}} (2 p_{1} α + 2 p_{2} r) x_{1} - 2 p_{2} x_{2},

(103)

satisfies (88); that is,

\begin{matrix} L V (x) & = & V^{'} (x) [f (x) - \frac{1}{2} G (x) R_{2}^{- 1} L_{2}^{T} (x) - \frac{1}{2} G (x) R_{2}^{- 1} G^{T} (x) V^{' T} (x)] \\ + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x) \\ = & - 2 [(α - σ_{1}^{2}) p_{1} x_{1}^{2} + (1 - σ_{2}^{2}) p_{2} x_{2}^{2} + (1 - σ_{3}^{2}) p_{2} b x_{3}^{2}] \\ < & 0, x \in R^{3}, x \neq 0 . \end{matrix}

Hence, the feedback control law

ϕ (x) = - (\frac{p_{1}}{p_{2}} α + r) x_{1}

given by (90) globally stabilizes the controlled Lorentz dynamical system (99)–(101). Furthermore, the performance functional (83), with

L_{1} (x) = [2 σ p_{1} + R_{2} {(\frac{p_{1}}{p_{2}} α + r)}^{2}] x_{1}^{2} - 2 (p_{1} σ + p_{2} r) x_{1} x_{2} + 2 p_{2} x_{2}^{2} + 2 p_{2} b x_{3}^{2},

(104)

is minimized in the sense of (92).

Figure 1 shows the mean along with the standard deviation of 1000 system sample paths with parameters

α = r = b = 1

,

σ_{1} = σ_{2} = σ_{3} = 0.25

, and

p_{1} = p_{2} = p_{3} = 1

. ▵

The next theorem is similar to Theorem 9 and is included here as it provides the basis for our stability margin results given in the next sections.

Theorem 10.

Consider the nonlinear controlled affine stochastic dynamical system (81) with performance measure (83). Assume that there exists a two-times continuously differentiable, radially unbounded function

V \in C_{p}^{1} (R^{n})

such that

\begin{matrix} V (0) & = & 0, \end{matrix}

(105)

\begin{matrix} V (x) & > & 0, x \in R^{n}, x \neq 0, \end{matrix}

(106)

\begin{matrix} L_{2} (0) = 0, \\ V^{'} (x) [f (x) - \frac{1}{2} G (x) R_{2}^{- 1} (x) L_{2}^{T} (x) - \frac{1}{2} G (x) R_{2}^{- 1} (x) G^{T} (x) V^{' T} (x)] \end{matrix}

(107)

\begin{matrix} + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x) < 0, x \in R^{n}, x \neq 0, \\ 0 = L_{1} (x) + V^{'} (x) f (x) + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x) - \frac{1}{4} [V^{'} (x) G (x) + L_{2} (x)] \end{matrix}

(108)

\begin{matrix} \cdot R_{2}^{- 1} (x) {[V^{'} (x) G (x) + L_{2} (x)]}^{T}, x \in R^{n} . \end{matrix}

(109)

Then the zero solution

x (\cdot) \equiv 0

of the closed-loop system

d x (t) = [f (x (t)) + G (x (t)) ϕ (x (t))] d t + D (x (t)) d w (t), x (0) = x_{0}, t \geq 0,

(110)

is globally asymptotically stable in probability with the feedback control law

ϕ (x) = - \frac{1}{2} R_{2}^{- 1} (x) {[V^{'} (x) G (x) + L_{2} (x)]}^{T},

(111)

and the performance functional (83) is minimized in the sense that

J (x_{0}, ϕ (x (\cdot))) = min_{u (\cdot) \in S (x_{0})} J (x_{0}, u (\cdot)), x_{0} \in R^{n} .

(112)

Finally,

J (x_{0}, ϕ (x (\cdot))) = V (x_{0}), x_{0} \in R^{n} .

(113)

Proof.

The proof is identical to the proof of Theorem 9 and, hence, is omitted. □

8. Relative Stability Margins for Optimal Nonlinear Stochastic Regulators

In this section, we establish relative stability margins for both optimal and inverse optimal nonlinear stochastic feedback regulators. Specifically, we derive sufficient conditions ensuring gain, sector, and disk margin guarantees for nonlinear stochastic dynamical systems under the control of nonlinear optimal and inverse optimal Hamilton–Jacobi–Bellman controllers. These controllers aim to minimize a nonlinear-nonquadratic performance criterion that includes cross-weighting terms. In the scenario where the cross-weighting term in the performance criterion is omitted, our findings align with the gain, sector, and disk margins derived for the deterministic optimal control problem outlined in [25].

Alternatively, by retaining the cross-terms in the performance criterion and specializing the optimal nonlinear-nonquadratic problem to a stochastic linear-quadratic problem featuring a multiplicative noise disturbance, our results recover the corresponding gain and phase margins for the deterministic linear-quadratic optimal control problem as presented in [44]. Despite the observed degradation of gain, sector, and disk margins due to the inclusion of cross-weighting terms, the added flexibility afforded by these terms enables the assurance of optimal and inverse optimal nonlinear controllers that can exhibit superior transient performance as compared to meaningful inverse optimal controllers.

To develop relative stability margins for nonlinear stochastic regulators consider the nonlinear stochastic dynamical system

G

given by

\begin{matrix} d x (t) & = [f (x (t)) + G (x (t)) u (t)] d t + D (x (t)) d w (t), x (0) = x_{0}, t \geq 0, \end{matrix}

(114)

\begin{matrix} y (t) & = - ϕ (x (t)), \end{matrix}

(115)

where

f : R^{n} \to R^{n}

satisfies

f (0) = 0

,

G : R^{n} \to R^{n \times m}

,

D : R^{n} \to R^{n \times d}

satisfies

D (0) = 0

, and

ϕ : R^{n} \to R^{m}

is an admissible feedback controller such that

G

is globally asymptotically stable in probability with

u = - y

, with a nonlinear-nonquadratic performance criterion

J (x_{0}, u (\cdot)) = E^{x_{0}} [\int_{0}^{\infty} [L_{1} (x (t)) + L_{2} (x (t)) u (t) + u^{T} (t) R_{2} (x (t)) u (t)] d t],

(116)

where

L_{1} : R^{n} \to R

,

L_{2} : R^{n} \to R^{1 \times m}

, and

R_{2} : R^{n} \to R^{m \times m}

are given such that

R_{2} (x) > 0

,

x \in R^{n}

, and

L_{2} (0) = 0

.

Next, we define the relative stability margins for

G

given by (114) and (115). Specifically, let

u_{c} \overset{▵}{=} - y

,

y_{c} \overset{▵}{=} u

, and consider the negative feedback interconnection

u = Δ (- y)

of

G

and

Δ

given in Figure 2, where

Δ

is either a linear operator

Δ (u_{c}) = Δ u_{c}

, a nonlinear static operator

Δ (u_{c}) = σ (u_{c})

, or a nonlinear dynamic operator

Δ

with input

u_{c}

and output

y_{c}

. Furthermore, we assume that in the nominal case

Δ = I

the nominal closed-loop system is globally asymptotically stable in probability.

Definition 3

([16]). Let

α, β \in R

be such that

0 < α \leq 1 \leq β < \infty

. Then the nonlinear stochastic dynamical system

G

given by (114) and (115) is said to have a gain margin

(α, β)

if the negative feedback interconnection of

G

and

Δ (u_{c}) = Δ u_{c}

is globally asymptotically stable in probability for all

Δ = diag [k_{1}, \dots, k_{m}]

, where

k_{i} \in (α, β)

,

i = 1, \dots, m

.

Definition 4

([16]). Let

α, β \in R

be such that

0 < α \leq 1 \leq β < \infty

. Then the nonlinear stochastic dynamical system

G

given by (114) and (115) is said to have a sector margin

(α, β)

if the negative feedback interconnection of

G

and

Δ (u_{c}) = σ (u_{c})

is globally asymptotically stable in probability for all nonlinearities

σ : R^{m} \to R^{m}

such that

σ (0) = 0

,

σ (u_{c}) = {[σ_{1} (u_{c 1}), \dots, σ_{m} (u_{c m})]}^{T}

, and

α u_{c i}^{2} < σ_{i} (u_{c i}) u_{c i} < β u_{c i}^{2}

, for all

u_{c i} \neq 0

,

i = 1, \dots, m

.

Definition 5.

A nonlinear stochastic dynamical systems

G

is asymptotically zero-state observable if

{lim}_{t \to \infty} u (t) = 0

and

{lim}_{t \to \infty} y (t) = 0

imply

{lim}_{t \to \infty} x (t) = 0

.

For the next two definitions, we assume that the system

G

and the nonlinear operator

Δ

are asymptotically zero-state observable.

Definition 6

([16]). Let

α, β \in R

be such that

0 < α \leq 1 \leq β < \infty

. Then the nonlinear stochastic dynamical system

G

given by (114) and (115) is said to have a disk margin

(α, β)

if the negative feedback interconnection of

G

and Δ is globally asymptotically stable in probability for all dynamic operators Δ such that Δ is stochastically dissipative with respect to the supply rate

r_{c} (u_{c}, y_{c}) = u_{c}^{T} y_{c} - \frac{1}{\hat{α} + \hat{β}} y_{c}^{T} y_{c} - \frac{\hat{α} \hat{β}}{\hat{α} + \hat{β}} u_{c}^{T} u_{c}

and with a two-times continuously differentiable, positive definite storage function, where

\hat{α} = α + δ

,

\hat{β} = β - δ

, and

δ \in R

such that

0 < 2 δ < β - α

.

Definition 7

([16]). Let

α, β \in R

be such that

0 < α \leq 1 \leq β < \infty

. Then the nonlinear stochastic dynamical system

G

given by (114) and (115) is said to have a structured disk margin

(α, β)

if the negative feedback interconnection of

G

and Δ is globally asymptotically stable in probability for all dynamic operators Δ such that

Δ (u_{c}) = diag [δ_{1} (u_{c 1}), \dots, δ_{m} (u_{c m})]

, and

δ_{i}

,

i = 1, \dots, m

, is stochastically dissipative with respect to the supply rate

r_{c i} (u_{c i}, y_{c i}) = u_{c i} y_{c i} - \frac{1}{\hat{α} + \hat{β}} y_{c i}^{2} - \frac{\hat{α} \hat{β}}{\hat{α} + \hat{β}} u_{c i}^{2}

and with a two-times continuously differentiable, positive definite storage function, where

\hat{A} α = α + δ

,

\hat{β} = β - δ

, and

δ \in R

such that

0 < 2 δ < β - α

.

Note that if

G

has a disk margin

(α, β)

, then

G

has gain and sector margins

(α, β)

.

The following lemma is needed for developing the main results of this section.

Lemma 1.

Consider the nonlinear stochastic dynamical system

G

given by (114) and (115), where ϕ is a stochastically stabilizing feedback control law given by (111) and where V satisfies

\begin{matrix} 0 & = V^{'} (x) f (x) + L_{1} (x) - \frac{1}{4} [V^{'} (x) G (x) + L_{2} (x)] R_{2}^{- 1} (x) {[V^{'} (x) G (x) + L_{2} (x)]}^{T} \\ + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x), x \in R^{n} . \end{matrix}

(117)

Furthermore, suppose there exists

θ \in R

such that

0 < θ < 1

and

(1 - θ^{2}) L_{1} (x) - \frac{1}{4} L_{2} (x) R_{2}^{- 1} (x) L_{2}^{T} (x) \geq 0, x \in R^{n} .

(118)

Then,

\begin{matrix} L V (x) \leq {[u + y]}^{T} R_{2} (x) [u + y] - θ^{2} u^{T} R_{2} (x) u, \end{matrix}

(119)

for all

x \in R^{n}

and

u \in R^{m}

.

Proof.

Note that it follows from (117) and (118) that, for all

x \in R^{n}

and

u \in R^{m}

,

\begin{matrix} θ^{2} u^{T} R_{2} (x) u & \leq θ^{2} u^{T} R_{2} (x) u + [\frac{1}{2 \sqrt{1 - θ^{2}}} L_{2} (x) R_{2}^{- 1} (x) + \sqrt{1 - θ^{2}} u^{T}] \\ \cdot R_{2} (x) {[\frac{1}{2 \sqrt{1 - θ^{2}}} L_{2} (x) R_{2}^{- 1} (x) + \sqrt{1 - θ^{2}} u^{T}]}^{T} \\ = u^{T} R_{2} (x) u + \frac{1}{4 (1 - θ^{2})} L_{2} (x) R_{2}^{- 1} (x) L_{2}^{T} (x) + L_{2} (x) u \\ \leq u^{T} R_{2} (x) u + L_{2} (x) u + L_{1} (x) \\ = u^{T} R_{2} (x) u + L_{2} (x) u - V^{'} (x) f (x) \\ + ϕ^{T} (x) R_{2} (x) ϕ (x) - \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x) \\ = {[u + y]}^{T} R_{2} (x) [u + y] - V^{'} (x) [f (x) + G (x) u] \\ - \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x), \end{matrix}

(120)

which implies that

\begin{matrix} θ^{2} u^{T} R_{2} (x) u \leq {[u + y]}^{T} R_{2} (x) [u + y] - L V (x) . \end{matrix}

(121)

This completes the proof. □

Next, we present disk margins for the nonlinear-nonquadratic optimal regulator given by Theorem 10. We consider the case in which

R_{2} (x)

,

x \in R^{n}

, is a constant diagonal matrix and the case in which it is not a constant diagonal matrix.

Theorem 11.

Consider the nonlinear stochastic dynamical system

G

given by (114) and (115), where ϕ is the stochastically stabilizing feedback control law given by (111) and where

V \in C_{p}^{1} (R^{n})

is a two-times continuously differentiable, radially unbounded function that satisfies (105)–(109). Assume that

G

is asymptotically zero state observable. If the matrix

R_{2} (x) = diag [r_{1}, \dots, r_{m}]

, where

r_{i} > 0

,

i = 1, \dots, m

, and there exists

θ \in R

such that

0 < θ < 1

and (118) is satisfied, then the nonlinear stochastic dynamical system

G

has a structured disk margin

(\frac{1}{1 + θ}, \frac{1}{1 - θ})

. If, in addition,

R_{2} (x) \equiv I

and there exists

θ \in R

such that

0 < θ < 1

and (118) is satisfied, then the nonlinear stochastic dynamical system

G

has a disk margin

(\frac{1}{1 + θ},

\frac{1}{1 - θ})

.

Proof.

Note that it follows from Lemma 1 that

\begin{matrix} L V (x) \leq {[u + y]}^{T} R_{2} [u + y] - θ^{2} u^{T} R_{2} u . \end{matrix}

(122)

Hence, with the storage function

V_{s} (x) = \frac{1}{2} V (x)

, it follows from that

G

is stochastically dissipative with respect to the supply rate

r (u, y) = u^{T} R_{2} y + \frac{1 - θ^{2}}{2} u^{T} R_{2} u + \frac{1}{2} y^{T} R_{2} y

. Now, the result is a direct consequence of Definitions 6 and 7, and the stochastic version of Corollary 6.2 given in [16] with

α = \frac{1}{1 + θ}

and

β = \frac{1}{1 - θ}

. □

For the next result, define

\bar{γ} \overset{▵}{=} sup_{x \in R^{n}} σ_{max} (R_{2} (x)), \underset{̲}{γ} \overset{▵}{=} \inf_{x \in R^{n}} σ_{min} (R_{2} (x)),

(123)

where

R_{2} (x)

is such that

\bar{γ} < \infty

and

\underset{̲}{γ} > 0

.

Theorem 12.

Consider the nonlinear stochastic dynamical system

G

given by (114) and (115), where ϕ is the stochastically stabilizing feedback control law given by (111) and where

V \in C_{p}^{1} (R^{n})

is a two-times continuously differentiable, radially unbounded function that satisfies (105)–(109). Assume that

G

is asymptotically zero-state observable. If there exists

θ \in R

such that

0 < θ < 1

and (118) is satisfied, then the nonlinear stochastic system

G

has a disk margin

(\frac{1}{1 + η θ}, \frac{1}{1 - η θ})

, where

η \overset{▵}{=} \sqrt{\underset{̲}{γ} / \bar{γ}}

.

Proof.

It follows from Lemma 1 that

\begin{matrix} L V (x) & \leq {[u + y]}^{T} R_{2} (x) [u + y] - θ^{2} u^{T} R_{2} (x) u \\ \leq \bar{γ} {[u + y]}^{T} [u + y] - \underset{̲}{γ} θ^{2} u^{T} u . \end{matrix}

(124)

Thus, with the storage function

V_{s} (x) = \frac{1}{2 \underset{̲}{γ}} V (x)

,

G

is stochastically dissipative with respect to thesupply rate

r (u, y) = u^{T} y + \frac{1 - η^{2} θ^{2}}{2} u^{T} u + \frac{1}{2} y^{T} y

. The result now is a direct consequence of Definition 6 and the stochastic version of Corollary 6.2 given in [16] with

α = \frac{1}{1 + η θ}

and

β = \frac{1}{1 - η θ}

. □

Next, we provide a result that guarantees sector and gain margins for the case in which

R_{2} (x)

,

x \in R^{n}

, is diagonal.

Theorem 13.

Consider the nonlinear stochastic dynamical system

G

given by (114) and (115), where ϕ is the stochastically stabilizing feedback control law given by (111) and where

V \in C_{p}^{1} (R^{n})

is a two-times continuously differentiable, radially unbounded function that satisfies (105)–(109). Furthermore, let

R_{2} (x) = diag [r_{1} (x), \dots, r_{m} (x)]

, where

r_{i} : R^{n} \to R

,

r_{i} (x) > 0

,

i = 1, \dots, m

. If there exists

θ \in R

such that

0 < θ < 1

and

(1 - θ^{2}) L_{1} (x) - \frac{1}{4} L_{2} (x) R_{2}^{- 1} (x) L_{2}^{T} (x) > 0, x \in R^{n},

(125)

then the nonlinear stochastic dynamical system

G

has a sector (and hence, gain) margin

(\frac{1}{1 + θ},

\frac{1}{1 - θ})

.

Proof.

Let

Δ (- y) = σ (- y)

, where

σ : R^{m} \to R^{m}

is a static nonlinearity such that

σ (0) = 0

,

σ (v) = {[σ_{1} (v_{1}), \dots, σ_{m} (v_{m})]}^{T}

, and

α v_{i}^{2} < σ_{i} (v_{i}) v_{i} < β v_{i}^{2}

, for all

v_{i} \neq 0

,

i = 1, \dots, m

, where

α = \frac{1}{1 + θ}

and

β = \frac{1}{1 - θ}

; or, equivalently,

(σ_{i} (v_{i}) - α v_{i}) (σ_{i} (v_{i}) - β v_{i}) < 0

, for all

v_{i} \neq 0

,

i = 1, \dots, m

. In this case, the closed-loop system (114) and (115) with

u = σ (- y)

is given by

d x (t) = [f (x (t)) + G (x (t)) σ (ϕ (x (t)))] d t + D (x (t)) d w (t), x (0) = x_{0}, t \geq 0 .

(126)

Next, consider the two-times continuously differentiable, radially unbounded function Lyapunov function candidate V satisfying (105)–(109) and let

L V

denote the Lyapunov infinitesimal generator of the closed-loop system (126). Now, it follows from (109) and (125) that

\begin{matrix} L V (x) & = V^{'} (x) f (x) + V^{'} (x) G (x) σ (ϕ (x)) + \frac{1}{2} tr D^{T} (x) V^{''} (x) D (x) \\ < V^{'} (x) f (x) + V^{'} (x) G (x) σ (ϕ (x)) + \frac{1}{2} tr D^{T} (x) V^{''} (x) D (x) + L_{1} (x) \\ - \frac{1}{4 (1 - θ^{2})} L_{2} (x) R_{2}^{- 1} (x) L_{2}^{T} (x) \\ + (1 - θ^{2}) {[σ (ϕ (x)) - \frac{1}{2 (1 - θ^{2})} R_{2}^{- 1} (x) L_{2}^{T} (x)]}^{T} R_{2} (x) \\ \cdot [σ (ϕ (x)) - \frac{1}{2 (1 - θ^{2})} R_{2}^{- 1} (x) L_{2}^{T} (x)] \\ = V^{'} (x) f (x) + L_{1} (x) + V^{'} (x) G (x) σ (ϕ (x)) + \frac{1}{2} tr D^{T} (x) V^{''} (x) D (x) \\ + (1 - θ^{2}) σ^{T} (ϕ (x)) R_{2} (x) σ (ϕ (x)) - L_{2} (x) σ (ϕ (x)) \\ = ϕ^{T} (x) R_{2} (x) ϕ (x) - 2 ϕ^{T} (x) R_{2} (x) σ (ϕ (x)) \\ + (1 - θ^{2}) σ^{T} (ϕ (x)) R_{2} (x) σ (ϕ (x)) \\ = \sum_{i = 1}^{m} r_{i} (x) (\frac{1}{β} σ_{i} (- y_{i}) + y_{i}) (\frac{1}{α} σ_{i} (- y_{i}) + y_{i}) \\ = \frac{1}{α β} \sum_{i = 1}^{m} r_{i} (x) (σ_{i} (- y_{i}) + α y_{i}) (σ_{i} (- y_{i}) + β y_{i}) \\ < 0, x \in R^{n}, \end{matrix}

(127)

which, by Theorem 2, implies that the closed-loop system (126) is globally asymptotically stable in probability for all

σ

such that

α v_{i}^{2} < σ_{i} (v_{i}) v_{i} < β v_{i}^{2}

,

v_{i} \neq 0

,

i = 1, \dots, m

. Hence,

G

given by (114) and (115) has sector (and hence, gain) margins

(α, β)

. □

It is important to note that Theorem 13 also holds in the case where (125) is replaced with (118) and with the additional assumption that (118) is radially unbounded. To see this, note that (127) can be written as

\begin{matrix} \begin{matrix} L V (x) = & - [L_{1} (x) - \frac{1}{4 (1 - θ^{2})} L_{2} (x) R_{2}^{- 1} (x) L_{2}^{T} (x)] \\ - (1 - θ^{2}) {[σ (ϕ (x)) + \frac{1}{2 (1 - θ^{2})} R_{2}^{- 1} (x) L_{2}^{T} (x)]}^{T} R_{2} (x) \\ \cdot [σ (ϕ (x)) + \frac{1}{2 (1 - θ^{2})} R_{2}^{- 1} (x) L_{2}^{T} (x)] \\ + \frac{1}{α β} \sum_{i = 1}^{m} r_{i} (x) (σ_{i} (- y_{i}) + α y_{i}) (σ_{i} (- y_{i}) + β y_{i}) \\ \leq - W (x), x \in R^{n}, \end{matrix} \end{matrix}

(128)

where

W (x) ≜ L_{1} (x) - \frac{1}{4 (1 - θ^{2})} L_{2} (x) R_{2}^{- 1} (x) L_{2}^{T} (x) .

In this case, the result follows from Theorem 3.1 of [45]. Furthermore, note that in the case where

R_{2} (x)

,

x \in R^{n}

, is diagonal, Theorem 13 guarantees larger gain and sector margins to the gain and sector margin guarantees provided by Theorem 12. However, Theorem 13 does not provide disk margin guarantees.

9. Nonlinear Stochastic Feedback Regulators with Relative Stability Margins Guarantees

In this section, we give sufficient conditions that guarantee that a given nonlinear feedback controller has prespecified disk, sector, and gain margins.

Proposition 1.

Let

θ \in (0, 1)

and let

R_{2} \in R^{m \times m}

be a positive-definite matrix. Consider the nonlinear stochastic dynamical system

G

given by (114) and (115), where ϕ is a stochastically stabilizing feedback control law. Then there exist functions

V : R^{n} \to R

,

L_{1} : R^{n} \to R

, and

L_{2} : R^{n} \to R^{1 \times m}

such that

ϕ (x) = - \frac{1}{2} R_{2}^{- 1} {[V^{'} (x) G (x) + L_{2} (x)]}^{T}

,

V \in C_{p}^{1} (R^{n})

is a two-times continuously differentiable, radially unbounded function such that

V (0) = 0

,

V (x) > 0

,

x \in R^{n}

,

x \neq 0

, and, for all

x \in R^{n}

,

\begin{matrix} 0 = & V^{'} (x) f (x) + L_{1} (x) - \frac{1}{4} [V^{'} (x) G (x) + L_{2} (x)] R_{2}^{- 1} {[V^{'} (x) G (x) + L_{2} (x)]}^{T} \\ + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x), \end{matrix}

(129)

\begin{matrix} 0 & \leq (1 - θ^{2}) L_{1} (x) - \frac{1}{4} L_{2} (x) R_{2}^{- 1} L_{2}^{T} (x), \end{matrix}

(130)

if and only if there exists a two-times continuously differentiable, radially unbounded function

V \in C_{p}^{1} (R^{n})

such that

V (0) = 0

,

V (x) > 0

,

x \in R^{n}

,

x \neq 0

, and

\begin{matrix} L V (x) \leq {[u + y]}^{T} R_{2} (x) [u + y] - θ^{2} u^{T} R_{2} (x) u, x \in R^{n}, u \in R^{m} . \end{matrix}

(131)

Proof.

If there exist functions

V : R^{n} \to R

,

L_{1} : R^{n} \to R

, and

L_{2} : R^{n} \to R^{1 \times m}

such that

ϕ (x) = - \frac{1}{2} R_{2}^{- 1} {[V^{'} (x) G (x) + L_{2} (x)]}^{T}

and (129) and (130) are satisfied, then it follows from Lemma 1 that (131) is satisfied.

Conversely, if (131) is satisfied, then with

Q = R_{2}

,

S = R_{2}

, and

R = (1 - θ^{2}) R_{2}

, it follows from the stochastic version of Theorem 5.6 of [16] that, for all

x \in R^{n}

,

\begin{matrix} 0 & \geq V^{'} (x) f (x) + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x) - ϕ^{T} (x) R_{2} ϕ (x) \\ + \frac{1}{4 (1 - θ^{2})} [2 ϕ^{T} (x) R_{2} + V^{'} (x) G (x)] R_{2}^{- 1} {[2 ϕ^{T} (x) R_{2} + V^{'} (x) G (x)]}^{T} . \end{matrix}

The result now follows with

L_{1} (x) = - V^{'} (x) f (x) + ϕ^{T} (x) R_{2} ϕ (x) - \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x)

and

L_{2} (x)

= - [2 ϕ^{T} (x) R_{2} + V^{'} (x) G (x)]

. □

Note that if (129) and (130) are satisfied, then it follows from Theorem 9 that the feedback control law

ϕ (x) = - \frac{1}{2} R_{2}^{- 1} {[V^{'} (x) G (x) + L_{2} (x)]}^{T}

minimizes the cost functional (83). Hence, Proposition 1 provides necessary and sufficient conditions for optimality of a given stochastically stabilizing feedback control law with prespecified disk margin guarantees.

The following result presents specific disk margin guarantees for inverse optimal controllers.

Theorem 14.

Let

θ \in (0, 1)

be given. Consider the nonlinear stochastic dynamical system

G

given by (114) and (115), where ϕ is a stochastically stabilizing feedback control law. Assume that

G

is asymptotically zero-state observable, there exist functions

V : R^{n} \to R

and

R_{2} : R^{n} \to R^{m \times m}

such that

V \in C_{p}^{1} (R^{n})

is a two-times continuously differentiable, radially unbounded function,

R_{2} (x) > 0

,

x \in R^{n}

, and

\begin{matrix} V (0) = 0, \end{matrix}

(132)

\begin{matrix} V (x) > 0, x \in R^{n}, x \neq 0, \end{matrix}

(133)

\begin{matrix} V^{'} (x) [f (x) + G (x) ϕ (x)] + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x) < 0, x \in R^{n}, x \neq 0, \\ V^{'} (x) f (x) + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x) - ϕ^{T} (x) R_{2}^{- 1} (x) ϕ (x) \\ + \frac{1}{1 - θ^{2}} (ϕ^{T} (x) + \frac{1}{2} V^{'} (x) G (x) R_{2}^{- 1} (x)) R_{2} (x) \end{matrix}

(134)

\begin{matrix} \cdot {(ϕ^{T} (x) + \frac{1}{2} V^{'} (x) G (x) R_{2}^{- 1} (x))}^{T} \leq 0, x \in R^{n} . \end{matrix}

(135)

Then the nonlinear stochastic dynamical system

G

has a disk margin

(\frac{1}{1 + η θ}, \frac{1}{1 - η θ})

, where

η = \sqrt{\underset{̲}{γ} / \bar{γ}}

and

\underset{̲}{γ}

and

\bar{γ}

are given by (123). Furthermore, with the feedback control law ϕ the performance functional

\begin{matrix} J (x_{0}, u (\cdot)) & = E^{x_{0}} [\int_{0}^{\infty} [- V^{'} (x (t)) (f (x (t)) + G (x (t)) u (t)) \\ + {(ϕ (x (t)) - u (t))}^{T} R_{2} (x (t)) (ϕ (x (t)) - u (t))] d t] \end{matrix}

(136)

is minimized in the sense that

J (x_{0}, ϕ (x (\cdot))) = min_{u (\cdot) \in S (x_{0})} J (x_{0}, u (\cdot)), x_{0} \in R^{n} .

(137)

Proof.

The result is a direct consequence of Theorems 9 and 12 with

L_{1} (x) = - V^{'} (x) f (x) + ϕ^{T} (x) R_{2} (x) ϕ (x) - \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x)

and

L_{2} (x) = - (2 ϕ^{T} (x) R_{2} (x) + V^{'} (x) G (x))

. Specifically, in this case, all the conditions of Theorem 9 are trivially satisfied. Furthermore, note that (135) is equivalent to (118). The result now follows from Theorems 9 and 12. □

The next result provides sufficient conditions that guarantee that a given nonlinear feedback controller has prespecified gain and sector margins.

Theorem 15.

Let

θ \in (0, 1)

be given. Consider the asymptotically zero-state observable nonlinear stochastic dynamical system

G

given by (114) and (115), where ϕ is a stochastically stabilizing feedback control law. Assume there exist functions

R_{2} (x) = diag [r_{1} (x), \dots, r_{m} (x)]

, where

r_{i} : R^{n} \to R

,

r_{i} (x) > 0

,

i = 1, \dots, m

, and

V \in C_{p}^{1} (R^{n})

is a two-times continuously differentiable, radially unbounded function, and satisfies (132)–(135). Then the nonlinear stochastic dynamical system

G

has a disk margin

(\frac{1}{1 + θ}, \frac{1}{1 - θ})

. Furthermore, with the feedback control law ϕ the performance functional (136) is minimized in the sense that

J (x_{0}, ϕ (x (\cdot))) = min_{u (\cdot) \in S (x_{0})} J (x_{0}, u (\cdot)), x_{0} \in R^{n} .

(138)

Proof.

The result is a direct consequence of Theorems 9 and 13 with the proof being identical to the proof of Theorem 14. □

10. Optimal Linear-Quadratic Stochastic Control

In this section, we specialize Theorems 11 and 12 to the case of linear stochastic systems with multiplicative disturbance noise. Specifically, consider the stabilizable stochastic linear system given by

\begin{matrix} d x (t) & = & [A x (t) + B u (t)] d t + x σ^{T} d w (t), x (0) = x_{0}, t \geq 0, \end{matrix}

(139)

\begin{matrix} y (t) & = & - K x (t), \end{matrix}

(140)

where

A \in R^{n \times n}

,

B \in R^{n \times m}

,

K \in R^{m \times n}

, and

σ \in R^{d}

, and assume that

(A, K)

is detectable and the system (139) and (140) is asymptotically stable in probability with the feedback

u = - y

or, equivalently,

\tilde{A} + B K

is Hurwitz, where

\tilde{A} = A + \frac{1}{2} {‖ σ ‖}^{2} I_{n}

. Furthermore, assume that K is an optimal regulator that minimizes the quadratic performance functional given by

J (x_{0}, u (\cdot)) = E^{x_{0}} [\int_{0}^{\infty} [x^{T} (t) R_{1} x (t) + 2 x^{T} (t) R_{12} u (t) + u^{T} (t) R_{2} u (t)] d t],

(141)

where

R_{1} \in R^{n \times n}

,

R_{12} \in R^{n \times m}

, and

R_{2} \in R^{m \times m}

are such that

R_{2} > 0

,

R_{1} - R_{12} R_{2}^{- 1} R_{12}^{T} \geq 0

, and

(A, R_{1})

is observable. In this case, it follows from Theorem 9 with

f (x) = A x

,

G (x) = B

,

L_{1} (x) = x^{T} R_{1} x

,

L_{2} (x) = 2 x^{T} R_{12}

,

R_{2} (x) = R_{2}

,

ϕ (x) = K x

, and

V (x) = x^{T} P x

that the optimal control law K is given by

K = - R_{2}^{- 1} (B^{T} P + R_{12})

, where

P > 0

is the solution to the algebraic regulator Riccati equation given by

0 = {(\tilde{A} - B R_{2}^{- 1} R_{12}^{T})}^{T} P + P (\tilde{A} - B R_{2}^{- 1} R_{12}^{T}) + R_{1} - R_{12} R_{2}^{- 1} R_{12}^{T} - P B R_{2}^{- 1} B^{T} P .

(142)

The following results provide guarantees of disk, sector, and gain margins for the system (139) and (140). We assume that

G

is asymptotically zero-state observable.

Corollary 3.

Consider the stochastic dynamical system with multiplicative noise given by (139) and (140) and with performance functional (141), and let

σ_{max}^{2} (R_{12}) < σ_{min} (R_{1}) σ_{min} (R_{2})

. Then, with

K = - R_{2}^{- 1} (B^{T} P

+ R_{12})

, where

P > 0

satisfies (142), the system (139) and (140) has disk margin (and, hence, sector and gain margins)

(\frac{1}{1 + η θ}, \frac{1}{1 - η θ})

, where

η = \frac{σ_{min} (R_{2})}{σ_{max} (R_{2})}, θ = {(1 - \frac{σ_{max}^{2} (R_{12})}{σ_{min} (R_{1}) σ_{min} (R_{2})})}^{1 / 2} .

(143)

Proof.

The result is a direct consequence of Theorem 12 with

f (x) = A x

,

G (x) = B

,

ϕ (x) = K x

,

V (x) = x^{T} P x

,

L_{1} (x) = x^{T} R_{1} x

, and

L_{2} (x) = 2 x^{T} R_{12}

. Specifically, note that (142) is equivalent to (109). Now, with

θ

given by (143), it follows that

(1 - θ^{2}) R_{1} - R_{12} R_{2}^{- 1} R_{12}^{T} \geq 0

, and hence, (118) is satisfied so that all the conditions of Theorem 12 are satisfied. □

Corollary 4.

Consider the stochastic dynamical system with multiplicative noise given by (139) and (140) and with performance functional (141), and let

σ_{max}^{2} (R_{12}) < σ_{min} (R_{1}) σ_{min} (R_{2})

, where

R_{2}

is diagonal. Then, with

K = - R_{2}^{- 1} (B^{T} P + R_{12})

, where

P > 0

satisfies (142), the system (139) and (140) has structured disk margin (and hence, gain and sector) margin

(\frac{1}{1 + θ}, \frac{1}{1 - θ})

, where

θ = {(1 - \frac{σ_{max}^{2} (R_{12})}{σ_{min} (R_{1}) σ_{min} (R_{2})})}^{1 / 2} .

(144)

Proof.

The result is a direct consequence of Theorem 11 with

f (x) = A x

,

G (x) = B

,

ϕ (x) = K x

,

V (x) = x^{T} P x

,

L_{1} (x) = x^{T} R_{1} x

, and

L_{2} (x) = 2 x^{T} R_{12}

. Specifically, note that (142) is equivalent to (109). Now, with

θ

given by (144), it follows that

(1 - θ^{2}) R_{1} - R_{12} R_{2}^{- 1} R_{12}^{T} \geq 0

, and hence, (118) is satisfied so that all the conditions of Theorem 11 are satisfied. □

The gain margins specified in Corollary 4 precisely match those presented in [44] for deterministic linear-quadratic optimal regulators incorporating cross-weighting terms in the performance criterion. Additionally, as Corollary 4 ensures structured disk margins of

(\frac{1}{1 + θ}, \frac{1}{1 - θ})

, it implies that the system possesses a phase margin

ϕ

defined as follows:

cos (ϕ) = 1 - \frac{θ^{2}}{2},

(145)

or equivalently,

sin (\frac{ϕ}{2}) = \frac{θ}{2} .

(146)

In the scenario where

R_{12} = 0

, deduced from (144), it follows that

θ = 1

. Consequently, Corollary 4 ensures a phase margin of

\pm 60^{\circ}

in each input–output channel. Additionally, stipulating

R_{1} \geq 0

leads to the conclusion, based on Corollary 4, that the system described by (139) and (140) possesses a gain and sector margin of

(\frac{1}{2}, \infty)

.

11. Stability Margins and Meaningful Inverse Optimality

In this section, we establish explicit links between stochastic stability margins, stochastic meaningful inverse optimality, and stochastic dissipativity, focusing on a specific quadratic supply rate. More precisely, we derive a stochastic counterpart to the classical return difference inequality for continuous-time systems with continuously differentiable flows [21,46] in the context of stochastic dynamical systems. Furthermore, we establish connections between stochastic dissipativity and optimality for stochastic nonlinear controllers. Notably, we demonstrate the equivalence between stochastic dissipativity and optimality in the realm of stochastic dynamical systems. Specifically, we illustrate that an optimal nonlinear feedback controller

ϕ

, satisfying a return difference condition based on the infinitesimal generator of a controlled Markov diffusion process, is equivalent to the stochastic dynamical system—with input u and output

y = - ϕ (x)

—being stochastically dissipative with respect to a supply rate expressed as

{[u + y]}^{T} [u + y] - u^{T} u

.

Here, we assume that

L (x, u)

is nonnegative for all

(x, u) \in R^{n} \times R^{m}

, which, in the terminology of [25,47], corresponds to a meaningful cost functional. Furthermore, we assume

L_{2} (x) \equiv 0

and

L_{1} (x) \geq 0

,

x \in R^{n}

, and is radially unbounded. In this case, we establish connections between stochastic dissipativity and optimality for nonlinear stochastic controllers. The first result specializes Theorem 10 to the case in which

L_{2} (x) \equiv 0

.

Theorem 16.

Consider the nonlinear stochastic dynamical system (114) with performance functional (83) with

L_{2} (x) \equiv 0

and

L_{1} (x) \geq 0

,

x \in R^{n}

. Assume that there exists a two-times continuously differentiable, radially unbounded function

V \in C_{p}^{1} (R^{n})

such that

\begin{matrix} V (0) = 0, \end{matrix}

(147)

\begin{matrix} V (x) > 0, x \in R^{n}, x \neq 0, \\ 0 = L_{1} (x) + V^{'} (x) f (x) + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x) \end{matrix}

(148)

\begin{matrix} - \frac{1}{4} V^{'} (x) G (x) R_{2}^{- 1} (x) G^{T} (x) V^{' T} (x), x \in R^{n} . \end{matrix}

(149)

Then the zero solution

x (\cdot) \equiv 0

of the closed-loop system

d x (t) = [f (x (t)) + G (x (t)) ϕ (x (t))] d t + D (x (t)) d w (t), x (0) = x_{0}, t \geq 0 .

(150)

is globally asymptotically stable in probability with the feedback control law

ϕ (x) = - \frac{1}{2} R_{2}^{- 1} (x) G^{T} (x) V^{' T} (x),

(151)

and the performance functional (83) is minimized in the sense that

J (x_{0}, ϕ (x (\cdot))) = min_{u (\cdot) \in S (x_{0})} J (x_{0}, u (\cdot)), x_{0} \in R^{n} .

(152)

Finally,

J (x_{0}, ϕ (x (\cdot))) = V (x_{0}), x_{0} \in R^{n} .

(153)

Proof.

The proof is similar to the proof of Theorem 9, and hence, is omitted. □

Next, we show that for a given nonlinear stochastic dynamical system

G

given by (114) and (115), there exists an equivalence between optimality and stochastic dissipativity. For the following result we assume that for a given nonlinear stochastic system (114), if there exists a feedback control law

ϕ

that minimizes the performance functional (83) with

R_{2} (x) \equiv I

,

L_{2} (x) \equiv 0

, and

L_{1} (x) \geq 0

,

x \in R^{n}

, then there exists a two-times continuously differentiable, radially unbounded function

V \in C_{p}^{1} (R^{n})

such that (149) is satisfied.

Theorem 17.

Consider the nonlinear stochastic dynamical system

G

given by (114) and (115). The feedback control law

u = ϕ (x)

is optimal with respect to a performance functional (82) with

R_{2} (x) \equiv I

,

L_{2} (x) \equiv 0

, and

L_{1} (x) \geq 0

,

x \in R^{n}

, if and only if the nonlinear stochastic system

G

is stochastically dissipative with respect to the supply rate

r (u, y) = y^{T} y + 2 u^{T} y

and has a two-times continuously differentiable positive-definite, radially unbounded storage function

V \in C_{p}^{1} (R^{n})

.

Proof.

If the control law

ϕ

is optimal with respect to a performance functional (82) with

R_{2} (x) \equiv I

,

L_{2} (x) \equiv 0

, and

L_{1} (x) \geq 0

,

x \in R^{n}

, then, by assumption, there exists a two-times continuously differentiable, radially unbounded function

V \in C_{p}^{1} (R^{n})

such that (149) is satisfied. Hence, it follows from Proposition 1 that

\begin{matrix} L V (x) \leq {[u + y]}^{T} R_{2} (x) [u + y] - θ^{2} u^{T} R_{2} (x) u, \end{matrix}

(154)

which implies that

G

is stochastically dissipative with respect to the supply rate

r (u, y) = y^{T} y + 2 u^{T} y

.

Conversely, if

G

is stochastically dissipative with respect to the supply rate

r (u, y) = y^{T} y + 2 u^{T} y

and has a two-times continuously differentiable positive-definite storage function

V \in C_{p}^{1} (R^{n})

, then, with

h (x) = - ϕ (x)

,

J (x) \equiv 0

,

Q = I

,

R = 0

, and

S = 2 I

, it follows from the stochastic version of Theorem 5.6 of [16] that there exists a function

ℓ : R^{n} \to R^{p}

such that

ϕ (x) = - \frac{1}{2} G^{T} (x) V^{' T} (x)

and, for all

x \in R^{n}

,

0 = V^{'} (x) f (x) + \frac{1}{2} tr D^{T} (x) V^{″} (x) D (x) - \frac{1}{4} V^{'} (x) G (x) G^{T} (x) V^{' T} (x) + ℓ^{T} (x) ℓ (x) .

Now, the result follows from Theorem 16 with

L_{1} (x) = ℓ^{T} (x) ℓ (x)

. □

The next result gives disk and structured disk margins for the nonlinear stochastic dynamical system

G

given by (114) and (115).

Corollary 5.

Consider the nonlinear stochastic dynamical system

G

given by (114) and (115), where ϕ is the stochastically stabilizing feedback control law given by (111) with

L_{2} (x) \equiv 0

and where

V \in C_{p}^{1} (R^{n})

is a two-times continuously differentiable, radially unbounded function that satisfies (105)–(109). Assume that

G

is asymptotically zero state observable. Furthermore, assume

R_{2} (x) = diag [r_{1}, \dots, r_{m}]

, where

r_{i} > 0

,

i = 1, \dots, m

, and

L_{1} (x) \geq 0

,

x \in R^{n}

. Then the nonlinear stochastic dynamical system

G

has a structured disk margin

(\frac{1}{2}, \infty)

. If, in addition,

R_{2} (x) \equiv I_{m}

, then the nonlinear stochastic system

G

has a disk margin

(\frac{1}{2}, \infty)

Proof.

The result is a direct consequence of Theorem 11. Specifically, if

L_{1} (x) \geq 0

,

x \in R^{n}

, and

L_{2} (x) \equiv 0

, then (118) is trivially satisfied for all

θ \in (0, 1)

. Now, the result follows immediately by letting

θ \to 1

. □

Finally, we provide sector and gain margins for the nonlinear stochastic dynamical system

G

given by (114) and (115).

Corollary 6.

Consider the nonlinear stochastic dynamical system

G

given by (114) and (115), where ϕ is a stochastically stabilizing feedback control law given by (111) with

L_{2} (x) \equiv 0

and where

V \in C_{p}^{1} (R^{n})

is a two-times continuously differentiable, radially unbounded function that satisfies (105)–(109). Furthermore, assume

R_{2} (x) = diag [r_{1} (x),

\dots, r_{m} (x)]

, where

r_{i} : R^{n} \to R

,

r_{i} (x) > 0

,

i = 1, \dots, m

, and

L_{1} (x) > 0

,

x \in R^{n}

. Then the nonlinear stochastic dynamical system

G

has a sector (and hence, gain) margin

(\frac{1}{2}, \infty)

.

Proof.

The result is a direct consequence of Theorem 13. Specifically, if

L_{1} (x) > 0

,

x \in R^{n}

, and

L_{2} (x) \equiv 0

, then (125) is trivially satisfied for all

θ \in (0, 1)

. Now, the result follows immediately by letting

θ \to 1

. □

12. Conclusions

In this paper, we merged stochastic Lyapunov theory with stochastic Hamilton–Jacobi–Bellman theory to provide explicit connections between stability and optimality of nonlinear stochastic regulators. The proposed approach involves utilizing a steady-state stochastic Hamilton–Jacobi–Bellman framework to characterize optimal nonlinear feedback controllers wherein the notion of optimality is directly linked to a specified Lyapunov function, guaranteeing stability in probability for the closed-loop system. The derived results are then employed to establish inverse optimal feedback controllers for both affine nonlinear stochastic systems and linear stochastic systems.

Moreover, leveraging the concepts of stochastic stability and stochastic dissipativity theory, we developed sufficient conditions for gain, sector, and disk margin guarantees. These conditions apply to nonlinear stochastic dynamical systems controlled by both nonlinear optimal and inverse optimal regulators, minimizing a nonlinear-nonquadratic performance criterion. Furthermore, we established connections between stochastic dissipativity and optimality for nonlinear stochastic systems. The novelty of the proposed framework provides the foundation for extending linear-quadratic control for stochastic dynamical systems to nonlinear-nonquadratic problems.

Author Contributions

M.L.: Conceptualization, Formal analysis, Software, Visualization, Writing—original draft. W.M.H.: Conceptualization, Formal analysis, Writing—review and editing, Supervision, Funding Acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Air Force Office of Scientific Research under Grant FA9550-20-1-0038.

Data Availability Statement

No data were used for the research described in the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Baumann, W.T.; Rugh, W.J. Feedback control of nonlinear systems by extended linearization. IEEE Trans. Autom. Control 1986, 31, 40–46. [Google Scholar] [CrossRef]
Wang, J.; Rugh, W.J. Feedback linearization families for nonlinear systems. IEEE Trans. Autom. Control 1987, 32, 935–941. [Google Scholar] [CrossRef]
Blueschke, D.; Blueschke-Nikolaeva, V.; Neck, R. Approximately optimal control of nonlinear dynamic stochastic problems with learning: The OPTCON algorithm. Algorithms 2021, 14, 181. [Google Scholar] [CrossRef]
Zhang, Y.; Li, S.; Liao, L. Near-optimal control of nonlinear dynamical systems: A brief survey. Annu. Rev. Control 2019, 47, 71–80. [Google Scholar] [CrossRef]
Rekasius, Z.V. Suboptimal design of intentionally nonlinear controllers. IEEE Trans. Autom. Control 1964, 9, 380–386. [Google Scholar] [CrossRef]
Bass, R.; Webber, R. Optimal nonlinear feedback control derived from quartic and higher-order performance criteria. IEEE Trans. Autom. Control 1966, 11, 448–454. [Google Scholar] [CrossRef]
Speyer, J. A nonlinear control law for a stochastic infinite time problem. IEEE Trans. Autom. Control 1976, 21, 560–564. [Google Scholar] [CrossRef]
Shaw, L. Nonlinear control of linear multivariable systems via state-dependent feedback gains. IEEE Trans. Autom. Control 1979, 24, 108–112. [Google Scholar] [CrossRef]
Salehi, S.V.; Ryan, E. On optimal nonlinear feedback regulation of linear plants. IEEE Trans. Autom. Control 1982, 27, 1260–1264. [Google Scholar] [CrossRef]
Leitmann, G. On the efficacy of nonlinear control in uncertain linear systems. ASME J. Dyn. Syst. Meas. Control 1981, 102, 95–102. [Google Scholar] [CrossRef]
Petersen, I.R. Nonlinear versus linear control in the direct output feedback stabilization of linear systems. IEEE Trans. Autom. Control 1985, 30, 799–802. [Google Scholar] [CrossRef]
Barmish, B.R.; Galimidi, A.R. Robustness of Luenberger observers: Linear systems stabilized via non-linear control. Automatica 1986, 22, 413–423. [Google Scholar] [CrossRef]
Ryan, E.P. Optimal feedback control of saturating systems. Int. J. Control 1982, 35, 531–534. [Google Scholar] [CrossRef]
Blanchini, F. Feedback control for linear time-invariant systems with state and control bounds in the presence of disturbances. IEEE Trans. Autom. Control 1990, 35, 1231–1234. [Google Scholar] [CrossRef]
Bernstein, D.S. Nonquadratic cost and nonlinear feedback control. Int. J. Robust Nonlinear Control 1993, 3, 211–229. [Google Scholar] [CrossRef]
Haddad, W.M.; Chellaboina, V. Nonlinear Dynamical Systems and Control: A Lyapunov-Based Approach; Princeton University Press: Princeton, NJ, USA, 2008. [Google Scholar]
Kushner, H.J. A partial history of the early development of continuous-time nonlinear stochastic systems theory. Automatica 2014, 50, 303–334. [Google Scholar] [CrossRef]
Lanchares, M.; Haddad, W.M. Nonlinear–nonquadratic optimal and inverse optimal control for discrete-time stochastic dynamical systems. Int. J. Robust Nonlinear Control 2022, 32, 1487–1509. [Google Scholar] [CrossRef]
Haddad, W.M.; Lanchares, M. Dissipativity, inverse optimal control, and stability margins for nonlinear discrete-time stochastic feedback regulators. Int. J. Control 2023, 96, 2133–2145. [Google Scholar] [CrossRef]
Molinari, B. The stable regulator problem and its inverse. IEEE Trans. Autom. Control 1973, 18, 454–459. [Google Scholar] [CrossRef]
Moylan, P.J.; Anderson, B.D.O. Nonlinear regulator theory and an inverse optimal control problem. IEEE Trans. Autom. Control 1973, 18, 460–465. [Google Scholar] [CrossRef]
Jacobson, D.H. Extensions of Linear-Quadratic Control Optimization and Matrix Theory; Academic Press: New York, NY, USA, 1977. [Google Scholar]
Jacobson, D.H.; Martin, D.H.; Pachter, M.; Geveci, T. Extensions of Linear-Quadratic Control Theory; Springer: Berlin/Heidelberg, Germany, 1980. [Google Scholar]
Freeman, R.A.; Kokotović, P.V. Inverse optimality in robust stabilization. SIAM J. Control Optim. 1996, 34, 1365–1391. [Google Scholar] [CrossRef]
Sepulchre, R.; Jankovic, M.; Kokotovic, P. Constructive Nonlinear Control; Springer: London, UK, 1997. [Google Scholar]
Deng, H.; Krstic, M. Stochastic nonlinear stabilization–Part II: Inverse optimality. Syst. Control Lett. 1997, 32, 151–159. [Google Scholar] [CrossRef]
Rajpurohit, T.; Haddad, W.M. Dissipativity theory for nonlinear stochastic dynamical systems. IEEE Trans. Autom. Control 2017, 62, 1684–1699. [Google Scholar] [CrossRef]
Lanchares, M.; Haddad, W.M. Dissipative stochastic dynamical systems. Syst. Control Lett. 2023, 172, 105451. [Google Scholar] [CrossRef]
Khasminskii, R. Stochastic Stability of Differential Equations, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Arnold, L. Stochastic Differential Equations: Theory and Applications; Wiley-Interscience: New York, NY, USA, 1974. [Google Scholar]
Øksendal, B. Stochastic Differential Equations: An Introduction with Applications, 6th ed.; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Mao, X. Stochastic Differential Equations and Applications, 2nd ed.; Woodhead Publishing: Cambridge, UK, 2007. [Google Scholar]
Gall, J.F.L. Brownian Motion, Martingales, and Stochastic Calculus; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
Gard, T.C. Introduction to Stochastic Differential Equations; Marcel Dekker: New York, NY, USA, 1988. [Google Scholar]
Klebaner, F.C. Introduction to Stochastic Calculus with Applications, 3rd ed.; Imperial College Press: London, UK, 2012. [Google Scholar]
Stroock, D.W.; Varadhan, S.R.S. Multidimensional Diffusion Processes; Springer: Berlin/Heidelberg, Germany, 2014. [Google Scholar]
Billingsley, P. Probability and Measure, anniversary edition ed.; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
Apostol, T.M. Mathematical Analysis; Addison-Wesley: Reading, MA, USA, 1974. [Google Scholar]
Shreve, S. Stochastic Calculus for Finance II: Continuous-Time Models; Springer: New York, NY, USA, 2004. [Google Scholar]
Kushner, H. Introduction to Stochastic Control; Holt, Rinehart and Winston: New York, NY, USA, 1971. [Google Scholar]
Chang, F.-R. Stochastic Optimization in Continuous Time; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Fleming, W.H.; Soner, H.M. Controlled Markov Processes and Viscosity Solutions, 2nd ed.; Springer: New York, NY, USA, 2006. [Google Scholar]
Wan, C.-J.; Bernstein, D.S. Nonlinear feedback control with global stabilization. Dyn. Control 1995, 5, 321–346. [Google Scholar] [CrossRef]
Chung, D.; Kang, T.; Lee, J. Stability robustness of LQ optimal regulators for the performance index with cross-product terms. IEEE Trans. Autom. Control 1994, 39, 1698–1702. [Google Scholar] [CrossRef]
Mao, X. Stochastic versions of the LaSalle theorem. J. Differ. Equations 1999, 153, 175–195. [Google Scholar] [CrossRef]
Chellaboina, V.; Haddad, W.M. Stability margins of nonlinear optimal regulators with nonquadratic performance criteria involving cross-weighting terms. Syst. Control Lett. 2000, 39, 71–78. [Google Scholar] [CrossRef]
Freeman, R.; Kokotović, P. Robust Nonlinear Control Design: State-Space and Lyapunov Techniques; Birkhauser: Boston, MA, USA, 1996. [Google Scholar]

Figure 1. Controlled system states versus time. The bold lines show the average states over 1000 sample paths, whereas the shaded area shows a one standard deviation from the average.

Figure 2. Multiplicative input uncertainty of

G

and input operator

Δ

.

Figure 2. Multiplicative input uncertainty of

G

and input operator

Δ

.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lanchares, M.; Haddad, W.M. Nonlinear Optimal Control for Stochastic Dynamical Systems. Mathematics 2024, 12, 647. https://doi.org/10.3390/math12050647

AMA Style

Lanchares M, Haddad WM. Nonlinear Optimal Control for Stochastic Dynamical Systems. Mathematics. 2024; 12(5):647. https://doi.org/10.3390/math12050647

Chicago/Turabian Style

Lanchares, Manuel, and Wassim M. Haddad. 2024. "Nonlinear Optimal Control for Stochastic Dynamical Systems" Mathematics 12, no. 5: 647. https://doi.org/10.3390/math12050647

APA Style

Lanchares, M., & Haddad, W. M. (2024). Nonlinear Optimal Control for Stochastic Dynamical Systems. Mathematics, 12(5), 647. https://doi.org/10.3390/math12050647

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Nonlinear Optimal Control for Stochastic Dynamical Systems

Abstract

1. Introduction

2. Mathematical Preliminaries

3. Stability Theory for Stochastic Dynamical Systems

4. Dissipativity Theory for Stochastic Dynamical Systems

5. Connections between Stability Analysis and Nonlinear-Nonquadratic Performance Evaluation

6. Optimal Nonlinear Feedback Control for Stochastic Systems

7. Inverse Optimal Stochastic Control

8. Relative Stability Margins for Optimal Nonlinear Stochastic Regulators

9. Nonlinear Stochastic Feedback Regulators with Relative Stability Margins Guarantees

10. Optimal Linear-Quadratic Stochastic Control

11. Stability Margins and Meaningful Inverse Optimality

12. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI