Singularly Perturbed Forward-Backward Stochastic Differential Equations: Application to the Optimal Control of Bilinear Systems

Kebiri, Omar; Neureither, Lara; Hartmann, Carsten

doi:10.3390/computation6030041

Open AccessArticle

Singularly Perturbed Forward-Backward Stochastic Differential Equations: Application to the Optimal Control of Bilinear Systems

by

Omar Kebiri

^1,2,

Lara Neureither

² and

Carsten Hartmann

^2,*

¹

Laboratory of Statistics and Random Modeling, University of Abou Bekr Belkaid, Tlemcen 13000, Algeria

²

Institute of Mathematics, Brandenburgische Technische Universität Cottbus-Senftenberg, 03046 Cottbus, Germany

^*

Author to whom correspondence should be addressed.

Computation 2018, 6(3), 41; https://doi.org/10.3390/computation6030041

Submission received: 12 February 2018 / Revised: 22 June 2018 / Accepted: 27 June 2018 / Published: 28 June 2018

(This article belongs to the Special Issue Computation in Molecular Modeling)

Download

Browse Figure

Versions Notes

Abstract

:

We study linear-quadratic stochastic optimal control problems with bilinear state dependence where the underlying stochastic differential equation (SDE) has multiscale features. We show that, in the same way in which the underlying dynamics can be well approximated by a reduced-order dynamics in the scale separation limit (using classical homogenization results), the associated optimal expected cost converges to an effective optimal cost in the scale separation limit. This entails that we can approximate the stochastic optimal control for the whole system by a reduced-order stochastic optimal control, which is easier to compute because of the lower dimensionality of the problem. The approach uses an equivalent formulation of the Hamilton-Jacobi-Bellman (HJB) equation, in terms of forward-backward SDEs (FBSDEs). We exploit the efficient solvability of FBSDEs via a least squares Monte Carlo algorithm and show its applicability by a suitable numerical example.

Keywords:

linear quadratic stochastic control; bilinear systems; slow-fast dynamics; model reduction; forward-backward stochastic differential equations; least squares Monte Carlo

1. Introduction

Stochastic optimal control is one of the important fields in mathematics which has attracted the attention of both pure and applied mathematicians [1,2]. Stochastic control problems also appear in a variety of applications, such as statistics [3,4], financial mathematics [5,6], molecular dynamics [7,8] or materials science [9,10], to mention just a few. For some applications in science and engineering, such as molecular dynamics [8,11], the high dimensionality of the state space is an important challenge when solving optimal control problems. Another issue when solving optimal control problems by discretising the corresponding dynamic programming equations in space and time are multiscale effects that come into play when the state space dynamics exhibits slow and fast motions.

Here we consider such systems that have slow and fast scales and that are possibly high-dimensional. Several techniques have been developed to reduce the spatial dimension of control systems (see e.g., [12,13] and the references therein), but these techniques treat the control as a possibly time-dependent parameter (“open loop control”) and do not take into account that the control may be a feedback control that depends on the state variables (“closed loop control”). Needless to say that homogenisation techniques for stochastic control systems have been extensively studied by applied analysts using a variety of different mathematical tools, including viscosity solutions of the Hamilton-Jacobi-Bellman equation [14,15], backward stochastic differential equations [16,17], or occupation measures [18,19]. However the convergence analysis of multiscale stochastic control systems is quite involved and non-constructive, in that the limiting equations of motion are not given in explicit or closed form, which makes these results of limited practical use; see [20,21] for notable exceptions, dealing mainly with the case when the dynamics is linear.

In general, the elimination of variables and solving control problems do not commute, so one of the key questions in control engineering is under which conditions it is possible to eliminate variables before solving an optimal problem. We call this the model reduction problem. In this paper, we identify a class of stochastic feedback control problems with bilinear state dependence that have the property that they admit the elimination of variables (i.e., model reduction) before solving the control problem. These systems turn out to be relevant in the control of high-dimensional transport PDEs, such as Fokker-Planck equations or the evolution equations of open quantum systems [22,23]. The possibility of applying model reduction before solving the corresponding optimal control problem means that it is possible to treat the control in the original equation simply as a parameter. This is in accordance with the general model reduction strategy in control engineering that is motivated by the fact that solving a dimension-reduced control problem, rather than the original one, is numerically much less demanding; see e.g., [12,13] and the references therein. We will show that this strategy, under certain assumptions, yields a good approximation of the high-dimensional optimal control, which implies that the reduced-order optimal control can be used to control the full systems dynamics almost optimally.

Our approach is based on a Donsker-Varadhan type duality principle between a linear Feynman-Kac PDE and the semi-linear dynamic programming PDE associated with a stochastic control problem [24]. Here we exploit the fact that the dynamic programming PDE can be recast as an uncoupled forward-backward stochastic differential equation (see e.g., [25,26]) that can be treated by model reduction techniques, such as averaging or homogenization. The relation between semilinear PDEs of Hamilton-Jacobi-Bellman type and forward-backward stochastic differential equations (FBSDE) is a classical subject that has been first studied by Pardoux and Peng [27] and since then received lot of attention from various sides, e.g., [28,29,30,31,32,33]. The solution theory for FBSDEs has its roots in the work of Antonelli [34] and has been extended in various directions since then; see e.g., [35,36,37,38]. From a theoretical point of view, this paper goes beyond our previous works [24,39] in that we prove strong convergence of the value function and the control without relying on compactness or periodicity assumptions for the fast variables, even though we focus on bilinear systems only, which is the weakest form of nonlinearity. (Many nonlinear systems however can be represented as bilinear systems by a so-called Carleman linearisation.) It also goes beyond the classical works [20,21] that treat systems that are either fully linear or linear in the fast variables. We stress that we are mainly aiming at the model reduction problem, but we discuss alongside with the theoretical results some ideas to discretise the corresponding FBSDE [40,41,42,43,44], since one of the main motivations for doing model reduction is to reduce the numerical complexity of solving optimal control problems.

1.1. Set-Up and Problem Statement

We briefly discuss the technical set-up of the control problem considered in this paper. In this work, we consider the linear-quadratic (LQ) stochastic control problem of the following form: minimise the expected cost

J (u; t, x) = E [\int_{t}^{τ} (q_{0} (X_{s}^{u}) + {| u_{s} |}^{2}) d s + q_{1} (X_{τ}^{u}) | X_{t}^{u} = x]

(1)

over all admissible controls

u \in U

and subject to:

d X_{s}^{u} = (a (X_{s}^{u}) + b (X_{s}^{u}) u_{s}) d s + σ (X_{s}^{u}) d W_{s}, 0 ⩽ t ⩽ s ⩽ τ .

(2)

Here

τ < \infty

is a bounded stopping time (specified below), and the set of admissible controls

U

is chosen such that (2) has a unique strong solution. The denomination linear-quadratic for (1)–(2) is due to the specific dependence of the system on the control variable u. The state vector

x \in R^{n}

is assumed to be high-dimensional, which is why we seek a low-dimensional approximation of (1)–(2).

Specifically, we consider the case that

q_{0}

and

q_{1}

are quadratic in x, a is linear and

σ

is constant, and the control term is an affine function of x, i.e.,

b (x) u = (N x + B) u .

In this case, the system is called bilinear (including linear systems as a special case), and the aim is to replace (2) by a lower dimensional bilinear system

d {\bar{X}}_{s}^{v} = \bar{A} {\bar{X}}_{s}^{v} d s + (\bar{N} {\bar{X}}_{s}^{v} + \bar{B}) v_{s} d s + \bar{C} d W_{s}, 0 ⩽ t ⩽ s ⩽ τ,

with states

\bar{x} \in R^{n_{s}}

,

n_{s} ≪ n

and an associated reduced cost functional

\bar{J} (v; \bar{x}, t) = E [\int_{t}^{τ} ({\bar{q}}_{0} ({\bar{X}}_{s}^{v}) + {| v_{s} |}^{2}) d s + {\bar{q}}_{1} ({\bar{X}}_{τ}^{v}) | {\bar{X}}_{t}^{v} = \bar{x}],

that is solved instead of (1)–(2). Letting

v^{*}

denote the minimizer of

\bar{J}

, we require that

v^{*}

is a good approximation of the minimiser

u^{*}

of the original problem where “good approximation” is understood in the sense that

J (v^{*}; \cdot, \cdot) \approx J (u^{*}; \cdot, \cdot) .

It is a priori not clear how the symbol “≈” in the last equation must be interpreted, e.g., pointwise for all initial data

(x, t) \in R^{n} \times [0, T)

for some

T < \infty

, or uniformly on all compact subsets of

R^{n} \times [0, T)

.

One situation in which the above approximation property holds is when

u^{*} \approx v^{*}

uniformly in t and the cost is continuous in the control, but it turns out that this requirement will be too strong in general and overly restrictive. We will discuss alternative criteria in the course of this paper.

1.2. Outline

The paper is organised as follows: In Section 2 we introduce the bilinear stochastic control problem studied in this paper and derive the corresponding forward-backward stochastic differential equation (FBSDE). Section 3 contains the main result, a convergence result for the value function of a singularly perturbed control problem with bilinear state dependence, based on a FBSDE formulation. In Section 4 we present a numerical example to illustrate the theoretical findings and discuss the numerical discretisation of the FBSDE. The article concludes in Section 5 with a short summary and a discussion of future work. The proof of the main result and some technical lemmas are recorded in the Appendix A.

2. Singularly Perturbed Bilinear Control Systems

We now specify the system dynamics (2) and the corresponding cost functional (1). Let

(x_{1}, x_{2}) \in R^{n_{s}} \times R^{n_{f}}

with

n_{s} + n_{f} = n

denote a decomposition of the state vector

x \in R^{n}

into relevant (slow) and irrelevant (fast) components. Further let

W = {(W_{t})}_{t \geq 0}

denote a

R^{m}

-valued Brownian motion on a probability space

(Ω, F, P)

that is endowed with the filtration

{(F_{t})}_{t \geq 0}

generated by W. For any initial condition

x \in R^{n}

and any

A

-valued admissible control

u \in U

, with

A \subset R

, we consider the following system of Itô stochastic differential equations

d X_{s}^{ϵ} = A X_{s}^{ϵ} d s + (N X_{s}^{ϵ} + B) u_{s} d s + C d W_{s}, X_{t}^{ϵ} = x,

(3)

that depends parametrically on a parameter

ϵ > 0

via the coefficients

A = A^{ϵ} \in R^{n \times n}, N = N^{ϵ} \in R^{n \times n}, B = B^{ϵ} \in R^{n}, and C = C^{ϵ} \in R^{n \times m},

where for brevity we also drop the dependence of the process on the control u, i.e.,

X_{s}^{ϵ} = X_{s}^{u, ϵ}

. The stiffness matrix A in (3) is assumed to be of the form

A = (\begin{matrix} \begin{matrix} A_{11} \end{matrix} & \begin{matrix} ϵ^{- 1 / 2} A_{12} \end{matrix} \\ ϵ^{- 1 / 2} A_{21} & ϵ^{- 1} A_{22} \end{matrix}) \in R^{(n_{s} + n_{f}) \times (n_{s} + n_{f})},

(4)

with

n = n_{s} + n_{f}

. Control and noise coefficients are given by

N = (\begin{matrix} \begin{matrix} N_{11} \end{matrix} & \begin{matrix} N_{12} \end{matrix} \\ ϵ^{- 1 / 2} N_{21} & ϵ^{- 1 / 2} N_{22} \end{matrix}) \in R^{(n_{s} + n_{f}) \times (n_{s} + n_{f})}

(5)

and

B = (\begin{matrix} B_{1} \\ ϵ^{- 1 / 2} B_{2} \end{matrix}) \in R^{(n_{s} + n_{f}) \times 1}, C = (\begin{matrix} C_{1} \\ ϵ^{- 1 / 2} C_{2} \end{matrix}) \in R^{(n_{s} + n_{f}) \times m},

(6)

where

N x + B \in range (C)

for all

x \in R^{n}

; often we will consider either the case

m = 1

with

C_{i} = \sqrt{ρ} B_{i}

,

ρ > 0

, or

m = n

, with C being a multiple of the identity when

ϵ = 1

. All block matrices

A_{i j}, N_{i j}

,

B_{i}

and

C_{j}

are assumed to be order 1 and independent of

ϵ

.

The above

ϵ

-scaling of the coefficients is natural for a system with

n_{s}

slow and

n_{f}

fast degrees of freedom and arises, for example, as a result of a balancing transformation applied to a large-scale system of equations; see e.g., [22,45]. A special case of (3) is the linear system

d X_{s}^{ϵ} = (A X_{s}^{ϵ} + B u_{s}) d s + C d W_{s} .

(7)

Our goal is to control the stochastic dynamics (3)—or (7) as a special variant—so that a given cost criterion is optimised. Specifically, given two symmetric positive semidefinite matrices

Q_{0}, Q_{1} \in R^{n_{s} \times n_{s}}

, we consider the quadratic cost functional

J (u; t, x) = E [\frac{1}{2} \int_{t}^{τ} ({(X_{1, s}^{ϵ})}^{⊤} Q_{0}^{} X_{1, s}^{ϵ} + | u_{s} |^{2}) d s + \frac{1}{2} {(X_{1, τ}^{ϵ})}^{⊤} Q_{1}^{} X_{1, τ}^{ϵ}],

(8)

that we seek to minimise subject to the dynamics (3). Here the expectation is understood as an expectation over all realisations of

{(X_{s}^{ϵ})}_{s \in [t, τ]}

starting at

X_{t}^{ϵ} = x

, and as a consequence J is a function of the initial data

(t, x)

. The stopping time is defined as the minimum of some time

T < \infty

and the first exit time of a domain

D = D_{s} \times R^{n_{f}} \subset R^{n_{s}} \times R^{n_{f}}

where

D_{s}

is an open and bounded set with smooth boundary. Specifically, we set

τ = min {τ_{D}, T}

, with

τ_{D} = inf {r \geq t : X_{r}^{ϵ} \notin D} .

In other words,

τ

is the stopping time that is defined by the event that either

r = T

or

X_{r}^{ϵ}

leaves the set

D = D_{s} \times R^{n_{f}}

, whichever comes first. Please note that the cost function does not explicitly depend on the fast variables

x_{2}

. We define the corresponding value function by

V^{ϵ} (t, x) = inf_{u \in U} J (u; t, x) .

(9)

Remark 1.

As a consequence of the boundedness of

D_{s} \subset R^{n_{s}}

, we may assume that all coefficients in our control problem are bounded or Lipschitz continuous, which makes some of the proofs in the paper more transparent.

We further note that all of the following considerations trivially carry over to the case

N = 0

and a multi-dimensional control variable, i.e.,

u \in R^{k}

and

B \in R^{n \times k}

.

From Stochastic Control to Forward-Backward Stochastic Differential Equations

We suppose that the matrix pair

(A, C)

satisfies the Kalman rank condition

rank (C | A C | A^{2} C | \dots | A^{n - 1} C) = n .

(10)

A necessary—and in this case sufficient—condition for optimality of our control problem is that the value function (9) solves a semilinear parabolic partial differential equation of Hamilton-Jacobi-Bellman type (a.k.a. dynamic programming equation) [46]

- \frac{\partial V^{ϵ}}{\partial t} = L^{ϵ} V^{ϵ} + f (x, V^{ϵ}, C^{⊤} \nabla V^{ϵ}), V^{ϵ} |_{E^{+}} = q_{1},

(11)

where

q_{1} (x) = \frac{1}{2} x_{1}^{⊤} Q_{1} x_{1}^{}

and

E^{+}

is the terminal set of the augmented process

(s, X_{s}^{ϵ})

, precisely

E^{+} = ([0, T) \times \partial D) \cup ({T} \times D)

. Here

L^{ϵ}

is the infinitesimal generator of the control-free process,

L^{ϵ} = \frac{1}{2} C C^{⊤} : \nabla^{2} + (A x) \cdot \nabla,

(12)

and the nonlinearity f is given by

f (x, y, z) = \frac{1}{2} x_{1}^{⊤} Q_{0} x_{1}^{} - \frac{1}{2} {|(x^{⊤} N^{⊤} + B^{⊤}) {(C^{⊤})}^{♯} z|}^{2} .

(13)

Please note that f is independent of y where

{(C^{⊤})}^{♯}

denotes the Moore-Penrose pseudoinverse that is is unambiguously defined since

z = C^{⊤} \nabla V^{ϵ}

and

(N x + B) \in range (C)

, which by noting that

{(C^{⊤})}^{♯} C^{⊤}

is the orthogonal projection onto

range (C)

implies that

| (x^{⊤} N^{⊤} + B^{⊤}) \nabla V^{ϵ} |^{2} = | (x^{⊤} N^{⊤} + B^{⊤}) {(C^{⊤})}^{♯} z |^{2} .

The specific semilinear form of the equation is a consequence of the control problem being linear-quadratic. As a consequence, the dynamic programming Equation (11) admits a representation in form of an uncoupled forward-backward stochastic differential equation (FBSDE). To appreciate this point, consider the control-free process

X_{s}^{ϵ} = X_{s}^{ϵ, u = 0}

with infinitesimal generator

L^{ϵ}

and define an adapted process

Y_{s}^{ϵ} = Y_{s}^{ϵ, x, t}

by

Y_{s}^{ϵ} = V^{ϵ} (s, X_{s}^{ϵ}) .

(14)

(We abuse notation and denote both the controlled and the uncontrolled process by

X_{s}^{ϵ}

.) Then, by definition,

Y_{t}^{ϵ} = V^{ϵ} (t, x)

. Moreover, by Itô’s formula and the dynamic programming Equation (11), the pair

{(X_{s}^{ϵ}, Y_{s}^{ϵ})}_{s \in [t, τ]}

can be shown to solve the system of equations

\begin{matrix} d X_{s}^{ϵ} & = A X_{s}^{ϵ} d s + C d W_{s}, X_{t}^{ϵ} = x \\ d Y_{s}^{ϵ} & = - f (X_{s}^{ϵ}, Y_{s}^{ϵ}, Z_{s}^{ϵ}) d s + Z_{s}^{ϵ} d W_{s}, Y_{τ}^{ϵ} = q_{1} (X_{τ}^{ϵ}), \end{matrix}

(15)

with

Z_{s}^{ϵ} = C^{⊤} \nabla V^{ϵ} (s, X_{s}^{ϵ})

being the control variable. Here, the second equation is only meaningful if interpreted as a backward equation, since only in this case

Z_{s}^{ϵ}

is uniquely defined. To see this, let

f = 0

and

q_{1} (x) = x

and note that the ansatz (14) implies that

Y_{s}^{ϵ}

is adapted to the filtration generated by the forward process

X_{s}^{ϵ}

. If the second equation was just a time-reversed SDE then

(Y_{s}^{ϵ}, Z_{s}^{ϵ}) \equiv (X_{τ}^{ϵ}, 0)

would be the unique solution to the SDE

d Y_{s}^{ϵ} = Z_{s}^{ϵ} d W_{s}

with terminal condition

Y_{τ}^{ϵ} = X_{τ}^{ϵ}

. However, such a solution would not be adapted, because

Y_{s}^{ϵ}

for

s < τ

would depend on the future value

X_{τ}^{ϵ}

of the forward process.

Remark 2.

Equation (15) is called an uncoupled FBSDE because the forward equation for

X_{s}^{ϵ}

is independent of

Y_{s}^{ϵ}

or

Z_{s}^{ϵ}

. The fact that the FBSDE is uncoupled furnishes a well-known duality relation between the value function of an LQ optimal control problem and the cumulate generating function of the cost [47,48]; specifically, in the case that

N = 0

,

B = C

and the pair

(A, B)

being completely controllable, it holds that

V^{ϵ} (t, x) = - log E [exp (- \int_{t}^{τ} q_{0} (X_{s}^{ϵ}) d s - q_{1} (X_{τ}^{ϵ}))],

(16)

with

q_{0} (x) = \frac{1}{2} x_{1}^{⊤} Q_{0} x_{1}^{} .

Here the expectation on the right hand side is taken over all realisations of the control-free process

X_{s}^{ϵ} = X_{s}^{ϵ, u = 0}

, starting at

X_{t}^{ϵ} = x

. By the Feynman-Kac theorem, the function

ψ^{ϵ} = exp (- V^{ϵ})

solves the linear parabolic boundary value problem

(\frac{\partial}{\partial t} + L^{ϵ}) ψ^{ϵ} = q_{0} (x) ψ^{ϵ}, ψ^{ϵ} |_{E^{+}} = exp (- q_{1}),

(17)

which is equivalent to the corresponding dynamic programming Equation (11).

3. Model Reduction

The idea is to exploit the fact that (15) is uncoupled, which allows us to derive an FBSDE for the slow variables

{\bar{X}}_{s}^{ϵ} = X_{1, s}^{ϵ}

only, by standard singular perturbation methods. The reduced FBSDE as

ϵ \to 0

will then be of the form

\begin{matrix} d {\bar{X}}_{s} & = \bar{A} {\bar{X}}_{s} d s + \bar{C} d W_{s}, {\bar{X}}_{t} = x_{1} \\ d {\bar{Y}}_{s} & = - \bar{f} ({\bar{X}}_{s}, {\bar{Y}}_{s}, {\bar{Z}}_{s}) d s + {\bar{Z}}_{s} d W_{s}, {\bar{Y}}_{τ} = {\bar{q}}_{1} ({\bar{X}}_{τ}), \end{matrix}

(18)

where the limiting form of the backward SDE follows from the corresponding properties of the forward SDE. Specifically, assuming that the solution of the associated SDE

d ξ_{s} = A_{22} ξ_{s} d s + C_{2} d W_{s},

(19)

that is governing the fast dynamics as

ϵ \to 0

, is ergodic with unique Gaussian invariant measure

π = N (0, Σ)

, where

Σ = Σ^{⊤} > 0

is the unique solution to the Lyapunov equation

A_{22} Σ + Σ A_{22}^{⊤} = - C_{2} C_{2}^{⊤},

(20)

we obtain that, asymptotically as

ϵ \to 0

,

X_{2, s}^{ϵ} \sim ξ_{s / ϵ}, s > 0 .

(21)

As a consequence, the limiting SDE governing the evolution of the slow process

X_{1, s}^{ϵ}

—in other words: the forward part of (18)—has the coefficients

\bar{A} = A_{11}^{} - A_{12}^{} A_{22}^{- 1} A_{21}^{}, \bar{C} = C_{1} - A_{12}^{} A_{22}^{- 1} C_{2}^{},

(22)

which follows from standard homogenisation arguments [49]; a formal derivation is given in the Appendix A. By a similar reasoning we find that the driver of the limiting backward SDE reads

\bar{f} (x_{1}, y, z_{1}) = \int_{R^{n_{f}}} f ((x_{1}, x_{2}), y, (z_{1}, 0)) π (d x_{2}),

(23)

specifically,

\bar{f} (x_{1}, y, z_{1}) = \frac{1}{2} x_{1}^{⊤} {\bar{Q}}_{0}^{} x_{1}^{} - \frac{1}{2} | (x_{1}^{⊤} {\bar{N}}_{}^{⊤} + {\bar{B}}_{}^{⊤}) z_{1}^{} |^{2} + K_{0}^{},

(24)

with

{\bar{Q}}_{0}^{} = Q_{0}, \bar{N} = C_{1}^{♯} N_{11}^{}, \bar{B} = C_{1}^{♯} (B_{1}^{} + N_{12}^{} Σ_{}^{1 / 2}) .

(25)

The limiting backward SDE is equipped with a terminal condition

{\bar{q}}_{1}

that equals

q_{1}

, namely,

{\bar{q}}_{1} (x_{1}) = \frac{1}{2} x_{1}^{⊤} Q_{1} x_{1}^{} .

(26)

3.1. Interpretation as an Optimal Control Problem

It is possible to interpret the reduced FBSDE again as the probabilistic version of a dynamic programming equation. To this end, note that (10) implies that the matrix pair

(\bar{A}, \bar{C})

satisfies the Kalman rank condition [50]

rank (\bar{C} | A \bar{C} | A^{2} \bar{C} | \dots | A^{n_{s} - 1} \bar{C}) = n_{s} .

As a consequence, the semilinear partial differential equation

- \frac{\partial V}{\partial t} = \bar{L} V + \bar{f} (x_{1}, V, {\bar{C}}^{⊤} \nabla V) {, V |}_{E_{s}^{+}} = {\bar{q}}_{1},

(27)

with

E_{s}^{+} = ([0, T) \times \partial D_{s}) \cup ({T} \times D_{s})

and

\bar{L} = \frac{1}{2} \bar{C} {\bar{C}}^{⊤} : \nabla^{2} + (\bar{A} x_{1}) \cdot \nabla

(28)

has a classical solution

V \in C^{1, 2} ([0, T) \times D) \cap C^{0, 1} (E_{s}^{+})

. Letting

{\bar{Y}}_{s} : = V (s, {\bar{X}}_{s})

,

0 ⩽ t ⩽ s ⩽ τ

, with initial data

{\bar{X}}_{t} = x_{1}

and

{\bar{Z}}_{s} = {\bar{C}}^{⊤} \nabla V (s, {\bar{X}}_{s})

, the limiting FBSDE (18) can be readily seen to be equivalent to (27). The latter is the dynamic programming equation of the following LQ optimal control problem: minimise the cost functional

\bar{J} (v; t, x_{1}) = E [\frac{1}{2} \int_{t}^{τ} ({\bar{X}}_{s}^{⊤} {\bar{Q}}_{0}^{} {\bar{X}}_{s}^{} + | v_{s} |^{2}) d s + \frac{1}{2} {\bar{X}}_{τ}^{⊤} {\bar{Q}}_{1}^{} {\bar{X}}_{τ}^{}],

(29)

subject to

d {\bar{X}}_{s} = \bar{A} {\bar{X}}_{s} d s + (\bar{M} {\bar{X}}_{s} + \bar{D}) v_{s} d s + \bar{C} d w_{s}, {\bar{X}}_{t} = x_{1},

(30)

where

{(w_{s})}_{s \geq 0}

denotes standard Brownian motion in

R^{n_{s}}

and we have introduced the new control coefficients

\bar{M} = \bar{C} \bar{N}

and

\bar{D} = \bar{C} \bar{B}

.

3.2. Convergence of the Control Value

Before we state our main result and discuss its implications for the model reduction of linear and bilinear systems, we recall the basic assumptions that we impose on the system dynamics. Specifically, we say that the dynamics (3) and the corresponding cost functional (8) satisfy Condition LQ if the following holds:

$(A, C)$ is controllable, and the range of $b (x) = N x + B$ is a subspace of $range (C)$ .
The matrix $A_{22}$ is Hurwitz (i.e., its spectrum lies entirely in the open left complex half-plane) and the matrix pair $(A_{22}, C_{2})$ is controllable.
The driver of the FBSDE (15) is continuous and quadratically growing in Z.
The terminal condition in (15) is bounded; for simplicity we set $Q_{1} = 0$ in (8).

Assumption 2 implies that the fast subsystem (19) has a unique Gaussian invariant measure

π = N (0, Σ)

with full topological support, i.e., we have

Σ = Σ^{⊤} > 0

. According to ([51], Prop. 3.1) and [33], existence and uniqueness of (15) is guaranteed by Assumptions 3 and 4 and the controllability of

(A, C)

and the range condition, which imply that the transition probability densities of the (controlled or uncontrolled) forward process

X_{s}^{ϵ}

are smooth and strictly positive. As a consequence of the complete controllability of the original system, the reduced system (30) is completely controllable too, which guarantees existence and uniqueness of a classical solution of the limiting dynamic programming Equation (27); see, e.g., [52].

Uniform convergence of the value function

V^{ϵ} \to V

is now entailed by the strong convergence of the solution to the corresponding FBSDE as is expressed by the following Theorem.

Theorem 1.

Let the assumptions of Condition LQ hold. Further let

V^{ϵ}

be the classical solution of the dynamic programming Equation (11) and V be the solution of (27). Then

V^{ϵ} \to V,

uniformly on all compact subsets of

[0, T] \times D

.

The proof of the Theorem is given in Appendix A.2. For the reader’s convenience, we present a formal derivation of the limit equation in the next subsection.

3.3. Formal Derivation of the Limiting FBSDE

Our derivation of the limit FBSDE follows standard homogenisation arguments (see [49,53,54]), taking advantage of the fact that the FBSDE is uncoupled. To this end we consider the following linear evolution equation

(\frac{\partial}{\partial t} - L^{ϵ}) ϕ^{ϵ} = 0, ϕ^{ϵ} (x_{1}, x_{2}, 0) = g (x_{1})

(31)

for a function

ϕ^{ϵ} : {\bar{D}}_{s} \times R^{n_{f}} \times [0, T]

where

L^{ϵ} = \frac{1}{ϵ} L_{0} + \frac{1}{\sqrt{ϵ}} L_{1} + L_{2},

(32)

with

\begin{matrix} L_{0} & = \frac{1}{2} C_{2} C_{2}^{⊤} : \nabla_{x_{2}}^{2} + (A_{22} x_{2}) \cdot \nabla_{x_{2}} \end{matrix}

(33)

\begin{matrix} L_{1} & = \frac{1}{2} C_{1} C_{2}^{⊤} : \nabla_{x_{2} x_{1}}^{2} + \frac{1}{2} C_{2} C_{1}^{⊤} : \nabla_{x_{1} x_{2}}^{2} + (A_{12} x_{2}) \cdot \nabla_{x_{1}} + (A_{21} x_{1}) \cdot \nabla_{x_{2}} \end{matrix}

(34)

\begin{matrix} L_{2} & = \frac{1}{2} C_{1} C_{1}^{⊤} : \nabla_{x_{1}}^{2} + (A_{11} x_{1}) \cdot \nabla_{x_{1}} \end{matrix}

(35)

is the generator associated with the control-free forward process

X_{s}^{ϵ}

in (15). We follow the standard procedure of [49] and consider the perturbative expansion

ϕ^{ϵ} = ϕ_{0} + \sqrt{ϵ} ϕ_{1} + ϵ ϕ_{2} + \dots

that we insert into the Kolmogorov Equation (31). Equating different powers of

ϵ

we find a hierarchy of equations, the first three of which read

L_{0} ϕ_{0} = 0, L_{0} ϕ_{1} = - L_{1} ϕ_{0}, L_{0} ϕ_{2} = \frac{\partial ϕ_{0}}{\partial t} - L_{1} ϕ_{1} - L_{2} ϕ_{0} .

(36)

Assumption 2 on page 7 implies that

L_{0}

has a one-dimensional nullspace that is spanned by functions that are constant in

x_{2}

, and thus the first of the three equations implies that

ϕ_{0}

is independent of

x_{2}

. Hence the second equation—the cell problem—reads

L_{0} ϕ_{1} = - (A_{12} x_{2}) \cdot \nabla ϕ_{0} (x_{1}, t) .

(37)

The last equation has a solution by the Fredholm alternative, since the right hand side averages to zero under the invariant measure

π

of the fast dynamics that is generated by the operator

L_{0}

, in other words, the right hand side of the linear equation is orthogonal to the nullspace of

L_{0}^{*}

spanned by the density of

π

. Here

L_{0}^{*}

is the formal

L^{2}

adjoint of the operator

L_{0}

, defined on a suitable dense subspace of

L^{2}

. The form of the equation suggests the general ansatz

ϕ_{1} = ψ (x_{2}) \cdot \nabla ϕ_{0} (x_{1}, t) + R (x_{1}, t)

where the function R plays no role in what follows, so we set it equal to zero. Since

L_{0} ψ = - {(A_{12} x_{2})}^{⊤}

, the function

ψ

must be of the form

ψ = Q x_{2}

with a matrix

Q \in R^{n_{s} \times n_{f}}

. Hence

Q = - A_{12} A_{22}^{- 1} .

Now, solvability of the last of the three equations of (36) requires again that the right hand side averages to zero under

π

, i.e.,

\int_{R^{n_{f}}} (\frac{\partial ϕ}{\partial t} + L_{1} [(A_{12} A_{22}^{- 1} x_{2}) \cdot \nabla ϕ] - L_{2} ϕ) π (d x_{2}) = 0,

(38)

which formally yields the limiting equation for

ϕ = ϕ_{0} (x_{1}, t)

. Since

π

is a Gaussian measure with mean 0 and covariance

Σ

given by (20), the integral (38) can be explicitly computed:

(\frac{\partial}{\partial t} - \bar{L}) ϕ = 0, ϕ (x_{1}, 0) = g (x_{1}),

(39)

where

\bar{L}

is given by (28) and the initial condition

ϕ (\cdot, 0) = g

is a consequence of the fact that the initial condition in (31) is independent of

ϵ

. By the controllability of the pair

(\bar{A}, \bar{C})

, the limiting Equation (39) has a unique classical solution and uniform convergence

ϕ^{ϵ} \to ϕ

is guaranteed by standard results, e.g., ([49], Thm. 20.1).

Since the backward part of (15) is uniformly bounded in

ϵ

, the final form of the homogenised FBSDE (18) is found by averaging over

x_{2}

, with the unique solution of the corresponding backward SDE satisfying

Z_{2, s} = 0

as the averaged backward process is independent of

x_{2}

.

4. Numerical Studies

In this section, we present numerical results for linear and bilinear control systems and discuss the numerical discretisation of uncoupled FBSDE associated with LQ stochastic control problems. We begin with the latter.

4.1. Numerical FBSDE Discretisation

The fact that (15) or (18) are decoupled entails that they can be discretised by an explicit time-stepping algorithm. Here we utilise a variant of the least-squares Monte Carlo algorithm proposed in [41]; see also [55]. The convergence of numerical schemes for FBSDE with quadratic nonlinearities in the driver has been analysed in [56].

The least-squares Monte Carlo scheme is based on the Euler discretisation of (15):

\begin{matrix} {\hat{X}}_{n + 1} & = {\hat{X}}_{n} + Δ t A {\hat{X}}_{n} + \sqrt{Δ t} C ξ_{n + 1} \\ {\hat{Y}}_{n + 1} & = {\hat{Y}}_{n} - Δ t f ({\hat{X}}_{n}, {\hat{Y}}_{n}, {\hat{Z}}_{n}) + \sqrt{Δ t} {\hat{Z}}_{n} \cdot ξ_{n + 1} \end{matrix}

(40)

where

({\hat{X}}_{n}, {\hat{Y}}_{n})

denotes the numerical discretisation of the joint process

(X_{s}^{ϵ}, Y_{s}^{ϵ})

, where we set

X_{s}^{ϵ} = X_{τ_{D}}^{ϵ}

for

s \in (τ_{D}, T]

when

τ_{D} < T

, and

{(ξ_{k})}_{k \geq 1}

is an i.i.d. sequence of normalised Gaussian random variables. Now let

F_{n} = σ (\{{\hat{W}}_{k} : 0 ⩽ k ⩽ n\})

be the

σ

-algebra generated by the discrete Brownian motion

{\hat{W}}_{n} : = \sqrt{Δ t} \sum_{i ⩽ n} ξ_{i}

. By definition the joint process

(X_{s}^{ϵ}, Y_{s}^{ϵ})

is adapted to the filtration generated by

{(W_{r})}_{0 ⩽ r ⩽ s}

, therefore

{\hat{Y}}_{n} = E [{\hat{Y}}_{n} | F_{n}] = E [{\hat{Y}}_{n + 1} + Δ t f ({\hat{X}}_{n}, {\hat{Y}}_{n}, {\hat{Z}}_{n}) | F_{n}],

(41)

where we have used that

{\hat{Z}}_{n}

is independent of

ξ_{n + 1}

. To compute

{\hat{Y}}_{n}

from

{\hat{Y}}_{n + 1}

we use the identification of

Z_{s}^{ϵ}

with

C^{⊤} \nabla V^{ϵ} (s, X_{s}^{ϵ})

and replace

{\hat{Z}}_{n}

in (41) by

{\hat{Z}}_{n} = C^{⊤} \nabla V^{ϵ} (t_{n}, {\hat{X}}_{n}),

(42)

and the parametric ansatz (44) for

V^{ϵ}

makes the overall scheme explicit in

{\hat{X}}_{n}

and

{\hat{Y}}_{n}

.

4.2. Least-Squares Solution of the Backward SDE

To evaluate the conditional expectation

{\hat{Y}}_{n} = E [\cdot | F_{n}]

we recall that a conditional expectation can be characterised as the solution to the following quadratic minimisation problem:

E [S | F_{n}] = \underset{Y \in L^{2}, F_{n} - measurable}{argmin} {E [| Y - S |}^{2}] .

Given M independent realisations

{\hat{X}}_{n}^{(i)}

,

i = 1, \dots, M

of the forward process

{\hat{X}}_{n}

, this suggests the approximation scheme

{\hat{Y}}_{n} \approx \underset{Y = Y ({\hat{X}}_{n})}{argmin} \frac{1}{M} \sum_{i = 1}^{M} | Y - {\hat{Y}}_{n + 1}^{(i)} - Δ t f ({\hat{X}}_{n}^{(i)}, {\hat{Y}}_{n + 1}^{(i)}, C^{⊤} \nabla {\hat{Y}}_{n + 1}^{(i)}) |^{2},

(43)

where

{\hat{Y}}^{(i)}

is defined by

{\hat{Y}}^{(i)} = Y ({\hat{X}}^{(i)})

with terminal values

{\hat{Y}}_{N}^{(i)} = q_{1}^{} (X_{N}^{(i)}), τ = N Δ t .

(Please note that

N = N_{D}

is random.) For simplicity, we assume in what follows that the terminal value is zero, i.e., we set

q_{1} = 0

. (Recall that the existence and uniqueness result from [33] requires

q_{1}

to be bounded.) To represent

{\hat{Y}}_{n}

as a function

Y ({\hat{X}}_{n})

we use the ansatz

Y ({\hat{X}}_{n}) = \sum_{k = 1}^{K} α_{k} (n) φ_{k} ({\hat{X}}_{n}),

(44)

with coefficients

α_{1} (\cdot), \dots, α_{K} (\cdot) \in R

and suitable basis functions

φ_{1}, \dots, φ_{K} : R^{n} \to R

(e.g., Gaussians). Please note that the coefficients

α_{k}

are the unknowns in the least-squares problem (43) and thus are independent of the realisation. Now the least-squares problem that has to be solved in the n-th step of the backward iteration is of the form

\hat{α} (n) = \underset{α \in R^{K}}{argmin} {∥A_{n} α - b_{n}∥}^{2},

(45)

with coefficients

A_{n} = {(φ_{k} ({\hat{X}}_{n}^{(i)}))}_{i = 1, \dots, M; k = 1, \dots, K}

(46)

and data

b_{n} = {({\hat{Y}}_{n + 1}^{(i)} - Δ t f ({\hat{X}}_{n}^{(i)}, {\hat{Y}}_{n + 1}^{(i)}, C^{⊤} \nabla {\hat{Y}}_{n + 1}^{(i)}))}_{i = 1, \dots, M} .

(47)

Assuming that the coefficient matrix

A_{n} \in R^{M \times K}

,

K ⩽ M

defined by (46) has maximum rank K, then the solution to the least-squares problem (45) is given by

\hat{α} (n) = {(A_{n}^{⊤} A_{n}^{})}^{- 1} A_{n}^{⊤} b_{n}^{} .

(48)

The thus defined scheme is strongly convergent of order 1/2 as

Δ t \to 0

and

M, K \to \infty

as has been analysed by [41]. Controlling the approximation quality for finite values

Δ t, M, K

, however, requires a careful adjustment of the simulation parameters and appropriate basis functions, especially with regard to the condition number of the matrix

A_{n}

.

4.3. Numerical Example

Illustrating our theoretical findings of Theorem 1, we consider a linear system of form (7) where the matrices

A, B

and C are given by

A = (\begin{matrix} \begin{matrix} 0 \end{matrix} & \begin{matrix} ϵ^{- 1 / 2} I_{n \times n} \end{matrix} \\ - ϵ^{- 1 / 2} I_{n \times n} & - γ ϵ^{- 1} I_{n \times n} \end{matrix}) \in R^{2 n \times 2 n},

and

B = C = (\begin{matrix} 0 \\ σ ϵ^{- 1 / 2} I_{n \times n} \end{matrix}) \in R^{2 n \times n} .

This is an instance of a controlled Langevin equation with friction and noise coefficients

γ, σ > 0

which are assumed to fulfil the fluctuation-dissipation relation

2 γ = σ^{2} .

In the example we let

γ = 1 / 2

and

σ = 1

. The quadratic cost functional (8) is determined by the running cost via

Q_{0} = I_{n \times n} \in R^{n \times n}

and we apply no terminal cost, i.e.,

Q_{1} = 0 .

The associated effective equations are given by (29)–(30), where

\bar{A} = - γ^{- 1} I_{n \times n}, \bar{D} = \bar{C} = σ γ^{- 1}, \bar{M} = 0, {\bar{Q}}_{0} = I_{n \times n}, {\bar{Q}}_{1} = 0 \in R^{n \times n} .

We apply the previously described FBSDE scheme (40), (44)–(48), which was shown to yield good results in [57], to both the full and the reduced system, and we choose

n = 3

, i.e., the full system is six dimensional. To this end we choose the basis functions

ϕ_{k, n}^{μ_{k}, δ} (x) = exp (- \frac{{(μ_{k} - x)}^{2}}{2 δ})

where

δ = 0.1

is fixed but

μ_{k} = μ_{k} (n)

changes in each timestep such that the basis follows the forward process. For this, we simulate K additional forward trajectories

X^{(k)}, k = 1, \dots, K

and set

μ_{k} (n) = X_{n}^{(k)}

.

We choose the parameters for the numerics as follows. The number of basis functions K is given by

K = 9

for the reduced system and

K^{ϵ} = 40

for the full system. We choose these values because the maximally observed rank of the coefficient matrices

A_{n}

defined in (46) is 9 for the reduced system and they should not be rank deficient. For the full system we could have used a greater values for K, but we want to keep the computational effort reasonable. Further, we choose

Δ t = 5 \times 10^{- 5}

, the final time

T = 0.5

and the number of realisations

M = 400 .

We let the whole algorithm run five times and compute the distance between the value functions of the full and reduced systems

E (ϵ) : = | V^{ϵ} (0, x) - V (0, x) |

for which convergence of order

1 / 2

was found in the proof of Theorem 1. Indeed, we observe convergence of order

1 / 2

in our numerical example as can be seen in Figure 1 where we depict the mean and standard deviation of

E (ϵ)

.

4.4. Discussion

We shall now discuss the implications of the above simple example when it comes to more complicated dynamical systems. As a general remark the results show that it is possible to to apply model reduction before solving the corresponding optimal control problem where the control variable in the original equation can simply be treated as a parameter. This is in accordance with the general model reduction strategy in control engineering; see e.g., [12,13] and the references therein. Our results not only guarantee convergence of the value function via convergence of

Y^{ϵ}

, but they also imply strong convergence of the optimal control, by the convergence of the control process

Z^{ϵ}

in

L^{2}

. (See the Appendix A for details.) This means that in the case of a system with time scale separation, our result is highly valuable since we can resort to the reduced system for finding the optimal control which can then be applied to the full systems dynamics.

We stress that our results carry over to fully nonlinear stochastic control problems which have a similar LQ structure [24]. Clearly, for realistic (i.e., high-dimensional or nonlinear) systems the identification of a small parameter

ϵ

remains challenging, and one has to resort to e.g., semi-empirical approaches, such as [58,59].

If the dynamics is linear, as is the case here, small parameters may be identified using system theoretic arguments based on balancing transformations (see, e.g., [22,45]). These approaches require that the dynamics is either linear or bilinear in the state variables, but the aforementioned duality for the quasi-linear dynamic programming equation can be used here as well in order to change the drift of the forward SDE from some nonlinear vector field b to a linear vector field, say,

b_{0} = A x

. Assuming that the noise coefficient C is square and invertible and ignoring

ϵ

and the boundary condition for the moment, it is easy to see that the dynamic programming PDE (11) can be recast as

\begin{matrix} - \frac{\partial V^{ϵ}}{\partial t} = \tilde{L} V + \tilde{f} (x, V^{ϵ}, C^{⊤} \nabla_{x} V^{ϵ}) = 0 . \end{matrix}

Here

\tilde{L} = \frac{1}{2} C C^{⊤} + b (x) \cdot \nabla

is the generator of a forward SDE with nonlinear drift b, and

\tilde{f} (x, y, z) = f (x, y, z) + C^{- 1} (A x - b (x)) \cdot z .

is the driver of the corresponding backward SDE. Even though the change of drift is somewhat arbitrary, it shows that by changing the driver in the backward SDE it is possible to reduce the control problem to one with linear drift that falls within the category that is considered in this paper, at the expense of having a possibly non-quadratic cost functional.

Remark 3.

Changing the drift may be advantageous in connection with the numerical FBSDE solver. In the martingale basis approach of Bender and Steiner [41], the authors have suggested to use basis functions that are defined as conditional expectations of certain linearly independent candidate functions over the forward process, which makes the basis functions martingales. Computing the martingale basis, however, comes with a large computational overhead, which is why the authors consider only cases in which the conditional expectations can be computed analytically. Changing the drift of the forward SDE may thus be used to simplify the forward dynamics so that its distribution becomes analytically tractable.

5. Conclusions and Outlook

We have given a proof of concept that model reduction methods for singularly perturbed bilinear control systems can be applied to the dynamics before solving the corresponding optimal control problem. The key idea here was to exploit the equivalence between the semi-linear dynamic programming PDE corresponding to our stochastic optimal control problem and a singularly perturbed forward-backward SDE which is decoupled. Using this equivalence, we could derive a reduced-order FBSDE which was then interpreted as the representation of a reduced-order stochastic control problem. We have proved uniform convergence of the corresponding value function and, as an auxiliary result, obtained a strong convergence result for the optimal control. As we have argued the latter implies that the optimal control computed from the reduced system can be used to control the original dynamics.

We have presented a numerical example to illustrate our findings and discussed the numerical discretisation of uncoupled FBSDEs, based on the computation of conditional expectations. For the latter, the choice of the basis functions played an essential role, and how to cleverly choose the ansatz functions, possibly exploiting that the forward SDE has an explicit solution (see e.g., [41]) is an important aspect that future research ought to address, especially with regard to high dimensional problems.

Another class of important problems not considered in this article are slow-fast systems with vanishing noise. The natural question here is how the limit equation depends on the order in which noise and time scale parameters go to zero. This question has important consequences for the associated deterministic control problem and its regularisation by noise. We leave this topic for future work.

Author Contributions

All authors have contributed equally to this work.

Funding

This research has been partially funded by Deutsche Forschungsgemeinschaft (DFG) through the Grant CRC 1114 “Scaling Cascades in Complex Systems”, Project A05 “Probing scales in equilibrated systems by optimal nonequilibrium forcing”. Omar Kebiri acknowledges funding from the EU-METALIC II Programme.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs and Technical Lemmas

The idea of the proof of Theorem 1 closely follows the work [60], with the main differences being (a) that we consider slow-fast systems exhibiting three time scales, in particular the slow equation contains singular

O (ϵ^{- 1 / 2})

terms, and (b) that the coefficients of the fast dynamics are not periodic, with the fast process being asymptotically Gaussian as

ϵ \to 0

; in particular the

n_{f}

-dimensional fast process lives on the unbounded domain

R^{n_{f}}

.

Appendix A.1. Poisson Equation Lemma

Theorem 1 rests on the following Lemma that is similar to a result in [61].

Lemma A1.

Suppose that the assumptions of Condition LQ on page 7 hold and define

h : [0, T] \times R^{n_{s}} \times R^{n_{f}} \to R

to be a function of the class

C_{b}^{1, 2, 2}

. Further assume that h is centred with respect to the invariant measure π of the fast process. Then for every

t \in [0, T]

and initial conditions

(X_{1, u}^{ϵ}, X_{2, u}^{ϵ}) = (x_{1}, x_{2}) \in R^{n_{s}} \times R^{n_{f}}

,

0 ⩽ u < t

, we have

lim_{ϵ \to 0} E [{(\int_{u}^{v} h (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}) d s)}^{2}] = 0, 0 ⩽ u < v ⩽ t .

(A1)

Proof.

We remind the reader of the definition (33)–(35) of the differential operators

L_{0}, L_{1}

and

L_{2}

, and consider the Poisson equation

L_{0} ψ = - h

(A2)

on the domain

R^{n_{f}}

. (The variables

x_{1} \in R^{n_{s}}

and

t \in [0, T]

are considered as parameters.) Since h is centred with respect to

π

, Equation (A2) has a solution by the Fredholm alternative. By Assumption 2

L_{0}

is a hypoelliptic operator in

x_{2}

and thus by ([62], Thm. 2), the Poisson Equation (A2) has a unique solution that is smooth and bounded. Applying Itô’s formula to

ψ

and introducing the shorthand

δ ψ (u, v) = ψ (v, X_{1, v}^{ϵ}, X_{2, v}^{ϵ}) - ψ (u, x_{1}, x_{2})

yields

\begin{matrix} δ ψ (u, v) = & \int_{u}^{v} (\partial_{t} ψ + L_{2} ψ) (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}) d s + \frac{1}{\sqrt{ϵ}} \int_{u}^{v} L_{1} ψ (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}) d s \\ + \frac{1}{ϵ} \int_{u}^{v} L_{0} ψ (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}) d s + M_{1} (u, v) + \frac{1}{\sqrt{ϵ}} M_{2} (u, v), \end{matrix}

(A3)

where

M_{1}

and

M_{2}

are square integrable martingales with respect to the natural filtration generated by the Brownian motion

W_{s}

:

\begin{matrix} M_{1} (u, v) = & \int_{u}^{v} (\partial_{t} ψ + L_{2} ψ) (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}) d s + \frac{1}{\sqrt{ϵ}} \int_{u}^{v} L_{1} ψ (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}) d s \\ + \frac{1}{ϵ} \int_{u}^{v} L_{0} ψ (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}) d s + M_{1} (u, v) + \frac{1}{\sqrt{ϵ}} M_{2} (u, v), \end{matrix}

(A4)

By the properties of the solution to (A2) the first three integrals on the right hand side are uniformly bounded in u and v, and thus

\begin{matrix} \int_{u}^{v} h (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}) d s = & - ϵ δ ψ (u, v) + ϵ \int_{u}^{v} (\partial_{t} ψ + L_{2} ψ) (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}) d s \\ + \sqrt{ϵ} \int_{u}^{v} L_{1} ψ (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}) d s + ϵ M_{1} (u, v) + \sqrt{ϵ} M_{2} (u, v) . \end{matrix}

By the Itô isometry and the boundedness of the derivatives

\nabla_{x_{1}} ψ

and

\nabla_{x_{2}} ψ

, the martingale term can be bounded by

E [{(M_{i} (u, v))}^{2}] ⩽ C_{i} (v - u), 0 < C_{i} < \infty .

Hence

E [{(\int_{u}^{v} h (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}) d s)}^{2}] ⩽ C ϵ,

with a generic constant

0 < C < \infty

that is independent of

u, v

and

ϵ

. ☐

Appendix A.2. Convergence of the Value Function

Lemma A2.

Suppose that Condition LQ from page 7 holds. Then

| V^{ϵ} (t, x) - V (t, x_{1}) | \leq C \sqrt{ϵ},

with

x = (x_{1}, x_{2}) \in D = D_{s} \times R^{n_{f}}

, where

V^{ϵ}

is the solution of the original dynamic programming Equation (11) and V is the solution of the limiting dynamic programming Equation (27). The constant and C depends on x and t, but is finite on every compact subset of

D \times [0, T]

.

Proof.

The idea of the proof is to apply Itô’s formula to

| y_{s}^{ϵ} |^{2}

, where

y_{s}^{ϵ} = Y_{s}^{ϵ} - V (s, X_{1, s}^{ϵ})

satisfies the backward SDE

d y_{s}^{ϵ} = - G^{ϵ} (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}, y_{s}^{ϵ}, z_{s}^{ϵ}) d s + z_{s}^{ϵ} \cdot d W_{s}

(A5)

where

z_{s}^{ϵ} = Z_{s}^{ϵ} - {({\bar{C}}^{⊤} \nabla V (s, X_{1, s}^{ϵ}), 0)}^{⊤} (\nabla V = \nabla_{x_{1}} V)

and

G^{ϵ} (t, x_{1}, x_{2}, y, z) = G_{1} (t, x_{1}, x_{2}, y, z) + G_{2}^{ϵ} (t, x_{1}, x_{2}, y, z),

with

\begin{matrix} G_{1} & = f (t, x, y + V (t, x_{1}), z + ({\bar{C}}^{⊤} \nabla V (t, x_{1}), 0)) - \bar{f} (t, x_{1}, V (t, x_{1}), {\bar{C}}^{⊤} \nabla V (t, x_{1})) \\ G_{2}^{ϵ} & = ((A_{11} - \bar{A}) x_{1} + \frac{1}{ϵ} A_{12} x_{2}) \cdot \nabla V (t, x_{1}) + \frac{1}{2} (C_{1}^{} C_{1}^{⊤} - \bar{C} {\bar{C}}^{⊤}) \nabla^{2} V (t, x_{1}) . \end{matrix}

We set

X_{s}^{ϵ} = X_{τ_{D}}^{ϵ}

for

s \in (τ_{D}, T]

when

τ_{D} < T

. Then, by construction,

G_{1} (t, x, 0, 0)

,

x = (x_{1}, x_{2}) \in D_{s} \times R^{n_{f}}

is centred with respect to

π

and bounded (since the running cost is independent of

x_{2}

), therefore Lemma A1 implies that

sup_{t \in [0, T]} E [{(\int_{t}^{T} G_{1} (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}, 0, 0) d s)}^{2}] ⩽ C_{1} ϵ,

(A6)

The second contribution to the driver can be recast as

G_{2}^{ϵ} = (L - \bar{L}) V

, with

L_{2}

and

\bar{L}

as given by (12) and (28) and thus, as

ϵ \to 0

,

sup_{t \in [0, T]} E [{(\int_{t}^{T} G_{2}^{ϵ} (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}, 0, 0) d s)}^{2}] ⩽ C_{2} ϵ

(A7)

by the functional central limit theorem for diffusions with Lipschitz coefficients [53]; cf. also Section 3.3. As a consequence of (A6) and (A7), we have

G^{ϵ} \to 0

in

L^{2}

, which, since

E [| y_{T}^{ϵ} |^{2}] ⩽ C_{3} ϵ

, implies strong convergence of the solution of the corresponding backward SDE in

L^{2}

.

Specifically, since

\nabla V

is bounded

{\bar{D}}_{s}

, Itô’s formula applied to

| y_{s}^{ϵ} |^{2}

, yields after an application of Gronwall’s Lemma:

\begin{matrix} E [sup_{t \leq s \leq T} | y_{s}^{ϵ} |^{2} + \int_{t}^{T} {| z_{s}^{ϵ} |}^{2} d s] ⩽ & ℓ_{D} E [{(\int_{t}^{T} G^{ϵ} (s, X_{1, s}^{ϵ}, X_{2, s}^{ϵ}, 0, 0) d s)}^{2}] + ℓ_{D} E [| y_{T}^{ϵ} |^{2}] \end{matrix}

where the Lipschitz constant

ℓ_{D}

is independent of

ϵ

and finite for every compact subset

{\bar{D}}_{s} \subset R^{n_{s}}

by the boundedness of

\nabla V

(since V is a classical solution and

D_{s}

in bounded). Hence

E [| y_{s}^{ϵ} |^{2}] \leq C_{3} ϵ

uniformly for

s \in [t, T]

, and by setting

s = t

, we obtain

| y_{t}^{ϵ} | = | V^{ϵ} (t, x) - V (t, x_{1}) | \leq C \sqrt{ϵ}

for a constant

C \in (0, \infty)

. ☐

This proves Theorem 1.

References

Fleming, W.H.; Mete Soner, H. Controlled Markov Processes and Viscosity Solutions, 2nd ed.; Springer: New York, NY, USA, 2006. [Google Scholar]
Stengel, F.R. Optimal Control and Estimation; Dover Books on Advanced Mathematics; Dover Publications: New York, NY, USA, 1994. [Google Scholar]
Dupuis, P.; Spiliopoulos, K.; Wang, H. Importance sampling for multiscale diffusions. Multiscale Model. Simul. 2012, 10, 1–27. [Google Scholar] [CrossRef]
Dupuis, P.; Wang, H. Importance sampling, large deviations, and differential games. Stoch. Rep. 2004, 76, 481–508. [Google Scholar] [CrossRef]
Davis, M.H.; Norman, A.R. Portfolio selection with transaction costs. Math. Oper. Res. 1990, 15, 676–713. [Google Scholar] [CrossRef]
Pham, H. Continuous-Time Stochastic Control and Optimization with Financial Applications; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Hartmann, C.; Schütte, C. Efficient rare event simulation by optimal nonequilibrium forcing. J. Stat. Mech. Theor. Exp. 2012, 2012, 11004. [Google Scholar] [CrossRef]
Schütte, C.; Winkelmann, S.; Hartmann, C. Optimal control of molecular dynamics using markov state models. Math. Program. Ser. B 2012, 134, 259–282. [Google Scholar] [CrossRef]
Asplund, E.; Klüner, T. Optimal control of open quantum systems applied to the photochemistry of surfaces. Phys. Rev. Lett. 2011, 106, 140404. [Google Scholar] [CrossRef] [PubMed]
Steinbrecher, A. Optimal Control of Robot Guided Laser Material Treatment. In Progress in Industrial Mathematics at ECMI 2008; Fitt, A.D., Norbury, J., Ockendon, H., Wilson, E., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 501–511. [Google Scholar]
Zhang, W.; Wang, H.; Hartmann, C.; Weber, M.; Schütte, C. Applications of the cross-entropy method to importance sampling and optimal control of diffusions. SIAM J. Sci. Comput. 2014, 36, A2654–A2672. [Google Scholar] [CrossRef]
Antoulas, A.C. Approximation of Large-Scale Dynamical Systems; SIAM: Philadelphia, PA, USA, 2005. [Google Scholar]
Baur, U.; Benner, P.; Feng, L. Model order reduction for linear and nonlinear systems: A system-theoretic perspective. Arch. Comput. Meth. Eng. 2014, 21, 331–358. [Google Scholar] [CrossRef]
Bensoussan, A.; Blankenship, G. Singular perturbations in stochastic control. In Singular Perturbations and Asymptotic Analysis in Control Systems; Lecture Notes in Control and Information Sciences; Kokotovic, P.V., Bensoussan, A., Blankenship, G.L., Eds.; Springer: Berlin/Heidelberg, Germany, 1987; Volume 90, pp. 171–260. [Google Scholar]
Evans, L.C. The perturbed test function method for viscosity solutions of nonlinear PDE. Proc. R. Soc. Edinb. A 1989, 111, 359–375. [Google Scholar] [CrossRef]
Buckdahn, R.; Hu, Y. Probabilistic approach to homogenizations of systems of quasilinear parabolic PDEs with periodic structures. Nonlinear Anal. 1998, 32, 609–619. [Google Scholar] [CrossRef]
Ichihara, N. A stochastic representation for fully nonlinear PDEs and its application to homogenization. J. Math. Sci. Univ. Tokyo 2005, 12, 467–492. [Google Scholar]
Kushner, H.J. Weak Convergence Methods and Singularly Perturbed Stochastic Control and Filtering Problems; Birkhäuser: Boston, MA, USA, 1990. [Google Scholar]
Kurtz, T.; Stockbridge, R.H. Stationary solutions and forward equations for controlled and singular martingale problems. Electron. J. Probab. 2001, 6, 5. [Google Scholar] [CrossRef]
Kabanov, Y.; Pergamenshchikov, S. Two-Scale Stochastic Systems: Asymptotic Analysis and Control; Springer: Berlin/Heidelberg, Germany; Paris, France, 2003. [Google Scholar]
Kokotovic, P.V. Applications of singular perturbation techniques to control problems. SIAM Rev. 1984, 26, 501–550. [Google Scholar] [CrossRef]
Hartmann, C.; Schäfer-Bung, B.; Zueva, A. Balanced averaging of bilinear systems with applications to stochastic control. SIAM J. Control Optim. 2013, 51, 2356–2378. [Google Scholar] [CrossRef]
Pardalos, P.M.; Yatsenko, V.A. Optimization and Control of Bilinear Systems: Theory, Algorithms, and Applications; Springer: New York, NY, USA, 2010. [Google Scholar]
Hartmann, C.; Latorre, J.; Pavliotis, G.A.; Zhang, W. Optimal control of multiscale systems using reduced-order models. J. Comput. Dyn. 2014, 1, 279–306. [Google Scholar] [CrossRef]
Peng, S. Backward Stochastic Differential Equations and Applications to Optimal Control. Appl. Math. Optim. 1993, 27, 125–144. [Google Scholar] [CrossRef]
Touzi, N. Optimal Stochastic Control, Stochastic Target Problem, and Backward Differential Equation; Springer: Berlin, Germany, 2013. [Google Scholar]
Pardoux, E.; Peng, S. Adapted solution of a backward stochastic differential equation. Syst. Control Lett. 1990, 14, 55–61. [Google Scholar] [CrossRef]
Bahlali, K.; Kebiri, O.; Khelfallah, N.; Moussaoui, H. One dimensional BSDEs with logarithmic growth application to PDEs. Stochastics 2017, 89, 1061–1081. [Google Scholar] [CrossRef]
Duffie, D.; Epstein, L.G. Stochastic differential utility. Econometrica 1992, 60, 353–394. [Google Scholar] [CrossRef]
El Karoui, N.; Peng, S.; Quenez, M.C. Backward stochastic differential equations in finance. Math. Financ. 1997, 7, 1–71. [Google Scholar] [CrossRef]
Hu, Y.; Imkeller, P.; Müller, M. Utility maximization in incomplete markets. Ann. Appl. Probab. 2005, 15, 1691–1712. [Google Scholar] [CrossRef] [Green Version]
Hu, Y.; Peng, S. A stability theorem of backward stochastic differential equations and its application. Acad. Sci. Math. 1997, 324, 1059–1064. [Google Scholar] [CrossRef]
Kobylanski, M. Backward stochastic differential equations and partial differential equations with quadratic growth. Ann. Probab. 2000, 28, 558–602. [Google Scholar] [CrossRef]
Antonelli, F. Backward-forward stochastic differential equations. Ann. Appl. Probab. 1993, 3, 777–793. [Google Scholar] [CrossRef]
Bahlali, K.; Gherbal, B.; Mezerdi, B. Existence of optimal controls for systems driven by FBSDEs. Syst. Control Lett. 1995, 60, 344–349. [Google Scholar] [CrossRef]
Bahlali, K.; Kebiri, O.; Mtiraoui, A. Existence of an optimal Control for a system driven by a degenerate coupled Forward-Backward Stochastic Differential Equations. Comptes Rendus Math. 2017, 355, 84–89. [Google Scholar] [CrossRef]
Ma, J.; Protter, P.; Yong, J. Solving Forward-Backward Stochastic Differential Equations Explicitly—A Four Step Scheme. Probab. Theory Relat. Fields 1994, 98, 339–359. [Google Scholar] [CrossRef]
Zhen, W. Forward-backward stochastic differential equations, linear quadratic stochastic optimal control and nonzero sum differential games. J. Syst. Sci. Complex. 2005, 18, 179–192. [Google Scholar]
Hartmann, C.; Schütte, C.; Weber, M.; Zhang, W. Importance sampling in path space for diffusion processes with slow-fast variables. Probab. Theory Relat. Fields 2017, 170, 177–228. [Google Scholar] [CrossRef] [Green Version]
Bally, V. Approximation scheme for solutions of BSDE. In Backward Stochastic Differential Equations; El Karoui, N., Mazliak, L., Eds.; Addison Wesley Longman: Boston, MA, USA, 1997; pp. 177–191. [Google Scholar]
Bender, C.; Steiner, J. Least-Squares Monte Carlo for BSDEs. In Numerical Methods in Finance; Springer: Berlin, Germany, 2012; pp. 257–289. [Google Scholar]
Bouchard, B.; Elie, R.; Touzi, N. Discrete-time approximation of BSDEs and probabilistic schemes for fully nonlinear PDEs. Comput. Appl. Math. 2009, 8, 91–124. [Google Scholar]
Chevance, D. Numerical methods for backward stochastic differential equations. In Numerical Methods in Finance; Publications of the Newton Institute, Cambridge University Press: Cambridge, UK, 1997; pp. 232–244. [Google Scholar]
Hyndman, C.B.; Ngou, P.O. A Convolution Method for Numerical Solution of Backward Stochastic Differential Equations. Methodol. Comput. Appl. Probab. 2017, 19, 1–29. [Google Scholar] [CrossRef]
Hartmann, C. Balanced model reduction of partially-observed Langevin equations: An averaging principle. Math. Comput. Model. Dyn. Syst. 2011, 17, 463–490. [Google Scholar] [CrossRef]
Fleming, W.H. Optimal investment models with minimum consumption criteria. Aust. Econ. Pap. 2005, 44, 307–321. [Google Scholar] [CrossRef]
Budhiraja, A.; Dupuis, P. A variational representation for positive functionals of infinite dimensional Brownian motion. Probab. Math. Stat. 2000, 20, 39–61. [Google Scholar]
Dai Pra, P.; Meneghini, L.; Runggaldier, J.W. Connections between stochastic control and dynamic games. Math. Control Signal Syst. 1996, 9, 303–326. [Google Scholar] [CrossRef]
Pavliotis, G.A.; Stuart, A.M. Multiscale Methods: Averaging and Homogenization; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
Anderson, B.D.O.; Liu, Y. Controller reduction: Concepts and approaches. IEEE Trans. Autom. Control 1989, 34, 802–812. [Google Scholar] [CrossRef]
Bensoussan, A.; Boccardo, L.; Murat, F. Homogenization of elliptic equations with principal part not in divergence form and hamiltonian with quadratic growth. Commun. Pure Appl. Math. 1986, 39, 769–805. [Google Scholar] [CrossRef]
Pardoux, E.; Peng, S. Backward stochastic differential equations and quasilinear parabolic partial differential equations. In Stochastic Partial Differential Equations and Their Applications; Lecture Notes in Control and Information Sciences 176; Rozovskii, B.L., Sowers, R.B., Eds.; Springer: Berlin, Germany, 1992. [Google Scholar]
Freidlin, M.; Wentzell, A. Random Perturbations of Dynamical Systems; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Khasminskii, R. Principle of averaging for parabolic and elliptic differential equations and for Markov processes with small diffusion. Theory Probab. Appl. 1963, 8, 1–21. [Google Scholar] [CrossRef]
Gobet, E.; Turkedjiev, P. Adaptive importance sampling in least-squares Monte Carlo algorithms for backward stochastic differential equations. Stoch. Proc. Appl. 2005, 127, 1171–1203. [Google Scholar] [CrossRef]
Turkedjiev, P. Numerical Methods for Backward Stochastic Differential Equations of Quadratic and Locally Lipschitz Type. Ph.D. Thesis, Humboldt-Universität zu Berlin, Berlin, Germany, 2013. [Google Scholar]
Kebiri, O.; Neureither, L.; Hartmann, C. Adaptive importance sampling with forward-backward stochastic differential equations. arXiv, 2018; arXiv:1802.04981. [Google Scholar]
Franzke, C.; Majda, A.J.; Vanden-Eijnden, E. Low-order stochastic mode reduction for a realistic barotropic model climate. J. Atmos. Sci. 2005, 62, 1722–1745. [Google Scholar] [CrossRef]
Lall, S.; Marsden, J.; Glavaški, S. A subspace approach to balanced truncation for model reduction of nonlinear control systems. Int. J. Robust. Nonlinear Control 2002, 12, 519–535. [Google Scholar] [CrossRef] [Green Version]
Briand, P.; Hu, Y. Probabilistic approach to singular perturbations of semilinear and quasilinear parabolic. Nonlinear Anal. 1999, 35, 815–831. [Google Scholar] [CrossRef]
Bensoussan, A.; Lions, J.L.; Papanicolaou, G. Asymptotic Analysis for Periodic Structures; American Mathematical Society: Washington, DC, USA, 1978; pp. 769–805. [Google Scholar]
Pardoux, E.; Yu, A. Veretennikov: On the poisson equation and diffusion approximation 3. Ann. Probab. 2005, 33, 1111–1133. [Google Scholar] [CrossRef]

Figure 1. Plot of the mean of

E (ϵ) \pm

its standard deviation

σ (E (ϵ))

and for comparison we plot

\sqrt{ϵ}

against

ϵ

on a doubly logarithmic scale: we observe convergence of order

1 / 2

as predicted by the theory.

Figure 1. Plot of the mean of

E (ϵ) \pm

its standard deviation

σ (E (ϵ))

and for comparison we plot

\sqrt{ϵ}

against

ϵ

on a doubly logarithmic scale: we observe convergence of order

1 / 2

as predicted by the theory.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kebiri, O.; Neureither, L.; Hartmann, C. Singularly Perturbed Forward-Backward Stochastic Differential Equations: Application to the Optimal Control of Bilinear Systems. Computation 2018, 6, 41. https://doi.org/10.3390/computation6030041

AMA Style

Kebiri O, Neureither L, Hartmann C. Singularly Perturbed Forward-Backward Stochastic Differential Equations: Application to the Optimal Control of Bilinear Systems. Computation. 2018; 6(3):41. https://doi.org/10.3390/computation6030041

Chicago/Turabian Style

Kebiri, Omar, Lara Neureither, and Carsten Hartmann. 2018. "Singularly Perturbed Forward-Backward Stochastic Differential Equations: Application to the Optimal Control of Bilinear Systems" Computation 6, no. 3: 41. https://doi.org/10.3390/computation6030041

APA Style

Kebiri, O., Neureither, L., & Hartmann, C. (2018). Singularly Perturbed Forward-Backward Stochastic Differential Equations: Application to the Optimal Control of Bilinear Systems. Computation, 6(3), 41. https://doi.org/10.3390/computation6030041

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Singularly Perturbed Forward-Backward Stochastic Differential Equations: Application to the Optimal Control of Bilinear Systems

Abstract

1. Introduction

1.1. Set-Up and Problem Statement

1.2. Outline

2. Singularly Perturbed Bilinear Control Systems

From Stochastic Control to Forward-Backward Stochastic Differential Equations

3. Model Reduction

3.1. Interpretation as an Optimal Control Problem

3.2. Convergence of the Control Value

3.3. Formal Derivation of the Limiting FBSDE

4. Numerical Studies

4.1. Numerical FBSDE Discretisation

4.2. Least-Squares Solution of the Backward SDE

4.3. Numerical Example

4.4. Discussion

5. Conclusions and Outlook

Author Contributions

Funding

Conflicts of Interest

Appendix A. Proofs and Technical Lemmas

Appendix A.1. Poisson Equation Lemma

Appendix A.2. Convergence of the Value Function

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI