Optimal Stabilization of Linear Stochastic System with Statistically Uncertain Piecewise Constant Drift

Borisov, Andrey; Bosov, Alexey; Miller, Gregory

doi:10.3390/math10020184

Open AccessArticle

Optimal Stabilization of Linear Stochastic System with Statistically Uncertain Piecewise Constant Drift

by

Andrey Borisov

,

Alexey Bosov

^*

and

Gregory Miller

Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, 44/2 Vavilova Str., 119333 Moscow, Russia

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(2), 184; https://doi.org/10.3390/math10020184

Submission received: 3 December 2021 / Revised: 24 December 2021 / Accepted: 30 December 2021 / Published: 7 January 2022

(This article belongs to the Special Issue Stochastic Processes and Their Applications)

Download

Browse Figures

Versions Notes

Abstract

:

The paper presents an optimal control problem for the partially observable stochastic differential system driven by an external Markov jump process. The available controlled observations are indirect and corrupted by some Wiener noise. The goal is to optimize a linear function of the state (output) given a general quadratic criterion. The separation principle, verified for the system at hand, allows examination of the control problem apart from the filter optimization. The solution to the latter problem is provided by the Wonham filter. The solution to the former control problem is obtained by formulating an equivalent control problem with a linear drift/nonlinear diffusion stochastic process and with complete information. This problem, in turn, is immediately solved by the application of the dynamic programming method. The applicability of the obtained theoretical results is illustrated by a numerical example, where an optimal amplification/stabilization problem is solved for an unstable externally controlled step-wise mechanical actuator.

Keywords:

Markov jump process; Itô stochastic differential equation; optimal control; quadratic criterion; stochastic filtering; Wonham filter

1. Introduction

The linear quadratic Gaussian (LQG) control problem is one of the most productive results in the control theory. It represents the synthesis of a control law for a linear-Gaussian stochastic system state, which minimizes the expected value of a quadratic utility function [1]. The application area of the result is extremely broad: Aerospace applications were and still are important and include, for example, re-entry vehicle tracking and orbit determination [1]; other uses include navigation and guidance systems, radar tracking, and sonars, i.e., the same tasks that became the first applications for the Kalman filter [2]. Non-stationary technological objects management also became a significant source of applications. These industrial applications arise in paper and mineral processing industries, biochemistry and energy, and many other domains [3]. Industrial applications are often represented by very specific examples, for example, dry process rotary cement kiln, cement raw material mixing, and thickness control of a plastic film [4,5]. Of particular interest are the tasks connected with the mechanics of drives and manipulators, e.g., two-link manipulator stabilization [6] or output feedback stabilization for robot control [7]. In fact, these models do not have stochastic elements, but the applications served as a source of motivation for the present research.

One of the cornerstones of the contemporary LQG technique is the separation theorem, which is immanent to control problems with incomplete information [8,9]. This result justifies a transparent engineering approach, according to which the initial control problem splits into the unobservable state estimation stage and control synthesis based on the obtained state estimate. In the general nonlinear/non-Gaussian case, the separation procedure does not deliver an absolute optimum in the class of the admissible controls. Nevertheless, forcing the separation of the state estimation and control optimization allows synthesizing high-performance controls for various applied problems. This approach is called separation principle. Its applications are not only the aforementioned traditional tasks for applying Kalman filtering and LQG control. Recent applications cover rather modern areas, e.g., of telecommunications: In [10,11], LQG and Risk-Sensitive LQG controls are used in Denial-of-Service (DoS) attack conditions, and [12] uses similar models and tasks for control applications in lossy data networks. In [13], user activity measurements are used to manage computational resources of a data center or deep learning. In [14], the ideas of estimation and control separation are adapted for neural network training tasks. We can also note the development of the meaning of the term “separation”. In [15], for hybrid systems containing real-time control loops and logic-based decision makers, the control loops are separated. In [16], separation is interpreted in the sense that the output feedback controller is based on some separate properties related to state feedback on the one hand and output to state stability on the other hand.

Control problems with incomplete information, especially in stochastic settings, are usually of much higher relevance than those with complete information. The separation principle, being applied in a general control problem, provides a chance to use known theoretical results and corresponding numerical procedures developed for optimal stochastic control with complete information [17,18,19]. The separation principle implies the necessity to calculate a most precise state estimate given the available indirect noisy observations, which results in the optimal filtering problem. Its general solution, in turn, is characterized by either Kushner–Sratonovich [20,21] or Duncan–Mortensen–Zakai [22,23,24] equations, which describe posterior state distribution evolution in time. An approach developed in [25,26,27] results in easily calculated state estimates only in a limited number of special cases. Some examples can be found in [28] for a controlled Markov jump process (MJP).

For the quadratic criterion, the most general model where the separation theorem is valid was considered in [29,30,31]. The main goal of our research is to apply these results to a typical linear model of a mechanical actuator augmented by MJP to simulate changes in the actuator operating modes or external drift. The difference is that, in mentioned papers, there was no quadratic term involving the observed state; the criterion in [31] is more general but still does not include a term with an unobservable MJP. In this case, the obtained results provide an optimal control in the form of linear dependency on the observed state and the optimal estimate of the unobservable component. The latter is interpreted as an uncontrollable disturbance and is not included in the criterion. In the target application of our work, unobservable MJP represents the state of a controlled system. This state can be called an operating mode, and the purpose of control is to monitor this mode and to stabilize the system under the conditions determined by the mode switches. For the analysis of such a system, we do not need general models of the unobserved component as in [29,31] and the corresponding optimal filtering equations, which are difficult to implement in practice. For our purposes, an MJP model as in [30] and optimal Wonham filter [32] are sufficient. Thus, the main theoretical goal of our research is to develop the result of [30] for the case of a general quadratic criterion containing a term with unobservable MJP. The practical goal is to implement this statement in order to stabilize the mechanical actuator in conditions of changing operating modes.

The paper is organized as follows. Section 2 contains a thorough statement of an optimal control problem. The controlled observable output represents a solution to a linear stochastic differential system governed by some unobservable finite-state Markov jump process. The control appears on the right-hand side (RHS) as an additive term with some nonrandom matrix gain. The optimality criterion is a generalization of the traditional quadratic one. It allows investigating various optimal control problems (MJP state tracking in terms of integral performance index, control assistance in systems with noisy controls, damping of the statistically uncertain exogenous step-wise disturbances, etc.) using a unified approach.

On the one hand, a slight modification of the Wonham filter admits the calculation of the conditional expectation of an unobservable MJP state given output observations. On the other hand, the investigated dynamic system can be rewritten in form, which can be treated as a controlled system with complete information. In turn, the optimality criterion can be transformed to a form where the dependency on the unobserved MJP is only through its estimate. These manipulations, performed in Section 3 legitimize the usage of the separation principle in the investigated control problem (see Proposition 1).

Generally, in order to synthesize the optimal control, one has to solve the dynamic programming equation. However, the similarity of the considered control problem to the “classic” LQG case gives a hint for the form of the Bellman function. One can observe that it is a quadratic function of the controlled state. Hence, the problem is reduced to the characterization of corresponding coefficients. Section 4 and Section 5 contain their derivation.

Section 6 presents an illustrative example of the obtained theoretical results application. The object of the control is a mechanical actuator (e.g., the cart of a gantry crane). This actuator performs step-wise movements governed by an external unobservable controlling signal, which is an MJP corrupted by noise. The goal is the assistance of the actuator to carry out external commands more accurately. Section 7 contains concluding remarks.

2. Problem Statement

On the canonical probability space with filtration

(Ω, F, P, F_{t})

,

t \in [0, T]

, we consider a partially observable stochastic dynamic system:

d z_{t} = a_{t} y_{t} d t + b_{t} z_{t} d t + c_{t} u_{t} d t + σ_{t} d w_{t}, z_{0} = Z,

(1)

d y_{t} = Λ_{t}^{T} y_{t} d t + d λ_{t}, y_{0} = Y,

(2)

where

z_{t} \in R^{n_{z}}

is an observable system state governed by a statistically uncertain (i.e., unobservable random) drift

y_{t} \in R^{n_{y}}

. We suppose that

y_{t}

is a finite-state Markov jump process (MJP) with values in the set

{e_{1}, \dots, e_{n_{y}}}

formed by unit coordinate vectors of the Euclidean space

R^{n_{y}}

. The initial value Y has a known distribution

π_{0}

,

Λ_{t}

is a transition intensity matrix, and

λ_{t}

is a

F_{t}

-adapted martingale with quadratic characteristic [33].

{〈λ, λ〉}_{t} = \int_{0}^{t} (diag \{Λ_{s}^{T} y_{s}\} - Λ_{s}^{T} diag \{y_{s}\} - diag \{y_{s}\} Λ_{s}) d s .

We denote natural filtration generated by system state z as

F_{t}^{z}

and assume that

F_{t}^{z} \subseteq F_{t} \subseteq F

. Furthermore,

w_{t} \in R^{n_{w}}

is a standard vector Wiener process.

Z \in R^{n_{z}}

is a random vector with a finite variance. Z,

{y_{t}}

, and

{w_{t}}

are mutually independent;

u_{t} \in R^{n_{u}}

is an admissible control. We suppose that

u_{t} = u (t, z_{t})

; thus, the admissible control is a feedback control:

u = u (t, z),

z \in R^{n_{z}}

[19]. We also assume that state

z_{t}

is uniformly non-degenerate, i.e.,

σ_{t} σ_{t}^{T} ⩾ κ I > 0

, where

κ > 0

is some positive value, and

I

stands for a unit matrix of appropriate dimensionality.

The quality of the control law

U_{0}^{T} = {u (t, z), 0 \leq t \leq T, z \in R^{n_{z}}}

is defined by the following objective:

J (U_{0}^{T}) = E \{\int_{0}^{T} {|P_{t} y_{t} + Q_{t} z_{t} + R_{t} u_{t}|}^{2} d t + {|P_{T} y_{T} + Q_{T} z_{T}|}^{2}\},

(3)

where

P_{t} \in R^{n_{J} \times n_{y}}

,

Q_{t} \in R^{n_{J} \times n_{z}}

, and

R_{t} \in R^{n_{J} \times n_{u}}

are known bounded matrix functions;

{|x|}^{2} = x^{T} x

is the Euclidean norm of a vector x; and below we also use the Euclidean norm

{|x|}^{2} = tr \{x^{T} x\}

of a matrix x. In order to eliminate the possibility of zero penalties for individual components of the control vector

u_{t}

, which would make objective (3) physically incorrect, we assume that the common non-degeneracy assumption

R_{t}^{T} R_{t} ⩾ κ I > 0

is valid.

Note that the stochastic system models (1) and (2) are the same as in [30] and are special cases of models in [29,31]. On the contrary, objective (3) includes additional terms. Namely, the objective from [29,30] is obtained at

P_{t} = 0, Q_{t} = 0

, and the one from [31] is obtained at

P_{t} = 0

. This reflects the key terminological difference of the problem under consideration: We consider

y_{t}

as an element of the observation system, external drift, and not as a complex measurement error. Thus, we represent objective (3), which may find its application in tracking the drift with state

{|y_{t} - z_{t}|}^{2}

or tracking the drift with control

{|z_{t} - u_{t}|}^{2}

problems, which take into account control

{|u_{t}|}^{2}

.

The matrix functions

a_{t} \in R^{n_{z} \times n_{y}}

,

b_{t} \in R^{n_{z} \times n_{z}}

,

c_{t} \in R^{n_{z} \times n_{u}}

, and

σ_{t} \in R^{n_{z} \times n_{w}}

are bounded:

|a_{t}| + |b_{t}| + |c_{t}| + |σ_{t}| \leq C

for all

0 \leq t \leq T

. Thus, the existence of a strong solution to (1) is guaranteed for any admissible control

u_{t}

. Moreover, we assume that all used time functions

Λ_{t}

,

a_{t}

,

b_{t}

,

c_{t}

,

σ_{t}

,

P_{t}

,

Q_{t}

,

R_{t}

, and

S_{t}

are piecewise continuous to ensure that the typical conditions for the existence of solutions to the ordinary differential equations (ODE) obtained below are satisfied.

Our goal is to find an admissible control

{(U^{*})}_{0}^{T} = {u^{*} (t, z), 0 \leq t \leq T, z \in R^{n_{z}}}

, which delivers the minimum to the quadratic criterion

J (U_{0}^{T})

:

{(U^{*})}_{0}^{T} = arg min J (U_{0}^{T}),

(4)

under the assumption of its existence.

3. Separation of Filtering and Control

In order to solve the problem formulated in Section 2, we need to transform it into a form that permits revealing two necessary properties. We note that for system (1) and (2) an optimal filtering problem solution is known: The expectation

E {y_{t} | F_{t}^{z}}

is given by the Wonham filter [33]. First, for the separation, we need to find such a representation of this filter that shows that the quality of the optimal filter estimate does not depend on the chosen control law. The second goal is to decompose objective (3) into independent additive terms, which determine separately the quality of control and the quality of estimation.

To that end, we propose a change of variables in (1) to eliminate terms

b_{t} z_{t} d t

and

c_{t} u_{t} d t

. Denote by

B_{t} \in R^{n_{z} \times n_{z}}

a solution to equation

d B_{t} = - B_{t} b_{t} d t

, which is the matrix exponential

B_{t} = exp \{- \int_{0}^{t} b_{s} d s\}

. Moreover, denote by

z_{t}^{(1)} \in R^{n_{z}}

the following linear output transform:

z_{t}^{(1)} = B_{t} z_{t} - \int_{0}^{t} B_{s} c_{s} u_{s} d s .

By differentiating, we obtain the following:

\begin{matrix} d z_{t}^{(1)} = & - B_{t} b_{t} z_{t} d t + B_{t} d z_{t} - B_{t} c_{t} u_{t} d t = \\ = & B_{t} (d z_{t} - b_{t} z_{t} d t - c_{t} u_{t} d t) = \\ = & B_{t} a_{t} y_{t} d t + B_{t} σ_{t} d w_{t} \end{matrix}

or

d z_{t}^{(1)} = a_{t}^{(1)} y_{t} d t + σ_{t}^{(1)} d w_{t}, z_{0}^{(1)} = Z,

(5)

with additional notation

a_{t}^{(1)} = B_{t} a_{t}

,

σ_{t}^{(1)} = B_{t} σ_{t}

.

Taking into account

B_{t}^{- 1} = exp \{\int_{0}^{t} b_{s} d s\}

, for what follows, note the following:

B_{t}^{- 1} d z_{t}^{(1)} = d z_{t} - b_{t} z_{t} d t - c_{t} u_{t} d t .

The transformations we made do not affect the filtering problem solution, i.e.,

E \{y_{t} | F_{t}^{z}\} = E \{y_{t} | F_{t}^{z^{(1)}}\}

. That is because the change of variables used to obtain

z_{t}^{(1)}

is a linear non-degenerate transformation of

z_{t}

. Hence, the optimal filtering estimate does not depend on

u_{t}

, and we can use

z_{t}^{(1)}

as observations and obtain the same drift estimate no matter which admissible control law is chosen. Consequently, in place of the estimate problem given the control-dependent observations

z_{t}

, we pass to an equivalent problem of

y_{t}

estimation given the transformed observations

z_{t}^{(1)}

, which are described by Equation (5) and do not depend on

u_{t}

.

Denoting the optimal estimate

{\hat{y}}_{t} = E \{y_{t} | F_{t}^{z^{(1)}}\}

, we write the equation for it. The uniform non-degeneracy of

σ_{t}^{(1)} {(σ_{t}^{(1)})}^{T}

along with the one of the matrix exponential

B_{t}

[34] legitimizes the use of the Wonham filter [33]:

\begin{matrix} d {\hat{y}}_{t} = & Λ_{t}^{T} {\hat{y}}_{t} d t + (diag \{{\hat{y}}_{t}\} - {\hat{y}}_{t} {\hat{y}}_{t}^{T}) {(a_{t}^{(1)})}^{T} {(σ_{t}^{(1)} {(σ_{t}^{(1)})}^{T})}^{- 1} \times \\ \times (d z_{t}^{(1)} - a_{t}^{(1)} {\hat{y}}_{t} d t), {\hat{y}}_{0} = E \{Y\} . \end{matrix}

Replacing variables

a_{t}^{(1)}

,

σ_{t}^{(1)}

, and

z_{t}^{(1)}

, we obtain the following:

\begin{matrix} d {\hat{y}}_{t} = & Λ_{t}^{T} {\hat{y}}_{t} d t + (diag \{{\hat{y}}_{t}\} - {\hat{y}}_{t} {\hat{y}}_{t}^{T}) a_{t}^{T} B_{t}^{T} {(B_{t} σ_{t} σ_{t}^{T} B_{t}^{T})}^{- 1 / 2} \times \\ \times {(B_{t} σ_{t} σ_{t}^{T} B_{t}^{T})}^{- 1 / 2} B_{t} (B_{t}^{- 1} d z_{t}^{(1)} - a_{t} {\hat{y}}_{t} d t) = \\ = & Λ_{t}^{T} {\hat{y}}_{t} d t + (diag \{{\hat{y}}_{t}\} - {\hat{y}}_{t} {\hat{y}}_{t}^{T}) a_{t}^{T} B_{t}^{T} {(B_{t} σ_{t} σ_{t}^{T} B_{t}^{T})}^{- 1} B_{t} \times \\ \times (d z_{t} - b_{t} z_{t} d t - c_{t} u_{t} d t - a_{t} {\hat{y}}_{t} d t) . \end{matrix}

Taking into account

B_{t}^{T} {(B_{t} σ_{t} σ_{t}^{T} B_{t}^{T})}^{- 1} B_{t} = {(σ_{t} σ_{t}^{T})}^{- 1}

, we have the final representation for the Wohnam filter as follows:

\begin{matrix} d {\hat{y}}_{t} = & Λ_{t}^{T} {\hat{y}}_{t} d t + (diag \{{\hat{y}}_{t}\} - {\hat{y}}_{t} {\hat{y}}_{t}^{T}) a_{t}^{T} {(σ_{t} σ_{t}^{T})}^{- 1 / 2} \times \\ \times {(σ_{t} σ_{t}^{T})}^{- 1 / 2} (d z_{t} - b_{t} z_{t} d t - c_{t} u_{t} d t - a_{t} {\hat{y}}_{t} d t), {\hat{y}}_{0} = E \{Y\}, \end{matrix}

(6)

where

d W_{t} = {(σ_{t} σ_{t}^{T})}^{- 1 / 2} (d z_{t} - b_{t} z_{t} d t - c_{t} u_{t} d t - a_{t} {\hat{y}}_{t} d t)

is a differential of the

F_{t}^{z}

-adapted standard vector Wiener process. This provides us the final observation equation:

d z_{t} = a_{t} {\hat{y}}_{t} d t + b_{t} z_{t} d t + c_{t} u_{t} d t + {\hat{σ}}_{t} d W_{t},

(7)

where

{\hat{σ}}_{t} = {(σ_{t} σ_{t}^{T})}^{1 / 2},

and the final new state equation is as follows:

d {\hat{y}}_{t} = Λ_{t}^{T} {\hat{y}}_{t} d t + (diag \{{\hat{y}}_{t}\} - {\hat{y}}_{t} {\hat{y}}_{t}^{T}) a_{t}^{T} {(σ_{t} σ_{t}^{T})}^{- 1 / 2} d W_{t} .

(8)

Furthermore, by denoting the following:

Σ_{t} = Σ_{t} (y) = (diag \{y\} - y y^{T}) a_{t}^{T} {(σ_{t} σ_{t}^{T})}^{- 1 / 2}, y \in R^{n_{y}},

we obtain a control system with complete information:

\begin{matrix} d z_{t} = a_{t} {\hat{y}}_{t} d t + b_{t} z_{t} d t + c_{t} u_{t} d t + \hat{σ_{t}} d W_{t}, & z_{0} = Z, \\ d {\hat{y}}_{t} = Λ_{t}^{T} {\hat{y}}_{t} d t + Σ_{t} ({\hat{y}}_{t}) d W_{t}, & {\hat{y}}_{0} = E \{Y\}, \end{matrix}

(9)

where both variables

{\hat{y}}_{t}

and

z_{t}

are observable.

For final formulation of the equivalent control problem, one needs to perform the change of variables in objective (3). This can be conducted by using the law of total expectation [35] as follows:

J (U_{0}^{T}) = E \{\begin{matrix} \int_{0}^{T} {|P_{t} (y_{t} - {\hat{y}}_{t} + {\hat{y}}_{t}) + Q_{t} z_{t} + R_{t} u_{t}|}^{2} d t + \\ + {|P_{T} (y_{T} - {\hat{y}}_{T} + {\hat{y}}_{T}) + Q_{T} z_{T}|}^{2} \end{matrix}\}

= E \{\begin{matrix} \int_{0}^{T} {|P_{t} {\hat{y}}_{t} + Q_{t} z_{t} {+ R}_{t} u_{t}|}^{2} d t + {|P_{T} {\hat{y}}_{T} + Q_{T} z_{T}|}^{2} + \\ + \int_{0}^{T} {|P_{t} (y_{t} - {\hat{y}}_{t})|}^{2} d t + {|P_{T} (y_{T} - {\hat{y}}_{T})|}^{2} \end{matrix}\} .

Since the last additive term in the expression for

J (U_{0}^{T})

does not depend on

U_{0}^{T}

but merely characterises the quality of the optimal filter estimate

{\hat{y}}_{t}

, it can be excluded from the criterion. Hence, the new objective takes the following form:

J (U_{0}^{T}) = E \{\int_{0}^{T} {|P_{t} {\hat{y}}_{t} + Q_{t} z_{t} {+ R}_{t} u_{t}|}^{2} d t + {|P_{T} {\hat{y}}_{T} + Q_{T} z_{T}|}^{2}\} .

(10)

The final statement is formulated as follows.

Proposition 1

(Separation theorem). The optimal (

F_{t}^{\hat{y}, z}

-adapted) feedback control for the system with complete information (9) and objective (10) is at the same time an optimal (

F_{t}^{z}

-adapted) solution for the optimal control problem in the original system (1) and (2) with objective (3).

Note that control

{(U^{*})}_{0}^{T}

minimizes both functionals (10) and (3), and the difference

\int_{0}^{T} {|P_{t} (y_{t} - {\hat{y}}_{t})|}^{2} d t + {|P_{T} (y_{T} - {\hat{y}}_{T})|}^{2}

between

J (U_{0}^{T})

from (10) and

J (U_{0}^{T})

from (3) does not depend on

U_{0}^{T}

; thus, there is no need for another notation for the objective.

4. Solution to Control Problem

In order to find the control for the system (9), which is optimal with respect to the criterion (3), we use the classical dynamic programming method [18,19].

Denote the following Bellman function:

V_{t} = V_{t} (y, z) = inf_{U_{t}^{T}} E \{\int_{t}^{T} {|P_{s} {\hat{y}}_{s} + Q_{s} z_{s} {+ R}_{s} u_{s}|}^{2} d s + {|P_{T} {\hat{y}}_{T} + Q_{T} z_{T}|}^{2}\},

where

U_{t}^{T} = {u (s, y, z), t \leq s \leq T, y \in R^{n_{y}}, z \in R^{n_{z}}}

. Then, with

Σ_{t} = Σ_{t} (y)

the dynamic programming equation is as follows:

\begin{matrix} \frac{\partial V_{t}}{\partial t} + \frac{1}{2} tr \{Σ_{t}^{T} \frac{\partial^{2} V_{t}}{\partial y^{2}} Σ_{t} + {\hat{σ}}_{t}^{T} \frac{\partial^{2} V_{t}}{\partial z^{2}} {\hat{σ}}_{t} + 2 Σ_{t}^{T} \frac{\partial^{2} V_{t}}{\partial y \partial z} {\hat{σ}}_{t}\} \\ + min_{u} \{y^{T} Λ_{t} \frac{\partial V_{t}}{\partial y} + {(a_{t} y + b_{t} z + c_{t} u)}^{T} \frac{\partial V_{t}}{\partial z} + {|P_{t} y + Q_{t} z {+ R}_{t} u|}^{2}\} = 0, \\ V_{T} = {|P_{T} y + Q_{T} z|}^{2}, \end{matrix}

(11)

where admissible feedback control

u = u (t, y, z)

corresponds to system (9) designations; thus,

u_{t} = u (t, {\hat{y}}_{t}, z_{t})

. Note that this is the same control that appeared in the original statement, since

{\hat{y}}_{t} = {\hat{y}}_{t} (z_{t})

.

The existence of a solution to (11) is a sufficient optimality condition. Moreover, the optimal control is the one that minimizes to the corresponding additive term. Hence, under condition

R_{t}^{T} R_{t} ⩾ α I > 0

and under the assumption of (11) solution existence, this minimum is delivered by the following control:

u^{*} = u^{*} (t, y, z) = - \frac{1}{2} {(R_{t}^{T} R_{t})}^{- 1} (c_{t}^{T} \frac{\partial V_{t}}{\partial z} + 2 R_{t}^{T} (P_{t} y + Q_{t} z)) .

(12)

By substituting

u_{t}^{*}

in (11) and regrouping the additive terms, we obtain the following:

\begin{matrix} \frac{\partial V_{t}}{\partial t} + \frac{1}{2} tr \{Σ_{t}^{T} \frac{\partial^{2} V_{t}}{\partial y^{2}} Σ_{t} + {\hat{σ}}_{t}^{T} \frac{\partial^{2} V_{t}}{\partial z^{2}} {\hat{σ}}_{t} + 2 Σ_{t}^{T} \frac{\partial^{2} V_{t}}{\partial y \partial z} {\hat{σ}}_{t}\} + y^{T} Λ_{t} \frac{\partial V_{t}}{\partial y} \\ + (y^{T} a_{t}^{T} + z^{T} b_{t}^{T} - {(R_{t}^{T} (P_{t} y + Q_{t} z))}^{T} {(R_{t}^{T} R_{t})}^{- 1} c_{t}^{T}) \frac{\partial V_{t}}{\partial z} \\ + {(P_{t} y + Q_{t} z)}^{T} (I - R_{t} {(R_{t}^{T} R_{t})}^{- 1} R_{t}^{T}) (P_{t} y + Q_{t} z) \\ - \frac{1}{4} {(\frac{\partial V_{t}}{\partial z})}^{T} c_{t} {(R_{t}^{T} R_{t})}^{- 1} c_{t}^{T} \frac{\partial V_{t}}{\partial z} = 0 . \end{matrix}

(13)

The linear observations (9) and quadratic objective (10) suggest that a solution of the dynamic programming Equation (13) might be presented as follows:

V_{t} = V_{t} (y, z) = z^{T} α_{t} z + z^{T} β_{t} (y) + γ_{t} (y),

(14)

which reduces the optimal solution search to a problem of finding a symmetric matrix function

α_{t}

, vector function

β_{t} (y)

, and a scalar

γ_{t} (y)

. Moreover, the explicit representation for

γ_{t} (y)

is not necessary for optimal control. One only needs a derivative

\frac{\partial V_{t}}{\partial z} = 2 α_{t} z + β_{t} (y)

; nevertheless,

γ_{t} (y)

will be useful in determining the quality of control.

The Bellman function representation (14) can be simplified further by using the fact that the term with derivative

\frac{\partial V_{t}}{\partial y}

in (13) contains only the multiplier

y_{t} Λ_{t}^{T}

. This suggests that

β_{t} (y)

is an affine transform of y:

β_{t} (y) = B_{t} y,

(15)

where matrix

B_{t} \in R^{n_{z} \times n_{y}}

. The boundary condition at the same time takes the following form:

V_{T} = {|P_{T} y + Q_{T} z|}^{2} = z^{T} Q_{T}^{T} Q_{T} z + 2 z^{T} Q_{T}^{T} P_{T} y + y^{T} P_{T}^{T} P_{T} y,

In other words, we have the following:

α_{T} = Q_{T}^{T} Q_{T}, B_{T} = 2 Q_{T}^{T} P_{T}, γ_{T} (y) = y^{T} P_{T}^{T} P_{T} y,

(16)

and the optimal control (12) can finally be represented as

u^{*} = u^{*} (t, y, z) = - \frac{1}{2} {(R_{t}^{T} R_{t})}^{- 1} (c_{t}^{T} (2 α_{t} z + B_{t} y) + 2 R_{t}^{T} (P_{t} y + Q_{t} z)) .

(17)

Thus,

u_{t}^{*}

contains two terms: The first term is linear in observations (output variable) z; the second one is linear in state (state estimate given by Wonham filter) y. In order to justify the assumptions made, it is necessary to derive

α_{t}

,

B_{t}

, and

γ_{t} (y)

such that under assumptions (14) and (15) made about the Bellman function

V_{t}

and the affinity of

β_{t} (y)

,

V_{t}

from (14) would in fact be a solution of (13). To that end, we substitute (14) into Equation (13) and take into account that the derivatives of

β_{t} (y)

according to (15) are equal to

\frac{\partial^{2} β_{t} (y)}{\partial y^{2}} = 0,

\frac{\partial β_{t} (y)}{\partial y} = B_{t}^{T},

and also that the following is the case:

y^{T} Λ_{t} \frac{\partial V_{t}}{\partial y} = {(\frac{\partial V_{t}}{\partial y})}^{T} Λ_{t}^{T} y = z^{T} B_{t} Λ_{t}^{T} y + y^{T} Λ_{t} \frac{\partial γ_{t} (y)}{\partial y}, \frac{\partial^{2} V_{t}}{\partial y \partial z} = B_{t}^{T} .

After this substitution and minor transformations, grouping the terms at

z^{T}

and z, at

z^{T}

and at

z^{0}

, we obtain equations for

α_{t}

,

β_{t} (y) = B_{t} y

and

γ_{t} = γ_{t} (y)

, respectively:

\frac{d α_{t}}{d t} + M_{t}^{α} α_{t} + α_{t} {(M_{t}^{α})}^{T} + N_{t}^{α} - α_{t} c_{t} {(R_{t}^{T} R_{t})}^{- 1} c_{t}^{T} α_{t} = 0,

(18)

\frac{d B_{t}}{d t} y + B_{t} Λ_{t}^{T} y + M_{t}^{β} y + N_{t}^{β} B_{t} y = 0,

(19)

\frac{\partial γ_{t}}{\partial t} + \frac{1}{2} tr \{Σ_{t}^{T} \frac{\partial^{2} γ_{t}}{\partial y^{2}} Σ_{t}\} + tr \{{\hat{σ}}_{t}^{T} α_{t} {\hat{σ}}_{t} + Σ_{t}^{T} B_{t}^{T} {\hat{σ}}_{t}\} + y^{T} Λ_{t} \frac{\partial γ_{t}}{\partial y} + M_{t}^{γ} (y) = 0,

(20)

where the following is the case:

\begin{matrix} M_{t}^{α} & = & b_{t}^{T} - {Q_{t}^{T} R_{t} (R_{t}^{T} R_{t})}^{- 1} c_{t}^{T}, \\ N_{t}^{α} & = & Q_{t}^{T} (I - R_{t} {(R_{t}^{T} R_{t})}^{- 1} R_{t}^{T}) Q_{t}, \\ M_{t}^{β} & = & 2 (α_{t} (a_{t} - {c_{t} (R_{t}^{T} R_{t})}^{- 1} R_{t}^{T} P_{t}) + Q_{t}^{T} (I - R_{t} {(R_{t}^{T} R_{t})}^{- 1} R_{t}^{T}) P_{t}), \\ N_{t}^{β} & = & b_{t}^{T} - Q_{t}^{T} R_{t} {(R_{t}^{T} R_{t})}^{- 1} c_{t}^{T} - α_{t} c_{t} {(R_{t}^{T} R_{t})}^{- 1} c_{t}^{T}, \\ M_{t}^{γ} (y) & = & y^{T} B_{t}^{T} (a_{t} - c_{t} {(R_{t}^{T} R_{t})}^{- 1} R_{t}^{T} P_{t}) y + y^{T} P_{t}^{T} (I - R_{t} {(R_{t}^{T} R_{t})}^{- 1} R_{t}^{T}) P_{t} y \\ - \frac{1}{4} y^{T} B_{t}^{T} c_{t} {(R_{t}^{T} R_{t})}^{- 1} c_{t}^{T} B_{t} y . \end{matrix}

Finally, from (19), we obtain the equation for

B_{t}

:

\frac{d B_{t}}{d t} + B_{t} Λ_{t}^{T} + M_{t}^{β} + N_{t}^{β} B_{t} = 0 .

(21)

The obtained equations are solved with boundary conditions (16). In this case, Equation (18) is a matrix Riccati equation for a square symmetric matrix

α_{t}

. The above assumptions regarding the piecewise continuity of the coefficients of this equation and condition

R_{t}^{T} R_{t} > 0

are sufficient for the existence and uniqueness of a positive semidefinite solution for every

0 \leq t \leq T

. Indeed, such an equation occurs in the classical linear-quadratic problem, which admits a unique optimal feedback control—linear with respect to the output

Z_{t}

—and with a gain described by this Riccati equation [36].

Equation (21) is a system of linear ODEs with respect to variable

B_{t}

with piecewise continuous coefficients; hence, the existence and uniqueness of a solution is ensured by the usual theorem for linear differential equations [37].

Equation (20) for

γ_{t} (y)

is a linear partial differential equation of parabolic type, but unlike the equation for

β_{t} (y)

, it can not be simplified. We will assume that this equation has a solution, and we interpret this assumption as a sufficient condition the same as the above dynamic programming Equation (11).

The following statement summarizes the reasoning in the section.

Proposition 2

(Output control under complete information). If the dynamic programming Equation (11) solution exists, then this solution admits representation (14), (15) with coefficients

α_{t}

,

B_{t}

, and

γ_{t} (y)

given by (18), (21), and (20), respectively, and the optimal control

u^{*}

is given by (17).

5. Stochastic Representation of Gamma Functional

In order to implement control (12), one needs functions

α_{t}

and

B_{t}

. These can be calculated by means of any effective numerical method of ODE solution; hence, the practical solution of Equations (18) and (21) presents no difficulties. At the same time, the quality of optimal control is given by a solution of the dynamic programming equation:

J ({(U^{*})}_{0}^{T}) = V_{0} (Y, Z) = E \{Z^{T} α_{0} Z + Z^{T} B_{0} Y + γ_{0} (Y)\},

Therefore, functional

γ_{t} (y)

is also required. An approximate solution to (20) can be also obtained numerically by traditional grid methods, especially since these methods are well studied for equations of parabolic type. Nevertheless, growth of the dimensionality of Y may cause certain difficulties. However, the main thing is that, for Equation (20), there is only an initial condition

γ_{T} (y)

defined by (16), more precisely, a terminal condition, and there is no boundary condition such as

γ_{t} (y_{m a x})

, which is necessary for numerical treatments.

That is why, in our opinion, an alternative approach based on the known relation between solutions of partial differential equations of parabolic type and stochastic equations is more promising. An example of such relation is the Kolmogorov equation [38], which determines the connection between the solution of the Cauchy problem with the terminal condition for the parabolic equation, on the one hand, and the solution of the related stochastic equation, on the other hand. Moreover, here, we need a simpler integral representation of the solution of a parabolic equation, known as the Feynman–Kac formula [39]. The application of such tools for the numerical solution of parabolic equations has been well studied in a more general setting than we need [40].

For differential Equation (20), the Feynman–Kac formula has the following form:

γ_{t} (y) = E \{{\hat{y}}_{T}^{T} P_{T}^{T} P_{T} {\hat{y}}_{T} + \int_{t}^{T} M_{τ}^{γ} ({\hat{y}}_{τ}) d τ | F_{t}^{y}\} .

(22)

Here, we deliberately use the same notation for two different random processes:

Random process, which generates a terminal condition under the mathematical expectation and passes through point $(t, y) : {\hat{y}}_{t} = y$ ;
The Wonham filter estimate, which substitutes the state in observation Equation (9).

This emphasizes the fact that both these processes are described by the same equation and have the same probabilistic characteristics.

For practical applications of (22), we rewrite this relation by adding an auxiliary variable

{\hat{y}}_{t}^{γ}

in a differential form:

\begin{matrix} d {\hat{y}}_{τ} = Λ_{τ}^{T} {\hat{y}}_{τ} d τ + Σ_{τ} ({\hat{y}}_{τ}) d W_{τ}, {\hat{y}}_{t} = y, t \leq τ \leq T, \\ \frac{d {\hat{y}}_{τ}^{γ}}{d τ} = M_{τ}^{γ} ({\hat{y}}_{τ}) d τ, {\hat{y}}_{t}^{γ} = 0, \\ γ_{t} (y) = E \{{\hat{y}}_{T}^{T} P_{T}^{T} P_{T} {\hat{y}}_{T} + {\hat{y}}_{T}^{γ}\} . \end{matrix}

(23)

First, (23) is a stochastic representation of

γ_{t} (y)

, which explains the meaning of this coefficient. Secondly, (23) provides a basis for an approximate calculation of this coefficient. To be more precise, the proposed method of the numerical calculation of coefficient

γ_{t} (y)

consists in applying the Monte Carlo method to system (23). To be more specific, one needs a series of sample solutions of this system for different initial conditions

(t, y)

in order to obtain an approximate value of

γ_{t} (y)

as a statistical estimate of the terminal condition in (23).

6. Assistance of Mechanical Actuator Control

We investigate a controllable mechanical actuator designed for step-wise movement. Its model is described by the following dynamic system:

\begin{matrix} d (\begin{matrix} x_{t} \\ v_{t} \end{matrix}) = (\begin{matrix} 0 & 1 \\ a & b \end{matrix}) (\begin{matrix} x_{t} \\ v_{t} \end{matrix}) d t - \\ - (\begin{matrix} 0 & 0 & 0 \\ c_{1} & c_{2} & c_{3} \end{matrix}) y_{t} d t + (\begin{matrix} 0 \\ h \end{matrix}) u_{t} d t + (\begin{matrix} 0 \\ \sqrt{g} \end{matrix}) d w_{t}, t \in (0, T], \end{matrix}

(24)

(\begin{matrix} x_{0} \\ v_{0} \end{matrix}) \sim N ((\begin{matrix} 0 \\ 0 \end{matrix}), (\begin{matrix} σ_{x}^{2} & 0 \\ 0 & σ_{v}^{2} \end{matrix})) .

Here, we have the following:

$x_{t} \in R^{1}$ is an observable actuator position;
$v_{t} \in R^{1}$ is an observable actuator velocity;
$y_{t} \in \{e_{1}, e_{2}, e_{3}\}$ is an unobservable external control action, supposed to be an MJP with known transition intensity matrix $Λ$ and initial distribution vector $π_{0}$ ;
$w_{t} \in R^{1}$ is an external control noise supposed to be a standard Wiener process;
$u_{t} \in R^{1}$ is an admissible assisting control.

Thus, in the original notation, we have three-dimensional external unobservable control

y_{t}

corrupted by the one-dimensional Wiener noise

w_{t}

, two-dimensional observable state vector

z_{t} = col (x_{t}, v_{t})

, and one-dimensional control

u_{t}

. The other notations are as follows:

a_{t} = - (\begin{matrix} 0 & 0 & 0 \\ c_{1} & c_{2} & c_{3} \end{matrix}), b_{t} = (\begin{matrix} 0 & 1 \\ a & b \end{matrix}), c_{t} = (\begin{matrix} 0 \\ h \end{matrix}), σ_{t} = (\begin{matrix} 0 \\ \sqrt{g} \end{matrix}) .

It is easy to verify that in the case of noiseless (i.e.,

g = 0

) external constant control action

y_{t} \equiv e_{j}

, the actuator coordinate

x_{t}

tends to its steady-state value

C_{j} = \frac{c_{j}}{a}

as

t \to \infty

. Without loss of generality, we suppose that all components of vector

C = row (\frac{c_{1}}{a}, \frac{c_{2}}{a}, \frac{c_{3}}{a})

are different.

In the case of piecewise constant control action

y_{t}

and constant

g > 0

, the mechanical actuator cannot implement the expected “ideal” step-wise trajectory

C y_{t}

due to the inertia and noise accompanying the external control. Thus, the aim of the assisting control

u_{t}

is to improve actuator performance, and the corresponding criterion has the following form:

J (U_{0}^{T}) = E \{\int_{0}^{T} ({(C y_{t} - x_{t})}^{2} + R u_{t}^{2}) d t + \frac{1}{T} {(C y_{T} - x_{T})}^{2}\} \to min_{U_{0}^{T}},

(25)

where

R > 0

is a dimensionless unit cost of the assisting control.

We perform the calculations for the following parameter values:

Λ = (\begin{matrix} - 0.5 & 0.5 & 0 \\ 0.5 & - 1 & 0.5 \\ 0 & 0.5 & - 0.5 \end{matrix}); π_{0} = (\begin{matrix} 1 \\ 0 \\ 0 \end{matrix}),

a = - 1, b = - 0.5, T = 10, h = 10, g = 0.01, σ_{x} = 1, σ_{v} = 1, R = 0.01,

(\begin{matrix} c_{1} & c_{2} & c_{3} \end{matrix}) = (\begin{matrix} 1 & 0 & - 1 \end{matrix}), C = (\begin{matrix} C_{1} & C_{2} & C_{3} \end{matrix}) = (\begin{matrix} - 1 & 0 & 1 \end{matrix}) .

Stochastic system (24) is solved by the Euler–Maruyama numerical scheme with step

0.01

; meanwhile, Equations (18) and (21) are solved using the implicit Euler’s scheme with the same step.

Note, that the violation of condition

σ_{t} σ_{t}^{T} > 0

in (24) does not prevent the use of the obtained result, since coordinate

x_{t}

in the state

z_{t}

is a simple integral of the velocity

v_{t}

; thus, in the filtering problem, one can use only observation

v_{t}

.

It is easy to verify that the actuator is stable if

b < 0

and

b^{2} + 4 a < 0

. In the example, we investigate only the stable case.

Figure 1, Figure 2 and Figure 3 present assisting control coefficients.

Figure 1 contains plots of the evolution in time for coefficients

α_{t}^{(11)}

,

α_{t}^{(12)}

, and

α_{t}^{(22)}

; Figure 2 stands for coefficients

B_{t}^{(11)}

,

B_{t}^{(12)}

and

B_{t}^{(13)}

; Figure 3 stands for coefficients

B_{t}^{(21)}

,

B_{t}^{(22)}

and

B_{t}^{(23)}

. Here, we denote by

x^{(i j)}

the element of matrix x with indexes

i, j

.

Figure 4 presents the comparison of the unobserved external control term

c y_{t}

and its estimate

c {\hat{y}}_{t}

calculated by the Wonham filter.

Figure 5 contains plots of the actuator coordinate governed by various types of assisting control:

$x_{t}^{(0)}$ obtained without assisting control, i.e., for $u_{t} = u_{t}^{(0)} \equiv 0$ ;
$x_{t}^{m e}$ obtained under the trivial assisting control, i.e., for $u_{t} = u_{t}^{m e} \equiv u^{*} (t, y_{t}^{m e}, z_{t}^{m e}),$ where control law $u^{*} (t, y, z)$ is defined in (17), $y_{t}^{m e} = E \{y_{t}\}$ , $z_{t}^{m e} = E \{z_{t}\}$ ;
the optimal trajectory $x_{t}^{*}$ calculated for the optimal control $u_{t} = u_{t}^{*}$ ;
the ”ideal” trajectory $x_{t}^{* *}$ calculated for control $u_{t} = u_{t}^{* *} \equiv u^{*} (t, y_{t}, z_{t}^{* *}),$ i.e., optimal control law $u^{*}$ defined in (17) with full information $y = y_{t}$ , $z = z_{t}^{* *}$ .

A comparison with the step-wise “target” trajectory

C y_{t}

is conducted.

Figure 6 contains plots of the actuator velocity governed by various types of the assisting control:

$v_{t}^{(0)}$ obtained under control $u_{t}^{(0)}$ ;
$v_{t}^{m e}$ obtained under control $u_{t}^{m e}$ ;
The optimal trajectory $v_{t}^{*}$ calculated for the optimal control $u_{t}^{*}$ ;
The ”ideal” trajectory $v_{t}^{* *}$ calculated for the control $u_{t}^{* *}$ .

Finally, Figure 7 contains the following:

The optimal control $u_{t}^{*}$ ;
The “ideal” control $u_{t}^{* *}$ .

We also calculate objective values for various control types:

J ({(U^{*})}_{0}^{T}) = 1.71

. Meanwhile,

J ({(U^{(0)})}_{0}^{T}) = 8.65

and

J ({(U^{m e})}_{0}^{T}) = 7.83

, i.e., the optimal assistance improves the actuator performance significantly. Additionally, we calculate the performance index for the “ideal” control:

J ({(U^{* *})}_{0}^{T}) = 1.29

, i.e., it does not demonstrate superiority in comparison with optimal control

U^{*}

.

7. Conclusions

The principal results of the paper are as follows: the derived explicit form (17) of the optimal control in the problem at hand and the fact that this optimal solution is indeed a feedback control, in which the key role is played by the solution of the auxiliary optimal filtering problem, i.e., conditional expectation

Y_{t} = E \{y_{t} | F_{t}^{Z}\}

.

This provided grounds to formulate the main result, which is the separation theorem, generalizing the results of [29,30,31] to a general criterion (3). The fundamental difference between the classical LQG and the alternatives proposed in the referenced papers and the present research is that the transformed problem is not similar to the original one: the martingale representation of MJP (2) is replaced by a completely different object—the stochastic Ito equation with a Wiener process (9). Thus, the key to the solution turned out to be a unique property of the Wonham filter, which presents the MJP estimate given the indirect noisy observations in the form of the classical Ito equation with a Wiener process.

The obtained results admit the following interpretation and extensions. First, in the initial partially observable system, process

y_{t}

can be treated as an unobservable system state; meanwhile, process

z_{t}

plays the role of the controlled observable output. This point of view can be useful for various applications. Second, the obtained results remain valid in general if one complements the initial optimality criterion (3) by the additional terms

{|y_{t} - z_{t}|}^{2}

for the output optimal tracking and by

{|u_{t}|}^{2}

and

{|z_{t}|}^{2}

for the optimal output/cost control. Third, the MJP

y_{t}

can be replaced by any random process with a finite moment of the second order described by some nonlinear stochastic differential equation. In this case, the separation theorem is still correct, but coefficient

β

in the optimal control is defined by some partial differential equation, and the optimal filtering estimate

{\hat{y}}_{t}

is described via the solution to the Kushner–Strtonovich or Duncan–Mortensen–Zakai equation. Fourth, if

y_{t}

is an MJP, the diffusion coefficient

σ_{t}

in (1) can be replaced by function

σ (t, y_{t})

. In this manner, we extend the class dynamics by the systems with both the statistically uncertain piecewise constant drift and diffusion. All separation theorems and optimal control formulae remain valid. However, the solution to the optimal filtering problem is not expressed by the original Wonham filter but via its generalization [41,42].

Author Contributions

Conceptualization, A.B. (Andrey Borisov); methodology, A.B. (Alexey Bosov); software, G.M.; validation, A.B. (Andrey Borisov); formal analysis and investigation, A.B. (Alexey Bosov); writing—original draft preparation, A.B. (Alexey Bosov), G.M.; writing—review and editing, A.B. (Alexey Bosov); visualization, G.M.; supervision, A.B. (Andrey Borisov). All authors have read and agreed to the published version of the manuscript.

Funding

The research was supported by the Ministry of Science and Higher Education of the Russian Federation, project No. 075-15-2020-799.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable to this article.

Acknowledgments

The research was carried out using the infrastructure of the Shared Research Facilities «High Performance Computing and Big Data» (CKP «Informatics») of FRC CSC RAS (Moscow).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LQG	Linear quadratic Gaussian;
MJP	Markov jump process;
ODE	Ordinary differential equation;
RHS	Right-hand side.

References

Athans, M. The Role and Use of the Stochastic Linear-Quadratic-Gaussian Problem in Control System Design. IEEE Trans. Autom. Control 1971, 16, 529–552. [Google Scholar] [CrossRef]
Cipra, B.A. Engineers look to Kalman filtering for guidance. SIAM News 1993, 26, 8–9. [Google Scholar]
Johnson, A. LQG applications in the process industries. Chem. Eng. Sci. 1993, 48, 2829–2838. [Google Scholar] [CrossRef]
Mäkilä, P.; Westerlund, T.; Toivonen, H. Constrained linear quadratic gaussian control with process applications. Automatica 1984, 20, 15–29. [Google Scholar] [CrossRef]
Westerlund, T. A digital quality control system for an industrial dry process rotary cement kiln. IEEE Trans. Autom. Control 1981, 26, 885–890. [Google Scholar] [CrossRef]
Ammar, S.; Mabrouk, M.; Vivalda, J.C. Observer and global output feedback stabilisation for some mechanical systems. Int. J. Control 2009, 82, 1070–1081. [Google Scholar] [CrossRef]
Battilotti, S.; Lanari, L.; Ortega, R. On the Role of Passivity and Output Injection in the Output Feedback Stabilisation Problem: Application to Robot Control. Eur. J. Control 1997, 3, 92–103. [Google Scholar] [CrossRef]
Wonham, W.M. On the Separation Theorem of Stochastic Control. SIAM J. Control 1968, 6, 312–326. [Google Scholar] [CrossRef]
Georgiou, T.T.; Lindquist, A. The Separation Principle in Stochastic Control, Redux. IEEE Trans. Autom. Control 2013, 58, 2481–2494. [Google Scholar] [CrossRef]
Amin, S.; Cárdenas, A.A.; Sastry, S.S. Safe and Secure Networked Control Systems under Denial-of-Service Attacks. In Hybrid Systems: Computation and Control; Majumdar, R., Tabuada, P., Eds.; Springer: Berlin/Heidelberg, Germany, 2009; pp. 31–45. [Google Scholar]
Befekadu, G.K.; Gupta, V.; Antsaklis, P.J. Risk-Sensitive Control Under Markov Modulated Denial-of-Service (DoS) Attack Strategies. IEEE Trans. Autom. Control 2015, 60, 3299–3304. [Google Scholar] [CrossRef]
Sinopoli, B.; Schenato, L.; Franceschetti, M.; Poolla, K.; Sastry, S. An LQG Optimal Linear Controller for Control Systems with Packet Losses. In Proceedings of the 44th IEEE Conference on Decision and Control, Seville, Spain, 15 December 2005; pp. 458–463. [Google Scholar] [CrossRef] [Green Version]
Bosov, A.V. Discrete stochastic system linear output control with respect to a quadratic criterion. J. Comput. Syst. Sci. Int. 2016, 55, 349–364. [Google Scholar] [CrossRef]
Achille, A.; Soatto, S. A Separation Principle for Control in the Age of Deep Learning. Annu. Rev. Control Robot. Auton. Syst. 2018, 1, 287–307. [Google Scholar] [CrossRef] [Green Version]
Bencze, W.; Franklin, G. A separation principle for hybrid control system design. IEEE Control Syst. Mag. 1995, 15, 80–84. [Google Scholar] [CrossRef]
Besancon, G.; Battilottib, S.; Lanari, L. A new separation result for a class of quadratic-like systems with application to Euler–Lagrange models. Automatica 2003, 39, 1085–1093. [Google Scholar] [CrossRef]
Kushner, H.; Dupuis, P.G. Numerical Methods for Stochastic Control Problems in Continuous Time; Stochastic Modelling and Applied Probability Series; Springer: New York, NY, USA, 2001; Volume 24. [Google Scholar] [CrossRef]
Bertsekas, D.P. Dynamic Programming and Optimal Control; Athena Scientific: Cambridge, MA, USA, 2013. [Google Scholar]
Fleming, W.H.; Rishel, R.W. Deterministic and Stochastic Optimal Control; Stochastic Modelling and Applied Probability Series; Springer: New York, NY, USA, 1975; Volume 1. [Google Scholar] [CrossRef]
Stratonovich, R.L. Conditional Markov Processes. Theory Probab. Appl. 1960, 5, 156–178. [Google Scholar] [CrossRef]
Kushner, H.J. On the differential equations satisfied by conditional probability densities of Markov processes with applications. J. Soc. Ind. Appl. Math. Ser. A Control 1964, 2, 106–119. [Google Scholar] [CrossRef]
Mortensen, R.E. Stochastic Optimal Control with Noisy Observations. Int. J. Control 1966, 4, 455–464. [Google Scholar] [CrossRef]
Duncan, T.E. On the Absolute Continuity of Measures. Ann. Math. Stat. 1970, 41, 30–38. [Google Scholar] [CrossRef]
Zakai, M. On the optimal filtering of diffusion processes. Z. Wahrsch. Verw. Geb. 1969, 11, 230–243. [Google Scholar] [CrossRef]
Davis, M.H.A.; Varaiya, P. Dynamic Programming Conditions for Partially Observable Stochastic Systems. SIAM J. Control 1973, 11, 226–261. [Google Scholar] [CrossRef]
Beneš, V.; Karatzas, I. On the Relation of Zakai’s and Mortensen’s Equations. SIAM J. Control Optim. 1983, 21, 472–489. [Google Scholar] [CrossRef]
Bensoussan, A. Stochastic Control of Partially Observable Systems; Cambridge University Press: Cambridge, UK, 1992. [Google Scholar]
Miller, B.M.; Avrachenkov, K.E.; Stepanyan, K.V.; Miller, G.B. Flow Control as a Stochastic Optimal Control Problem with Incomplete Information. Probl. Inf. Transm. 2005, 41, 150–170. [Google Scholar] [CrossRef] [Green Version]
Beneš, V. Quadratic Approximation by Linear Systems Controlled From Partial Observations. In Stochastic Analysis; Mayer-Wolf, E., Merzbach, E., Shwartz, A., Eds.; Academic Press: Cambridge, MA, USA, 1991; pp. 39–50. [Google Scholar] [CrossRef]
Helmes, K.; Rishel, R. The solution of a partially observed stochastic optimal control problem in terms of predicted miss. IEEE Trans. Autom. Control 1992, 37, 1462–1464. [Google Scholar] [CrossRef]
Rishel, R. A Strong Separation Principle for Stochastic Control Systems Driven by a Hidden Markov Model. SIAM J. Control Optim. 1994, 32, 1008–1020. [Google Scholar] [CrossRef]
Wonham, W.M. Some Applications of Stochastic Differential Equations to Optimal Nonlinear Filtering. J. Soc. Ind. Appl. Math. Ser. A Control 1964, 2, 347–369. [Google Scholar] [CrossRef]
Elliott, R.J.; Aggoun, L.; Moore, J.B. Hidden Markov Models. Estimation and Control; Stochastic Modelling and Applied Probability; Springer: New York, NY, USA, 1995. [Google Scholar] [CrossRef]
Bhatia, R. Matrix Analysis; Graduate Texts in Mathematics; Springer: New York, NY, USA, 1997. [Google Scholar] [CrossRef]
Shiryaev, A.N. Probability; Graduate Texts in Mathematics; Springer: New York, NY, USA, 1996. [Google Scholar] [CrossRef]
Davis, M. Linear Estimation and Stochastic Control; A Halsted Press Book; Chapman and Hall: London, UK, 1977. [Google Scholar]
Hurewicz, W. Lectures on Ordinary Differential Equations; Dover Phoenix Editions; Dover Publications, Incorporated: New York, NY, USA, 2002. [Google Scholar]
Gihman, I.I.; Skorohod, A.V. The Theory of Stochastic Processes III; Springer: New York, NY, USA, 1979. [Google Scholar] [CrossRef]
Oksendal, B. Stochastic Differential Equations: An Introduction with Applications; Universitext; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar] [CrossRef]
Fahim, A.; Touzi, N.; Warin, X. A probabilistic numerical method for fully nonlinear parabolic pdes. Ann. Appl. Probab. 2011, 21, 1322–1364. [Google Scholar] [CrossRef]
Borisov, A.; Sokolov, I. Optimal Filtering of Markov Jump Processes Given Observations with State-Dependent Noises: Exact Solution and Stable Numerical Schemes. Mathematics 2020, 8, 506. [Google Scholar] [CrossRef] [Green Version]
Borisov, A.; Bosov, A.; Miller, G.; Sokolov, I. Partial Diffusion Markov Model of Heterogeneous TCP Link: Optimization with Incomplete Information. Mathematics 2021, 9, 1632. [Google Scholar] [CrossRef]

Figure 1. Optimal control coefficients 1 −

α_{t}^{(11)}

, 2 −

α_{t}^{(12)}

, and 3 −

α_{t}^{(22)}

.

Figure 1. Optimal control coefficients 1 −

α_{t}^{(11)}

, 2 −

α_{t}^{(12)}

, and 3 −

α_{t}^{(22)}

.

Figure 2. Optimal control coefficients 1 −

B_{t}^{(11)}

, 2 −

B_{t}^{(12)}

, and 3 −

B_{t}^{(13)}

.

Figure 2. Optimal control coefficients 1 −

B_{t}^{(11)}

, 2 −

B_{t}^{(12)}

, and 3 −

B_{t}^{(13)}

.

Figure 3. Optimal control coefficients 1 −

B_{t}^{(21)}

, 2 −

B_{t}^{(22)}

, and 3 −

B_{t}^{(23)}

.

Figure 3. Optimal control coefficients 1 −

B_{t}^{(21)}

, 2 −

B_{t}^{(22)}

, and 3 −

B_{t}^{(23)}

.

Figure 4. Trajectory and its estimate: 1 −

c {\hat{y}}_{t}

and 2 −

c y_{t}

.

Figure 4. Trajectory and its estimate: 1 −

c {\hat{y}}_{t}

and 2 −

c y_{t}

.

Figure 5. Actuator coordinate: 1 −

x_{t}^{(0)}

, 2 −

x_{t}^{m e}

, 3 −

x_{t}^{*}

, 4 −

x_{t}^{* *}

, and 5 − target

C y_{t}

.

Figure 5. Actuator coordinate: 1 −

x_{t}^{(0)}

, 2 −

x_{t}^{m e}

, 3 −

x_{t}^{*}

, 4 −

x_{t}^{* *}

, and 5 − target

C y_{t}

.

Figure 6. Actuator velocity: 1 −

v_{t}^{(0)}

, 2 −

v_{t}^{m e}

, 3 −

v_{t}^{*}

, 4 −

v_{t}^{* *}

, and 5 − target

C y_{t}

.

Figure 6. Actuator velocity: 1 −

v_{t}^{(0)}

, 2 −

v_{t}^{m e}

, 3 −

v_{t}^{*}

, 4 −

v_{t}^{* *}

, and 5 − target

C y_{t}

.

Figure 7. Controls: 1 −

u_{t}^{*}

, 2 −

u_{t}^{* *}

, and 3 − target

C y_{t}

.

Figure 7. Controls: 1 −

u_{t}^{*}

, 2 −

u_{t}^{* *}

, and 3 − target

C y_{t}

.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Borisov, A.; Bosov, A.; Miller, G. Optimal Stabilization of Linear Stochastic System with Statistically Uncertain Piecewise Constant Drift. Mathematics 2022, 10, 184. https://doi.org/10.3390/math10020184

AMA Style

Borisov A, Bosov A, Miller G. Optimal Stabilization of Linear Stochastic System with Statistically Uncertain Piecewise Constant Drift. Mathematics. 2022; 10(2):184. https://doi.org/10.3390/math10020184

Chicago/Turabian Style

Borisov, Andrey, Alexey Bosov, and Gregory Miller. 2022. "Optimal Stabilization of Linear Stochastic System with Statistically Uncertain Piecewise Constant Drift" Mathematics 10, no. 2: 184. https://doi.org/10.3390/math10020184

APA Style

Borisov, A., Bosov, A., & Miller, G. (2022). Optimal Stabilization of Linear Stochastic System with Statistically Uncertain Piecewise Constant Drift. Mathematics, 10(2), 184. https://doi.org/10.3390/math10020184

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimal Stabilization of Linear Stochastic System with Statistically Uncertain Piecewise Constant Drift

Abstract

1. Introduction

2. Problem Statement

3. Separation of Filtering and Control

4. Solution to Control Problem

5. Stochastic Representation of Gamma Functional

6. Assistance of Mechanical Actuator Control

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI