1. Introduction
Optimal control synthesis plays a critical role in managing the dynamics of controlled objects or processes. This is particularly relevant when the dynamic behavior is described by Ito stochastic differential equations (SDEs). The choice of Ito SDE is due to their prevalence as applied models and their widespread use in biology, chemistry, telecommunications, etc. The main type of random perturbations in stochastic differential equations is the symmetric Wiener process, the trajectories of which have the reflection property, which very well characterizes the symmetry of this process. The symmetry of the solution of the stochastic differential equation is also well reflected in the example of this work, where the symmetry of the solution trajectories is traced relative to the so-called averaged trajectory, and optimal control does not violate this property. The paper [
1] considers the optimal control of a linear SDE of general type perturbed by a random process with independent increments and a quadratic quality functional. It is shown that under certain conditions the optimal control is linear and can be determined by solving an additional vector quadratic problem. The optimal solution of the deterministic problem with feedback is obtained. In paper [
2], the theory of optimal control for stochastic systems whose performance is measured by the exponent of an integral form is developed. In paper [
3], the problem of the existence of optimal control for stochastic systems with a nonlinear quality functional is solved. The main model in the work [
4] is the linear autonomous SDE in the form
where
with
. Notably, the results of the paper are devoted to the quadratic quality functional and necessary and sufficient conditions for the stability of the systems (
1), and the stability conditions are formulated in terms of Riccati-type jumping operator properties. Note that the linear case (
1) is most often considered as a first approximation of the dynamics of a real phenomenon, since the optimal control in this case can be found in closed form and the approximations of the optimal control can be compared with the exact values.
In article [
5], the main attention is focused on SDE with external switches. In this paper, a number of important remarks are made on the calculation of the product (infinitesimal) operator of a random process given by a SDE with Markov switches.
In article [
6], SDE with external switches and Poisson perturbations are discussed. Using the Belman equation, sufficient conditions for the existence of optimal control for a general quality function are found, and a closed form of optimal control is found for the linear case with a quadratic quality function.
The transition from difference equations to differential equations gives rise to a number of complications in the synthesis of optimal control, since it leads to a transition from solving more complex dynamical systems such as the analog of the Lyapunov equation for nonautonomous systems. Further, the presence of random variables of different structures leads to the use of the infinitesimal operator [
7,
8], the calculation of which depends on the nature of the random variables that affect the calculation of the infinitesimal operator.
In this paper, we will focus on the use of semi-Markov processes [
8] as the main source of external random disturbances. Noteworthy, the use of semi-Markov processes significantly extends the range of application of theoretical results for many applied problems, since the condition of exponential distribution of the time spent in the state
for continuous Markov processes
is very strict for many applied problems. A description based on a Markov process will be incorrect if, for example, an assumption is made about the minimum time spent in a particular state
, where
is the minimum time spent in the state. In this case, the use of semi-Markov processes is more efficient, since it allows us to control the properties of the residence time, the size of the jump
and the interdependence between them based on the semi-Markov kernel
[
8]. The use of Markov processes greatly simplifies the study of the system, since it requires estimation only on the basis of the intensities of the Markov process
, and usually when studying systems with external Markov disturbances it is assumed that
In this study, we will not focus on the asymptotic properties of the random process
,
, but will consider the problem of synthesizing optimal control on a finite interval
, so the article will not consider additional conditions on the semi-Markov process that ensure its ergodicity [
7,
9,
10].
Note that the study of optimal control with semi-Markovian processes has been receiving attention since the last quarter of the 20th century, when it became clear that the exponential distribution cannot be considered as the basic distribution for the transition time in many cases. For example, in the work [
11], the optimization of the system in the presence of random shocks, which in turn are described by semi-Markov processes, is considered. The work gives an example of optimal replacement of the levels of the system and constructs a replacement algorithm for a countable set of states of the semi-Markov process. Models for finding the optimal general repair policy are considered in [
12], and the semi-Markov process in this case describes construction errors, and the policy change is based on minimizing the budget on the one hand and fulfilling all construction requirements on the other. In [
13], the authors considered the possibility of using semi-Markov models to solve the output-feedback control problem for general semi-Markov jump systems with a memory-dynamic event-triggered protocol. The authors of this article proposed a description of the system using a differential equation with a semi-Markovian component for the derivative of the process, i.e., the dynamics of the process will depend integrally on the dynamics of the semi-Markovian process. The main statements of this work provide sufficient conditions for the existence of optimal control for systems with semi-Markov parameters and an example of finding optimal control is given. A somewhat similar result for the case of asynchronous control was investigated in work [
14]. The paper also provides several sufficient conditions for stochastic controllability of the system and considers a number of model examples with implementations of the semi-Markov process.
As noted in many works, the use of semi-Markov processes is of great practical importance, since the use of these models allows for a more accurate description of real processes. For example, in work [
12] the authors used a semi-Markov process to find the optimal policy in construction, in work [
15] the authors used semi-Markov processes with the corresponding problem of optimizing a portfolio of securities. Practically, this explains the use of Markov models with exponential time of stay in states is a good theoretical model, but for practical reasons it has a number of limitations associated with the moments of switching. As noted in paper [
12], the policy change occurs at discrete moments of time and has a uniform distribution over a finite horizon, so the use of the Markov model in this case is inappropriate.
Instead, the main attention will be paid to the elementary study of the dynamics of the main process
at a fixed value of the external perturbation
, which allows us to more effectively study the optimal control based on the methods proposed for Markov processes.
Section 2 of the article describes the problem statement and justifies the existence of a solution that is unique up to stochastic equivalence. In
Section 3, sufficient conditions for the existence of a solution to the optimal control problem are established, and in
Section 4, a method for finding it in the general case is described.
Section 5 is devoted to the synthesis of optimal control for a linear system with a quadratic performance index, and in
Section 6, a system of Riccati differential equations is presented for finding the matrix functions that form optimal control. The last section,
Section 7, contains a model example that illustrates the stages of optimal control synthesis under the condition that the time spent in a state is determined by
, i.e., the time spent in a state is discrete and with probability 1 greater than
.
2. Problem Statement
Consider a stochastic dynamical system defined on a probabilistic basis
[
7,
16]. The system is governed by the stochastic differential equation (SDE):
with initial conditions
Here,
is a semi-Markov process with values in
, which is characterised by generator [
9]
where
specifies the distribution of jumps of the nested Markov chain
[
8],
is the conditional distribution of time spent in the state
y. The representation of the generator based on the splitting (
4) greatly simplifies the main calculations in proving the main theoretical results of this paper, but allows for generalization to the general case of a semi-Markov kernel
;
;
,
is a standard Wiener process; a control
is
m-measured function from the set of admissible controls
U [
17]; and the processes
w and
are independent [
7,
16].
As in works [
16,
18], we assume that the measured by the set of variables functions
and
satisfy the boundedness condition and the Lipschitz condition
The semi-Markov process
has the following effect on the trajectories of the process
x. Suppose that on the interval
, the process
takes the value
. Then, the movement will occur due to the system
According to [
10,
16], if the conditions (
5) and (
6) are met, the system (
7) has on the interval
a unique solution with a finite second moment up to stochastic equivalence.
Then, at time
, the value of the process
changes:
. Then, on the interval
, the motion will occur due to the system
According to [
10,
16], if conditions (
5) and (
6) are met, system (
8) has a unique solution with a finite second moment on the interval
.
Thus, conditions (
5) and (
6) guarantee the existence of a unique solution to the Cauchy problem, (
2), and (
3) on the interval
, the second moment for which is finite. Thus, for the existence of a solution on
, we will assume that the semi-Markov process is defined on
, i.e.,
.
3. Sufficient Conditions for Optimality
We introduce a sequence of functions and class .
On the functions
, we define the weak infinitesimal operator (WIO)
where
is the strong solution (
2) on the interval
with control
. Notice that WIO is natural extension of the right-hand side of the ordinary differential equation
in the case of stochastic systems.
The problem of optimal control is to find a control
from the set
U that minimizes the scalar quality functional [
17]
for some fixed
,
,
, and
.
To obtain sufficient conditions for optimality, we need to prove several auxiliary statements.
Lemma 1. Let:
- (1)
There exists a unique solution to the Cauchy problem (2), and (3), whose second moment is finite for each t; - (2)
A sequence of functions from class V exists;
- (3)
For , the WIO is defined on the solutions of (2) and (3).
Then, for the following equality holds Proof. For the Markov process
, with respect to the
-algebra
constructed on the interval
, the following Dynkin formula [
7] holds
where
and
.
If
, then for the solution of the problems (
2) and (
3), we obtain the following equality
Similarly, write the Dynkin formula on the interval
and, subtracting it from (
12), we obtain (
11). Lemma 1 is proved. □
Lemma 2. Let:
- (1)
Conditions (1) and (2) of Lemma 1 are fulfilled;
- (2)
For , has the sense of the equation
with a boundary conditionwhere is the WIO defined by (9). Then, , can be written Proof. Consider the solution
of the problems (
2) and (
3) for
, constructed according to the corresponding initial condition.
We integrate (
13) with respect to
s from
to
T and calculate the mathematical expectation. We obtain
According to Lemma 1, there exists a first term (
16) that is equal to the increment (
11):
where
according to (
14), and
. Thus,
Substituting (
17) into (
16), we obtain the statement of Lemma 2. □
Theorem 1. Let:
- (1)
There exists a unique solution to the Cauchy problem (2), and (3), whose second moment is finite for each t; - (2)
There exists a sequence of functions and an optimal control that satisfy the equation
with a boundary condition - (3)
, the following inequality holds
where is the WIO (9) on the solutions of (2), (3). Then, the control is optimal, for The sequence of functions is called the control cost or Bellman function, and Equation (18) can be written as the Bellman equation Proof. An optimal control is also an admissible control. Therefore, there exists a solution
for which (
18) takes the form
where
is taken at the point
.
We integrate (
23) from
t to
T, calculate the mathematical expectation, and, taking into account (
19), we obtain
Now, let
be an arbitrary control from the class of admissible controls
U. Then, according to condition (3) of the theorem, the following inequality holds
We integrate (
24) over
, and calculate the mathematical expectation
at fixed
and initial value
x. Accounting for Lemmas 1 and 2, we obtain
And this, in fact, is the definition of optimal control in the sense of minimizing the quality functional . Theorem 1 is proved. □
5. Synthesis of Optimal Control for a Linear Stochastic System
When studying complex systems, the main approach is to break these systems into simpler components, thus considering simple idealized objects that will reveal many of their properties [
20]. The uniqueness and exceptional importance of the linear case for stochastic systems, as an “idealization” approach, is explained by a number of reasons.
First, in this case, as already mentioned, there is always the possibility of obtaining the form of the control function in a closed form. As well as the application of various methods for studying the properties of the solution, such as stability, etc.
Secondly, in theoretical articles, such as ours, the most important aspect is the analysis and justification of the stochastic components. They can be most clearly observed in linear systems—the solutions of which are sufficiently smooth—and the reader will clearly understand that all fluctuations in the solution are due exclusively to stochastic components.
Thirdly, it is linear systems that are usually systems of first approximation, which under certain conditions are a reflection of the behavior (stability or instability) of the solution of the general system.
Consider the problem of optimal control for a linear stochastic dynamical system given by the stochastic differential equation
with the initial conditions
Here are piecewise continuous integral-matrix functions of appropriate dimension.
The problem of optimal control for the systems (
30) and (
31) is to find a control
,
from the set of admissible controls
U so that minimizes the quadratic quality functional
is a uniformly positive definite with respect to the
matrix, and
and
are non-negatively definite
matrices. To simplify, we introduce the notation
Theorem 3. The optimal control for the problem (30)–(32) has the following form:where the non-negatively definite matrix defines the Bellman functionalHere, g is a non-negative scalar function. Proof. Bellman’s equation for (
30)–(
32) has the form
where
Substitute (
36) into (
35):
The form of the optimal control is obtained by differentiating (
37), since
minimizes the left side (
37):
where
Theorem 3 is proved. □
6. Construction of the Bellman Equation
Substituting (
33) and (
34) into Equation (
35), we obtain the following equation for
:
Equating to zero the quadratic form in
x and expressions that do not depend on
x, and taking into account the matrix equality
, we obtain a system of differential equations for finding the matrices
:
with a boundary condition
Thus, we can formulate the following theorem.
Theorem 4. If the quality functional for the systems (30) and (31) is (32), and the control cost is (34), then the system of differential equations for finding the matrices has the form (39)–(41). Next, we will prove the solvability of the system (
39)–(
41). We use the Bellman iteration method [
21]. We consider the interval
where
and omit the index
k for
and
P. We define the zero approximation
where
is a bounded piecewise continuous matrix. Substitute (
42) into (
27) and find the value of
from the resulting equation, which corresponds to the control of (
42).
Next, we substitute
into the Bellman equation (
22) and find the control
that minimizes (
22).
Continuing this process, we obtain a sequence of controls
and functional
in the form
where
is the solution of the boundary-value problem (
39)–(
41) for
.
For
, the next estimate is correct:
The convergence of the functions
to
, the controls
to
, and the convergence of the sequence of matrices
to
can be proved using (
44) [
17].
The next estimate is correct:
Thus, we state the following theorem.
Theorem 5. The approximate solution of the optimal control synthesis problem for the problem (30)–(32) is carried out using successive Bellman approximations, where the n-th approximation of the optimal control and the Bellman functional for each interval is given by the formula (43). In this case, the error is estimated by the inequality (46). 7. Model Example
Consider the system
with the initial condition
Here, is a semi-Markov process with two states with transition probabilities for a nested Markov chain and time in state , where . , .
The matrices from the quality functional (
32) are assumed to be equal to
The Bellman functional will be found in the form
In this case, the system (
39)–(
41) has the form
with a boundary condition
Or
where
.
The optimal control is as follows
Simulated trajectory of the solution (46) and (47) without control and under the influence of optimal control is shown in
Figure 1.
8. Discussion
The main focus of this paper is on theoretical derivations of the optimal control system for stochastic differential equations in the presence of external perturbations described by semi-Markov processes. This generalization allows us to more accurately describe the dynamics of real processes under various restrictions on the spend time
in states, which is impossible in the case of a Markov process. In Theorem 2, we find an explicit form of the infinitesimal operator, which is determined on the basis of the coefficients of the original equation and the characteristics of the semi-Markov process. This representation allows us to synthesize the optimal control
based on the Bellman equation (
18) with the boundary condition (
19). For the linear case of the system of the form (
30), the search for optimal control is carried out on the basis of solving the Riccati equation (
39), which also arises in the case of the presence of a Markovian external perturbation.
As we can see from
Figure 1, the controlled solution visually resembles an asymptotically stable one, but from
Figure 2 it is clear that the solution is not. Thus, asymptotic stability may not be achieved by constructing an optimal control. The problem of stability and stability of the solution will be the subject of research in subsequent articles.
The main focus in the following works for dynamical systems with semi-Markovian external perturbations will be on taking into account the ergodic properties of the semi-Markovian process
when analyzing the asymptotic behavior of the system. In contrast to systems with Markovian external switches, where the ergodic properties of
were described on the basis of intensities
, for the semi-Markovian case, conditions on the times of steady state and jumps will play an important role. Thus, the parameter estimate of model (
2) will have not only an estimate of the parameters
, but also an estimate of the distribution of the residence time in the states. Therefore, the following algorithm can be proposed for system analysis and parameter estimation:
Estimation of switching moments
This estimation can be realized using a generalized unit-root test developed for time series [
22];
Estimation state space for semi-Markov process
,
Estimation coefficients
for SDE (
2);
The presented framework of stochastic dynamic systems with semi-Markov parameters offers a promising tool for systems biology and systems medicine. In systems biology, it can help model complex molecular interactions, such as oxidative stress and mitochondrial dysfunction, influenced by stochastic perturbations. In systems medicine, this approach supports personalized treatment strategies by capturing patient-specific dynamics. For instance, it can predict disease progression and optimize therapies in conditions like Parkinson’s Disease. By integrating theoretical modeling with clinical data, this framework bridges the gap between understanding disease mechanisms and advancing precision medicine.