Information Geometry Control under the Laplace Assumption

Guel-Cortez, Adrian-Josue; Kim, Eun-jin

doi:10.3390/psf2022005025

Open AccessProceeding Paper

Information Geometry Control under the Laplace Assumption^†

by

Adrian-Josue Guel-Cortez

^*

and

Eun-jin Kim

Centre for Fluid and Complex Systems, Coventry University, Coventry CV1 2TT, UK

^*

Author to whom correspondence should be addressed.

^†

Presented at the 41st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Paris, France, 18–22 July 2022.

Phys. Sci. Forum 2022, 5(1), 25; https://doi.org/10.3390/psf2022005025

Published: 12 December 2022

(This article belongs to the Proceedings of The 41st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

By combining information science and differential geometry, information geometry provides a geometric method to measure the differences in the time evolution of the statistical states in a stochastic process. Specifically, the so-called information length (the time integral of the information rate) describes the total amount of statistical changes that a time-varying probability distribution takes through time. In this work, we outline how the application of information geometry may permit us to create energetically efficient and organised behaviour artificially. Specifically, we demonstrate how nonlinear stochastic systems can be analysed by utilising the Laplace assumption to speed up the numerical computation of the information rate of stochastic dynamics. Then, we explore a modern control engineering protocol to obtain the minimum statistical variability while analysing its effects on the closed-loop system’s stochastic thermodynamics.

Keywords:

information geometry; non-linear stochastic systems; information length; stochastic thermodynamics

1. Introduction

Stochastic systems are ubiquitous and include a large set of complex systems, such as the time evolution of molecular motors [1], the stock market [2], decision making [3], population dynamics [4] or engineering systems with parameter uncertainties [5]. The description of stochastic dynamics commonly involves the calculation of time-varying probability density functions (PDFs) governed by a Fokker–Planck equation and its corresponding stochastic differential equation [1]. Since a time-varying PDF describes all the possible trajectories the stochastic system can take in time, such a formalism has been advantageously applied in emergent fields as stochastic thermodynamics [6] or inference control [7].

As time-varying PDFs contain enormous dynamical information, defining a metric of the path’s length that a stochastic system takes through time can bring benefits when, for example, designing “efficient” systems (for instance, see [8]). In this regard, the field of information geometry has brought to light a true metric of the differences in the time evolution of the statistical states in a stochastic process [9,10]. Specifically, the concept of information length (IL) [11,12], given by the time integral of the stochastic dynamics information rate, describes the total amount of statistical changes that a time-varying probability distribution takes through time. The previous works showed that IL provides a link between stochastic processes, complexity and geometry [12]. Additionally, IL has been applied to the quantification of hysteresis in forward–backward processes [8,13], correlation and self-regulation among different players [13], phase transitions [14], and prediction of sudden events [15]. It is worth noting that in nonlinear stochastic systems, IL is generally difficult to obtain because the analytical/numerical solution of the Fokker–Planck equation and the execution of stochastic simulations are usually complicated and computationally costly. Hence, analytical simplifications are advantageous to ease IL’s calculation while increasing the possibility of applying IL to broader practical scenarios [16].

Even though the information rate may seem purely like a statistical quantity, its meaning can be understood in relation to the thermodynamics [11,12]. This result is of great advantage as we may use it to quantify the effects, for example, that a minimum statistical variability (constant information rate) control could have on the system’s energetic behaviour. Note that the idea of thermodynamic informed control systems is not something new. For instance, since the beginning of the 21st century, various works have proposed the consideration of entropy-informed control protocols to generate “intelligent/efficient” systems (for further details, see [17] and the references therein). Yet, describing the effects of an information geometry-informed control protocol over the system’s stochastic thermodynamics is to be carried out.

In this regard, we consider the application of the so-called Laplace assumption (Gaussian approximation of the system’s time-varying PDF) to the computation of IL and the information rate for a set of nonlinear stochastic differential equations. By using this assumption, we derive the values of the entropy rate, entropy production and entropy flow and their relation to the information rate. Then, we formulate an optimisation problem for the minimum information variability control and study the closed-loop stochastic dynamics and thermodynamics in a numerical example. Thus, creating a connection between information geometry, stochastic thermodynamics and control engineering (Figure 1).

To help readers, in the following, we summarise our notations.

R

is the set of real numbers;

x \in R^{n}

represents a column vector

x

of real numbers of dimension n;

A \in R^{n \times n}

represents a real matrix of dimension

n \times n

(bold-face letters are used to represent vectors and matrices);

Tr (A)

corresponds to the trace of the matrix

A

;

| A |

,

vec (A)

,

A^{⊤}

and

A^{- 1}

are the determinant, vectorisation, transpose and inverse of matrix

A

, respectively. The value

I_{n}

denotes the identity matrix of order n. Newton’s notation is used for the partial derivative with respect to the variable t (i.e.

\frac{\partial y}{\partial t} = \dot{y}

). Finally, the average of a random vector

ζ

is denoted by

μ : = 〈 x 〉

, the angular brackets representing the average.

2. Model

Throughout this work, the following set of nonlinear Langevin equations is considered

\dot{x} = f (x, u) + ξ .

(1)

Here,

f : R^{n} \to R^{n}

is a function taking as input a vector

x \in R^{n}

and a bounded smooth time-dependent deterministic function

u (t) \in R

to output a vector

f (x) \in R^{n}

with elements

f_{i}

(

i = 1, 2, . . . n

);

ξ \in R^{n}

is a Gaussian stochastic noise given by an n dimensional vector of

δ

-correlated Gaussian noises

ξ_{i}

(

i = 1, 2, . . . n

), with the following statistical property

〈 ξ_{i} (t) 〉 = 0, 〈 ξ_{i} (t) ξ_{j} (t_{1}) 〉 = 2 D_{i j} (t) δ (t - t_{1}), D_{i j} (t) = D_{j i} (t), \forall i, j = 1, \dots, n .

(2)

The Fokker–Planck equation of (1) is

\dot{p} = - \nabla \cdot J = - \nabla \cdot f p + \nabla \cdot D \nabla p = - \sum_{i = 1}^{n} \partial_{x_{i}} (f_{i} p) + \sum_{i, j = 1}^{n} (\partial_{x_{i}} D_{i j} \partial_{x_{j}}) p,

(3)

where

J = [f_{i} p - D_{i} \nabla p, \dots, f_{n} p - D_{n} \nabla p]

.

2.1. The Laplace Assumption

The Laplace assumption allows us to describe the solution of (3) through a fixed multivariable Gaussian distribution given by [4]

p (x; t) = \frac{1}{\sqrt{| 2 π Σ |}} e^{\frac{1}{2} Q (x; t)},

(4)

where

Q (x; t) = - \frac{1}{2} {(x - μ (t))}^{⊤} Σ^{- 1} (t) (x - μ (t))

, and

μ (t) \in R^{n}

and

Σ (t) \in R^{n \times n}

are the mean and covariance value of the random variable

x

. The value of the mean

μ (t)

and covariance matrix

Σ (t)

can be obtained from the following result.

Proposition 1

(The Laplace assumption). Under the Laplace assumption, the dynamics of the mean μ and covariance Σ at any time t of a nonlinear stochastic differential system (1) are governed by the following differential equations

\begin{matrix} \dot{μ} & = & {[f_{1} (μ, u) + \frac{1}{2} Tr (Σ H_{f_{1}}), f_{2} (μ, u) + \frac{1}{2} Tr (Σ H_{f_{2}}), \dots, f_{n} (μ, u) + \frac{1}{2} Tr (Σ H_{f_{n}})]}^{⊤} \end{matrix}

(5)

\begin{matrix} \dot{Σ} & = & J_{f} Σ + Σ J_{f}^{⊤} + D + D^{⊤} \end{matrix}

(6)

where

H_{f_{i}}

is the Hessian matrix of the function

f_{i} (x, u)

, and

J_{f}

is the Jacobian of the function

f (x, u)

.

Proof.

See Appendix A. □

Note that when

f_{i} (x; t)

in (1) is a linear function defined as

f_{i} (x; t) : = A_{i} x (t) + B_{i} u (t) = \sum_{j = 1}^{n} a_{i j} x_{j} (t) + \sum_{j}^{p} b_{i j} u_{j} (t),

(7)

i.e., we consider a set of particles driven by a harmonic potential (

A_{i} x (t)

) and a deterministic force (

B_{i} u (t)

, the value

H_{f_{i}} = 0

, meaning that the mean value

μ

is not affected by the covariance matrix

Σ

.

Limits of the Laplace Assumption

Since the Laplace assumption does not always hold, we first check on its limitation by considering the following cubic stochastic differential equation

\dot{x} (t) = - γ x {(t)}^{3} + u (t) + ξ (t),

(8)

where

〈 ξ 〉 = 0

,

〈 ξ (t) ξ (t^{'}) 〉 = 2 D δ (t - t^{'})

and

γ \in R^{+}

. Then, we denote

q (x, t)

that is based on the Laplace Assumption, which take the following Gaussian form

q (x, t) = \frac{1}{\sqrt{2 π Σ}} e^{\frac{1}{2} \frac{{(x - μ)}^{2}}{Σ}},

(9)

where

μ

and

Σ

are determined by the solution of

\begin{matrix} \dot{μ} & = & - γ μ^{3} + u - 3 γ μ Σ, \end{matrix}

(10)

\begin{matrix} \dot{Σ} & = & - 6 γ Σ μ^{2} + 2 D . \end{matrix}

(11)

To obtain the real system PDF

\tilde{p} (x, t)

of system (8), we use stochastic simulations and kernel density estimators (for further details see [18]). Now, to highlight the limits of the Gaussian approximation

q (x, t)

, we apply the Kullback divergence (KL)

D_{K L}

or relative entropy between the estimated

\tilde{p}

and the Gaussian approximation q of the time-varying system (8) PDFs defined as

D_{K L} (\tilde{p} | | q) = \int_{R} p (x; t) log (\frac{p (x; t)}{q (x; t)}) d x .

(12)

Figure 2 shows the KL divergence trough time between

\tilde{p}

and q when changing the parameters

γ

and D in Equation (8). The result shows that a valid LA requires a small damping (slow behaviour) and a wider noise amplitude in comparison with the initial value of

Σ

.

3. Stochastic Thermodynamics

Stochastic thermodynamics uses stochastic calculus to draw a connection between the “micro/mesoscopic stochastic dynamics” and the “macroscopic thermodynamics” [1,6]. In physical terms, this means that stochastic thermodynamics describes the interaction of a micro/mesoscopic system with one or multiple reservoirs (for instance, the dynamics of a Brownian particle suspended in a fluid in thermodynamic equilibrium described by a Langevin/Fokker Planck equation).

3.1. Entropy Rate

Given a time-varying multivariable PDF

p (x; t)

, we can calculate the entropy rate, a fundamental concept of stochastic thermodynamics, as follows [20]

\dot{S} (t) = \frac{d}{d t} S (t) = - \int_{R^{n}} \dot{p} (x; t) \ln (p (x; t)) d x = Π - Φ .

(13)

Proposition 2.

Under the Laplace assumption, the value of entropy rate

\dot{S}

, entropy production Π and entropy flow Φ is given by

\begin{matrix} \dot{S} & = & Tr (Σ^{- 1} D) + Tr (J_{f}) = \frac{1}{2} Tr (Σ^{- 1} \dot{Σ}), \end{matrix}

(14)

\begin{matrix} Π & = & Tr (Σ^{- 1} D) + Tr (f {(μ, u)}^{⊤} D^{- 1} f (μ, u)) + Tr (J_{f} D^{- 1} J_{f}^{⊤} Σ) + 2 Tr (J_{f}), \end{matrix}

(15)

\begin{matrix} Φ & = & Tr (f {(μ, u)}^{⊤} D^{- 1} f (μ, u)) + Tr (J_{f} D^{- 1} J_{f}^{⊤} Σ) + Tr (J_{f}) . \end{matrix}

(16)

Proof.

See Appendix B. □

3.2. Example

To illustrate the application of Proposition 2, consider the following Langevin form of the Duffing equation

\begin{matrix} \dot{x} (t) & = v (t) + ξ_{1} (t) \\ \dot{v} (t) & = - δ v (t) - α x (t) - β x {(t)}^{3} + γ cos (ω t) + ξ_{2} (t) \end{matrix},

(17)

where

x (t)

is the displacement at time t,

v (t) = \dot{x} (t)

is the first derivative of x with respect to time, i.e., velocity,

ξ

is a delta correlated noise, and the values

δ, α, β, γ

and

ω

are given constants.

Figure 3 shows a simulation of (17) using the deterministic equations of the mean vector

μ = {[〈 x 〉, 〈 v 〉]}^{⊤}

and covariance matrix

Σ

as described by Proposition 1. Specifically, Figure 3a includes the time evolution of the random variables x and v with its phase portrait, and the time evolution of

Σ_{11}, Σ_{12}

and

Σ_{22}

. Figure 3b shows the time evolution of the system’s stochastic thermodynamics including the entropy rate

\dot{S}

, the entropy production

Π

and the entropy flow

Φ

. In all subplots, time is scaled by the factor

T = 2 π / ω

.

More importantly, Figure 3 shows that via Propositions 1 and 2 it is possible to describe the thermodynamics of any nonlinear stochastic system at every instant of time. Hence, as will be discussed in Section 5, Propositions 1 and 2 allow us to perceive the effects of a control protocol on the closed-loop system thermodynamics.

4. Information Length and Information Rate

For a time-varying multivariable PDF

p (x; t)

, we define its IL

L

as [15,16]

L (t) = \int_{0}^{t} (\sqrt{\int_{R^{n}} p (x; τ) {[\partial_{τ} ln p (x; τ)]}^{2} d x}) d τ = \int_{0}^{t} Γ (τ) d τ,

(18)

where

Γ

is called the information rate. The value of

Γ^{2}

can be understood as the Fisher information where the time is the control parameter [12]. Since

Γ

gives the rate of change of

p (x; t)

, its time integral

L

quantifies the amount of statistical changes that the system goes through in time from the initial PDF

p (x; 0)

to a final PDF

p (x; t)

[16].

Under the Laplace assumption, i.e., when

p (x; t)

is a Gaussian PDF, the value of information rate

Γ

of the joint PDF takes the compact form [11,16]

Γ^{2} = {\dot{μ}}^{⊤} Σ^{- 1} \dot{μ} + \frac{1}{2} Tr ({(Σ^{- 1} \dot{Σ})}^{2}),

(19)

where the time derivatives of

\dot{μ}

and

\dot{Σ}

are given by Equations (5) and (6), respectively.

Relation with Stochastic Thermodynamics

Considering a fully decoupled nonlinear stochastic system and using the Laplace assumption, the value of the information rate

Γ^{2}

is related to the entropy production

Π

and the entropy flow

\dot{S}

as follows

Γ^{2} = \sum_{i = 1}^{n} \frac{D_{i i}}{Σ_{i i}} Π_{i} + \sum_{i = 1}^{n} {\dot{S}}_{i}^{2} + \frac{1}{2} \sum_{i = 1}^{n} H_{f_{i}} ({\dot{μ}}_{i} + f_{i} (μ_{i}, u)),

(20)

where

Π_{i}

and

{\dot{S}}_{i}

are the entropy production and entropy rate from the marginal PDF

p (x_{i}, t)

of

x_{i}

.

H_{f_{1}} = \frac{\partial^{2}}{\partial x_{i}^{2}} f_{i} (μ_{i}, t)

. If

f_{i}

describes a harmonic potential (7), then

H_{f_{i}} = 0

and (20) lead to the expression

Γ^{2} = \sum_{i}^{n} \frac{D_{i i}}{Σ_{i i}} Π_{i} + \sum_{i}^{n} {\dot{S}}_{i}^{2} .

(21)

Note that Equation (21) gives us a case where a minimum information length

L

would produce both a minimum entropy production/rate and a minimum statistical variability behaviour.

5. Minimum Variability Control

To impose a minimum statistical variability when going from an initial to a desired state (for instance, see [21]), we propose the optimisation problem with the following cost function

c = arg min_{\tilde{c}} (J = \int_{0}^{t_{f}} ({(Γ (t) - Γ (0))}^{2} + {(Y (t) - Y_{d})}^{⊤} Q (Y (t) - Y_{d}) + \tilde{c} {(t)}^{⊤} R \tilde{c} (t)) d t)

(22)

where

Y (t) : = {[μ (t), vec (Σ (t))]}^{⊤}, Y_{d} : = {[μ_{d}, vec (Σ_{d})]}^{⊤}, \tilde{c (t)} : = {[u (t), vec (D (t))]}^{⊤}, Q \in R^{n + n \times n}

, and

R \in R^{1 + n \times n}

. The solution

c (t)

corresponds to the control vector that allows us to obtain the minimum statistical variability. In Equation (22), the term

{(Γ (t) - Γ (0))}^{2}

keeps the information variability constant. The term involving

Q

drives the system to reach a given PDF defined by

Y_{d}

. The term containing

R

in the right-hand side of (22) regularises the control action

c

to avoid abrupt changes in the inputs. Note that the values of

Σ, μ

and

Γ

can be easily computed for any nonlinear stochastic process through Proposition (1) or the Laplace assumption. A control that comes after solving (22) would be called an information length quadratic regulator (IL-QR).

5.1. Model Predictive Control

A solution to the proposed optimisation problem (22) can be obtained by one of the most popular optimisation-based control techniques currently available—the so-called model-predictive-control (MPC) scheme [22]. Generally, MPC is an online optimisation algorithm for constrained control problems whose benefits have been recognized in applications to robotics [23], solar energy [24] or bioengineering [25]. Furthermore, MPC has the advantage of being easily implemented owing to packages such as CasADi [26] or the Hybrid Toolbox [27].

Figure 4 briefly details the working principle of the MPC’s optimiser in the form of a block diagram. The MPC method consists of utilising a prediction model to solve the optimisation problem in a finite horizon. Then, the optimal solution is applied to the system in real-time. Finally, the system’s output is fed back to the MPC algorithm to start the optimisation procedure again. In this work, we ease the prediction and simulation of the stochastic process by employing the Laplace assumption.

5.2. Example

We now present an example of the application of the MPC method to obtain the minimum variability behaviour of a stochastic system. Figure 5 shows the IL-QR applied to the cubic stochastic process given by Equation (8), where the control vector and the state vector are given by

c = {[u, D]}^{⊤}

and

Y = {[μ, Σ]}^{⊤}

, respectively. In the simulation, the initial state is

Y (0) = {[2 + 5 / 6, 1 / (2 \times 30)]}^{⊤}

, while the desired state is

Y_{d} (t) = {[2 + 1 / 30, 1 / (2 \times 3)]}^{⊤}

. Additionally, we consider the parameters

γ = 0.1

,

T_{s} = 1 \times 10^{- 3}

,

N = 5

,

I_{L} = 1 \times 10^{3}

,

R = 1 \times 10^{- 5} I_{2}

,

Q_{12} = Q_{21} = 0, Q_{12} = 1 \times 10^{2}

and

Q_{22} = 8 \times 10^{2}

. Here,

T_{s}

is the integration time step, and N is the number of future time steps considered in the prediction model. The value of

Γ (0)

is imposed via the initial conditions and Equation (19).

In Figure 5a, we show the time evolution of the mean

μ

, the inverse temperature

β = \frac{1}{2 Σ}

, the input force u, the noise amplitude D, the information rate

Γ^{2}

and the information length

L

. We also show the PDF time evolution of the simulation model computed via the Laplace approximation (q) or via stochastic simulations (

\tilde{p}

) and the corresponding

K L

-divergence (12) between them. In the subplot of

μ

and

β

, the legend LA and SS stand for the Laplace assumption and stochastic simulations, respectively. Interestingly, we can see from this that the Laplace approximation works fine when used as a prediction model in the MPC method. The controls have a chattering effect (oscillations having a finite frequency and amplitude) similar to the one encountered when implementing other control methods, such as the sliding mode control [28], when trying to keep the system in the desired state

Y_{d}

.

Figure 5b demonstrates the effects of controls (22) on the closed-loop system stochastic thermodynamics. The results show that at the desired state

Y_{d}

the value of

\dot{S}

oscillates around zero with a small amplitude. This means

Φ = - Π

holds at some instants of time when

Y

reaches

Y_{d}

. In other words, all the energy is exchanged with the system’s environment when the control keeps

Y

on the nonequilibrium state

Y_{d}

.

6. Conclusions

In this work, we developed a new control MPC method to derive the evolution of a system with a minimum information variability in systems governed by nonlinear stochastic differential equations. Specifically, we identified the limitations of the Laplace assumption and utilised it to reduce the computational cost of calculating the time-varying PDFs and to develop a prediction model in the MPC algorithm. We also derived the relations that permit us to analyse the controller effects on the close-loop system’s thermodynamics.

In future work, we aim to apply our results for maximising the free-energy (minimum entropy production) [12] and for the analysis of the closed-loop stochastic thermodynamics in higher-order systems.

Author Contributions

Conceptualization, A.-J.G.-C. and E.-j.K.; methodology, A.-J.G.-C.; software, A.-J.G.-C.; validation, A.-J.G.-C. and E.-j.K.; formal analysis, A.-J.G.-C.; investigation, A.-J.G.-C.; resources, A.-J.G.-C. and E.-j.K.; data curation, A.-J.G.-C.; writing—original draft preparation, A.-J.G.-C.; writing—review and editing, A.-J.G.-C. and E.-j.K.; visualization, A.-J.G.-C.; supervision, E.-j.K.; project administration, E.-j.K.; funding acquisition, E.-j.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Proposition 1

To prove Proposition 1, we start by defining the first two moments of the ensemble density

p (x)

. This is given as follows

\begin{matrix} {\dot{μ}}_{i} & = & \int_{R^{n}} x_{i} \dot{p} (x; t) d^{n} x, \end{matrix}

(A1)

\begin{matrix} {\dot{Σ}}_{i j} & = & \int_{R^{n}} {\bar{x}}_{i} {\bar{x}}_{j} \dot{p} (x; t) d^{n} x . \end{matrix}

(A2)

Here,

{\bar{x}}_{i} = x_{i} - μ_{i}

. Using (A1)–(A2) and (3), while avoiding the arguments for simplicity, we have

\begin{matrix} {\dot{μ}}_{i} & = & \int_{R^{n}} x_{i} [- \sum_{i = 1}^{n} \partial_{x_{i}} (f_{i} p) + \sum_{i, j = 1}^{n} (\partial_{x_{i}} D_{i j} \partial_{x_{j}}) p] d^{n} x, \\ = & - \int_{R^{n}} x_{i} \partial_{x_{i}} (f_{i} p) d^{n} x + \int_{R^{n}} x_{i} \partial_{x_{i}} (\sum_{j = 1}^{n} D_{i j} \partial_{x_{j}} p) d^{n} x, \\ = & \int_{R^{n}} f_{i} p d^{n} x = 〈 f_{i} 〉 . \end{matrix}

(A3)

\begin{matrix} {\dot{Σ}}_{i j} & = & \int_{R^{n}} {\bar{x}}_{i} {\bar{x}}_{j} [- \sum_{i = 1}^{n} \partial_{x_{i}} (f_{i} p) + \sum_{i, j = 1}^{n} (\partial_{x_{i}} D_{i j} \partial_{x_{j}}) p] d^{n} x, \\ = & - \int_{R^{n}} {\bar{x}}_{i} {\bar{x}}_{j} \partial_{x_{i}} (f_{i} p) d^{n} x - \int_{R^{n}} {\bar{x}}_{i} {\bar{x}}_{j} \partial_{x_{j}} (f_{j} p) d^{n} x + \int_{R^{n}} {\bar{x}}_{i} (\sum_{j = 1}^{n} D_{i j} \partial_{x_{j}} p) d^{n} x + \int_{R^{n}} {\bar{x}}_{j} (\sum_{i = 1}^{n} D_{j i} \partial_{x_{i}} p) d^{n} x, \\ = & 〈 {\bar{x}}_{j} f_{i} + {\bar{x}}_{i} f_{j} 〉 + D_{i j} + D_{j i} . \end{matrix}

(A4)

A closed-form solution to (A3)–(A4) can be obtained by exploiting the Laplace assumption; i.e., we recover the sufficient statistics (A1)–(A2) of system (1) through the first three terms of the nonlinear flow

f_{i} (x, u)

Taylor expansion around the expected state

μ

. This is given as follows

f_{i} (x, u) = f_{i} (μ, u) + \sum_{j = 1}^{n} \frac{\partial f_{i} (μ, u)}{\partial x_{j}} {\bar{x}}_{j} + \frac{1}{2} \sum_{j, k = 1}^{n} \frac{\partial^{2} f_{i} (μ, u)}{\partial_{x_{j}} \partial_{x_{k}}} {\bar{x}}_{j} {\bar{x}}_{k} + \dots

(A5)

Under Gaussian assumptions

〈 {\bar{x}}_{i} 〉 = 0

and

〈 {\bar{x}}_{i} {\bar{x}}_{j} 〉 = Σ_{i j}

and applying (A5) to (A3)–(A4), we have

\begin{matrix} \dot{μ} & = & 〈f_{i} (μ, u) + \sum_{j = 1}^{n} \frac{\partial f_{i} (μ, u)}{\partial x_{j}} {\bar{x}}_{j} + \frac{1}{2} \sum_{j, k = 1}^{n} \frac{\partial^{2} f_{i} (μ, u)}{\partial_{x_{j}} \partial_{x_{k}}} {\bar{x}}_{j} {\bar{x}}_{k}〉, \end{matrix}

\begin{matrix} = & f_{i} (μ, u) + \frac{1}{2} \sum_{j, k = 1}^{n} \frac{\partial^{2} f_{i} (μ, u)}{\partial_{x_{j}} \partial_{x_{k}}} Σ_{j k} . \\ {\dot{Σ}}_{i j} & = & 〈({\bar{x}}_{j} + {\bar{x}}_{i}) (f_{i} (μ, u) + \sum_{k = 1}^{n} \frac{\partial f_{i} (μ, u)}{\partial x_{k}} {\bar{x}}_{k})〉 + D_{i j} + D_{j i}, \end{matrix}

(A6)

\begin{matrix} = & \sum_{k = 1}^{n} \frac{\partial f_{i} (μ, u)}{\partial x_{k}} Σ_{j k} + \sum_{k = 1}^{n} \frac{\partial f_{i} (μ, u)}{\partial x_{k}} Σ_{i k} + D_{i j} + D_{j i} . \end{matrix}

(A7)

Equations (A6)–(A7) are the expansion of the equations shown in Proposition 1. This finishes the proof.

Appendix B. Proof of Proposition 2

By substituting (3) in (13), we obtain

\frac{d}{d t} S (t) = \int_{R^{n}} (\sum_{i} \frac{\partial}{\partial x_{i}} J_{i} (x; t)) \ln (p (x; t)) d^{n} x = - \int_{R^{n}} \sum_{i} J_{i} (x; t) (\frac{\partial}{\partial x_{i}} \ln (p (x; t))) d^{n} x .

(A8)

Now, after substituting the

i - t h

term of J in (A8), we have

\frac{d}{d t} S (t) = - \int_{R^{n}} \sum_{i} J_{i} (x; t) (\frac{f_{i} (x; t)}{D_{i i}} - \frac{J_{i} (x; t)}{D_{i i} p (x; t)} - \frac{\sum_{j \neq i} D_{i j} \frac{\partial}{\partial x_{j}} p (x; t)}{D_{i i} p (x; t)}) d^{n} x .

(A9)

From (A9), the entropy production rate of the system corresponds to the positive definite part

Π = \sum_{i} Π_{i} = \int_{R^{n}} \sum_{i} \frac{J_{i} {(x; t)}^{2}}{D_{i i} p (x; t)} d^{n} x,

(A10)

while the entropy flux (entropy from the system to the environment) is

Φ = \int_{R^{n}} \sum_{i} (\frac{J_{i} (x; t) f_{i} (x; t)}{D_{i i}} - \frac{\sum_{j \neq i} D_{i j} J_{i} (x; t) \frac{\partial}{\partial x_{j}} p (x; t)}{D_{i i} p (x; t)}) d^{n} x .

(A11)

In this paper, we focus on the case when

D_{i j} = 0

if

i \neq j

to simplify (A11) as

Φ = \sum_{i} Φ_{i} = \int_{R^{n}} \sum_{i} (\frac{J_{i} (x; t) f_{i} (x, t)}{D_{i i}}) d^{n} x .

(A12)

Notice that (A10)–(A11) require that

D_{i i} > 0

. If

D_{i i} = 0

, we have

Π_{i} = 0

and

Φ_{i} = 〈\frac{\partial f_{i} (x, t)}{\partial x_{i}}〉 .

(A13)

We start by applying the definition of entropy (A10) production and entropy flux (A12), giving us

\begin{matrix} Π_{i} & = & \frac{1}{D_{i i}} 〈f_{i} {(x, t)}^{2}〉 + D_{i i} 〈{(\frac{\partial Q (x)}{\partial x_{i}})}^{2}〉 + 2 〈\frac{\partial f_{i} (x, t)}{\partial x_{i}}〉, \end{matrix}

(A14)

\begin{matrix} Φ_{i} & = & \frac{1}{D_{i i}} 〈 f_{i} {(x, t)}^{2} 〉 + 〈\frac{\partial f_{i} (x, t)}{\partial x_{i}}〉 . \end{matrix}

(A15)

Before continuing, it is useful to note that [29]

\begin{matrix} \frac{\partial Q}{\partial x_{k}} = - \frac{1}{2} [\sum_{i} {\bar{x}}_{i} Σ_{k i}^{- 1} + \sum_{j} {\bar{x}}_{j} Σ_{j k}^{- 1}] = - \sum_{i} {\bar{x}}_{i} Σ_{k i}^{- 1} = - {\bar{x}}^{⊤} Σ_{k}^{- 1} \end{matrix}

(A16)

where

{\bar{x}}_{i} = x_{i} - μ_{i}

,

\bar{x} : = x - μ = {[{\bar{x}}_{1}, \dots, {\bar{x}}_{n}]}^{⊤}

and

Σ_{k}^{- 1}

is the k-th column of the inverse matrix

Σ^{- 1}

of

Σ

. Therefore, [29]

\begin{matrix} 〈D_{i i} {(\frac{\partial Q (x)}{\partial x_{i}})}^{2}〉 = D_{i i} 〈{\bar{x}}^{⊤} Σ_{i}^{- 1} {(Σ_{i}^{- 1})}^{⊤} \bar{x}〉 = D_{i i} Tr (Δ_{i} Σ), \end{matrix}

(A17)

and

\begin{matrix} \frac{〈f_{i} {(x)}^{2}〉}{D_{i i}} = \frac{1}{D_{i i}} 〈(f_{i} (μ, u) + \sum_{j = 1}^{n} \frac{\partial f_{i} (μ, u)}{\partial x_{j}} {\bar{x}}_{j}) (f_{i} (μ, u) + \sum_{k = 1}^{n} \frac{\partial f_{i} (μ, u)}{\partial x_{k}} {\bar{x}}_{k})〉 \\ = \frac{1}{D_{i i}} (f_{i} {(μ, u)}^{2} + \sum_{j, k = 1}^{n} \frac{\partial f_{i} (μ, u)}{\partial x_{j}} \frac{\partial f_{i} (μ, u)}{\partial x_{k}} Σ_{j k}) = \frac{1}{D_{i i}} (f_{i} {(μ, u)}^{2} + \nabla^{⊤} f_{i} (μ, u) Σ \nabla f_{i} (μ, u)), \end{matrix}

(A18)

〈\frac{\partial f_{i} (x, t)}{\partial x_{i}}〉 = 〈\frac{\partial}{\partial x_{i}} (f_{i} (μ, u) + \sum_{j = 1}^{n} \frac{\partial f_{i} (μ, u)}{\partial x_{j}} {\bar{x}}_{j} + \frac{1}{2} \sum_{j, k = 1}^{n} \frac{\partial^{2} f_{i} (μ, u)}{\partial_{x_{j}} \partial_{x_{k}}} {\bar{x}}_{j} {\bar{x}}_{k})〉 = \frac{\partial f_{i} (μ, u)}{\partial x_{i}} .

(A19)

Hence,

\begin{matrix} Π & = & \sum_{i = 1}^{n} Π_{i} = Tr (Σ^{- 1} D) + Tr (D^{- 1} f (μ, u) f {(μ, u)}^{⊤}) + Tr (J_{f} D^{- 1} J_{f}^{⊤} Σ) + 2 Tr (J_{f}) \end{matrix}

(A20)

\begin{matrix} Φ & = & \sum_{i = 1}^{n} Φ_{i} = Tr (D^{- 1} f (μ, u) f {(μ, u)}^{⊤}) + Tr (J_{f} D^{- 1} J_{f}^{⊤} Σ) + Tr (J_{f}) \end{matrix}

(A21)

\begin{matrix} \dot{S} & = & Tr (Σ^{- 1} D) + Tr (J_{f}) \end{matrix}

(A22)

References

Seifert, U. Stochastic thermodynamics, fluctuation theorems and molecular machines. Rep. Prog. Phys. 2012, 75, 126001. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Gontis, V.; Kononovicius, A. Consentaneous agent-based and stochastic model of the financial markets. PLoS ONE 2014, 9, e102201. [Google Scholar] [CrossRef] [PubMed] [Green Version]
de Freitas, R.A.; Vogel, E.P.; Korzenowski, A.L.; Rocha, L.A.O. Stochastic model to aid decision making on investments in renewable energy generation: Portfolio diffusion and investor risk aversion. Renew. Energy 2020, 162, 1161–1176. [Google Scholar] [CrossRef]
Marreiros, A.C.; Kiebel, S.J.; Daunizeau, J.; Harrison, L.M.; Friston, K.J. Population dynamics under the Laplace assumption. Neuroimage 2009, 44, 701–714. [Google Scholar] [CrossRef] [PubMed]
Maybeck, P.S. Stochastic Models, Estimation, and Control; Academic Press: Cambridge, MA, USA, 1982. [Google Scholar]
Peliti, L.; Pigolotti, S. Stochastic Thermodynamics: An Introduction; Princeton University Press: Princeton, NJ, USA, 2021. [Google Scholar]
Baltieri, M.; Buckley, C.L. PID control as a process of active inference with linear generative models. Entropy 2019, 21, 257. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kim, E.; Hollerbach, R. Geometric structure and information change in phase transitions. Phys. Rev. E 2017, 95, 062107. [Google Scholar] [CrossRef] [Green Version]
Nielsen, F. An elementary introduction to information geometry. Entropy 2020, 22, 1100. [Google Scholar] [CrossRef]
Amari, S.I. Information Geometry and Its Applications; Springer: Berlin/Heidelberg, Germany, 2016; Volume 194. [Google Scholar]
Kim, E. Information geometry and nonequilibrium thermodynamic relations in the over-damped stochastic processes. J. Stat. Mech. Theory Exp. 2021, 2021, 093406. [Google Scholar] [CrossRef]
Kim, E. Information Geometry, Fluctuations, Non-Equilibrium Thermodynamics, and Geodesics in Complex Systems. Entropy 2021, 23, 1393. [Google Scholar] [CrossRef]
Hollerbach, R.; Kim, E.; Schmitz, L. Time-dependent probability density functions and information diagnostics in forward and backward processes in a stochastic prey–predator model of fusion plasmas. Phys. Plasmas 2020, 27, 102301. [Google Scholar] [CrossRef]
Kim, E.; Heseltine, J.; Liu, H. Information length as a useful index to understand variability in the global circulation. Mathematics 2020, 8, 299. [Google Scholar] [CrossRef] [Green Version]
Guel-Cortez, A.J.; Kim, E. Information Geometric Theory in the Prediction of Abrupt Changes in System Dynamics. Entropy 2021, 23, 694. [Google Scholar] [CrossRef] [PubMed]
Guel-Cortez, A.J.; Kim, E. Information length analysis of linear autonomous stochastic processes. Entropy 2020, 22, 1265. [Google Scholar] [CrossRef] [PubMed]
Saridis, G.N. Entropy in Control Engineering; World Scientific: Singapore, 2001; Volume 12. [Google Scholar]
Fan, J.; Marron, J.S. Fast implementations of nonparametric curve estimators. J. Comput. Graph. Stat. 1994, 3, 35–56. [Google Scholar]
Stochastic Simulation Versus Laplace Assumption in a Cubic System. Available online: https://github.com/AdrianGuel/StochasticProcesses/blob/main/CubicvsLA.ipynb (accessed on 6 June 2022).
Tomé, T. Entropy production in nonequilibrium systems described by a Fokker-Planck equation. Braz. J. Phys. 2006, 36, 1285–1289. [Google Scholar] [CrossRef] [Green Version]
Soto, F.; Wang, J.; Ahmed, R.; Demirci, U. Medical micro/nanorobots in precision medicine. Adv. Sci. 2020, 7, 2002203. [Google Scholar] [CrossRef] [PubMed]
Lee, J.H. Model predictive control: Review of the three decades of development. Int. J. Control. Autom. Syst. 2011, 9, 415–424. [Google Scholar] [CrossRef]
Mehrez, M.W.; Worthmann, K.; Cenerini, J.P.; Osman, M.; Melek, W.W.; Jeon, S. Model predictive control without terminal constraints or costs for holonomic mobile robots. Robot. Auton. Syst. 2020, 127, 103468. [Google Scholar] [CrossRef]
Kristiansen, B.A.; Gravdahl, J.T.; Johansen, T.A. Energy optimal attitude control for a solar-powered spacecraft. Eur. J. Control 2021, 62, 192–197. [Google Scholar] [CrossRef]
Salesch, T.; Gesenhues, J.; Habigt, M.; Mechelinck, M.; Hein, M.; Abel, D. Model based optimization of a novel ventricular assist device. at-Automatisierungstechnik 2021, 69, 619–631. [Google Scholar] [CrossRef]
Andersson, J.A.E.; Gillis, J.; Horn, G.; Rawlings, J.B.; Diehl, M. CasADi—A software framework for nonlinear optimization and optimal control. Math. Program. Comput. 2019, 11, 1–36. [Google Scholar] [CrossRef]
Bemporad, A. Hybrid Toolbox—User’s Guide. 2004. Available online: http://cse.lab.imtlucca.it/~bemporad/hybrid/toolbox (accessed on 1 June 2022).
Utkin, V.; Lee, H. Chattering problem in sliding mode control systems. In Proceedings of the International Workshop on Variable Structure Systems, Alghero, Sardinia, 5–7 June 2006; pp. 346–350. [Google Scholar]
Petersen, K.B.; Pedersen, M.S. The matrix cookbook. Tech. Univ. Den. 2008, 7, 510. [Google Scholar]

Figure 1. The combination of methods from information geometry, stochastic thermodynamics and control engineering may lead to the creation of energetically efficient and organised behaviour.

Figure 2. KL divergence between the value

\tilde{p} (x; t)

and the value

q (x, y)

varying the values

γ

and D of Equation (8). When

γ

changes,

D = 0.01

; when D changes,

γ = 0.01

. The initial condition is a Gaussian distribution defined by

μ (0) = 5

and

Σ (0) = 0.01

. See code at [19].

Figure 2. KL divergence between the value

\tilde{p} (x; t)

and the value

q (x, y)

varying the values

γ

and D of Equation (8). When

γ

changes,

D = 0.01

; when D changes,

γ = 0.01

. The initial condition is a Gaussian distribution defined by

μ (0) = 5

and

Σ (0) = 0.01

. See code at [19].

Figure 3. Simulation of dynamics and stochastic thermodynamics of the Duffing equation under the Laplace assumption.

Figure 4. Model predictive control block diagram.

Figure 5. IL-QR under LA applied to system (8) with

Y (0) = {[2 + 5 / 6, 1 / (2 \times 30)]}^{⊤}

and

Y_{d} (t) = {[2 + 1 / 30, 1 / (2 \times 3)]}^{⊤}

. The control is applied in

u (t)

and

D (t)

. Moreover,

γ = 0.1

,

T_{s} = 1 \times 10^{- 3}

,

N = 5

,

I_{L} = 1 \times 10^{3}

,

R = 1 \times 10^{- 5} I_{2}

,

Q_{12} = Q_{21} = 0, Q_{12} = 1 \times 10^{2}

and

Q_{22} = 8 \times 10^{2}

.

Figure 5. IL-QR under LA applied to system (8) with

Y (0) = {[2 + 5 / 6, 1 / (2 \times 30)]}^{⊤}

and

Y_{d} (t) = {[2 + 1 / 30, 1 / (2 \times 3)]}^{⊤}

. The control is applied in

u (t)

and

D (t)

. Moreover,

γ = 0.1

,

T_{s} = 1 \times 10^{- 3}

,

N = 5

,

I_{L} = 1 \times 10^{3}

,

R = 1 \times 10^{- 5} I_{2}

,

Q_{12} = Q_{21} = 0, Q_{12} = 1 \times 10^{2}

and

Q_{22} = 8 \times 10^{2}

.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guel-Cortez, A.-J.; Kim, E.-j. Information Geometry Control under the Laplace Assumption. Phys. Sci. Forum 2022, 5, 25. https://doi.org/10.3390/psf2022005025

AMA Style

Guel-Cortez A-J, Kim E-j. Information Geometry Control under the Laplace Assumption. Physical Sciences Forum. 2022; 5(1):25. https://doi.org/10.3390/psf2022005025

Chicago/Turabian Style

Guel-Cortez, Adrian-Josue, and Eun-jin Kim. 2022. "Information Geometry Control under the Laplace Assumption" Physical Sciences Forum 5, no. 1: 25. https://doi.org/10.3390/psf2022005025

APA Style

Guel-Cortez, A.-J., & Kim, E.-j. (2022). Information Geometry Control under the Laplace Assumption. Physical Sciences Forum, 5(1), 25. https://doi.org/10.3390/psf2022005025

Article Menu

Information Geometry Control under the Laplace Assumption^†

Abstract

1. Introduction

2. Model

2.1. The Laplace Assumption

Limits of the Laplace Assumption

3. Stochastic Thermodynamics

3.1. Entropy Rate

3.2. Example

4. Information Length and Information Rate

Relation with Stochastic Thermodynamics

5. Minimum Variability Control

5.1. Model Predictive Control

5.2. Example

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Proof of Proposition 1

Appendix B. Proof of Proposition 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Information Geometry Control under the Laplace Assumption †

Abstract

1. Introduction

2. Model

2.1. The Laplace Assumption

Limits of the Laplace Assumption

3. Stochastic Thermodynamics

3.1. Entropy Rate

3.2. Example

4. Information Length and Information Rate

Relation with Stochastic Thermodynamics

5. Minimum Variability Control

5.1. Model Predictive Control

5.2. Example

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Proof of Proposition 1

Appendix B. Proof of Proposition 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Information Geometry Control under the Laplace Assumption^†