Convergence of Relative Entropy for Euler–Maruyama Scheme to Stochastic Differential Equations with Additive Noise

Yuan Yu

doi:10.3390/e26030232

School of Statistics and Mathematics, Shandong University of Finance and Economics, Jinan 250014, China

Entropy2024, 26(3), 232;https://doi.org/10.3390/e26030232

This article belongs to the Special Issue Concepts of Entropy and Their Applications III

Version Notes

Order Reprints

Abstract

For a family of stochastic differential equations driven by additive Gaussian noise, we study the asymptotic behaviors of its corresponding Euler–Maruyama scheme by deriving its convergence rate in terms of relative entropy. Our results for the convergence rate in terms of relative entropy complement the conventional ones in the strong and weak sense and induce some other properties of the Euler–Maruyama scheme. For example, the convergence in terms of the total variation distance can be implied by Pinsker’s inequality directly. Moreover, when the drift is

β

(

0 < β < 1

)-Hölder continuous in the spatial variable, the convergence rate in terms of the weighted variation distance is also established. Both of these convergence results do not seem to be directly obtained from any other convergence results of the Euler–Maruyama scheme. The main tool this paper relies on is the Girsanov transform.

Keywords:

relative entropy; Euler–Maruyama scheme; Girsanov’s transform; Hölder continuity; weighted variation distance

1. Introduction

Consider the following d-dimensional stochastic differential equation (SDE)

d X_{t} = b (X_{t}) d t + σ (X_{t}) d W_{t}, t \geq 0, X_{0} = x \in R^{d},

(1)

where

b : R^{d} \to R^{d}

,

σ : R^{d} \to R^{d} \otimes R^{m}

, and

{(W_{t})}_{t \geq 0}

is m-dimensional Brownian motion on some complete filtration probability space

(Ω, F, {(F_{t})}_{t \geq 0}, P)

.

Usually, it can be proved that (1) has strong well-posedness under reasonable conditions, whereas the explicit representation of (1) is unknown. Instead, one may develop various numerical schemes to approximate (1); see [] and references therein for more introductions. When the coefficients are regular, strong/weak convergence of numerical schemes for SDEs have been investigated considerably; see, for instance, monographs [].

One of the most popular numerical schemes for SDEs is the Euler–Maruyama (EM) scheme, the introduction of which can be found in [] and references therein. The Euler–Maruyama scheme is a numerical method commonly used for approximating the solutions of SDEs. SDEs are differential equations that involve both deterministic and stochastic (random) components. They are applied in various fields, including physics, finance, biology, and more. The Euler–Maruyama method is particularly useful for solving SDEs because it is a simple and computationally efficient approach. It is an extension of the Euler method, which is used for solving ordinary differential equations. The Euler–Maruyama method is adapted to handle the stochastic part of the equations. Besides its fundamental tools for the numerical solutions of SDEs, researchers also pay much attention to its convergence rate for SDEs. There are some related works in the literature.

For strong convergence of the EM scheme for SDEs, there are some basic results under irregular coefficients. Yan (2002) [] uses Meyer–Tanaka’s formula and estimates local times to derive a strong convergence rate for EM schemes to one-dimensional SDEs, for which the drift is Lipschitz continuous and the diffusion is Hölder continuous. Gyöngy and Rásonyi (2011) [] adopt a Yamada–Watanabe approximation approach to derive strong convergence for EM schemes of one-dimensional SDEs with Hölder-continuous diffusions, but the drifts cannot be Lipschitz continuous. Halidias and Kloeden (2008) [] established strong convergence for an EM scheme of SDEs with monotone drift, which may be discontinuous. Leobacher and Szölgyenyi (2016) and Müller-Gronbach and Yaroslavtseva (2020) [,] investigated strong convergence of the EM scheme for one-dimensional SDEs with piecewise Lipschitz-continuous drifts.

Moreover, there are also some other authors who use some transformation tools to get strong convergence of an EM scheme for more complex cases. Leobacher and Szölgyenyi (2017) [] studied an EM scheme for the multi-dimensional case, which was extended by Leobacher and Szölgyenyi (2018) [] to the multi-dimensional and degenerate case. The proofs are based on a transformation that changes the piecewise Lipschitz-continuous drifts into globally Lipschitz-continuous ones. Besides the transformation mentioned above, the Zvonkin transform is an alternative tool to deal with the convergence for an EM scheme of SDEs with irregular coefficients. Bao, Huang and Yuan (2019) and Pamen and Taguchi (2017) [,] studied SDEs with Hölder- or Hölder–Dini-continuous drifts. Bao, Huang and Zhang (2022) [] focused on the integrability condition; see the references in [,,] for more results.

For weak convergence, one can refer to [,], wherein the drift satisfies the integrable condition and the main tool is the Girsanov transform.

We should remark that all of the references mentioned above study weak or strong convergence of the EM scheme. As far as we know, there are no results on the convergence in the sense of relative entropy.

In this paper, we further characterize the asymptotic behaviors of an EM scheme for SDEs by studying the convergence rate in terms of the so-called relative entropy. Our main results show that the distribution of the EM iteration of SDEs driven by additive Gaussian noise converges to that of real stochastic process of SDEs in terms of relative entropy. And hence, we can get the asymptotic behaviors of its corresponding EM scheme.

Indeed, while relative entropy is commonly perceived as a measure of the dissimilarity between two probability distributions, it falls short of being considered a metric. This is primarily attributed to its asymmetry concerning the order of its parameters and its inability to satisfy the triangle inequality. Despite not meeting the criteria of a metric, relative entropy maintains strong connections with various other metrics. Some notable relationships include: total variation distance, Fisher information divergence, Wasserstein distance and so on; see, e.g., [] and references therein. These relationships highlight the versatility of relative entropy and its role in connecting with various other measures of dissimilarity and divergence between probability distributions. While it may not possess all the properties of a metric, its specific characteristics make it a valuable tool in information theory and related fields.

Relative entropy’s broad utilization across a spectrum of disciplines, spanning probability theory, statistics, statistical physics, machine learning, neural science, and information theory, can be attributed to its possession of numerous advantageous properties; see, e.g., [] and references therein. These qualities make it a versatile and valuable tool in diverse applications and fields of study. In addition to conventional convergence analysis in both the strong and weak senses for EM schemes, the relative entropy convergence in our findings could unveil previously unexplored facets of SDEs. These discoveries hold the potential to reveal new properties of SDEs in uncharted research territories, presenting exciting prospects for further exploration and investigation.

As an example of the direct corollary of our main results, the convergence in total variation of an EM scheme can be implied by the well-known Pinsker’s inequality. Pinsker’s inequality states that the relative entropy between two probability measures provides an upper bound for their total variation distance.

Moreover, when the drift of SDE (4) is

β

(

0 < β < 1

)-Hölder-continuous in the spatial variable, the convergence rate for the weighted variation distance will also be induced.

The paper is organized as follows: In Section 2, we review some related concepts and definitions of this paper. In Section 3, we state our assumptions and introduce the main results. In Section 4, we induce the proofs of all results. And conclusions and discussions are provided in Section 5.

2. Preliminaries

In this section, we first state some definitions and some related concepts for the main results of this paper.

2.1. Euler–Maruyama Scheme

The Euler–Maruyama scheme is a numerical method commonly used for approximating solutions to SDEs. SDEs involve both deterministic and stochastic components, making their solutions more challenging than those of ordinary differential equations. The basic idea is similar to the traditional Euler method for ordinary differential equations but adapted to handle the stochastic terms. The Euler–Maruyama scheme is one of the simplest time-discrete approximations of Itô’s process, and it is sometimes called the Euler–Maruyama aprproximation or Euler approximation.

We consider an Itô’s process satisfying the stochastic differential Equation (1):

d X_{t} = b (X_{t}) d t + σ (X_{t}) d W_{t}, t \geq 0, X_{0} = x \in R^{d},

on time interval

[0, T]

with initial value x.

For a given discretization

0 = τ_{0} < τ_{1} < \dots < τ_{N} = T

of the time interval

[0, T]

, a discrete EM scheme satisfies the following iterative scheme:

Y_{n + 1} = Y_{n} + b (Y_{n}) (τ_{n + 1} - τ_{n}) + σ (Y_{n}) (W_{τ_{n + 1}} - W_{τ_{n}}),

(2)

for

n = 0, 1, 2, \dots, N - 1

with initial value

Y_{0} = X_{0} .

We shall also write

Δ_{n} = τ_{n + 1} - τ_{n}

for the nth time increment and call

δ = {max}_{n} Δ_{n}

the maximum time step.

In this paper, we shall consider equidistant discretizaiton times

τ_{n} = n δ

with

δ = T / N

for some integer N large enough so that

δ \in (0, 1) .

When the diffusion coefficient is identically zero—that is, when

σ = 0

—the stochastic iterative scheme reduces to the deterministic Euler scheme for the ordinary differential equation

d X_{t} = b (X_{t}) d t

. The main difference is that we need to generate the random increments

Δ W_{n} = W_{τ_{n + 1}} - W_{τ_{n}}

for

n = 0, 1, \dots, N - 1

of the Wiener process

W = {W_{t}, t \geq 0}

. From the properties of Wiener processes, we know that these increments are independent Gaussian random variables with mean

E (Δ W_{n}) = 0

and variance

E ({(Δ W_{n})}^{2}) = Δ_{n}

. We can use a sequence of independent Gausssian pseudo-random numbers generated by one of the random number generators for the increments of the Wiener process.

The recursive structure of the discrete EM scheme, which evaluates approximate values for the Itô’s process at the discretization instants only, is the key to its successful implementation for numerical approximation of SDEs. For a given time discretizaiton, the discrete EM scheme determines values of the approximation process at the discretization times only. We also need the continuous EM scheme, i.e.,

d X_{t}^{δ} = b (X_{t_{δ}}^{δ}) d t + σ (X_{t_{δ}}^{δ}) d W_{t}, t \geq 0, X_{0}^{δ} = x \in R^{d},

where

t_{δ} : = ⌊ t / δ ⌋ δ

, and

⌊ t / δ ⌋

is the integer part of

t / δ

. More details and introductions can be found in [] and references therein.

2.2. Relative Entropy

Kullback and Leibler (1951) [] firstly introduced the definition of relative entropy, which is also called Kullback–Leibler divergence (K-L divergence for short). The definition of relative entropy is as bellow.

Definition 1

(Relative entropy). Recall that the relative entropy of two probability measures ν and μ on

R^{d}

is defined as

Ent (ν | μ) = \{\begin{matrix} \int_{R^{d}} log (\frac{d ν}{d μ}) d ν, & ν ≪ μ; \\ \infty, & otherwise, \end{matrix}

where

\frac{d ν}{d μ}

is the Radon–Nikodym derivative of ν with respect to μ.

Relative entropy is a concept from information theory and probability theory that measures how one probability distribution diverges from another one. Roughly speaking, the relative entropy between two probability measures is a measure of the “distance” or difference measuring how “close” these two probability distributions are. In Chapter 4 of reference [], the authors provide lots of properties of relative entropy (K-L divergence), establish the relationship between relative entropy, cross entropy and conventional differential entropy and give some examples of relative entropy calculations: for instance, exponential distributions, normal distributions and Poisson distributions.

2.3. Total Variation and Weighted Variation Distance

Definition 2

(Total variation). For two probability measures

γ, \tilde{γ}

on

R^{d}

, the total variation distance is formulated as

∥ γ - \tilde{γ} ∥_{v a r} = sup_{{∥ f ∥}_{\infty} \leq 1} | γ (f) - \tilde{γ} (f) | .

Definition 3

(Pinsker’s inequality).

\begin{matrix} ∥ γ - \tilde{γ} ∥_{v a r}^{2} \leq 2 Ent (γ | \tilde{γ}) . \end{matrix}

(3)

Remark 1.

In view of (3), the convergence of the total variation distance can be implied by Pinsker’s inequality directly under the relative entropy convergence.

Definition 4

(Weighted variation distance). For any

k > 1

, the weighted variation distance for two probability measures

γ, \tilde{γ}

on

R^{d}

is formulated as

∥ γ - \tilde{γ} ∥_{k, v a r} = sup_{{| f | < 1 + | \cdot |}^{k}} | γ (f) - \tilde{γ} (f) | .

Remark 2.

The convergence of the weighted total variation distance cannot be implied by the relative entropy convergence, so we need to further investigate it.

2.4. Stochastic Differential Equation Description

Definition 5

(Stochastic differential equations with additive noise). In this paper, we consider the following SDE

d X_{t} = b (X_{t}) d t + d W_{t}, t \geq 0, X_{0} = x \in R^{d},

(4)

where

b : R^{d} \to R^{d}

, and

{(W_{t})}_{t \geq 0}

is d-dimensional Brownian motion on some complete filtration probability space

(Ω, F, {(F_{t})}_{t \geq 0}, P)

.

Remark 3.

For any

δ \in (0, 1)

, the continuous Euler–Maruyama (EM) method for (4) is defined as

d X_{t}^{δ} = b (X_{t_{δ}}^{δ}) d t + d W_{t}, t \geq 0, X_{0}^{δ} = x \in R^{d},

(5)

with

t_{δ} : = ⌊ t / δ ⌋ δ

, where

⌊ t / δ ⌋

is the integer part of

t / δ

.

3. Assumptions and Main Results

3.1. Assumptions

Throughout the paper, we impose the following assumptions on the drift term b of the SDE (4).

(A): There exists a constant $β \in (0, 1]$ and $K > 0$ such that

$\begin{matrix} | b (x) - b (y) | \leq {K | x - y |}^{β}, x, y \in R^{d} \end{matrix}$

(6)

Remark 4.

By Zvonkin’s transform introduced in [], under assumption (

A

), (4) has a unique strong solution

{(X_{t})}_{t \geq 0}

(see, for instance, []).

3.2. Main Results

Let

L_{ξ}

denote the distribution of a random variable

ξ

. The main result is the following theorem.

Theorem 1.

Assume the dirft term of the SDE (4) satisfies assumption (A). Then there exists constants

C_{T, x, d}

such that

Ent (L_{X_{t}^{δ}} | L_{X_{t}}) \leq K C_{T, x, d} t δ^{β}, t \in [0, T] .

(7)

Consequently, we have

lim_{δ \to 0} Ent (L_{X_{t}^{δ}} | L_{X_{t}}) = 0 .

Remark 5.

Theorem 1 gives the convergence rate of an EM scheme in the sense of relative entropy for SDEs (4) with additive noise, so its asymptocic behaviors can be established. The main tool of the proof relies on the Girsanov transform. The details of the proof can be found in Section 4.

Corollary 1.

When assumption (A) is satisfied, we have

∥ L_{X_{t}^{δ}} - L_{X_{t}} ∥_{v a r} \leq \sqrt{2 K C_{T, x, d} t} δ^{\frac{β}{2}}, t \in [0, T] .

Remark 6.

Corollary 1 is the convergence of an EM scheme for the total variance distance. This can be implied by Pinker’s inequality (3) directly.

Theorem 2.

If assumption (A) holds for some

0 < β < 1

, then for any

k \geq 1

there exists a constant

c_{k, T, x, d}

such that

∥ L_{X_{t}^{δ}} (f) - L_{X_{t}} {(f) ∥}_{k, v a r} = sup_{{| f | < 1 + | \cdot |}^{k}} | L_{X_{t}^{δ}} (f) - L_{X_{t}} (f) | \leq c_{k, T, x, d} \sqrt{t} δ^{\frac{β}{2}}, t \in [0, T] .

Remark 7.

Theorem 2 is the convergence of an EM scheme for the weighted total variance distance. This convergence is not the direct application of the convergence in the relative entropy sense for the EM scheme of Theorem 1. And the details of the proof can be found in Section 4.

4. Proofs

4.1. Proof of Theorem 1

Before finishing the proof of Theorem 1, we prepare some auxiliary lemmas. The first lemma below plays a crucial role in the proof of Theorem 1.

Lemma 1.

Assume (A). Then for any

k \geq 1

, there exists a constant

C_{T, d, k} > 0

such that

E sup_{t \in [0, T]} | X_{t}^{δ} |^{k} + E sup_{t \in [0, T]} | X_{t} |^{k} \leq C_{T, d, k} {(1 + | x |}^{k}), t \in [0, T] .

(8)

Proof.

Without loss of generality, we only prove the inequality for

X_{t}^{δ}

since it is similar for

X_{t}

.

For any

n \geq 1

, let

ζ_{n} : = \inf {t \geq 0, | X_{t}^{δ} | \geq n}

. Firstly, we have

\begin{matrix} | X_{t \land ζ_{n}}^{δ} |^{k} \leq 2^{k - 1} {|\int_{0}^{t \land ζ_{n}} b (X_{s_{δ}}^{δ}) d s|}^{k} + 2^{k - 1} {| W_{t \land ζ_{n}} |}^{k}, t \in [0, T] . \end{matrix}

(9)

By (A), it is easy to see that

\begin{matrix} | b (x) | \leq | b (x) - b (0) | + | b (0) | \leq {K | x |}^{β} + | b (0) | \leq K (1 + | x |) + | b (0) |, x \in R^{d} . \end{matrix}

(10)

Combining this with (9), we can find a constant

c_{0} > 0

such that

\begin{matrix} E sup_{t \in [0, r]} | X_{t \land ζ_{n}}^{δ} |^{k} \leq c_{0} T E \int_{0}^{r} (1 + sup_{t \in [0, s]} | X_{t \land ζ_{n}}^{δ} |^{k}) d s + 2^{k - 1} E sup_{t \in [0, r]} {| W_{t \land ζ_{n}} |}^{k}, r \in [0, T] . \end{matrix}

(11)

By the Burkerholder–Davis–Gundy inequality, there exists a constant

c_{1} > 0

such that

E sup_{t \in [0, r]} {| W_{t \land ζ_{n}} |}^{k} \leq c_{1} {(d r)}^{\frac{k}{2}}, r \in [0, T] .

Putting this into (11) and applying Gronwall’s inequality, we find a constant

C_{T} > 0

such that

E sup_{t \in [0, T]} | X_{t \land ζ_{n}}^{δ} |^{k} \leq C_{T, d, k} {(1 + | x |}^{k}) .

Note that

P (ζ_{n} < T) \leq P (| X_{T \land ζ_{n}}^{δ} | \geq n) \leq \frac{E {sup}_{t \in [0, T]} {| X_{t \land ζ_{n}}^{δ} |}^{k}}{n^{k}} \leq \frac{C_{T, d, k} {(1 + | x |}^{k})}{n^{k}} .

This yields that

P

-a.s.

{lim}_{n \to \infty} ζ_{n} = \infty

, which, combined with Fatou’s lemma, yields that

E sup_{t \in [0, T]} | X_{t}^{δ} |^{k} \leq \underset{n \to \infty}{lim inf} E sup_{t \in [0, T]} | X_{t \land ζ_{n}}^{δ} |^{k} \leq C_{T, d, k} {(1 + | x |}^{k}) .

So we complete the proof. □

Lemma 2.

Under

(A)

, there exists a constant

C_{T, x, d} > 0

such that

sup_{t \in [0, T]} E {| X_{t}^{δ} - X_{t_{δ}}^{δ} |}^{4} \leq C_{T, x, d} δ .

(12)

Proof.

Note that

X_{t}^{δ} - X_{t_{δ}}^{δ} = \int_{t_{δ}}^{t} b (X_{s_{δ}}^{δ}) d s + W_{t} - W_{t_{δ}}

This together with (10), Lemma 1 and the fact

E | W_{t} - W_{t_{δ}} |^{4} = (d^{2} + 2 d) {(t - t_{δ})}^{2} \leq (d^{2} + 2 d) δ^{2}

implies that

\begin{matrix} E | X_{t}^{δ} - X_{t_{δ}}^{δ} |^{4} & \leq 8 δ^{3} \int_{t_{δ}}^{t} E {| b (X_{s_{δ}}^{δ}) |}^{4} d s + 8 (d^{2} + 2 d) δ^{2} \\ \leq 8 δ^{3} C_{0} \int_{t_{δ}}^{t} E (1 + sup_{t \in [0, T]} | X_{t}^{δ} |^{4}) d s + 8 (d^{2} + 2 d) δ^{2} \\ \leq C_{T, x, d} δ^{2} . \end{matrix}

Therefore, the proof is completed. □

Lemma 3.

Assume (A) . Then there exist constants

κ_{1} (T), κ_{2} (T, x, d) > 0

such that

E exp \{κ_{1} (T) sup_{t \in [0, r]} {| X_{t} |}^{2}\} \leq κ_{2} (T, x, d), r \in [0, T] .

Proof.

By Itô’s formula, we can find a constant

c_{0} > 0

such that

\begin{matrix} | X_{t} |^{2} & \leq {| x |}^{2} + \int_{0}^{t} ⟨ b (X_{s}), 2 X_{s} ⟩ d s + \int_{0}^{t} ⟨ 2 X_{s}, d W_{s} ⟩ + d t \\ \leq {| x |}^{2} + (c_{0} + d) t + c_{0} \int_{0}^{t} {| X_{s} |}^{2} d s + \int_{0}^{t} ⟨ 2 X_{s}, d W_{s} ⟩ . \end{matrix}

Gronwall’s inequality yields that

\begin{matrix} sup_{t \in [0, r]} {| X_{t} |}^{2} \leq e^{c_{0} r} [{| x |}^{2} + (c_{0} + d) r + sup_{t \in [0, r]} \int_{0}^{t} ⟨ 2 X_{s}, d W_{s} ⟩], r \in [0, T] . \end{matrix}

(13)

Note that for any

ε > 0

,

\begin{matrix} E exp \{ε sup_{t \in [0, r]} \int_{0}^{t} ⟨ 2 X_{s}, d W_{s} ⟩\} & \leq e E exp \{ε \int_{0}^{r} ⟨ 2 X_{s}, d W_{s} ⟩\} \\ \leq e {\{E e^{8 ε^{2} \int_{0}^{r} {| X_{s} |}^{2} d s}\}}^{\frac{1}{2}} \\ \leq e {\{E e^{8 ε^{2} r {sup}_{s \in [0, r]} {| X_{s} |}^{2}}\}}^{\frac{1}{2}}, r \in [0, T] . \end{matrix}

(14)

Taking

ε = \frac{1}{8 T} e^{- c_{0} T}

, we have

ε e^{- c_{0} T} = 8 T ε^{2}

. Combining this with (13) and (14), we derive

\begin{matrix} E exp \{ε e^{- c_{0} T} sup_{t \in [0, T]} {| X_{t} |}^{2}\} & \leq {exp {ε [| x |}^{2} + (c_{0} + d) T]} E exp \{ε sup_{t \in [0, T]} \int_{0}^{t} ⟨ 2 X_{s}, d W_{s} ⟩\} \\ \leq {exp {ε [| x |}^{2} + (c_{0} + d) T]} e {\{E e^{8 ε^{2} T {sup}_{s \in [0, T]} {| X_{s} |}^{2}}\}}^{\frac{1}{2}} . \end{matrix}

By a stopping time technique, we may and do assume that

E exp \{ε e^{- c_{0} T} sup_{t \in [0, T]} {| X_{t} |}^{2}\} < \infty .

Then we get

E exp \{\frac{1}{8 T} e^{- 2 c_{0} T} sup_{t \in [0, T]} {| X_{t} |}^{2}\} \leq exp \{2 + \frac{1}{4 T} e^{- c_{0} T} {[| x |}^{2} + (c_{0} + d) T]\} .

Letting

κ_{1} (T) = \frac{1}{8 T} e^{- 2 c_{0} T}

and

κ_{2} (T, x, d) = exp \{2 + \frac{1}{4 T} e^{- c_{0} T} {[| x |}^{2} + (c_{0} + d) T]\}

, we complete the proof. □

Proof of Theorem 1.

For any

n \geq 1

, let

τ_{n} : = \inf {t \geq 0, | X_{t} | \lor | X_{t}^{δ} | \geq n}

. By Lemma 1, it holds that

P

-a.s.

{lim}_{n \to \infty} τ_{n} = \infty

. Recall that

\begin{matrix} d X_{t} = b (X_{t}) d t + d W_{t}, t \in [0, T] . \end{matrix}

(15)

Let

\begin{matrix} W_{t}^{δ} = W_{t} - \int_{0}^{t} [b (X_{s_{δ}}) - b (X_{s})] d s, t \in [0, T] . \end{matrix}

(16)

Then (15) can be rewritten as

d X_{t} = b (X_{t_{δ}}) d t + d W_{t}^{δ}, t \in [0, T] .

Set

R_{t} = exp \{\int_{0}^{t} ⟨ [b (X_{s_{δ}}) - b (X_{s})], d W_{s} ⟩ - \frac{1}{2} \int_{0}^{t} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s\}, t \in [0, T] .

Fix

t_{0} \in (0, T]

. By (A) and Girsanov’s theorem, we conclude that

{R_{t \land τ_{n}}}_{t \in [0, t_{0}]}

is a martingale and

W_{t}^{δ}

is d-dimensional Brownian motion up to

t_{0} \land τ_{n}

under probability measure

Q^{n} = R_{t_{0} \land τ_{n}} P

. This together with (A) and Lemma 2 implies that

\begin{matrix} E^{Q^{n}} \int_{0}^{t_{0} \land τ_{n}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s \\ = E \int_{0}^{t_{0} \land τ_{n}} {| b (X_{s_{δ}}^{δ}) - b (X_{s}^{δ}) |}^{2} d s \\ \leq E \int_{0}^{t_{0}} K^{2} {| X_{s_{δ}}^{δ} - X_{s}^{δ} |}^{2 β} d s \\ \leq \int_{0}^{t_{0}} K^{2} {E | X_{s_{δ}}^{δ} - X_{s}^{δ} {|^{2}}}^{β} d s \leq K^{2} C_{T, x, d} t_{0} δ^{β} . \end{matrix}

(17)

From this and (16), we derive that

\begin{matrix} E [R_{t_{0} \land τ_{n}} log R_{t_{0} \land τ_{n}}] & = E^{Q^{n}} [log R_{t_{0} \land τ_{n}}] \\ = E^{Q^{n}} \int_{0}^{t_{0} \land τ_{n}} ⟨ [b (X_{s_{δ}}) - b (X_{s})], d W_{s} ⟩ - \frac{1}{2} E^{Q^{n}} \int_{0}^{t_{0} \land τ_{n}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s \\ = E^{Q^{n}} \int_{0}^{t_{0} \land τ_{n}} ⟨ [b (X_{s_{δ}}) - b (X_{s})], d W_{s}^{δ} ⟩ + \frac{1}{2} E^{Q^{n}} \int_{0}^{t_{0} \land τ_{n}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s \\ = \frac{1}{2} E^{Q^{n}} \int_{0}^{t_{0} \land τ_{n}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s \\ \leq \frac{1}{2} K^{2} C_{T, x, d} t_{0} δ^{β} . \end{matrix}

This combined with the convergence theorem of martingales implies that

{R_{t}}_{t \in [0, t_{0}]}

is a martingale, and it follows from Fatou’s lemma that

\begin{matrix} E [R_{t_{0}} log R_{t_{0}}] \leq lim_{n \to \infty} E [R_{t_{0} \land τ_{n}} log R_{t_{0} \land τ_{n}}] & \leq \frac{1}{2} K^{2} C_{T, x, d} t_{0} δ^{β} . \end{matrix}

Applying Girsanov’s theorem again, we conclude that

{R_{t}}_{t \in [0, t_{0}]}

is a martingale and

W_{t}^{δ}

is d-dimensional Brownian motion up to

t_{0}

under probability measure

Q = R_{t_{0}} P

, and hence, the distribution of

{X_{t}}_{t \in [0, t_{0}]}

under

Q

is equal to that of

{X_{t}^{δ}}_{t \in [0, t_{0}]}

under

P

. As a result, it holds that

\begin{matrix} E f (X_{t_{0}}^{δ}) = E^{Q} f (X_{t_{0}}) = E [R_{t_{0}} f (X_{t_{0}})], | f | \leq {1 + | \cdot |}^{k}, k \geq 1 . \end{matrix}

(18)

By Young’s inequality, we derive that

Ent (L_{X_{t_{0}}^{δ}} | L_{X_{t_{0}}}) \leq C_{T, x, d} δ^{β}, t \in [0, T] .

(19)

The proof is completed. □

4.2. Proof of Corollary 1

Corollary 1 is the direct result of Theorem 1 and Pinsker’s inequality (3).

4.3. Proof of Theorem 2

Proof.

Firstly, we have

\begin{matrix} E {[R_{t_{0}} - 1]}^{2} = E [R_{t_{0}}^{2}] - 1 \\ = E {exp \{2 \int_{0}^{t_{0}} ⟨ [b (X_{s_{δ}}) - b (X_{s})], d W_{s} ⟩ - 4 \int_{0}^{t_{0}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s\} \\ \times exp \{3 \int_{0}^{t_{0}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s\}} - 1 \\ \leq {\{E exp \{6 \int_{0}^{t_{0}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s\}\}}^{\frac{1}{2}} - 1 \\ \leq E exp \{6 \int_{0}^{t_{0}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s\} - 1 \\ \leq E [exp \{6 \int_{0}^{t_{0}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s\} 6 \int_{0}^{t_{0}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s] \\ \leq 6 {\{E exp \{12 \int_{0}^{t_{0}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s\}\}}^{\frac{1}{2}} {[E {(\int_{0}^{t_{0}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s)}^{2}]}^{\frac{1}{2}} \\ \leq 6 {\{E exp \{48 K^{2} t_{0} sup_{s \in [0, t_{0}]} {| X_{s} |}^{2 β} d s\}\}}^{\frac{1}{2}} {[E {(\int_{0}^{t_{0}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s)}^{2}]}^{\frac{1}{2}} \end{matrix}

By Lemma 3 and for

β < 1

, we derive

6 {\{E exp \{48 K^{2} t_{0} sup_{s \in [0, t_{0}]} {| X_{s} |}^{2 β} d s\}\}}^{\frac{1}{2}} < C (T, x, d)

for some constant

C (T, x, d) > 0

. Moreover, Hölder’s inequality and Lemma 2 imply that

\begin{matrix} {[E {(\int_{0}^{t_{0}} {| b (X_{s_{δ}}) - b (X_{s}) |}^{2} d s)}^{2}]}^{\frac{1}{2}} \leq \sqrt{t_{0}} {[\int_{0}^{t_{0}} E {| b (X_{s_{δ}}) - b (X_{s}) |}^{4} d s]}^{\frac{1}{2}} \leq c_{1} t_{0} δ^{β} . \end{matrix}

So we conclude that

E {[R_{t_{0}} - 1]}^{2} \leq c t_{0} δ^{β}

for some constant

c > 0

. From this together with (18) and Lemma 1, we derive

\begin{matrix} ∥ L_{X_{t}^{δ}} (f) - L_{X_{t}} {(f) ∥}_{k, v a r} \\ = sup_{{| f | < 1 + | \cdot |}^{k}} | L_{X_{t}^{δ}} (f) - L_{X_{t}} (f) | \\ = sup_{{| f | < 1 + | \cdot |}^{k}} | E [(R_{t} - 1) f (X_{t})] | \\ \leq {[E {(R_{t} - 1)}^{2}]}^{\frac{1}{2}} [E (1 + | X_{t} |^{k} {)^{2}]}^{\frac{1}{2}} \\ \leq c_{k, T, x, d} \sqrt{t} δ^{\frac{β}{2}} . \end{matrix}

The proof is completed. □

5. Conclusions and Discussion

In this paper, we studied the convergence rate of the relative entropy for a Euler–Maruyama scheme of stochastic differential equations driven by additive Gaussian noise, and we obtained its asymptotic behaviors. Our results for the convergence rate in terms of relative entropy complement the conventional ones in the strong and weak sense and induce some other properties of the Euler–Maruyama scheme. The convergence in terms of total variation distance can be implied by Pinsker’s inequality directly. And the convergence rate in terms of the weighted variation distance is also established.

Finally, we discuss some more complicated processes.

(1) SDEs with multiplicative noise: The SDE becomes

\begin{matrix} d X_{t} = b (X_{t}) d t + σ (X_{t}) d W_{t}, X_{0} = x \end{matrix}

(20)

The EM scheme satisfies

\begin{matrix} d X_{t}^{δ} = b (X_{t_{δ}}^{δ}) d t + σ (X_{t_{δ}}^{δ}) d W_{t}, X_{0}^{δ} = x . \end{matrix}

(21)

When

σ

is invertible, we can still rewrite (20) as

\begin{matrix} d X_{t} = b (X_{t_{δ}}) d t + σ (X_{t}) d {\tilde{W}}_{t} \end{matrix}

(22)

with

d {\tilde{W}}_{t} = d W_{t} - σ^{- 1} (X_{t}) (b (X_{t_{δ}}) - b (X_{t})) d t .

Different from the additive noise case, (21) and (22) solve different SDEs and, hence, the Girsanov transform is unavailable. We need to develop new approaches to deal with the multiplicative noise case.

(2) Geometric Brownian motion (GBM): In reference [], the authors investigated time-averaging and nonergodicity for GBM in the presence of drift and with resetting. Although GBM in the presence of drift has explicit representation and follows log-normal distribution, it solves an SDE with linear and multiplicative noise:

d X_{t} = μ X_{t} d t + σ X_{t} d W_{t} .

The difficulty appearing in the multiplicative noise case will still exist.

(3) Anomalous-diffusion processes: In reference [], the authors studied the nonergodicity, non-Gaussianity and aging of scaled fractional Brownian motion. We believe that our present method for estimating the relative entropy between EM scheme and the true solution is also available in SDEs driven by additive fractional Brownian motion since the Girsanov transform can also be used, and the Girsanov transform is relatively simple for the case of

H < \frac{1}{2}

.

We will leave these processes mentioned above for a future study.

Funding

This research was funded by the Natural Science Foundation of the Shandong Province of China (ZR202111290596).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The author thanks the Editor, the Associate Editor and the referees for their time and comments, which have greatly improved the paper. We also thank Xing Huang from Tianjin University for his discussion and suggestions for this article.

Conflicts of Interest

The author declares no conflicts of interest.

References

Kloden, P.E.; Platen, E. Numerical Solutions of Stochastic Differential Equations; Springer: Berlin/Heidelberg, Germnay, 1992. [Google Scholar]
Yan, L. The Euler scheme with irregular coefficients. Ann. Probab. 2002, 30, 1172–1194. [Google Scholar] [CrossRef]
Gyöngy, I.; Rásonyi, M. A note on Euler approximations for SDEs with Hölder continuous diffusion coefficients. Stoch. Process. Appl. 2011, 121, 2189–2200. [Google Scholar] [CrossRef]
Halidias, N.; Kloeden, P.E. A note on the Euler-Maruyama scheme for stochastic differential equations with a discontinuous monotone drift coefficient. BIT Numer. Math. 2008, 48, 51–59. [Google Scholar] [CrossRef]
Leobacher, G.; Szölgyenyi, M. A numerical method for SDEs with discontinuous drift. BIT Numer. Math. 2016, 56, 151–162. [Google Scholar] [CrossRef]
Müller-Gronbach, T.; Yaroslavtseva, L. On the performance of the Euler-Maruyama scheme for SDEs with discontinuous drift coefficient. Ann. L’Institut Henri PoincarÉ Probab. Stat. 2020, 56, 1162–1178. [Google Scholar] [CrossRef]
Leobacher, G.; Szölgyenyi, M. A strong order 1/2 method for multidimensional SDEs with discontinuous drift. Ann. Appl. Probab. 2017, 27, 2383–2418. [Google Scholar] [CrossRef]
Leobacher, G.; Szölgyenyi, M. Convergence of the Euler-Maruyama method for multidimensional SDEs with discontinuous drift and degenerate diffusion coefficient. Numer. Math. 2018, 138, 219–239. [Google Scholar] [CrossRef] [PubMed]
Bao, J.; Huang, X.; Yuan, C. Convergence rate of Euler–Maruyama Scheme for SDEs with Hölder-Dini continuous drifts. J. Theor. Probab. 2019, 32, 848–871. [Google Scholar] [CrossRef]
Pamen, O.M.; Taguchi, D. Strong rate of convergence for the Euler–Maruyama approximation of SDEs with Hölder continuous drift coefficient. Stoch. Process. Appl. 2017, 127, 2542–2559. [Google Scholar] [CrossRef]
Bao, J.; Huang, X.; Zhang, S. Convergence rate of EM algorithm for SDEs under integrability condition. arXiv 2022, arXiv:2009.04781. [Google Scholar]
Shao, J. Weak convergence of Euler-Maruyama’s approximation for SDEs under integrability condition. arXiv 2018, arXiv:1808.07250. [Google Scholar]
Suo, Y.; Yuan, C.; Zhang, S.-Q. Weak convergence of Euler scheme for SDEs with singular drift. arXiv 2020, arXiv:2005.04631. [Google Scholar]
Villani, C. Optimal Transport: Old and New; Springer: Berlin/Heidelberg, Germnay, 2009. [Google Scholar]
Sommaruga, G. Formal Theories of Information: From Shannon to Semantic Information Theory and General Concepts of Information; Springer: Berlin/Heidelberg, Germnay, 2009. [Google Scholar]
Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
Calin, O.; Udrişte, C. Geometric Modeling in Probability and Statistics; Springer: Berlin/Heidelberg, Germnay, 2014. [Google Scholar]
Zvonkin, A.K. A transformation of the phase space of a diffusion process that removes the drift. Math. USSR Sb. 1974, 93, 129–149. [Google Scholar] [CrossRef]
Flandoli, M.; Gubinelli, M.; Priola, E. Flow of diffeomorphisms for SDEs with unbounded Hölder continuous drift. Bull. Sci. Math. 2010, 134, 405–422. [Google Scholar] [CrossRef]
Vinod, D.; Cherstvy, A.G.; Metzler, R.; Sokolov, I.M. Time-averaging and nonergodicity of reset geometric Brownian motion with drift. Phys. Rev. E 2022, 106, 034137. [Google Scholar] [CrossRef] [PubMed]
Liang, Y.; Wang, W.; Metzler, R.; Cherstvy, A.G. Anomalous diffusion, nonergodicity, non-Gaussianity, and aging of fractional Brownian motion with nonlinear clocks. Phys. Rev. E 2023, 108, 034113. [Google Scholar] [CrossRef] [PubMed]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Convergence of Relative Entropy for Euler–Maruyama Scheme to Stochastic Differential Equations with Additive Noise

Abstract

1. Introduction

2. Preliminaries

2.1. Euler–Maruyama Scheme

2.2. Relative Entropy

2.3. Total Variation and Weighted Variation Distance

2.4. Stochastic Differential Equation Description

3. Assumptions and Main Results

3.1. Assumptions

3.2. Main Results

4. Proofs

4.1. Proof of Theorem 1

4.2. Proof of Corollary 1

4.3. Proof of Theorem 2

5. Conclusions and Discussion

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics