Stationary Wong–Zakai Approximation of Fractional Brownian Motion and Stochastic Differential Equations with Noise Perturbations

: In this article, we introduce a Wong–Zakai type stationary approximation to the fractional Brownian motions and provide a sharp rate of convergence in L p ( Ω ) . Our stationary approximation is suitable for all values of H ∈ ( 0,1 ) . As an application, we consider stochastic differential equations driven by a fractional Brownian motion with H > 1/2. We provide sharp rate of convergence in a certain fractional-type Sobolev space of the approximation, which in turn provides rate of convergence for the solution of the approximated equation. This generalises some existing results in the literature concerning approximation of the noise and the convergence of corresponding solutions.


Introduction
The philosophy of using appropriate differential equations with more regular drivers to approximate stochastic differential equations dates back to the pioneer work by Wong and Zakai [1,2], in which both continuous piecewise linear approximations and piecewise smooth approximations for one-dimensional Brownian motions were proposed. Particularly, the convergence of solutions to the approximating equations was proved in the mean sense and almost surely under some suitable conditions on coefficients, respectively. However, these results are not applicable to high-dimensional cases. For instance, Mc-Shane [3] provided a counter example showing that Wong-Zakai's result does not hold for a two-dimensional Brownian motion approximated by smooth functions. Meanwhile, the Wong-Zakai's piecewise linear (polygonal) approximation for high-dimensional Brownian motions was introduced by Stroock and Varadhan in [4], where the convergence in law was also proved and used to determine the support of diffusion processes. Later on, the shift operator was incorporated into approximations of high-dimensional Brownian motion in two different ways by Ikeda et al. [5,6]. In these articles, the convergence was established in the mean square sense, uniformly over finite time interval. More recently, Kelly and Melbourne [7] introduced a class of smooth approximations in the integral form by involving a C 2 uniformly hyperbolic flow on a compact manifold. In this case, the authors established weak convergence towards the limit equation by the methods of rough path theory. However, none of the mentioned three articles [5][6][7] studied the limit equations understood in the Stratonovich sense.
To the best of our knowledge, one of the first stationary smooth approximations is that by Lu and Wang [8,9]. They approximated a one-dimensional white noise by a stationary Gaussian process, and they applied the approximation to the study of the chaotic behaviour of randomly forced ordinary differential equations with a homoclinic orbit to a saddle fixed point. Stationary Wong-Zakai approximation of the white noise by Lu and Wang was extended to high-dimensional situations by Shen and Lu in [10], where it was proved that the solutions of Wong-Zakai approximations converge in the mean square to the solutions approximation converge to the solution to the original stochastic differential equation driven by the fBm itself. Our result on the convergence of solutions can be seen as a slight generalisation of similar results derived in [29,30], see Remark 4. We will focus on the case of fBm in which we can apply some fine, known properties to obtain the exact rate of convergence. However, we stress that if one is simply interested on the convergence but not on the exact rate, our method and many of our results can be easily extended to cover a very large class of stochastic processes, see discussion in Section 5. This class includes all Hölder continuous Gaussian processes and also Hölder continuous processes living in some fixed chaos. This provides, among other things, information on stochastic differential equations driven by Hermite processes, a topic that has not received much attention in the literature. In fact, our approximation converges under very mild conditions in the process, and the approximation for the process is stationary as long as the process has stationary increments. In the context of stochastic differential equations, we prove that as long as the true noises are approximated properly in certain Besov-type space, the corresponding solutions converge as well. This justifies the claim presented in the abstract.
The rest of the article is organised as follows. In Section 2 we recall some preliminaries needed for our analysis. In particular, we recall the concept of generalised Lebesgue-Stieltjes integrals together with some useful inequalities. In Section 3, we introduce a stationary Wong-Zakai approximation of the fractional noise and study its convergence on the full range H ∈ (0, 1). In Section 4, we study effects of noise perturbations to the solutions of differential equations driven by Hölder signals of order α > 1/2; there, we also apply results of Section 3, on range H ∈ (1/2, 1), to study Wong-Zakai approximations of stochastic differential equations driven by fractional Brownian motions. We end the paper with a short discussion on the generality of our results.

Preliminaries
In this section, we discuss briefly some preliminaries we need for our analysis. In particular, we recall the notion of generalised Lebesgue-Stieltjes integrals and some a priori estimates. For more details, we refer to the article [31] and a monograph [32].
Let (a, b) be a nonempty bounded interval. For given p ∈ [1, ∞], we use the usual notation L p = L p (a, b) to denote p-integrable functions, or those essentially bounded in the case p = ∞. The fractional left and right Riemann-Liouville integrals of order α > 0 of a function f ∈ L 1 are given by (see [32] Definition 2.1) and It is known that the above integrals converge for almost every t ∈ (a, b), and I α a+ f and I α b− may be considered as functions in L 1 . By convention, I 0 a+ and I 0 b− are defined as the identity operator. Moreover, the integral operators I α a+ , I α b− : L 1 → L 1 are linear and injective. The Riemann-Liouville derivatives are defined through identities D α [32] (Definition 2.2), and they can be viewed as the (left-)inverse operators I −α a+ = (I α a+ ) −1 and I −α b− = (I α b− ) −1 , respectively, of the Riemann-Liouville fractional integrals, see [32] (Theorem 2.4). That is, we have, e.g., D α a + I α a+ f = f . For any α ∈ (0, 1) and for any f ∈ I α a+ (L 1 ) and g ∈ I α b− (L 1 ), the Weyl-Marchaud derivatives are defined by formulas (see [32] Equations (13.2) and (13.5)) and These are well-defined and, by [32] (Theorem 13.1 and Corollary on p. 228), they coincide with the Riemann-Liouville derivatives for almost every t ∈ (a, b).
We are now ready to recall the concept of generalised Lebesgue-Stieltjes integrals. For this, let f and g be functions such that the limits f (a+), g(a+), g(b−) exist in IR, and denote is well-defined. Moreover, it can be proved that the right side does not depend on α.
We follow closely the approach of [31]; for this, we need to introduce some necessary spaces and norms. For β ∈ (0, 1/2), we denote by W 1,1−β (0, T; R m ) the space of measurable functions with values in R m equipped with a norm Here and below, | · | denotes the Euclidean norm. Similarly, we use a norm We also consider the function space W β,∞ (0, T; R m ), the space of measurable functions with values in R m with a norm Clearly, we have which we will use in the sequel. Similarly, we denote by W β, We also denote by C β the space of β-Hölder continuous functions. Finally, we recall that if f ∈ W β,∞ (0, T; R n×m ) and g ∈ W 1,1−β (0, T; R m ), then the integral t 0 f s dg s exists for all t ∈ [0, T]. Moreover, the integral belongs to C 1−β (0, T; R n ) ⊂ W β,∞ (0, T; R n ). Actually, by Proposition 4.1 of [31], we have the estimate · 0 f s dg s This yields immediately a bound t s f u dg u ≤ C|t − s| 1−β g 1,1−β f β,∞ .
Moreover, the integral satisfies also a bound t 0 f s dg s ≤ C g 1,1−β f 2,β (7) which we will use throughout the paper. We also recall the following Gronwall-type lemma that is needed for the proof of Theorem 2.
be a continuous function such that for each t we have Then, where c α and d α are positive constants depending only on α.

Wong-Zakai Approximations for the Fractional Noise
Let W H (t) be an m-dimensional fractional Brownian motion with Hurst index H ∈ (0, 1) on a complete probability space (Ω, F , P H ), i.e., the components W H,j (t), j = 1, 2, . . . , m, of W H (t) are independent centred Gaussian processes with the covariance function Throughout the article, we denote by L p (Ω) the space of random variables with pth moment finite. In the sequel, we will work with the canonical version of the fBm. In fact, let Ω = C 0 (R; R m ) be the space of continuous paths with values zero at zero equipped with the compact open topology. Let F also be defined as the Borel-σ-algebra and let P H be the distribution of B H (t). We consider the Wiener shift given by where t ∈ R and ω ∈ C 0 (R; R m ). It follows from ( [33] Theorem 1) that the quadruple (Ω, F , P H , θ) is an ergodic metric dynamical system. In the sequel, we will identify B H (·, ω) with the continuous path ω(·). For each δ > 0, we define the random variable G δ : Ω → R m by Then, and it follows from stationarity of the increments of fBm (see, e.g., [20,21]) that G δ (θ t ω) is a stationary Gaussian process. Let Since lim s→±∞ |ω(s)| |s| = 0, we also obtain |ω(s)| ≤ C ω (|s| + 1).
Together with θ t ω(s) = ω(s + t) − ω(t), this gives In particular, we have which will be used later on. Next, we show that G δ (θ t ω) can be viewed as an approximation of fractional white noise in the Wong-Zakai sense. The following result provides us the first estimate on the convergence rate in the mean sense. In Proposition 1, we apply the properties of the fBm to provide the exact rate of convergence. However, we wish to present the following simple proof that can be adapted easily to more general cases than fBm (see also Remark 1 and discussions in Section 5).
Moreover, for any p ≥ 1, there exists a constant C > 0 such that for any t ∈ [0, T], we have Consequently, Hence, by using the uniform continuity of ω on the interval [0, that proves the first claim. For the second claim, basic inequality together with Minkowski inequality implies the claim once we have proved the claim for m = 1, i.e., in the one-dimensional case. This now follows easily. Indeed, by Gaussianity we have This concludes the proof.

Remark 1.
We remark that the above statement remains valid for a much larger class of processes. Actually, we have only used continuity and hypercontractivity IE|X t | p ≤ C p [IE|X t |] p , valid for Gaussian processes X, together with IE|X t − X s | ≤ C|t − s| H . Thus, the above result easily extends to arbitrary Gaussian process with Hölder continuous paths (see, e.g., [34]) and beyond. On the other hand, in particular cases such as fBm, one can refine the arguments and obtain the exact rate of convergence-see Proposition 1 and Theorem 1 below. For detailed discussion on the generalisations, see Section 5.
In order to establish convergence of the solutions of the approximating stochastic differential equations, we need convergence in the space W 1,1−β . This, on the other hand, can be achieved by proving convergence in the space L p (Ω). We start with a result stating the exact error in L p (Ω) in the one-dimensional case, which may also have its own interest.
It turns out that Θ(x) plays a crucial role in the rate of convergence. We also note that, while here, the integral could be computed explicitly, the form (11) is the most convenient for our proofs. We begin with the following lemma that studies the properties of Θ(x).
we have Proof. We begin with the first claim concerning asymptotic behaviour as x → ∞. For any x > 1, we have Moreover, by Taylor approximation, we have, for any a ∈ [−1, 1], where the remainder satisfies proving the first claim. For the second claim concerning asymptotic behaviour as x → 0, let x < 1. It suffices to prove that Indeed, combining (12) and (13) leads to We begin by showing (12). We compute Hence, by applying Taylor's theorem and L'hopital's rule, we can conclude and, in particular, that (12) holds. For (13), we can compute as above to obtain Direct calculations show that Hence, by using Taylor's approximation again, we obtain .
shows (13) and concludes the whole proof.

Proposition 1.
Let H ∈ (0, 1) be fixed and let the number of dimensions m = 1. Then, the error process has stationary increments, and for any p ≥ 1 and any t ≥ 0 we have where Γ(z) denotes the Gamma function and Θ(x) is given by (11).
Proof. Denote the error process by [∆ δ ω], i.e., For any t > r, we have and since fBm has stationary increments, it follows that where law = denotes equality in distribution. Thus, the error process x t has stationary increments. It remains to prove (14). First, we observe that since x t is a Gaussian process and for Z ∼ N(0, 1), it suffices to consider the case p = 2. We compute After taking expectation, using Fubini's theorem, and plugging in (8), we can use some elementary manipulations to collect similar terms together and obtain Here, we obtain by interchanging the roles of s and u that Treating other terms similarly, we obtain that Now, the claim follows directly from a change of variables u → δu and s → δs.
We are now ready to formulate the main result of the article that provides the rate of convergence in the space W 1,1−β . Up to multiplicative constants, our result is sharp. Theorem 1. Let H ∈ (1/2, 1) and β ∈ (1 − H, H). Then, for any p ≥ 1, there exists a constant C = C (H, β, m, p), depending only on parameters H, β, p and the number of dimensions m, such that for any δ < 1 we have Proof. Using the elementary inequality we obtain that it suffices to consider the one-dimensional case. Throughout the proof, we denote by C any generic unimportant constant (depending on parameters H, β, p, and m) that may vary from line to line. As in the proof of Proposition 1, we use short notation [∆ δ ω] for the error process. We begin with the lower bound, which is rather easy. Indeed, for any 0 < s < t < T, we have Consequently, Proposition 1 yields where v = t−s δ . This proves the lower bound. Consider next the upper bound. By Minkowski's inequality, we can study the two terms separately. For the first one, a simple application of Borell-TIS inequality (see e.g., [35] Theorem 2.1.1) implies that, for any topological space T and a centred Gaussian family Together with Proposition 1, this leads to an upper bound Consider next the term (15). By Proposition 1, [∆ δ ω] has stationary increments. Thus, where law = denotes the equality in distribution. Now, Minkowski's integral inequality together with Proposition 1 yields Together with (16), this concludes the proof.

Noise Perturbations for Differential Equations Driven by Hölder Functions
In this section, we apply Theorem 1 to study approximations of solutions to stochastic differential equations driven by fBm. However, in view of Remark 1, we present our main result of this section, Theorem 2, in a general form. The detailed analysis of the case of the fBm is postponed in Section 4.1.
Consider the following differential equation, driven by ω ∈ C α for some α > 1/2. To ensure the existence and uniqueness of the solution, we adopt the following slightly modified (cf. Remark 2) assumptions from [31] for the coefficient functions f : Ω × [0, T] × R n → R n and σ : Ω × [0, T] × R n → R n×m : (i) There exists a constant M > 0 such that for ∀x, y ∈ R n , ∀t ∈ [0, T] (ii) For any N > 0, there exist constants M N > 0 such that for ∀|x|, |y| ≤ N, ∀t ∈ [0, T] (iii) There exists a constant K 0 > 0 and ζ ∈ [0, 1] such that for ∀x ∈ R n , ∀t ∈ [0, T] |σ(t, x)| ≤ K 0 (1 + |x| ζ ). (21) (iv) For any N > 0, there exists a constant L N > 0 such that for ∀|x|, |y| ≤ N, ∀t ∈ [0, T] (v) There exist a constant L 0 > 0 and a function b 0 ∈ L ρ (0, T; R n ), where ρ > 2, such that Remark 2. The existence and uniqueness result of [31] hold under slightly more general assumptions compared to ours. That is, we have assumed Lipschitz continuity in (19) and (20), while in [31] these are replaced with Hölder continuity of suitable order. For the sake of simplicity and in order to avoid adding extra layers of parameters and complexity into the proof of Theorem 2, which is already technical and rather lengthy, we work with the above set of assumptions. Note also that in condition (v), we do not allow ρ = 2, which at first glimpse might look slightly different compared to (H 2 ) of [31]. However, existence and uniqueness in the space W β,∞ follows from ([31] Theorem 2.1) provided that β ∈ (1 − α, 1/2) and ρ ≥ 1 β . These conditions cannot be satisfied simultaneously if ρ = 2.
Remark 3. Note that for differential equations driven by ω ∈ C α , one could define integrals in the sense of Young as well, and consider p-variations of the solution (with suitable parameter p). Here we have considered integrals defined through fractional calculus instead, as (weighted) fractional-order Sobolev spaces introduced in Section 2 are naturally linked to Hölder continuous functions, and even our stationary approximation can be viewed as a smoothing of the process through integration. In this sense, integration considering fractional integration seems more natural approach compared to analysis of p-variation and Young integration. On the other hand, functions of bounded p-variation are intimately connected to Hölder functions as well, and it is already well-understood that in the case of the present paper, both approaches lead to solutions that coincide.
Since ω is, in general, not smooth, Equation (17) is understood in integral form Note that, here, we have used u(t) for the sake of notational simplicity, although u depends also on the path of ω, i.e., the solution is a flow u(t, ω). The integral in (24) is understood in the generalised Lebesgue-Stieltjes sense. Let β ∈ (1 − α, 1/2) such that ρ ≥ 1 β . Then, by ([31] Theorem 2.1), Equation (17) has a unique solution in the space where C 1 and C 2 depend on the constants appearing in (18)-(23)-T, β, and α. Finally, κ depends solely on ζ and β. More precisely (cf. [31], Proposition 5.1), Now, let G δ ∈ C α be an arbitrary approximation of the noise ω (for example, a smooth approximation). Then, we approximate (24) with Since G δ ∈ C α as well, Equation (27) admits a unique solution u δ ∈ W β,∞ . It turns out that the solutions u δ for (27) converge towards solution u for (24) as long as G δ converge towards the noise ω of (24) in the space W 1,1−β .

Remark 4.
In comparison with the existing literature, our result is very similar to the main results of [29,30] and does not provide any surprises to an educated reader. Indeed, in [29], the authors studied a differential equation of form with Hölder continuous (of sufficient order) driver y; in this sense, our result simply extends the main results of [29] to cover the more complicated coefficient σ and the case of nontrivial drift coefficient f in (17). Similarly, in [30], the authors studied the same general equation given by (17). However, in [30], the authors only provided convergence in the uniform norm · ∞ = sup t∈[0,T] | · |, while here, we provide the convergence in the fractional Sobolev norm · β,∞ . It is also worth noting that our method differs slightly from the one presented in [30]. As such, we present the result and its proof for the reader's convenience, despite the fact that our result can be seen only as a slight technical generalisation of the results presented in [29,30]. For related literature, see [36,37] concerning the continuity of the Itô map under various norms and settings.
Proof. Our aim is to apply Lemma 1. For this, we denote It follows from (25) that there exists large N such that sup δ u δ β,∞ ≤ N and u β,∞ ≤ N. This allows us to apply localisation argument in order to apply conditions (18)- (23). Throughout the proof, we denote by K a generic constant that may vary from line to line. We stress that K may depend on N and the constants appearing in conditions (18)- (23). K may also depend on ω, T, and β. However, K is independent of δ, although from time to time, we also apply bound u δ β,∞ ≤ N and include this to the constant K, whenever confusion cannot arise.
We begin by computing Note also that (I 1 (t) + I 2 (t)).
Let s = r − (t − r)z. For the first term, we have This yields For the second term, we have For the third term, we obtain Finally, for the fourth term, we obtain Consequently, by combining the estimates (31)-(36), we obtain an estimate With (30), this leads to and thus, we may apply Lemma 1 to obtain This shows (28) and completes the proof.

Wong-Zakai Approximations for SDEs Driven by fBm
Consider the following stochastic differential equations (SDEs) driven by fBm where H > 1/2, and the corresponding Wong-Zakai approximatioṅ where G δ (θ t ω), is given by (9). We denote Then, Equation (39) can be rewritten aṡ and Equations (39) and (38) can be written in integral forms as and where we have identified W H with ω. Together with (10), conditions (18)- (23) give the existence and uniqueness of solution of random ordinary differential Equation (40) by standard arguments. On the other hand, the solution to (40) is the solution to (41), where the integral can also be understood in the generalised Lebesgue0-Stieltjes sense. Thus, again by [31] (Theorem 2.1), Equation (41) has a unique solution in the space The following theorem provides us the fact that one can approximate solutions to SDEs driven by fBm by studying equations driven by our smooth stationary approximation. The claim follows essentially by repeating arguments presented in the proof of Theorem 1 and in [31]. For this reason, we only sketch the main ideas and omit the details. Theorem 3. Suppose that f and σ satisfy conditions (18)-(23) and fix β ∈ (1 − H, 1/2) such that ρ ≥ 1 β . Let u δ and u be the (unique in C 1−β ) solution to (41) and (42), respectively. Then, as δ → 0,

2.
If, in addition, α < 2−ζ 4 and the constants and the function b 0 in (18)-(23) are deterministic and can be chosen independently of N, there exists a constant K, independent of δ, such that Proof. The first claim follows immediately by combining statements of Theorem 1 and Theorem 2 together with the fact that convergence in L p (Ω) implies convergence in probability. Similarly, following the proof of Theorem 2 and applying (25) gives us where κ is given by (26). Estimating the norm · 1,1−β from above by the Hölder norm, obtaining that given assumptions imply κ < 2, repeating the arguments of [31] or [34] to estimate exponential moments of the Hölder norm, and using Hölder inequality together with Theorem 1 proves the claim. The repetition of the technical details are left to the reader.

Remark 5.
As expected, the sharp rate of convergence in Theorem 1 translates into a similar rate of convergence for the solutions. While here we have only provided an upper bound, the obtained rate is the best one can hope for in the general setting that covers all the possible choices of the coefficients f and σ.

Discussion
In this article, we have introduced Wong-Zakai approximations of the fractional noise and studied its convergence properties. Our approximation is valid on the full range H ∈ (0, 1) of the Hurst parameter. Moreover, our main result, Theorem 2, provides sharp rate of convergence for our approximation. As an application, we proved that, for the case H > 1/2, the solutions of the approximating differential equations converge, in the norm · β,∞ , towards the original solution. The only needed feature for the convergence of approximating solutions to differential equations is that the approximation of the noise converges in W 1,1−β . Indeed, on top of technical computations, the only additional ingredient in the proof of Theorem 2 was a deterministic Gronwall-type Lemma 1. Our Theorem 2 can be seen as a slight generalisation on similar results provided in the literature.
In the particular case of the fractional Brownian motion, convergence of the approximation together with a sharp rate of convergence can be seen from Lemma 3 and Proposition 1, whose proof uses the fact that the underlying process is the fBm. This, in turn, translates into a sharp result, provided in Theorem 1, on the rate of convergence in the space W 1,1−β . In addition, proof of Theorem 1 also applies Gaussianity and stationarity of the increments. However, as pointed out already in Remark 1, the convergence in L p (Ω) follows immediately if The convergence in W 1,1−β on the other hand could be based on Garsia-Rodemich-Rumsey lemma [38], which provides an inequality valid for all continuous functions f on [0, T], all p ≥ 1, and α > 1 p . Using this, one can follow the proof of ([31] Lemma 7.4) or ( [34] Theorem 1) and, assuming hypercontractivity, eventually obtain convergence of the moments of [∆ δ ω] 1,1−β . Unfortunately, however, while providing the desired convergence in the space W 1,1−β , this approach does not yield (in any straightforward manner) good bounds for the rate of convergence. Indeed, even in the case of fBm, one obtains that the p:th moment is proportional to δ γ , where γ ∈ (0, 1) can be chosen arbitrarily and p has to be chosen large enough. Thus, the obtained rate of convergence in L p (Ω) is proportional to δ γ p , which, in turn, means that one obtains a worse rate for higher moments. This is significantly worse than the sharp rate δ H+β−1 , valid for all p ≥ 1, provided in Theorem 1.
On the positive side, the Garsia-Rodemich-Rumsey inequality still provides the required convergence (despite poor rate of convergence) with very modest assumptions, namely, hypercontractivity and Hölder continuity, on the underlying process ω. This fact extends our results to cover a rather rich class of stochastic processes. For example, all Hölder continuous Gaussian processes fall into this category, see ( [34] Theorem 1). Interesting non-Gaussian examples include kth-order Hermite processes. They share many properties with the fBm including covariance structure, Hölder continuity, and self-similarity, though they are not Gaussian processes and instead belong to the kth chaos (see, e.g., [39] for definition and details). In this case, (43) and (44) are both valid, and consequently, the results of this paper provide stationary approximations to Hermite processes such that the corresponding solutions of stochastic differential equations converge as well. It is worth noting that, to the best of our knowledge, stochastic differential equations driven by Hermite processes have not been extensively studied in the literature.
Finally we emphasise that the stationary Wong-Zakai approximation and the exact L p (Ω)-error provided in Proposition 1 are valid for all H ∈ (0, 1), while convergence of solutions to SDEs, studied in Section 4 (cf. Theorem 1), require H > 1/2. This provides a natural question for future research on the convergence of solutions in the region H ∈ (0, 1/2), in which case rough path theory or some other method of integration has to be considered.