Next Article in Journal
Chaotic van der Pol Oscillator Control Algorithm Comparison
Previous Article in Journal
Existence for Nonlinear Fourth-Order Two-Point Boundary Value Problems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Moderate Averaged Deviations for a Multi-Scale System with Jumps and Memory

1
Mathematics Department, Universidade Estadual de Campinas, Campinas 13081-970, SP, Brazil
2
ParisTech Applied Mathematics Department, ENSTA, 828 Boulevard des Maréchaux, 91120 Palaiseau, France
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Dynamics 2023, 3(1), 171-201; https://doi.org/10.3390/dynamics3010011
Submission received: 31 January 2023 / Revised: 28 February 2023 / Accepted: 7 March 2023 / Published: 14 March 2023
(This article belongs to the Topic Advances in Nonlinear Dynamics: Methods and Applications)

Abstract

:
This work studies a two-time-scale functional system given by two jump diffusions under the scale separation by a small parameter ε 0 . The coefficients of the equations that govern the dynamics of the system depend on the segment process of the slow variable (responsible for capturing delay effects on the slow component) and on the state of the fast variable. We derive a moderate deviation principle for the slow component of the system in the small noise limit using the weak convergence approach. The rate function is written in terms of the averaged dynamics associated with the multi-scale system. The core of the proof of the moderate deviation principle is the establishment of an averaging principle for the auxiliary controlled processes associated with the slow variable in the framework of the weak convergence approach. The controlled version of the averaging principle for the jump multi-scale diffusion relies on a discretization method inspired by the classical Khasminkii’s averaging principle.

1. Introduction

With a fixed terminal time T > 0 and a certain delay τ > 0 , we consider in the small noise limit ε 0 the two-time scale stochastic system given for any t [ 0 , T ] by
d X ε ( t ) = a ( X t ε , Y ε ( t ) ) d t + ε σ ( X t ε ) d B 1 ( t ) + X ε c ( X t ε , z ) N ˜ 1 ε ( t , d z ) ; d Y ε ( t ) = 1 ε f ( X t ε , Y ε ( t ) ) d t + 1 ε g ( X t ε , Y ε ( t ) ) d B 2 ( t ) + X h ( X t ε , Y ε ( t ) , z ) N ˜ 1 ε ( t , d z ) .
For every ε > 0 , the stochastic process ( X ε ( t ) , Y ε ( t ) ) t [ 0 , T ] takes values in R n : = R d × R k . The initial datum is ( X 0 ε , Y ε ( 0 ) ) = ( χ , y ) where χ is a given continuous function from [ τ , 0 ] to R d (initial delay segment) and y R k . The processes X ε and Y ε are usually designated in the literature as, respectively, the slow variable and the fast variable of the multi-scale stochastic system (1). We draw the reader’s attention to the use of the notation X t ε for the segment process, i.e., X t ε : = { X ε ( t + θ ) | θ [ τ , 0 ] } for any t 0 . We refer the reader to Chapters 5 and 6 of the book [1] for an introduction to the subject of stochastic functional differential equations with Brownian noise and to [2] for the study of stochastic functional differential equations with jumps. The space of the jump increments X is Euclidean, the process B = ( B 1 , B 2 ) is a standard Brownian motion (BM for short) with values in R n with first component B 1 being a standard BM with values in R d and the second component B 2 being an independent R k -valued standard BM. For every ε > 0 , the random measure N ˜ 1 ε is an independent compensated Poisson random measure with intensity given by d s 1 ε ν ( d z ) , where d s stands for the Lebesgue measure on the real line and ν is a Lévy measure on X . In this work, we consider ν possibly with infinite total mass but satisfying an exponential integrability condition that reads as the big jumps of the underlying Lévy process having exponential moments of order 2. The assumptions on the coefficients of (1) and on the measure ν will be specified with full rigour in the following section.
Multi-scale stochastic systems such as (1) are nowadays very popular in applied mathematical and physical disciplines since they are successful models for phenomena exhibiting different levels of heterogeneity/homogeneity that can be asymptotically categorized by scaling. This technique of understanding diversity exploits the decomposition of the phase space of the model in two sets of variables, the ones with slow degrees of freedom and the ones with fast degrees of freedom through a separation scale given by an intensity parameter measuring this degree of heterogeneity/homogeneity. We refer the reader to [3] and the monograph [4] for an introduction to the subject. Typical examples are multi-factor stochastic volatility models in finance [5,6] and the dynamics of proxy-data in climatology [7] where climatic transitions are understood within the distinction between slow and fast variables that encode different factors used to build statistical parametrizations. In the description of those climatic models, short/large time scales must be taken into consideration (e.g., daily weather forecast vs climatic prediction) in order to see interesting phenomena such as metastability of the slow variable from an equilibrium state of the deterministic dynamics (cf. Appendix in [7,8,9]). Often in these multi-scale climatic models, the slow variable quantifies data related to large time scales (e.g., climatic data). Multi-scale stochastic systems of the type (1) offer the mathematical formalism necessary to capture more realistic attributes of the underlying stochastic climate model. The paradigmatic example in climate dynamics is the coupling of ocean temperature models (slow variable) with the atmospheric Lorentz equations (fast variable). We refer the reader for more details to [10]. The presence of an underlying Lévy process drives the stochastic dynamics of (1) in small noise models abrupt climate transitions. A typical example is given by the Daansgard–Oeschger events that show statistical evidence of underlying jump noise signals (cf. Chapter 10 in [10,11,12,13]). The dependence of the coefficients of (1) on the segment process of the slow variable models the memory effects exhibited by energy balance models such as the ones constructed in [14].
This type of multi-scale system is highly complex and difficult to analyze or simulate. It is desirable to approximate in a suitable sense the dynamics of the slow variable by some simpler dynamical system. The idea of the averaging principle performed first by Khasminkii in [15] is the following. Under strong dissipativity assumptions concerning the coefficients of the fast variable that ensure the existence of a unique invariant measure μ ζ for the fast variable process with frozen slow variable ζ and such that a certain ergodic property holds for the mixing coefficient a with respect to (w.r.t.) its average against μ ζ (cf. Proposition 3)
a ¯ ( ζ ) : = R k a ( ζ , y ) μ ζ ( d y )
we have that the (strong) averaging principle states that for any T > 0 and δ > 0 , one has
lim ε 0 P sup t [ 0 , T ] | X ε ( t ) X ¯ 0 ( t ) | > δ = 0 ,
where X ¯ 0 is the unique solution of the functional averaged differential equation
d d t X ¯ 0 ( t ) = a ¯ ( X ¯ t 0 ) , t [ 0 , T ] ; X ¯ 0 0 = χ .
The averaging principle has applications to problems in celestial (stochastic) mechanics (cf. Chapter 7 in [16]) and climatic energy balanced models (cf. [9]) among others and has a rich and diverse history in the literature. Khasminkii’s technique was introduced in [15] and later implemented by Mark Freidlin [17] and Veretennikov in [18] in different contexts, finding huge applicability in a diverse range of problems. We refer the reader to the following exemplary but not exhaustive works on weak and strong averaging principles: ref. [19,20,21] concerning multi-scale systems constituted by stochastic partial differential equations (SPDEs for short) driven by space time white noise; ref. [22,23,24,25] for multi-scale (finite and infinite dimensional) systems constituted by jump diffusions; and ref. [26,27] for stochastic dynamical systems with coefficients functionally dependent with delay. Although the averaging principle (3) yields an approximation result for small ε > 0 of the slow variable process by the averaged dynamics of X ¯ , nothing is said on the rate of convergence. Large and moderate deviation types of statements provide sharper estimates within the identification of a rate of convergence for the limit (3) in an exponentially small scale in ε 0 and in terms of a deterministic quantity designated good rate function. We refer the reader to [28,29,30,31] for stochastic averaging under the large deviations regime and, respectively, to [32,33,34] for averaging under moderate deviation regimes.
The aim of this article is to derive a moderate deviation principle (MDP for short) for ( X ε ) ε > 0 as ε 0 . More precisely, we will study deviations of X ε from the averaged dynamical system X ¯ ; that is,
Z ε : = X ε X ¯ 0 d ( ε ) as ε 0 ,
for certain families of magnitude scales d ( ε ) such that d ( ε ) 0 and b ( ε ) : = ε d 2 ( ε ) 0 as ε 0 . We fix θ 1 2 , 1 and let b ( ε ) : = ε θ , ε > 0 . The restrictions on the range of θ are due to parametric choices that are used in the course of the proof. This can be appreciated in the course of the proof of the technical but crucial Lemma 2 in Appendix A. Although we impose restrictions on the magnitudes d ( ε ) as stated above, the free parameter θ 1 2 , 1 still covers a big range of moderate deviation intermediary regimes. Assuming specific hypotheses on the coefficients that guarantee that a ¯ defined in (2) exists, it is Fréchet differentiable with Lipschitz derivative, and that the Lévy measure ν satisfies a certain exponential integrability property, we prove that the family ( Z ε ) ε > 0 satisfies a moderate deviation principle with speed b ( ε ) 0 in D ( [ 0 , T ] ; R d ) , the space of cádlág functions endowed with the Skorokhod topology and the good rate function I : D ( [ 0 , T ] ; R d ) [ 0 , ] given by
I ( η ) : = inf ( f , h ) L 2 ( [ 0 , T ] ) × L 2 ( ν d s ) 1 2 0 T | f ( s ) | 2 d s + 0 T | h ( s , z ) | 2 ν ( d z ) d s ,
where for every ( f , h ) L 2 ( [ 0 , T ] ) × L 2 ( ν d s ) the function η C ( [ τ , T ] ; R d ) solves uniquely the skeleton equation:
η ( t ) = 0 t D a ¯ ( X ¯ s 0 ) η s d s + 0 t σ ( X ¯ s 0 ) f ( s ) d s + 0 t X c ( X ¯ s 0 , z ) h ( s , z ) ν ( d z ) d s , t [ 0 , T ] ; η 0 = 0
and the function X ¯ 0 C ( [ τ , T ] ; R d ) is the unique solution of (4). Here, the coefficients a, σ and c are the coefficients of the stochastic Equation (1).
This means that the functional I has compact sublevel sets { I c } in the Skorokhod topology for any c 0 and that for any open set G B ( D ( [ 0 , T ] ) ; R d ) ) and closed set F B ( D ( [ 0 , T ] ; R d ) ) the following holds:
lim inf ε 0 ε θ ln P ( Z ε G ) inf η G I ( η ) and lim sup ε 0 ε θ ln P ( Z ε F ) inf η F I ( η ) .
We stress that the moderate deviation regime of speed b ( ε ) = ε θ , θ 1 2 , 1 , is an intermediary regime between the central limit approximation d ( ε ) = ε and the large deviation regime d ( ε ) = 1 . The moderate deviation regime is a very desirable asymptotic regime for the sake of applications since the rate function involves a quadratic functional which is often easier to use in applied problems in comparison with the more involved forms of the rate function used in large deviation statements. We refer as examples to ref. [35,36] for the application of moderate deviation principles in finance, to ref. [37] in statistics and to ref. [38] where the moderate deviation regime is used to study asymptotics of exit time results for discrete random dynamical systems.
In order to prove our result, we use the weak convergence approach of Dupuis, Ellis, Budhiraja and collaborators that rely on the equivalence in Polish spaces between the definition of the large deviation principle and the variational principle nowadays known in the literature as the Laplace–Varadhan principle. Initially, Fleming applied in [39,40] methods of stochastic control to large deviation problems. The control-theoretical approach was carried out later in order to derive variational formulas for Laplace functionals of Markov processes in different contexts (cf. [41]). In [42], the authors derive a sufficient condition for large deviation principles (LDPs for short) for Brownian diffusions and later for jump diffusions in [43,44] through the establishment of variational formulas for Laplace functionals of Markov processes. We refer the reader to the recent book [45] for an up-to-date introduction to the subject. In [46] Budhiraja, Dupuis and Ganguly derive a sufficient condition for an MDP that was successfully applied in [47,48] to the study of MDPS for SPDEs. The literature on large/moderate deviation principles for stochastic differential equations with delay is not so extensive such as in other domains of applications. We refer the reader to the works [49,50] where the authors apply Freidlin–Wentzell types of LDPs to the study of the first exit time problem in the small noise limit for Gaussian diffusions with delay. For the application of the weak convergence approach in the establishment of MDPs to stochastic differential delay equations, we mention the works [51,52].
  • Strategy of the proof.
The proof of the main result of this work follows from an abstract sufficient condition for moderate deviation principles stated as Theorem 9.9 in [45]. In our case, the application of this abstract condition is not straightforward due to the coupling between the slow variable X ε and the fast variable Y ε in (1) with different scaling orders in ε 0 .
More precisely, the difficult part is to prove directly the following. Fix β ( 0 , 1 ) , M 0 , two families of random variables ( ξ ε ) ε > 0 and ( ψ ε : = φ ε 1 d ( ε ) ) ε > 0 such that for any ε > 0 one has 0 T | ξ ε ( s ) | 2 d s M d 2 ( ε ) , where φ ε 0 satisfies 0 T X ( φ ε ( s , z ) ln φ ε ( s , z ) φ ε ( s , z ) + 1 ) ν ( d z ) d s M   P -a.s. obeying the following convergences in law, ξ ε ξ in the L 2 -weak topology and ψ ε 1 { | ψ ε | β d ( ε ) } ψ in some ball of L 2 ( ν d s ) equipped with the respective L 2 -weak topology. Consider the family Z ε : = X ε X ¯ 0 d ( ε ) , ε > 0 , where ( X ε ) ε > 0 is defined for every ε > 0 and t [ 0 , T ] by
X ε ( t ) = ξ ( 0 ) + 0 t a ( X s ε , Y ε ( s ) ) + σ ( X s ε ) ξ 1 ε ( s ) + X c ( X s ε , z ) ( φ ε ( s , z ) 1 ) ν ( d z ) d s + ε 0 t σ ( X s ε ) d B 1 ( s ) + ε 0 t X c ( X s ε , z ) N ˜ 1 ε φ ε ( d s , d z ) ; X 0 ε = ξ .
and
Y ε ( t ) = y + 1 ε 0 t f ( X s ε , Y ε ( s ) ) + g ( X s ε , Y ε ( s ) ) ξ 2 ε ( s ) + X h ( X s ε , Y ε ( s ) , z ) ( φ ε ( s , z ) 1 ) ν ( d z ) d s + 1 ε 0 t g ( X s ε , Y ε ( s ) ) d B 2 ( s ) + 0 t X h ( X s ε , Y ε ( s ) , z ) N ˜ 1 ε φ ε ( d s , d z ) ; Y 0 ε = y ;
where for any ε > 0 the random measure N ˜ 1 ε φ ε is a controlled random measure that under a change of probability measure has the same law of N ˜ 1 ε under the original probability measure. This will be rigorously stated in Section 3.
Under the following setting, the main task in the derivation of the MDP is to prove that Z ε Z ¯ where Z ¯ solves (5) uniquely in C ( [ τ , T ] ; R d ) for the control ( f , g ) = ( ξ , ψ ) L 2 ( [ 0 , T ] ) × L 2 ( ν d s ) . In order to prove that convergence in law, we show that the family ( X ε ) ε > 0 satisfies a tightened averaging principle, i.e., for every δ > 0 the following holds
lim sup ε 0 P sup t [ 0 , T ] | X ε ( t ) X ¯ ε ( t ) | > δ d ( ε ) = 0 ,
where ( X ¯ ε ) ε > 0 is defined for every ε > 0 and t [ 0 , T ] by
X ¯ ε ( t ) = χ ( 0 ) + 0 t a ¯ ( X ¯ s ε ) + σ ( X ¯ s ε ) ξ 1 ε ( s ) + X c ( X ¯ s ε , z ) ( φ ε ( s , z ) 1 ) ν ( d z ) d s + ε 0 t σ ( X ¯ s ε ) d B 1 ( s ) + ε 0 t X c ( X ¯ s ε , z ) N ˜ 1 ε φ ε ( d s , d z ) ; X ¯ 0 ε = ξ .
This will imply by Slutzky’s theorem (Theorem 4.1 in [53]) that ( Z ε ) ε > 0 has the same weak limit of ( Z ¯ ε ) ε > 0 where Z ¯ ε : = X ¯ ε X ¯ 0 d ( ε ) , ε > 0 . Therefore, we are led to the (easier) task to show that Z ¯ ε Z ¯ (since the dynamics of (9) is decoupled from the dynamics of the fast variable of the original stochastic system (1)).
The proof that Z ¯ ε Z ¯ as ε 0 relies on classical arguments of weak convergence. We use localization techniques in order to obtain good estimates for the second moment of the processes in combination with Bernstein’s inequality for càdlàg local martingales given in the form of Theorem 3.3 of [54] implying the tightness of the respective laws. Hence, the relative compactness of the laws follows, yielding, due to Skorohod’s representation together with the well posedness of the skeleton Equation (5), the desired conclusion.
The proof of the tightened controlled averaging principle (8) is inspired on the classical Khasminkii technique introduced in [15]. In a nutshell, the procedure relies on a discretization of the time interval [ 0 , T ] and the delay initial interval [ τ , 0 ] in a finite number of intervals with the same length Δ ( ε ) 0 as ε 0 satisfying some growth conditions that will interplay with the ergodic properties of the averaged dynamics via the construction of auxiliary processes ( X ^ ε ) ε > 0 and ( Y ^ ε ) ε > 0 . The construction of the auxiliary processes is not a straightforward generalization of the Khaminkii type of discretization used to prove the usual strong averaging principle. In our setting, we need to build stable not-straightforward discretizations ( X ^ ε ) ε > 0 and ( Y ^ ε ) ε > 0 in order to deal with the nonlocal integral terms that appear in the structure of the respective equations of ( X ε ) ε > 0 and ( Y ε ) ε > 0 . The proof of (8) builds heavily on the derivation of stable estimates for the deviations of the segment process ( X ^ t ε ) t [ 0 , T ] from the slow variable’s segment ( X t ε ) t [ 0 , T ] and, respectively, the deviations of the approximation ( Y ^ ε ( t ) ) t [ 0 , T ] from the fast variable controlled process ( Y ε ( t ) ) t [ 0 , T ] . We derive asymptotic bounds in ε > 0 for the second moment of the deviations of the fast variable from its discretization in contrast with the way we estimate the respective deviations of the slow segment from its approximation. Due to dependence on the segment process given in the dynamics of ( X ε ) ε > 0 , it turns out to be better to control the probability of the slow component deviations for the purpose of obtaining (8). This is a technical but major distinction of the technique for obtaining the strong controlled averaging principle (8) in comparison with the usual techniques available in the literature.
Our main result shows in particular that ( X ε ) ε > 0 obeys the same moderate deviation principle of ( X ¯ ε ) ε > 0 where we define the averaged process X ¯ ε for every ε > 0 and t [ 0 , T ] by
X ¯ ε ( t ) = ζ ( 0 ) + 0 t a ¯ ( X ¯ s ε ) d s + ε 0 t σ ( X ¯ s ε ) d B 1 ( s ) + ε 0 t X c ( X ¯ s ε , z ) N ˜ 1 ε ( d s , d z ) .
One could firstly derive the moderate deviation principle for ( X ¯ ε ) ε > 0 and secondly show that the families ( X ε ) ε > 0 and ( X ¯ ε ) ε > 0 are exponentially equivalent, i.e., for every δ > 0 we have
lim ε 0 ε d 2 ( ε ) ln P sup 0 t T | X ε ( t ) X ¯ ε ( t ) d ( ε ) | > δ = .
This would imply that ( X ε ) ε > 0 obeys the same MDP of ( X ¯ ε ) ε > 0 as ε 0 . However, verifying the exponential equivalence of those families is in general hard. The reasoning employed in this work illustrates the robustness of the weak convergence approach, providing a way to reduce the proof of the MDP to the verification of properties concerning the continuity and tightness of certain auxiliary processes associated with ( X ε ) ε > 0 . Such reduction of complexity in such an endeavour can be appreciated immediately by the contrast between the 0 scale of the limit (8) with the exponential negligibility demanded in the establishment of the limit (10).
  • Notation.
The arrow ⇒ means convergence in distribution. Throughout the article, we use when convenient the shorthand notation A ( ε ) ε B ( ε ) to mean that there exists a constant c > 0 independent of ε > 0 and ε 0 > 0 such that A ( ε ) c B ( ε ) for every ε < ε 0 . We write A ( ε ) ε B ( ε ) as ε 0 to mean that A ( ε ) ε B ( ε ) and B ( ε ) ε A ( ε ) as ε 0 .
  • Outline of the paper.
In Section 2, we state with full detail the probabilistic framework and the hypothesis concerning the coefficients of (1) in order to state with full rigour the already announced MDP for the family ( Z ε ) ε > 0 . We finish that section with some examples. Section 3 contains the proof of the main result following the already announced strategy with full detail. Appendix A contains for the reader’s convenience technical auxiliary results that can be skipped in a first reading.

2. Preliminaries and Statement of the Main Theorem

2.1. The Probabilistic and Functional Setup—The Averaged Dynamics

2.1.1. The Probabilistic Setup and Notation

We follow extensively the probabilistic setup and the notation introduced by Budhiraja, Dupuis, Maroulas and collaborators in [43,44,46] and systematized in [45]. For any S topological space, we denote by B ( S ) its Borel σ -algebra. Fix T > 0 , n = d + k with d , k N and let W = C ( [ 0 , T ] ; R n ) endowed with the topology of the uniform convergence which turns out to be a Polish space. Let X = R d \ { 0 } and M be the space of locally finite measures defined on ( X , B ( X ) ) . We endow M with the weakest topology such that for every f C c ( X ) (the space of compactly supported continuous functions) the function ν ν , f : = X f ( u ) ν ( d u ) , ν M , is continuous. This topology is known as the vague topology and can be metrized such that M turns out to be a Polish space. We refer the reader to [43].
Fix a measure ν M and let ν T = d s ν where d s is the Lebesgue measure on [ 0 , T ] . Consider the product space V = W × M and denote by P the unique probability measure on ( V , B ( V ) ) under which the first projection B : V W , B ( β , m ) = β is a standard Brownian motion with values in R n and N : V M , N ( β , m ) : = m is a Poisson random measure with intensity measure ν T . The corresponding expectation operator will be denoted by E . We refer the reader to Theorem I.9.1 in [55].
Let Y : = X × [ 0 , ) , Y T : = [ 0 , T ] × Y ; write M ¯ for the space of the locally finite measures defined on Y T when equipped with its Borel σ -algebra and V ¯ : = W × M ¯ . In a slight abuse of notation and analogously to what was said for M , the space M ¯ turns out to be also a Polish space and there exists a unique probability measure P ¯ defined on ( V ¯ , B ( V ¯ ) ) such that the maps B : V ¯ W , B ( β , m ¯ ) : = β is a standard Brownian motion with values in R n and N ¯ : V ¯ M ¯ , N ¯ ( β , m ¯ ) : = m ¯ is a Poisson random measure with values on B ( R d × R d \ { 0 } × [ 0 , ) ) and intensity measure given by d s ν d r , where d r stands for the Lebesgue measure on ( [ 0 , ) ; B ( [ 0 , ) ) .
For every ε > 0 , we consider N 1 ε the Poisson random measure defined on the probability space ( V , B ( V ) ) with intensity measure given by 1 ε d s ν d r and N ˜ 1 ε for its compensated counterpart. We also regard when necessary the object N 1 ε as a controlled random measure on ( V ¯ , B ( V ¯ ) ) (and therefore B ( V ¯ ) -measurable) under P ¯ by the identity
N 1 ε ( ( 0 , t ] × U ) : = 0 t U 0 1 [ 0 , 1 ε ] ( r ) N ¯ ( d s , d x , d r ) , t [ 0 , T ] , U B ( X ) .
We remark that the space Y : = X × [ 0 , ) takes into account the jumps and the frequencies of the underlying Poisson random measure N and refer the reader to [43] for more details.
For any t [ 0 , T ] , define
F t : = σ { N ¯ ( ( 0 , s ] × A ) ; B ( s ) | 0 s t , A B ( Y ) }
and denote by F ¯ : = { F ¯ t } t [ 0 , T ] the completion of F : = { F t } t [ 0 , T ] under P ¯ . Consider P ¯ the predictable σ -field on [ 0 , T ] × V ¯ with the filtration F ¯ on ( V ¯ , B ( V ¯ ) ) .
We make the following assumption on ν M .
Hypothesis 1. 
The measure ν M is a Lévy measure on ( R d \ { 0 } , B ( R d \ { 0 } ) ) , i.e., such that 0 < | z | < 1 | z | 2 ν ( d z ) < and satisfying
| z | 1 e α | z | 2 ν ( d z ) < , f o r s o m e α > 1 .
Remark 1. 
We remark that the assumption of Gaussian tails (12) is paradigmatic within the use of weak convergence approach arguments for the derivation of moderate deviation principles for jump processes. It is used in the pioneer work [46] and further extensive follow up works that exist in the literature. The assumption of exponential tails for laws that obey large deviation principles is a classical ansatz in the literature of large deviation principles. We cite as reference the Donsker–Varadhan theorem stated as Theorem 3.34 in the monography [56]. The assumption of Gaussian tails (12) for ν is sufficient to the proof of Lemma A1 in Appendix A which turns out to be a technical fundamental intermediary result that is fundamental in the derivation of the moderate deviation principle for ( X ε ) ε > 0 . This restriction still captures a rich class of Lévy measures ν, allowing the occurrence of infinitely small jumps as is exhibited in Section 2.3. We refer the reader to [57] for a discussion of the large deviation principle for symmetric stable processes that uses a very different approach than the one we use.
  • The Space of the Delays and the Segment Function
Fix now τ > 0 . Given a path x : [ τ , T ] R d and t 0 , we use the notation x t for the segment path defined as x t ( θ ) : = x ( t + θ ) , θ [ τ , 0 ] . Denote by C ( [ τ , T ] ; R d ) the space of continuous paths equipped with the uniform norm. We write C : = C ( [ τ , 0 ] ; R d ) . Let D ( [ τ , T ] ; R d ) be the space of the càdlàg functions equipped with the topology inherited by the J 1 -metric known as the Skorokhod topology (cf. Chapter 3-p. 111 in [53]). We write D : = D ( [ τ , 0 ] ; R d ) . The space D ( [ τ , T ] ; R d ) turn out to be Polish under this metric. We refer the reader to Theorems 12.1 and 12.2 in [53] for more details. For any x D ( [ τ , T ] ; R d ) , we write | | x t | | : = sup τ s t | x ( s ) | , t 0 .

2.1.2. The Multiscale System

For every T > 0 , τ > 0 and ε > 0 , we consider the following system of stochastic differential equations,
X ε ( t ) = X ε ( 0 ) + 0 t a ( X s ε , Y ε ( s ) ) d s + ε 0 t σ ( X s ε ) d B 1 ( s ) + ε 0 t X c ( X s ε , z ) N ˜ 1 ε ( d s , d z ) ; Y ε ( t ) = y + 1 ε 0 t f ( X s ε , Y ε ( s ) ) d s + 1 ε 0 t g ( X s ε , Y ε ( s ) ) d B 2 ( s ) + 0 t X h ( X s ε , Y ε ( s ) , z ) N ˜ 1 ε ( d s , d z ) , t [ 0 , T ] ;
subject to the initial data
X 0 ε = χ C , Y ε ( 0 ) = y R k ,
where we write ( B ( t ) ) t [ 0 , T ] = ( B 1 ( t ) , B 2 ( t ) ) t [ 0 , T ] with ( B 1 ( t ) ) t [ 0 , T ] and ( B 2 ( t ) ) t [ 0 , T ] , two independent standard Brownian motions with values in R d and R k , respectively. We stress that the multi-scale system (13) has slow and fast components, respectively, affected by different Brownian signals in small intensity ε and by the same jump noise signal also in small intensity ε > 0 but accelerated in inverse proportion. While the process ( B 1 , B 2 ) is also a BM in the space R d × k due to the independence of each component the same does not hold for Poisson random measures in the respective product space of measures. For this reason, it is not clear how to use the weak convergence approach developed in [46] that builds in the derivation of a variational formula for functionals of Poisson random measures established in [43]. In order to guarantee the existence and uniqueness of the solution for (13), we assume that its coefficients are deterministic measurable functions a : D × R n R d , σ : D R d × d , c : D × X R d , f : D × R n R n × n , g : D × R k R n × n and h : D × R n × X R n satisfying the following.
Hypothesis 2. 
1. 
There exists L > 0 such that for every φ , φ ˜ D and y , y ˜ R n the following holds
| a ( φ , y ) a ( φ ˜ , y ˜ ) | L sup t [ τ , 0 ] | φ ( t ) φ ˜ ( t ) | + | y y ˜ | | σ ( φ ) σ ( φ ˜ ) | L sup t [ τ , 0 ] | φ ( t ) φ ˜ ( t ) | X | c ( φ , z ) c ( φ ˜ , z ) | ν ( d z ) L sup t [ τ , 0 ] | φ ( t ) φ ˜ ( t ) | | f ( φ , y ) f ( φ ˜ , y ˜ ) | L sup t [ τ , 0 ] | φ ( t ) φ ˜ ( t ) | + | y y ˜ | | g ( φ , y ) g ( φ ˜ , y ˜ ) | L sup t [ τ , 0 ] | φ ( t ) φ ˜ ( t ) | + | y y ˜ | X | h ( φ , y , z ) h ( φ ˜ , y ˜ , z ) | ν ( d z ) L sup t [ τ , 0 ] | φ ( t ) φ ˜ ( t ) | + | y y ˜ | .
2. 
The functions c ( 0 , z ) , h ( 0 , 0 , z ) are in L 1 ( ν ) .
Remark 2. 
Hypothesis 2 implies that the coefficients have sublinear growth; i.e., there exists L 1 > 0 such that, for any φ D and y R n ,
| a ( φ , y ) | L 1 1 + sup t [ τ , 0 ] | φ ( t ) | + | y | | σ ( φ ) | L 1 1 + sup t [ τ , 0 ] | φ ( t ) | X | c ( φ , z ) | ν ( d z ) L 1 1 + sup t [ τ , 0 ] | φ ( t ) | | f ( φ , y ) | L 1 1 + sup t [ τ , 0 ] | φ ( t ) | + | y | | g ( φ , y ) | L 1 1 + sup t [ τ , 0 ] | φ ( t ) | + | y | X | h ( φ , y , z ) | ν ( d z ) L 1 1 + sup t [ τ , 0 ] | φ ( t ) | + | y | .
The following assumption on the initial delay segment ζ given in (14) is of great importance in the establishment of stable estimates for which we derive (8).
Hypothesis 3. 
The function χ C is Lipschitz continuous with Lipschitz constant λ > 0 , i.e.,
| χ ( θ 1 ) χ ( θ 2 ) | λ | θ 1 θ 2 | , f o r e v e r y θ 1 , θ 2 [ τ , 0 ] .
Definition 1. 
Given T > 0 , τ > 0 , ε > 0 , ζ C and y R k we consider the stochastic basis ( V ¯ , B ( V ¯ ) , F ¯ , P ) . A strong solution of (13) with initial datum (14) is a stochastic process ( X ε , Y ε ) : = { ( X ε ( t ) , Y ε ( t ) ) } t [ τ , T ] such that X 0 ε = χ , Y ε ( 0 ) = y , X ε ( t ) is F 0 -measurable for any t [ τ , 0 ] , ( X ε ( t ) , Y ε ( t ) ) t [ 0 , T ] is F ¯ -adapted and solves (13) P -a.s.
We write F t = F 0 for any t [ τ , 0 ] . For any t [ 0 , T ] and ε > 0 , the random variables X ε ( t ) R d and Y ε ( t ) R k are called slow and fast variables, respectively, under the scale separation by the parameter ε > 0 in the vanishing limit ε 0 . We underline that the stochastic differential equation for the slow variable X ε lifts the problem to an infinite-dimensional setting due to the dependence of the coefficients in terms of the segment path process.
Given T , τ > 0 , m N and F ¯ : = { F ¯ t } t [ 0 , T ] , we define the space
  • S F ¯ 2 ( [ τ , T ] ; R k ) : = { φ : Ω × [ τ , T ] R k | φ is F ¯ adapted with c à dl à g paths such that E sup τ u T | φ ( u ) | 2 < } .
The existence and uniqueness of the solution process ( X ε ( t ) , Y ε ( t ) ) t [ τ , T ] S F ¯ 2 ( [ τ , T ] ; R d ) × S F ¯ 2 ( [ τ , T ] ; R n ) of (13) with initial data (14) follows from Lemma V.2 and Theorem V.7 of [58], using the convention that Y ε ( t ) = y for all t [ τ , 0 ] . This is the content of the following result.
Theorem 1. 
Fix T , τ , ε > 0 and y R k . Let us assume that Hypotheses 1, 2 and 3 hold for some ν M and χ C . Then, there exists a stochastic process
( X ε ( t ) , Y ε ( t ) ) t [ τ , T ] S F ¯ 2 ( [ τ , T ] ; R d ) × S F ¯ 2 ( [ τ , T ] ; R n )
that solves uniquely (13) in the sense of Definition (1).

2.1.3. The Averaged Dynamics

We make the further dissipativity and boundedness assumptions on the coefficients of (13) that yield the existence and uniqueness of solution for the averaged dynamics given by (4) and some stable a priori estimates that will be crucial in the derivation of the result announced in the Introduction.
Hypothesis 4. 
1. 
The function a satisfies a ( 0 , y ) = 0 for any y R k and there exists Λ > 0 such that
| g ( ζ , y ) | Λ | h ( ζ , y , z ) | Λ | z | , f o r e v e r y ζ D , y R k , z X .
2. 
There exist constants β 1 , β 2 > 0 , such that, for any ζ , ζ 1 D , y , y ˜ × R k one has
2 y , f ( ζ , y ) + | g ( ζ , y ) | 2 + X | h ( ζ , y , z ) | 2 ν ( d z ) β 1 | y | 2 + β 2 | | ζ | | 2 ;
2 y y ˜ , f ( ζ , y ) f ( ζ , y ˜ ) + | g ( ζ , y ) g ( ζ , y ˜ ) | 2 + X | h ( φ , y , z ) h ( φ , y ˜ , z ) | 2 ν ( d z ) β 1 | y y ˜ | 2 + β 2 | | ζ | | 2
and
2 y y ˜ , f ( ζ , y ) f ( ζ 1 , y ˜ ) β 1 | y y ˜ | 2 + β 2 | | ζ ζ 1 | | 2
Remark 3. 
We do not consider a more general framework than Hypotheses 1–4 to derive the moderate deviation principle for the family of slow variables ( X ε ) ε > 0 from (13). Although it would be possible to derive the same result under the setting of locally Lipschitz coefficients and the usual weaker local versions of dissipativity conditions stated in Hypothesis 4. The reason builds on how the weak convergence approach bypasses the usual verification of exponential tightness through the verification of tightness for controlled modifications of the processes X ε under which the use of the usual localization probabilistic techniques works well. Attaining such a degree of generality at the expense of a more technical text is beyond the scope of our work.
The following a priori estimates are straightforward and we omit their proofs.
Proposition 1. 
Fix T , τ > 0 and y R k . Let Hypothesis 1–4 hold for some ν M and χ C . There exists a constant C 1 > 0 independent of ε > 0 such that for all 0 < ε < 1 we have
E ¯ sup τ t T | X ε ( t ) | 2 + sup 0 t T E ¯ | Y ε ( t ) | 2 C 1 .
We consider the equation for the fast variable of (13) whenever the slow component is frozen and given by ζ D in the regime ε = 1 , i.e., fix y R k ; for every t 0 , let
Y ζ , y ( t ) = y + 0 t f ( ζ , Y ζ , y ( s ) ) d s + 0 t g ( ζ , Y ζ , y ( s ) ) d B 2 ( s ) + 0 t X h ( ζ , Y ζ , y ( s ) , z ) N ˜ 1 1 ( d s , d z ) .
We assume that Hypotheses 1–4 hold. We follow [19,20] closely in the argumentation below.
With ζ D fixed, we define the transition semigroup on the space B b ( R k ) of the bounded measurable functions associated with the jump diffusion defined by the strong solution of (23) by
P t ζ f ( y ) : = E ¯ [ f ( Y ζ , y ( t ) ) ] , t 0 , y R k .
In what follows, we discuss the existence and uniqueness of an invariant measure for the family of linear operators ( P t ζ ) t 0 , i.e., a probability measure μ ζ P ( R k , B ( R k ) ) such that
R k P t ζ f ( y ) μ ζ ( d y ) = R k f ( y ) μ ζ ( d y ) , t 0 , f B b ( R k ) .
The dissipativity assumption given in (20) yields some C > 0 such that, for any T 0 0 , the following bound holds:
sup T T 0 E ¯ [ | Y ζ , y ( T ) | 2 ] C e 2 β 1 T ( 1 + | | ζ | | 2 + | y | 2 ) .
The estimate (26) implies that the family of the laws of the process { L ( Y ζ , y ( T ) ) } T T 0 is tight in P ( R k ; B ( R k ) ) when T 0 . Prokhorov’s theorem implies the existence of a weak limit μ ζ as T 0 and an indirect use of Krylov–Bogliobov’s theorem (Theorem 7.1 in [59]) asserts that μ ζ is an invariant measure of ( P t ζ ) t 0 , in the sense of (25). The setting of assumptions made in Hypotheses 1–4 implies that the semigroup ( P t ζ ) t 0 is irreducible. We refer the reader to Proposition 2.4 in [60]. Proposition 7.5 in [59] implies that μ ζ is the unique invariant measure. Due to the estimate (26) and the definition of μ ζ in (25), the simple application of monotone convergence shows, as in Lemma 3.4. in [20], that there exists C > 0 such that
R k | y | 2 μ ζ ( d y ) C ( 1 + | | ζ | | 2 + | y | 2 ) .
For any ζ D , we can define the averaged mixing coefficient
a ¯ ( ζ ) : = R k a ( φ , y ) μ ζ ( d y ) .
The proof of the following result concerning the Lipschitz continuity of a ¯ is straightforward. It follows in the same way the inequality (3.4) in [61].
Proposition 2. 
Fix T , τ > 0 and y R k . Let Hypothesis 1–4 hold for some ν M and χ C . Then, the function a ¯ defined by (28) is Lipschitz continuous.
Proposition 2 ensures that the averaged differential equation with initial delay data χ C ,
d d t X ¯ 0 , χ ( t ) = a ¯ ( X ¯ t 0 , χ ) , X ¯ 0 0 , χ = χ
has a unique solution X ¯ 0 , χ C ( [ τ , T ] ; R d ) .
The following proposition, which reads as a strong mixing property of the averaged coefficient a ¯ given by (28), plays a crucial role in the establishment of the moderate deviation principle for the family ( X ε ) ε > 0 since it is a fundamental ingredient in the proof of the controlled averaging principle (8). The derivation of this ergodic property follows Lemma 5.2 of [25].
Proposition 3. 
Fix T , τ > 0 and y R k . Let Hypotheses 1–4 hold for some ν M and χ C . Then, there exists some function α : [ 0 , ) [ 0 , ) such that α ( T ) 0 as T and satisfying for any t [ 0 , T ]
E ¯ | 1 T t t + T a ( ζ , Y ζ , y ( s ) ) d s a ¯ ( ζ ) | 2 α ( T ) ( 1 + | | ζ | | 2 + | y | 2 )
where the averaged coefficient a ¯ is defined by (28).

2.2. The Main Theorem

We make the further assumption on the averaged coefficient a ¯ defined by (28).
Hypothesis 5. 
The function a ¯ : D R d is Fréchet differentiable and its Fréchet derivative is a Lipschitz function, i.e., there exists some constant L 2 > 0 such that
| D a ¯ ( ζ ) D a ¯ ( ζ ¯ ) | L 2 sup τ t 0 | ζ ( t ) ζ ¯ ( t ) | , ζ , ζ ¯ D .
We define L 2 ( ν T ) : = g : [ 0 , T ] × X [ 0 , ) | 0 T X | g ( s , z ) | 2 ν ( d z ) d s < .
The main result of this work is the content of the next theorem and the reader can find its proof in the next section.
Theorem 2. 
Fix T , τ > 0 and y R k . Let Hypotheses 1–5 hold for some ν M and ζ C . Let
G 0 : L 2 ( [ 0 , T ] ; R d ) × L 2 ( ν T ) C ( [ τ , T ] ; R d )
such that
G 0 ( f , g ) = η ,
where for every ( f , g ) L 2 ( [ 0 , T ] ; R d ) × L 2 ( ν T ) the function η C ( [ τ , T ] ; R d ) solves uniquely the skeleton equation
η ( t ) = 0 t D a ¯ ( X ¯ s 0 , ζ ) η s d s + 0 t σ ( X ¯ s 0 , ζ ) f ( s ) d s + 0 t X c ( X ¯ s 0 , ζ , , z ) g ( s , z ) ν ( d z ) d s , t [ 0 , T ] η 0 = 0 .
and the function X ¯ 0 , χ C ( [ τ , T ] ; R d ) is the unique solution of (29).
For any η C ( [ τ , T ] ; R d ) we denote
G η 0 : = ( f , g ) L 2 [ 0 , T ] × L 2 ( ν T ) | G 0 ( f , g ) = η .
For any ε > 0 , let d ( ε ) = ε 1 θ 2 , for some θ 1 2 , 1 .
For every ε > 0 , let ( X ε , χ , y ( t ) , Y ε , χ , y ( t ) ) t [ τ , T ] be the unique strong solution of (13) with initial condition given by (14) and
Z ε , χ , y : = X ε , χ , y X ¯ 0 , χ , y ( t ) d ( ε ) .
Then, the family ( Z ε , χ , y ) ε > 0 defined by (33) satisfies a large deviation principle with speed b ( ε ) = ε θ 0 as ε 0 for some θ 1 2 , 1 and the good rate function
I ( η ) = inf ( f , g ) G η 0 1 2 0 T | f ( s ) | 2 d s + 0 T X | g ( s , z ) | 2 ν ( d z ) d s .
with the convention that the inf = .

2.3. Examples

  • Strongly tempered exponentially light Lévy measures.
Hypothesis 1 covers a wide class of Lévy measures and we point out the following special benchmark cases.
1.
Our setting covers the simplest case of finite intensity super-exponentially light jump measures given by ν ( d z ) = e α | z | 2 dz for some α > 1 . For every ε > 0 , the corresponding stochastic process L t ε : = 0 t X z N ˜ 1 ε ( d s , d z ) , t 0 is a compensated compound Poisson process.
2.
More generally, Hypothesis 1 covers a class of Lévy measures that mimics the class of strongly tempered exponentially light measures introduced by Rosiński in [62], however, with a Gaussian damping in order to satisfy (12). For the polar coordinate r = | z | and any A B ( X ) , we define
ν ( A ) = R d \ { 0 } 0 1 A ( r z ) e r 2 r α + 1 d r R ( d z ) , α ( 0 , 2 ) ,
for some measure R M such that R d \ { 0 } | z | α R ( d z ) < . We point out that, for every ε > 0 , the corresponding Lévy process ( L t ε ) t 0 differs from the compound Poisson process of the paragraph before not only from the fact that the corresponding jump measure has infinite total mass but also from the fact that although a compound Poisson process with positive jumps has almost surely nondecreasing paths, it does not have paths that are almost surely strictly increasing. Such measures and its corresponding processes were introduced in [63] for the study of dynamical features of stochastic equations perturbed by jump accelerated noises obeying the large deviations regime.
  • Invariant measures for the Markov semigroup associated with the fast variable.
1.
For every ε > 0 and t [ 0 , T ] , let us consider the multiscale system
d X ε ( t ) = a ( X t ε , Y ε ( t ) ) d t + ε σ ( X t ε ) d B 1 ( t ) , X 0 ε = ζ C d Y ε ( t ) = 1 2 ε Y ε ( t ) + 1 ε d B 2 ( t ) , Y ε ( 0 ) = y R ,
where B 1 and B 2 are two independent standard Brownian motions with values in R . We assume that the coefficients a and σ satisfy Hypotheses 2 and 4. For any χ C satisfying Hypothesis 3, the invariant measure of the fast variable (decoupled from the slow variable in this case)
d Y ( t ) = 1 2 Y ( t ) + d B 2 ( t ) , t 0 ,
is given by μ ( d y ) = 1 2 π e y 2 2 d y . Hence, the averaged coefficient a ¯ is given for any ζ D by
a ¯ ( ζ ) = 1 2 π R a ( ζ , y ) e y 2 2 d y .
The function a ¯ satisfies Hypothesis 5 if a is C 1 -Fréchet differentiable with respect to the first variable ζ .
2.
Fix T , τ > 0 and y R k . Let Hypotheses 1–5 hold for some ν M and χ C . For every ε > 0 and t 0 , let us consider the multiscale system (13) with d = k = 1 . We take f ( ζ , y ) = f 1 ( ζ ) y and g ( ζ , y ) = g ( ζ ) for every ζ D and y R k with f 1 ( ζ ) > 0 and g ( ζ ) > 0 for any ζ D . Fix the Lèvy measure ν ( d z ) = e | z | 2 d z and since this is a finite measure we consider the non-compensated Poisson random measure N 1 ε instead of N ˜ 1 ε . Fixed ζ D , the Markov semigroup of the the fast variable governed by the dynamics
d Y ζ , y ( t ) = f 1 ( ζ ) Y ζ , y ( t ) + g ( ζ ) d B 2 ( t ) + R \ { 0 } g ( ζ ) f 1 ( ζ ) z Y ζ , y ( t ) d N 1 ( d s , d z ) , t 0 ,
has a unique invariant distribution given by
μ ζ ( d y ) = f 1 ( ζ ) π g 2 ( ζ ) e f 1 ( ζ ) y 2 g 2 ( ζ ) d y .
The averaged coefficient a ¯ , given for any ζ D by
a ¯ ( ζ ) = R \ { 0 } a ( ζ , y ) μ ζ ( d y ) ,
satisfies Hypothesis 5 if a , f and g are C 1 -Fréchet differentiable in order ζ . This example was inspired by the examples illustrated in [30] and illustrates that the class of assumptions we make on the coefficients of (13) is not empty.

3. Proof of the Main Theorem

Throughout this section, let the standing assumptions made in Theorem 2 hold. Let
d ( ε ) = ε 1 θ 2 , ε > 0 , for some θ 1 2 , 1 .
The speed of the MDP is given by b ( ε ) : = ε d 2 ( ε ) = ε θ 0 , as ε 0 .

3.1. The Setup of the Weak Convergence Approach

  • Notation.
We follow extensively the notation introduced by Budhiraja, Dupuis and Ganguly in [46].
Let A ¯ + (resp. A ¯ ) be the class of all ( B ( X ) P ¯ ) / B ( [ 0 , ) ) (resp. ( B ( X ) P ¯ ) / B ( R ) )-measurable maps from [ 0 , T ] × X × V ¯ to [ 0 , ) (resp. R ). For φ A ¯ + , let us define a counting process N φ on X T by
N φ ( U × ( 0 , t ] ) ( ω ¯ ) : = U 0 t 0 1 [ 0 , φ ( x , s ) ( ω ¯ ) ] ( r ) N ¯ ( d x , d r , d s ) , t [ 0 , T ] , U B ( X ) .
One can think of N φ as a controlled random measure with φ selecting the intensity for the points at location x and time s in a possibly random but non-anticipating way. When φ ( x , s , m ¯ ) = θ ( 0 , ) , we write N φ = N θ . For more details, we refer the reader to [43].
Define : [ 0 , ) [ 0 , ) by
( r ) = r ln r r + 1 , r [ 0 , ) .
For any φ A ¯ + and t [ 0 , T ] , define the quantity
L t ( φ ) ( ω ¯ ) : = 0 t X ( φ ( s , z , ω ¯ ) ) ν ( d z ) d s .
This is a well defined quantity as a [ 0 , ] -valued random variable.
Let { K n } n N X be an increasing sequence of compact sets such that n = 1 K n = X . For each n N , let
A ¯ b , n : = { φ A ¯ + | for all ( t , ω ¯ ) [ 0 , T ] × V ¯ φ ( t , x , m ¯ ) 1 n , n if x K n and φ ( t , x , m ¯ ) = 1 if x K n c }
and let A ¯ b : = n N A ¯ b , n . Considering φ as a control that perturbs jump rates away from 1 when φ 1 , we see that the controls in A ¯ b are bounded and perturb only off a compact set where the bounds of the set can depend on φ .
Consider now the space of random variables
P 2 : = ξ : [ 0 , T ] × V ¯ R n | ξ is P ¯ B ( R n ) measurable such that 0 T | ξ ( s , ω ) | 2 d s < P ¯ a . s .
and set U = P 2 × A ¯ + .
For ξ P 2 define
L ˜ T ( ξ ) ( ω ¯ ) : = 1 2 0 T | ξ ( ω ¯ , s ) | 2 d s , ω ¯ V ¯ .
For a given random control u = ( ξ , φ ) U , define the energy L ¯ T ( u ) : = L ˜ T ( ξ ) + L T ( φ ) .
For any M > 0 , let
S ˜ M : = { f L 2 ( [ 0 , T ] ; R n ) | L ˜ T ( f ) M } .
Under the L 2 -weak topology S ˜ M is a compact subset of L 2 ( [ 0 , T ] ; R n ) . Throughout the rest of this work we consider S ˜ M to be endowed with this topology. Let
S M : = { g : [ 0 , T ] × X [ 0 , ) | L T ( g ) M } .
For any M > 0 and under the following identity,
S M ν T g M | ν T g ( A ) : = A g ( s , z ) ν ( d z ) d s , A B ( [ 0 , T ] × X ) ,
when considering the vague topology in M the space S M turns out to be compact. For more details, we refer the reader to Lemma 5.1 in [44].
For any ε > 0 and M > 0 , let us consider the following tightened sublevel sets
S + , ε M : = g : [ 0 , T ] × X [ 0 , ) | L T ( g ) M d 2 ( ε ) , S ε M : = h : [ 0 , T ] × X R | h : = g 1 d ( ε ) , φ S + , ε M and S ˜ ε M : = f : [ 0 , T ] R n | L ˜ T ( f ) M d 2 ( ε ) .
Define also the random sublevel sets
U + , ε M : = φ A ¯ b | φ ( . , . , ω ) S + , ε M P ¯ a . s . , U ε M : = ψ A ¯ | ψ ( . , . , ω ) S ε M P ¯ a . s . and U ˜ ε M : = ξ P 2 | ξ ( . , ω ) S ˜ ε M P ¯ a . s . .
We reserve the notation B 2 ( R ) for the closed ball of radius R > 0 in L 2 ( ν T ) and B ˜ 2 ( R ) for the closed ball in L 2 ( [ 0 , T ] ; R n ) .
Fix a given Polish space U . Given a measurable map G 0 : W × L 2 ( ν T ) U , let us write the set of fixed points of η under G 0 ,
S [ η ] : = ( f , g ) W × L 2 ( ν T ) | η = G 0 ( f , g )
and define the quadratic form
I ( η ) : = inf ( f , g ) S [ η ] 1 2 0 T | f ( s ) | 2 d s + 0 T X | g ( s , z ) | 2 ν T ( d s , d z ) , η U .
Remark 4. 
We note that a collection { ψ ε } ε > 0 A ¯ with the property that sup ε > 0 | | ψ ε | | 2 M P -a.s. for some M < is regarded as a collection of B 2 ( M ) -valued random variables where B 2 ( M ) is equipped with the weak topology on the Hilbert space L 2 ( ν T ) . Since B 2 ( M ) is weakly compact, such a collection of random variables is automatically tight. Suppose φ S + , ε M , which, we recall, means that L T ( φ ) M d 2 ( ε ) . Due to Lemma 3.2. in [46] there exists κ 2 ( 1 ) ( 0 , ) independent of ε > 0 and such that ψ 1 | ψ | 1 d ( ε ) B 2 ( M κ 2 ( 1 ) ) , where ψ : = φ 1 d ( ε ) .
The following set of conditions imply the moderate deviation regime.
Hypothesis 6. 
Let U be a Polish space. For any ε > 0 , let G ε : V U and G 0 : W × L 2 ( ν T ) U be measurable maps satisfying the following two conditions.
1.
Continuity of the limiting map on the controls.  Suppose ( f n , g n ) , ( f , g ) S ˜ M × B 2 ( M ) such that ( f n , g n ) ( f , g ) as n . Then
G 0 ( f n , g n ) G 0 ( f , g ) a s n .
2.
Weak law for the map under shifts by random tightened controls.  For every M < , let u ε : = ( ξ ε , φ ε ) U ˜ ε M × U + , ε M . For some β ( 0 , 1 ) , let us assume that ψ ε 1 { | ψ ε | < β d ( ε ) } ψ in B 2 ( M κ 2 ( 1 ) ) where ψ ε : = φ ε 1 d ( ε ) and 1 d ( ε ) ξ ε ξ as ε 0 in the weak topology of L 2 ( [ 0 , T ] ; R n ) . Then,
G ε ε B + 0 . ξ ε ( s ) d s , ε N 1 ε φ ε G 0 ( ξ , ψ ) , a s ε 0 .
The following theorem is the moderate deviation principle stated in an abstract manner and that will be applied to prove our main result.
Theorem 3. 
Suppose that for every ε > 0 the maps G ε : V U and G 0 : W × L 2 ( ν T ) U satisfy the conditions of Hypothesis 6. Then, the family { Z ε } ε > 0 defined by
Z ε : = G ε ε B , ε N 1 ε , ε > 0 , ε > 0 ,
satisfies a large deviation principle with speed b ( ε ) 0 in U with good rate function I given by (38).
Theorem 3 is a particular case of Theorem 9.9 in [45]. In what follows, we apply Theorem 3 to our setting.
Let us fix T > 0 , τ > 0 , ( ζ , y ) C × R k and for every ε > 0 let ( X ε , ζ , y ( t ) , Y ε , ζ , y ( t ) ) t [ τ , T ] be the unique strong solution of (13) with initial datum (14). For every ε > 0 , consider ( Z ε ) ε > 0 given by (33). Under the standing assumptions made at the beginning of this section, for any ε > 0 , Yamada–Watanabe’s theorem ensures the existence of a measurable map G ε : V D ( [ τ , T ] ; R d ) such that
Z ε : = G ε ( ε B , ε N 1 ε ) .
We recall that B = ( B 1 , B 2 ) is a Brownian motion in R d × k due to the independence of B 1 and B 2 and for any ε > 0 the Poisson random measure N 1 ε is independent of B 1 and B 2 and hence of B, which justifies the existence of the Ito map G ε . The proof of Theorem 2 consists in checking the conditions (1) and (2) of Hypothesis 6 for ( G ε ) ε > 0 and G 0 : W × L 2 ( ν T ) C ( [ τ , T ] ; R d ) , G 0 ( f , g ) = η , with η C ( [ τ , T ] ; R d ) defined by the skeleton Equation (32). Hence, Theorem 3 allows us to conclude.

3.2. The Skeleton Equations and the Compactness Condition

For any χ C and u = ( f , g ) L 2 ( [ 0 , T ] ; R d ) × L 2 ( ν T ) , let us denote by Z ¯ u C ( [ τ , T ] ; R d ) the unique solution of (32). By definition, we have
G 0 ( f , g ) = Z ¯ u .
Proposition 4. 
For every M < , one has that the set
K M : = G 0 ( f , g ) | ( f , g ) B ˜ 2 ( M ) × B 2 ( M )
is compact in C ( [ τ , T ] ; R d ) .
Remark 5. 
Proposition 4 is implied by the following. Fix 0 M < . Let ( f n , g n ) n N B ˜ 2 ( M ) × B 2 ( M ) such that ( f n , g n ) ( f , g ) as n weakly. Therefore,
G 0 ( f n , g n ) G 0 ( f , g ) a s n .
The proof of the sentence of Remark 5 that implies Proposition 4 is standard. We refer the reader to Lemma 4.1 in the seminal work [46].

3.3. The Weak Limit of the Controlled Auxiliary Processes

3.3.1. The Equations for the Controlled Auxiliary Processes

This section serves the purpose of verifying the second condition in Hypothesis 6 for G 0 and the family { G ε : V D ( [ τ , T ] ; R d ) } ε > 0 . For every ε > 0 , recall the random sublevel sets U ε M and U ˜ + , ε M given by (37) and let u ε : = ( ξ ε , φ ε ) U ε M × U ˜ + , ε M . Set φ ˜ ε = 1 φ ε . The definition of φ ˜ ε makes sense since one has φ ε A b P ¯ -a.s. For any t [ 0 , T ] , we define the F ¯ -martingales
E ( ξ ε ) ( t ) : = exp 0 t ξ ε ( s ) d B ( s ) 1 2 0 t | ξ ε ( s ) | 2 d s and E ( φ ˜ ε ) ( t ) : = exp 0 t X 0 1 ε ln φ ˜ ε ( s , z ) N ¯ ( d s , d z , d r ) + 0 t X 0 1 ε ( φ ˜ ε ( s , z ) + 1 ) d s ν ( d z ) d r ) .
For every t [ 0 , T ] , let E ¯ ( u ε ) ( t ) : = E ˜ ( ξ ε ) ( t ) E ( φ ˜ ε ) ( t ) . Girsanov’s theorem stated in the form of Theorem III.3.24 of [64] ensures that ( E ¯ ( u ε ) ( t ) ) t [ 0 , T ] is an F ¯ -martingale. Hence, the probability measures defined on ( V ¯ , B ( V ¯ ) ) by
Q T ε ( G ) : = G E ¯ ( u ε ) ( T ) d P ¯ , for all G B ( V ¯ )
are absolutely continuous with respect to P ¯ . Under Q T ε , the stochastic process
B ˜ ε ( t ) : = B ( s ) 0 t ξ ε ( s ) d s , t [ 0 , T ] ,
is a standard Brownian motion and ε N 1 ε φ ε is an independent random measure with the same law of ε N 1 ε under P ¯ . We recall that
N 1 ε φ ε ( ( 0 , t ] × U ) : = 0 t U 0 1 [ 0 , 1 ε φ ε ] ( r ) N ¯ ( d s , d z , d r ) .
For every ε > 0 and t [ 0 , T ] , we write ξ ε ( t ) = ( ξ 1 ε , ξ 2 ε ) ( t ) R d × R k . For any ( χ , y ) C × R k , we define the slow controlled process ( X ε ( t ) ) t [ 0 , T ] and the fast controlled process ( Y ε ( t ) ) t [ 0 , T ] given as the strong solutions of (6) and, respectively, (7) with respect to P ¯ (since Q T ε P ¯ ).
For every ε > 0 , we define ( X ¯ ε ( t ) ) t [ 0 , T ] the fast averaged controlled process as the strong solution under P ¯ of the controlled stochastic differential Equation (9).
For every ε > 0 , let
Z ε : = X ε X ¯ 0 d ( ε ) = G ε ε B + 0 . ξ ε ( s ) d s , ε N 1 ε φ ε
and, respectively,
Z ¯ ε : = X ¯ ε X ¯ 0 d ( ε ) .
The weak limit for the maps under shifts by random tightened controls.
Let M < and β ( 0 , 1 ) . Let ( ξ ε , φ ε ) U ˜ ε M × U + , ε M such that ψ ε 1 | ψ ε | < β d ( ε ) ψ in B 2 ( M κ 2 ( 1 ) ) where ψ ε : = φ ε 1 d ( ε ) and 1 d ( ε ) ξ ε ξ in B ˜ 2 ( M ) . The conclusion in the second statement in Hypothesis 6 for ( G ε ) ε > 0 and G 0 reads as Z ε Z ¯ , as ε 0 , where Z ¯ C ( [ τ , T ] ; R d ) solves uniquely
Z ¯ ( t ) = 0 t D a ¯ ( X ¯ s 0 , χ ) Z ¯ ( s ) d s + 0 t σ ( X ¯ s 0 , χ ) ξ ( s ) d s + 0 t X c ( X ¯ s 0 , χ , z ) ψ ( s , z ) ν ( d z ) d s , t [ 0 , T ] Z ¯ ( t ) = 0 , t [ τ , 0 ] .
In order to prove that Z ε Z ¯ , as ε 0 , we proceed as follows.
1.
This step passes through two intermediary tasks. Firstly, one shows that the laws of ( Z ¯ ε ) ε > 0 are tight in P ( C ( [ τ , T ] ; R d ) ) (since compact sets in the topology generated by the uniform convergence are also compact sets in the Skorokhod topology). Then, it follows that there exists Z ˜ C ( [ τ , T ] ; R d ) such that Z ¯ ε Z ˜ as ε 0 . Passing to the pointwise limit in the equation satisfied by Z ¯ ε and due to the uniqueness of the solution of (43), we conclude that Z ˜ = Z ¯ .
2.
We prove the following strong (controlled) averaging principle:
lim ε 0 P ¯ sup t [ 0 , T ] | Z ε ( t ) Z ¯ ε ( t ) | > δ = 0 , for any δ > 0 .
From the limit above and Theorem 4.1. in [53], commonly known as Slutzsky’s theorem, we can identify Z ¯ as the weak limit of ( Z ε ) ε as ε 0 .

3.3.2. A Priori Estimates and a Localization Procedure

For every ε > 0 , let R ( ε ) > 0 such that R ( ε ) and d ( ε ) R 2 ( ε ) 0 as ε 0 . For example, R ( ε ) : = 1 d ( ε ) 4 , ε > 0 does the job. Consequently, ε R 2 ( ε ) 0 and therefore ε R 2 ( ε ) 0 as ε 0 . For every ε > 0 and this choice of R ( ε ) , we define the F ¯ -stopping times
τ ˜ R ( ε ) ε : = inf { t [ 0 , T ] | X ε ( t ) B R ( ε ) ( 0 ) } .
and
τ ¯ R ( ε ) ε : = inf { t [ 0 , T ] | X ¯ ε ( t ) B R ( ε ) ( 0 ) }
The following list of propositions and lemmas are fundamental estimates used in the strategy described above to obtain the conclusion that Z ε Z ¯ as ε 0 .
Proposition 5. 
Let the standing assumptions of Theorem 2 hold. For any 0 < M < , ( ξ ε , φ ε ) ε > 0 U ˜ + , ε M × U + , ε M , R : ( 0 , 1 ] ( 0 , ) such that R ( ε ) and d ( ε ) R 2 ( ε ) 0 as ε 0 and T , τ > 0 , we have the following. Given ( X ε ( s ) ) s [ τ , T ] defined by (6) and ( X ¯ ε ( s ) ) s [ τ , T ] by (9), there exists 0 < ε 0 < 1 and a constant C > 0 such that for every 0 < ε < ε 0 the following estimates hold:
P ¯ sup 0 t T | X ε ( s ) | > R ( ε ) 2 e 1 2 R ( ε ) + C ε R ( ε )
and
P ¯ sup 0 t T | X ¯ ε ( s ) | > R ( ε ) 2 e 1 2 R ( ε ) + C ε R ( ε )
The proof follows the same reasoning employed in Lemma 2.1 of [63].
Proposition 6. 
Let M > 0 . Fix a function R : ( 0 , ) ( 0 , ) satisfying the assumptions of Proposition 5 and for every ε > 0 let τ ˜ R ( ε ) ε defined by (44). Under the assumptions of Hypotheses 1–5, there exists some ε 0 > 0 such that the following bound holds:
Γ 1 ( M ) : = sup 0 < ε < ε 0 sup ( ξ , φ ) U ˜ ε M × U + , ε M E ¯ sup τ t τ ˜ R ( ε ) ε | X ε ( t ) | 2 + sup τ t T E ¯ | Y ε ( t ) | 2 1 { τ R ( ε ) ε > T } < .
The proof follows from applying successfully Ito’s formula, BDG inequalities and Lemma A1 presented in Section A.1.1 of Appendix A.
Proposition 7. 
Fix M > 0 , R : ( 0 , ) ( 0 , ) satisfying the hypotheses of Proposition 5 and for every ε > 0 let τ ¯ R ( ε ) ε be defined by (45). Under Hypotheses 1–5, there exists some ε 0 > 0 such that the following holds:
Γ 2 ( M ) : = sup 0 < ε < ε 0 sup ( ξ , φ ) U ˜ ε M × U + , ε M E ¯ sup τ t τ ¯ R ( ε ) ε | X ¯ ε ( t ) | 2 < .
The proof of Proposition 7 follows analogously from the proof of (48). For this reason, we omit it.
Lemma 1. 
Fix M > 0 , R : ( 0 , ) ( 0 , ) under the hypotheses of Proposition 5 and for every ε > 0 let τ ¯ R ( ε ) ε defined by (45). Under the assumptions of Hypotheses 1–5, there exists some ε 0 > 0 such that the following holds:
Γ 3 ( M ) : = sup 0 < ε < ε 0 sup ( ξ , φ ) U ˜ ε M × U + , ε M E ¯ sup τ t τ ¯ R ( ε ) ε | Z ¯ ε ( t ) | 2 < .
The proof of (50) is straightforward and we omit it.

3.3.3. Identification of the Weak Limit

Given M < and ε > 0 , let ξ ε U ˜ ε M , φ ε U + , ε M and write ψ ε : = φ ε 1 d ( ε ) . Assume that for some β ( 0 , 1 ) the following convergences (in law) are satisfied
ψ ε 1 { | ψ ε | β d ( ε ) } ψ and 1 d ( ε ) ξ ε ξ , as ε 0 .
Then, the following result holds.
Proposition 8. 
Let the standing assumptions of Theorem 2 hold for some ν M and ξ C . For every ε > 0 , let ( Z ¯ ε ( t ) ) t [ τ , T ] be defined by (42). Then, the family ( Z ¯ ε , 1 d ( ε ) ξ ε , ψ ε 1 { | ψ ε | β d ( ε ) } ) ε > 0 is tight in D ( [ τ , T ] ; R d ) × B ˜ 2 ( M ) × B 2 ( M κ 2 ( 1 ) ) for some β ( 0 , 1 ) and κ 2 ( 1 ) given in the Remark 4. Furthermore, any limit point in law ( Z ¯ , ξ , ψ ) satisfies (43).
The proof follows with standard arguments used by the weak convergence approach to moderate deviation principles for stochastic differential equations with jumps. We refer the reader to Lemma 4.9 in the seminal work [46].

3.4. The Controlled Averaging Principle

The main result of this section allows us to identify the weak limit of ( Z ε ) ε > 0 with the weak limit of the family ( Z ¯ ε ) ε > 0 as ε 0 .
Theorem 4. 
Let the hypotheses of Theorem 2 hold. Then, given the families ( Z ε ) ε > 0 and ( Z ¯ ε ) ε > 0 , defined, respectively, by (41) and (42), we have for any δ > 0
lim ε 0 P sup 0 t τ ˜ R ( ε ) ε | Z ε ( t ) Z ¯ ε ( t ) | > δ = 0 .
The reader can find the proof in Section 3.4.4. km

3.4.1. Khasminkii’s Auxiliary Processes

We follow the technique introduced in [15] with the required modifications to our settings in order to deal with the nonlocal components of the auxiliary processes ( X ε ) ε > 0 and ( Y ε ) ε > 0 given, respectively, by (6) and (7).
Let [ τ , T ] be divided into intervals of the same length parametrized for every ε > 0
Δ = Δ ( ε ) : = ε γ d 2 ( ε ) | ln ε | p , for some γ 0 , θ 1 2 and p > 0 .
where the scale d ( ε ) is given by (35).
We note the following convergences that follow directly from the choice of Δ = Δ ( ε ) in (52):
Δ ( ε ) 0 ; Δ ( ε ) d 2 ( ε ) 0 ; and Δ ( ε ) ε as ε 0 .
For any t [ τ , T ] , we denote t Δ : = t Δ Δ .
We construct the auxiliary processes ( Y ^ ε ( t ) ) t [ 0 , T ] and ( X ^ ε ( t ) ) t [ 0 , T ] by means of the following equations: for any t [ 0 , T ] , let
Y ^ ε ( t ) = Y ε ( t Δ ) + 1 ε t Δ t f ( X t Δ ε , Y ^ ε ( s ) ) + g ( X t Δ ε , Y ^ ε ( s ) ) ξ 2 ε ( s ) d s + X h ( X t Δ ε , Y ^ ε ( s ) , z ) ( φ ε ( s , z ) 1 ) ν ( d z ) d s + 1 ε t Δ t g ( X t Δ ε , Y ^ ε ( s ) ) d B 2 ( s ) + t Δ t X h ( X t Δ ε , Y ^ ε ( s ) , z ) N ˜ 1 ε φ ε ( d s , d z )
and
X ^ ε ( t ) = ζ ( 0 ) + 0 t a ( X s Δ ε , Y ε ( s ) ) + σ ( X s ε ) ξ 1 ε ( s ) + X c ( X s ε , z ) ( φ ε ( s , z ) 1 ) ν ( d z ) d s + ε 0 t σ ( X s ε ) d B 1 ( s ) + ε 0 t X c ( X s ε , z ) N ˜ 1 ε φ ε ( d s , d z ) .

3.4.2. Auxiliary Estimates

For every ε > 0 , let us recall the F ¯ -stopping time τ ˜ R ( ε ) ε given by (44) for the fixed parametrization R given in Proposition 5. The following lemmas are essential a priori bounds that we use in the proof of the controlled averaging principle stated in Theorem 4.
Lemma 2. 
For every ε > 0 , let R ( ε ) > 0 , b ( ε ) : = ε d 2 ( ε ) and Δ ( ε ) > 0 be fixed as above. Then, for any ( g ε ) ε > 0 such that g ( ε ) ε d ( ε ) as ε 0 , the following asymptotic regime holds:
P ¯ sup 0 t τ ˜ R ( ε ) ε | | X t ε X t Δ ε | | > g ε ε Ξ ( ε ) 0 , a s ε 0 ,
where
Ξ ( ε ) : = b 2 ( ε ) ε | ln ε | 2 p + ε 2 θ 1 2 γ | ln ε | 2 p q + ε 2 θ 1 γ | ln ε | p q + ε γ ε 1 θ | ln ε | 2 p , q > 2 γ + 3 , ε > 0 .
The proof is given in Section A.2.2 of Appendix A.
Lemma 3. 
For every ε > 0 , let R ( ε ) be fixed as in Proposition 5 and Δ ( ε ) given by (52). Then, the following convergence holds,
sup 0 t T E ¯ | Y ε ( t ) Y ^ ε ( t ) | 1 { T < τ ˜ R ( ε ) ε } ε C ( ε ) Δ ( ε ) e 2 Δ ( ε ) 2 ε + 1 a s ε 0 .
for some C ( ε ) 0 as ε 0 uniformly in the initial condition ( χ , y ) C × R k .
The proof is given in Section A.2.2 of Appendix A.

3.4.3. Khasminkii’s Technique

Proposition 9. 
For any δ > 0 , we have
lim sup ε 0 P ¯ sup 0 t τ ˜ R ( ε ) ε | X ε ( t ) X ^ ε ( t ) | > δ d ( ε ) 2 = 0 .
Proof. 
The definitions of ( Y ε ( t ) ) t [ 0 , T ] and ( Y ^ ε ( t ) ) t [ 0 , T ] given in (7) and (54), respectively, combined with Hypothesis 2 yield for every ε > 0 and t [ 0 , T ] that
X ^ ε ( t ) X ε ( t ) = 0 t a ( X s Δ ε , Y ^ ε ( s ) ) a ( X s ε , Y ε ( s ) ) d s L 0 t | | X s Δ ε X s ε | | d s + L 0 t | Y ^ ε ( s ) Y ε ( s ) | d s .
The asymptotic behaviour (53) of Δ ( ε ) > 0 fixed in (52) combined with Lemma 2, (A14) and (A15) of Lemma 3 yield some C = C ( L , T ) > 0 such that
P ¯ sup 0 t τ ˜ R ( ε ) ε | X ^ ε ( t ) X ε ( t ) | > d ( ε ) 2 P ¯ 0 T τ ˜ R ( ε ) ε | a ( X s Δ ε , Y ^ ε ( s ) ) a ( X s ε , Y ε ( s ) ) | d s > d ( ε ) 2 P ¯ sup 0 t T τ ˜ R ( ε ) ε | | X t Δ ε X t ε | | > C d ( ε ) + P ¯ 0 T | Y ^ ε ( s ) Y ε ( s ) | 2 1 { T < τ ˜ ε ( R ( ε ) ) } d s > C d 2 ( ε ) ε Ξ ( ε ) + 1 d 2 ( ε ) 0 T E ¯ | Y ^ ε ( s ) Y ε ( s ) | 2 1 { T < τ ˜ R ( ε ) ε } d s ε Ξ ( ε ) + C 2 ( ε ) Δ ( ε ) d 2 ( ε ) e Δ ( ε ) 2 ε + 1 0 as ε 0 .
This finishes the proof of (59). □
Proposition 10. 
For any δ > 0 , we have
lim sup ε 0 P ¯ sup 0 t τ ˜ R ( ε ) ε | X ^ ε ( t ) X ¯ ε ( t ) | > δ d ( ε ) 2 = 0 .
Proof. 
For every ε > 0 , t [ 0 , T ] , ζ D , ξ U ˜ + , ε M and φ ε U + , ε M , we define the function
b ε ( ζ ) ( t ) : = 0 t σ ( ζ ) ξ 1 ε ( s ) + X c ( ζ , z ) ( φ ε ( s , z ) 1 ) ν ( d z ) .
The definitions of ( X ε ( t ) ) t [ 0 , T ] and ( X ^ ε ( t ) ) t [ 0 , T ] given, respectively, in (6) and (55) combined with the definition of b ε given above imply for every t [ 0 , T ] and ε > 0 the following identity P ¯ -a.s. on the event { T < τ ˜ R ( ε ) ε } :
X ^ ε ( t ) X ¯ ε ( t ) = 0 t b ε ( X ^ s ε ) b ε ( X ¯ s ε ) d s + 0 t a ( X s Δ ε , Y ^ ε ( s ) ) a ¯ ( X s ε ) d s + 0 t a ¯ ( X s ε ) a ¯ ( X ^ s ε ) d s + 0 t a ¯ ( X ^ s ε ) a ¯ ( X ¯ s ε ) d s + ε 0 t σ ( X s ε ) σ ( X ¯ s ε ) d B 1 ( s ) + ε 0 t X c ( X s ε , z ) c ( X ¯ s ε , z ) N ˜ 1 ε φ ε ( d s , d z ) .
Hypothesis 2, Proposition 2 and (61) yield some constant C = C ( L , T ) > 0 such that on the event { T < τ ˜ R ( ε ) ε } we have P ¯ -a.s.
sup 0 s t | X ^ ε ( s ) X ¯ ε ( s ) | 2 C ( 0 t sup 0 u s | X ^ ε ( u ) X ¯ ε ( u ) | 2 d s + sup 0 s t | 0 s a ( X u Δ ε , Y u ε ) a ¯ ( X u ε ) d u | 2 + sup t [ 0 , T ] | J 1 ε ( t ) | 2 1 { T < τ ˜ R ( ε ) ε } + sup t [ 0 , T ] | J 2 ε ( t ) | 2 1 { T < τ ˜ R ( ε ) ε } ) ,
where for any ε > 0 we write
J 1 ε ( t ) : = ε 0 t σ ( X s ε ) σ ( X ¯ s ε ) d B 1 ( s ) and J 2 ε ( t ) : = ε 0 t X c ( X s ε , z ) c ( X ¯ s ε , z ) N ˜ 1 ε φ ε ( d s , d z ) .
Gronwall’s lemma implies for any ε > 0 that
sup τ t T | X ε ( t ) X ε ( t ) | 2 1 { T < τ ˜ R ( ε ) ε } e C T ( sup 0 s t | 0 s a ( X u Δ ε , Y u ε ) a ¯ ( X u ε ) d u | 2 1 { T < τ ˜ R ( ε ) ε } + sup 0 t T τ ˜ R ( ε ) ε | J 1 ε ( t ) | 2 + sup 0 t T τ ˜ R ( ε ) ε | J 2 ε ( t ) | 2 ) .
The estimate (62) yields for any δ > 0
P ¯ sup 0 t τ ˜ R ( ε ) ε | X ^ ε ( t ) X ¯ ε ( t ) | > d ( ε ) δ 2 P ¯ sup 0 s t | 0 s a ( X u Δ ε , Y ε ( u ) ) a ¯ ( X u ε ) d u | 2 1 { T < τ ˜ R ( ε ) ε } > δ 2 d 2 ( ε ) e 2 C T 12 + P ¯ sup 0 t T τ ˜ R ( ε ) ε | J 1 ε ( t ) | 2 > δ 2 d 2 ( ε ) e 2 C T 12 + P ¯ sup 0 t T τ ˜ R ( ε ) ε | J 2 ε ( t ) | 2 δ 2 d 2 ( ε ) e 2 C T 12 .
Burkholder–Davis–Gundy’s inequalities and the sublinear growth of σ given by (16) in Remark 2 yield some constant C 2 = C 2 ( δ , C 1 , L 1 , Γ 1 , Γ 2 ) > 0 , where Γ 1 , Γ 2 are given, respectively, by (48) in Proposition 6 and (49) in Proposition 7, such that
P ¯ sup 0 t T τ ˜ R ( ε ) ε | J 1 ε ( t ) | 2 > δ 2 d 2 ( ε ) e 2 C T 12 12 e 2 C T δ 2 d 2 ( ε ) E ¯ sup 0 t T τ ˜ R ( ε ) ε | J 1 ε ( t ) | 2 C 2 ε d 2 ( ε ) 0 , as ε 0 .
Analogously, due to Burkholder–Davis–Gundy’s inequalities and (A1) given in Lemma A1 of Appendix ASection A.1, there exists some constant C 3 = C 3 ( δ , C 1 , L 1 , Γ 1 , Γ 2 , M ) > 0 that may change from line to line, such that
P ¯ sup 0 t T τ ˜ R ( ε ) ε | J 2 ε ( t ) | 2 > δ 2 d 2 ( ε ) e 2 C T 12 12 e 2 C T δ 2 d 2 ( ε ) E ¯ sup 0 t T τ ˜ R ( ε ) ε | J 2 ε ( t ) | 2 ε d 2 ( ε ) C 3 sup g S + , ε M 0 T X | z | 2 g ( s , z ) ν ( d z ) d s C 3 b ( ε ) ( T + d 2 ( ε ) ) 0 .
We estimate now the first term on the right-hand side of (63). For every ε > 0 and t [ 0 , T ] , we write P ¯ -a.s. on the event { T < τ ˜ R ( ε ) ε }
0 t a ( X s Δ ε , Y ^ ε ( s ) ) a ¯ ( X s ε ) ) d s = k = 0 t Δ 1 k Δ ( k + 1 ) Δ a ( X k Δ ε , Y ε ( s ) ) a ¯ ( X k Δ ε ) d s + k = 0 t Δ 1 k Δ ( k + 1 ) Δ a ¯ ( X k Δ ε ) a ¯ ( X s ε ) d s + t Δ t a ( X s Δ ε , Y ^ ε ( s ) ) a ¯ ( X s ε ) d s : = I 1 ε + I 2 ε + I 3 ε .
It follows from (66) that
P ¯ sup 0 s t | 0 s a ( X u Δ ε , Y u ε ) a ¯ ( X u ε ) d u | 2 1 { T < τ ˜ R ( ε ) ε } > δ 2 d 2 ( ε ) e 2 C T 12 P ¯ sup 0 t T | I 1 ε ( t ) | 1 { T < τ ˜ R ( ε ) ε } > δ d ( ε ) e C T 6 3 + P ¯ sup 0 t T | I 2 ε ( t ) | 1 { T < τ ˜ R ( ε ) ε } > δ d ( ε ) e C T 6 3 + P ¯ sup 0 t T | I 3 ε ( t ) | 1 { T < τ ˜ R ( ε ) ε } > δ d ( ε ) e C T 6 3 .
We estimate I 2 ε .
We observe that for any ε > 0
I 2 ε = 0 t Δ a ¯ ( X s Δ ε ) a ¯ ( X s ε ) d s .
Proposition 2 and Lemma 2 imply for some C 4 = C ( T ) > 0 , any δ > 0 and ε > 0 small enough that
P ¯ sup t [ 0 , T ] | I 1 ε ( t ) | 1 { T < τ ˜ R ( ε ) ε } > δ d ( ε ) e C T 6 3 P ¯ sup 0 t τ ˜ R ( ε ) ε 0 t Δ | X s Δ ε X s ε | > C 4 d ( ε ) ε Ξ ( ε ) 0 as ε 0 .
We estimate I 3 ε .
Hypothesis 2, Proposition 2 and Proposition 6 yield some constant C 5 = C 5 ( L , Γ 1 ( M ) ) > 0 that may change from line to line such that, for every ε > 0 small enough and any δ > 0 , one has
P ¯ sup t [ 0 , T ] | I 3 ε ( t ) | 1 { T < τ ˜ R ( ε ) ε } > δ d ( ε ) e C T 6 3 C 5 d 2 ( ε ) E ¯ sup 0 t τ ˜ R ( ε ) ε | t Δ t a ( X s Δ ε , Y ^ ε ( s ) ) a ¯ ( X s ε ) d s | 2 C 5 Δ ( ε ) d 2 ( ε ) E ¯ 0 T 1 + | | X s ε | | 2 + | | X s Δ ε | | 2 + | Y ε ( s ) | 2 1 { T < τ ˜ R ( ε ) ε } d s ε Δ ( ε ) d 2 ( ε ) 0 , as ε 0 ,
due to (53).
We estimate I 1 ε .
We construct a new process Z : = Y ε ( X k Δ ε , Y ε ( k Δ ) ) where the notation that is displayed here stresses that the process is the fast variable process Y ε with frozen slow component X k Δ ε and initial condition Y ε ( k Δ ) ) . It is a classical fact in the course of the Khasminkii technique employed in [15] for the proof of the strong averaging principle that for every s [ 0 , Δ ] we have
( X k Δ ε , Y ε ( s + k Δ ) ) = d X k Δ ε , Y ε ( X k Δ ε , Y ε ( k Δ ) ) s ε .
We may assume in addition that the fabricated noises above are independent of X k Δ ε and Y ε ( k Δ ) . For the proof of the statements above, we refer the reader to Section 5 in [25]. Hence, Proposition 3 together with the Markov property of ( X t ε , Y ε ( t ) ) t [ 0 , T ] implies for every k = 0 , , t Δ the following:
E ¯ | k Δ ( k + 1 ) Δ a ( X k Δ ε , Y ^ ε ( s ) ) a ¯ ( X k Δ ε ) d s | Δ E ¯ ε Δ | 0 Δ ε a ( X k Δ ε , Z ( s ) ) a ¯ ( X k Δ ε ) d s | = Δ E ¯ E ¯ | ε Δ 0 Δ ε a ( ζ , Z ζ , y ) a ¯ ( ζ ) | | ( ζ , y ) = ( X k Δ ε , Y ε ( k Δ ) ) | Δ α Δ ε 1 + E ¯ | | X k Δ ε | | + E ¯ [ | Y ε ( k Δ ) | ] .
Proposition 3, Proposition 6, (53) and (70) yield, for any δ > 0 and ε > 0 sufficiently small, that
P ¯ sup 0 t T | I 1 ε | 1 { T < τ ˜ R ( ε ) ε } > δ d ( ε ) e C T 6 3 ε 1 d 2 ( ε ) E ¯ sup 0 t T | I 1 ε ( t ) | 2 1 { T < τ ˜ R ( ε ) ε } ε 1 d 2 ( ε ) k = 0 T Δ ( ε ) E ¯ | k Δ ( k + 1 ) Δ ( a ( X k Δ ε , Y ^ ε ( s ) ) a ¯ ( X k Δ ε ) ) 1 { T < τ ˜ R ( ε ) ε } d s | 2 ε Δ ( ε ) d 2 ( ε ) α Δ ε 0 as ε 0 .
The convergence above follows from the choice of the parametrization Δ = Δ ( ε ) fixed in (52) and α constructed in Proposition 3. □

3.4.4. Proof of Theorem 3.2

For any ε > 0 , fix R ( ε ) > 0 such as in Proposition 5 and recall the definition of τ ˜ R ( ε ) ε in (44).
For any δ > 0 , we have
lim sup ε 0 P ¯ sup 0 t T | Z ε ( t ) Z ¯ ε ( t ) | > δ lim sup ε 0 P ¯ sup 0 t T | X ε ( t ) X ¯ ε ( t ) | > δ d ( ε ) lim sup ε 0 P ¯ sup 0 t τ ˜ R ( ε ) ε | X ε ( t ) X ^ ε ( t ) | > δ d ( ε ) 2 + lim sup ε 0 P ¯ sup 0 t τ ˜ R ( ε ) ε | X ^ ε ( t ) X ¯ ε ( t ) | > δ d ( ε ) 2 + lim sup ε 0 P ¯ τ ˜ R ( ε ) ε T = 0 ,
due to Propositions 5, 9 and 10. □

3.5. Conclusions

  • Conclusion-Proof of Theorem 2
We recall the collection of measurable maps ( G ε ) ε > 0 introduced in (40) and G 0 defined by means of the skeleton Equation (32). We note that Proposition 4 reads as Condition 1 of Hypothesis 6 for ( G ε ) ε > 0 and G 0 . Proposition 8 combined with Theorem 4 yield, due to Slutzky’s theorem, that Condition 2 of Hypothesis 6 is verified for ( G ε ) ε > 0 and G 0 . Hence, the result follows from Theorem 3. □
  • Conclusion from the main result.
The work presented in this article shows how robust the use of the weak convergence approach is in the proof of a moderate deviation principle for a slow–fast system of stochastic equations given by (13). More precisely, the work presented here reduces the usual proof of exponential tightness for the family ( X ε ) ε > 0 to the proof of a controlled stochastic averaging principle (Theorem 4) that follows from the Markov property of the system and easier weak compactness arguments.

Author Contributions

The authors contributed equally to the work shown in this article. Conceptualization, A.d.O.G. and P.C.; methodology, A.d.O.G.; validation, A.d.O.G. and P.C.; formal analysis, A.d.O.G.; investigation, A.d.O.G. and P.C.; writing—original draft preparation, A.d.O.G.; writing—review and editing, A.d.O.G. and P.C.; supervision, P.C.; project administration, P.C.; funding acquisition, P.C. All authors have read and agreed to the published version of the manuscript.

Funding

The authors acknowledge and thank the financial support from the FAPESP grant number 2018/06531-1 at the University of Campinas (UNICAMP), SP-Brazil.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Auxiliary Results for the Derivation of the Moderate Deviation Principle

Appendix A.1.1. Integrability Properties of the Controls

The following lemma is heavily used in the derivation of the moderate deviation principle stated in Theorem 2. We refer the reader to Section 3.1 for notation.
Lemma A1. 
Fix M > 0 and ν M a measure satisfying the Hypothesis 1. The following holds.
1. 
There exists τ > 0 such that for all ε > 0 we have
sup g S + , ε M I X | z | 2 g ( s , z ) ν ( d z ) d s < τ ( d 2 ( ε ) + | I | ) ,
sup g S + , ε M I X | z | | g ( s , z ) 1 | ν ( d z ) d s < τ ( d ( ε ) + | I | )
and there exists τ ˜ > 0 yielding for all ε , β > 0 some c ( β ) 0 as β and such that
sup h S ε M I X | z | | h ( s , z ) | ν ( d z ) d s < τ ˜ ( | I | + | I | + d ( ε ) + c ( β ) ) ,
for any Borel measurable I [ 0 , T ] .
2. 
For every ε > 0 , let ψ ε U ε M . We assume that for some β ( 0 , 1 ) the following convergence in law holds, ψ ε 1 { | ψ ε | < β d ( ε ) } ψ in the compact ball B 2 ( M κ 2 ( 1 ) ) , where κ 2 ( 1 ) is given by Remark 4. Then, the following convergence in distribution holds, for every t [ 0 , T ] ,
0 t X | z | ψ ε ( s , z ) ν ( d z ) d s 0 t X | z | r ψ ( s , z ) ν ( d z ) d s .
For the proof of the first statement we refer the reader to Lemma 2.1 in [63]. The conclusion of the second statement is proved as in Lemma 4.8 of [46].

Appendix A.2. Auxiliary Estimates for the Controlled Averaging Principle

Appendix A.2.1. Proof of Lemma 3.2

For any ε > 0 we fix Δ : = Δ ( ε ) given by (52), d ( ε ) given in (35) and R ( ε ) > 0 such as in Proposition 5. We recall that due to Proposition 6 we have for any ε > 0 small enough that
sup 0 < ε < ε 0 E ¯ sup 0 t τ ˜ R ( ε ) ε | | X t ε | | < ,
where τ ˜ R ( ε ) ε is the F ¯ -stopping time defined by (44).
Let us work on the event { T < τ ˜ R ( ε ) ε } . Fix ε > 0 , t [ 0 , T ] and t Δ : = t Δ Δ . For every ε > 0 , let K ε : = T Δ ( ε ) N and N ε : = τ Δ ( ε ) N . For any k = 0 , , K ε 1 and m = 0 , , N ε 1 we label I k ε : = [ k Δ ; ( k + 1 ) Δ ] and J m ε : = [ ( m + 1 ) Δ , m Δ ] .
Given t [ 0 , T ] and θ [ τ , 0 ] , let k , m 0 such that t [ k Δ , ( k + 1 ) Δ ] and θ [ ( m + 1 ) Δ , m Δ ] . It is immediate that
t + θ [ ( k m 1 ) Δ , ( k + 1 m ) Δ ] and t Δ + θ [ ( k m 1 ) Δ , ( k m ) Δ ] .
We have to distinguish three possible cases:
(i)
m k 1 ;
(ii)
m k + 1 and
(iii)
m = k .
It follows that
sup 0 t T | | X t ε X t Δ ε | | = sup 0 t T sup τ θ 0 | X ε ( t + θ ) X ε ( t Δ + θ ) | = sup t k = 0 K ε 1 I k ε sup θ m = 0 N ε 1 J k ε | X ε ( t + θ ) X ε ( t Δ + θ ) | .
Let us fix ( g ε ) ε > 0 such that g ε ε d ( ε ) as ε 0 . It follows that
P ¯ sup 0 t T | X t ε X t Δ ε | > g ε ; T < τ ˜ R ( ε ) ε N ε K ε max k = 0 , , K ε 1 m = 0 , , N ε 1 P ¯ sup k Δ t ( k + 1 ) Δ ( m + 1 ) Δ θ m Δ | X ε ( t + θ ) X ε ( t Δ + θ ) | > g ε ; T < τ ˜ R ( ε ) ε : = K ε N ε p 1 ε + p 2 ε + p 3 ε ,
where
p 1 ε : = P ¯ sup k Δ t ( k + 1 ) Δ ( m + 1 ) Δ θ m Δ | X ε ( t + θ ) X ε ( t Δ + θ ) | > g ε ; m k 1 ; T < τ ˜ R ( ε ) ε p 2 ε : = P ¯ sup k Δ t ( k + 1 ) Δ ( m + 1 ) Δ θ m Δ | X ε ( t + θ ) X ε ( t Δ + θ ) | > g ε ; m k + 1 ; T < τ ˜ R ( ε ) ε and p 3 ε : = P ¯ sup k Δ t ( k + 1 ) Δ ( m + 1 ) Δ θ m Δ | X ε ( t + θ ) X ε ( t Δ + θ ) | > g ε ; m = k ; T < τ ˜ R ( ε ) ε .
Case (i): m k 1 .
In this case we have that t + θ > 0 and t Δ + θ > 0 . Then, we have that
X ε ( t + θ ) X ε ( t Δ + θ ) = t Δ + θ t + θ a ( X s ε , Y ε ( s ) ) + σ ( X s ε ) ξ 1 ε ( s ) + X c ( X s ε , z ) ( φ ε ( s , z ) 1 ) ν ( d z ) d s + ε t Δ + θ t + θ σ ( X s ε ) d B 1 ( s ) + ε t Δ + θ t + θ c ( X s ε , z ) N ˜ 1 ε φ ε ( d s , d z ) .
Let us fix the parametrization L = L ε > 0 , given by
L = L ε > 0 : = d 2 ( ε ) | ln ε | q for some q > 2 γ + 3 , ε > 0 .
The Bernstein inequality given in the form of Theorem 3.3. in [54] implies for every ε > 0 that
p 1 ε ε e g ε 2 L ε + P [ X ε X ε ( t Δ + θ ) ] ( k + 1 ) Δ m Δ > L ε ; m k 1 ; T < τ ˜ R ( ε ) ε .
Due to (A5) it follows for any ε > 0 on the event { T < τ ˜ R ( ε ) ε } that
[ X ε X ε ( t Δ + θ ) ] ( k + 1 ) Δ m Δ ε ε Δ ( 1 + R 2 ( ε ) ) + ε 2 t Δ ( m + 1 ) Δ ( k + 1 ) Δ m Δ | z | 2 N 1 ε ( d s , d z ) : = ε ( 1 + R 2 ( ε ) ) Δ + ε 2 I ( k + 1 ) Δ m Δ ε .
Due to the choice of L ε in (A6) and Δ ( ε ) in (52) let ε 0 > 0 sufficiently small such that for any ε < ε 0 we have ε ( 1 + R 2 ( ε ) ) ε γ | ln ε | p q < 1 2 . Then, it follows that
ε ( 1 + R 2 ( ε ) ) Δ = ε ( 1 + R 2 ( ε ) ) ε γ | ln ε | p q d 2 ( ε ) | ln ε | q < L ε 2
for every ε < ε 0 .
The estimate (A1) in Lemma A1 (Section A.1 of the Appendix A) implies for any ε > 0 small enough such that (A7) holds that
P [ X ε X ε ( t Δ + θ ) ] ( k + 1 ) Δ m Δ > L ε ; m k 1 ; T < τ ˜ R ( ε ) ε P ¯ ε 2 I ( k + 1 ) Δ m Δ ε > L ε 2 ε ε 2 2 L ε E ¯ I ( k + 1 ) Δ m Δ ε ε ε L ε t Δ ( m + 1 ) Δ ( k + 1 ) Δ m Δ | z | 2 φ ε ( s , z ) ν ( d z ) d s ε ε L ε d 2 ( ε ) + Δ .
Due to (53), (A6) and d ( ε ) 0 as ε 0 we conclude for every ε > 0 small enough that
p 1 ε ε e g ε 2 L ε + ε L ε d 2 ( ε ) + Δ ( ε ) 0 as ε 0 .
The case m k + 1 .
In this case we have that t + θ < 0 and t Δ + θ < 0 . Since the initial delay χ is Lipschitz continuous (cf. (17)) it follows that
| X ε ( t + θ ) X ε ( t Δ + θ ) | = | χ ( t + θ ) χ ( t Δ + θ ) | λ | t t Δ | .
Then, for any ε > 0 we have
p 2 ε = P ¯ sup k Δ t ( k + 1 ) Δ ( m + 1 ) Δ θ m Δ | X ε ( t + θ ) X ε ( t Δ + θ ) | 4 > ( g ε ) 4 ; m k + 1 ; T < τ ˜ R ( ε ) ε ε 1 ( g ε ) 4 E ¯ sup k Δ t ( k + 1 ) Δ ( m + 1 ) Δ θ m Δ | X ε ( t + θ ) X ε ( t Δ + θ ) | 4 ε Δ ( ε ) g ε 4 0 as ε 0 ,
due to the definition of Δ ( ε ) in (52) and g ε ε d ( ε ) as ε 0 .
  • The case m = k .
In this case we have t + θ [ Δ , Δ ] and t Δ + θ [ Δ , 0 ] . It is immediate that
| X ε ( t + θ ) X ε ( t Δ + θ ) | = | X ε ( t + θ ) X ε ( t Δ + θ ) | 1 { t + θ > 0 } + | X ε ( t + θ ) X ε ( t Δ + θ ) | 1 { t + θ < 0 } .
Due to the two previous cases already analyzed we have, for any ε > 0 small enough, that
p 3 ε P ¯ sup k Δ t ( k + 1 ) Δ ( k + 1 ) Δ θ k Δ | X ε ( t + θ ) X ε ( t Δ + θ ) | > g ε ; 1 { t + θ > 0 } ; T < τ ˜ R ( ε ) ε + P ¯ sup k Δ t ( k + 1 ) Δ ( k + 1 ) Δ θ k Δ | X ε ( t + θ ) X ε ( t Δ + θ ) | > g ε ; 1 { t + θ < 0 } ; T < τ ˜ R ( ε ) ε ε e g ε 2 L ε + ε L ε d 2 ( ε ) + Δ + Δ g ε 4 0 as ε 0 .
Combining (A8)–(A10) it follows, for Δ ( ε ) , L ε given by (52) and respectively (A6) and any ε > 0 small enough, that
P ¯ sup 0 t T | X t ε X t Δ ε | > g ε ; T < τ ˜ R ( ε ) ε ε N ε K ε e g ε 2 L ε + ε L ε d 2 ( ε ) + Δ + Δ g ε 4 ε 1 ( Δ ( ε ) ) 2 e | ln ε | q + ε | ln ε | q + b ( ε ) | ln ε | q Δ + Δ ( ε ) d ( ε ) 4 ε ε q ε 2 γ a 4 ( ε ) | ln ε | 2 p + ε ε 2 γ a 4 ( ε ) | ln ε | 2 p q + ε ε γ a 4 ( ε ) | ln ε | p q + d 2 ( ε ) ε 2 γ | ln ε | 2 p ε b 2 ( ε ) ε | ln ε | 2 p + ε 2 θ 1 2 γ | ln ε | 2 p q + ε 2 θ 1 γ | ln ε | p q + ε γ ε 1 θ | ln ε | 2 p = : Ξ ( ε ) .
Since γ 0 , θ 1 2 , d ( ε ) = ε 1 θ 2 , θ 1 2 , 1 , b ( ε ) = ε d 2 ( ε ) we conclude that Ξ ( ε ) 0 as ε 0 . This finishes the proof. □

Appendix A.2.2. Proof of Lemma 3

Ito’s formula yields for any t [ t Δ , t Δ + 1 ] and P ¯ -a.s.
| Y ^ ε ( t ) Y ε ( t ) | 2 = 2 ε t Δ t f ( X t Δ ε , Y ^ ε ( s ) ) f ( X s ε , Y ε ( s ) ) , Y ^ ε ( s ) Y ε ( s ) d s + 2 ε t Δ t ( g ( X t Δ ε , Y ^ ε ( s ) ) g ( X s ε , Y ε ( s ) ) ) ξ 2 ε ( s ) , Y ^ ε ( s ) Y ε ( s ) d s + 2 ε t t Δ X h ( X t Δ ε , Y ^ ε ( s ) , z ) h ( X s ε , Y ε ( s ) , z ) , Y ^ ε ( s ) Y ε ( s ) ( φ ε ( s , z ) 1 ) ν ( d z ) d s + 2 ε t Δ t g ( X t Δ ε , Y ^ ε ( s ) ) g ( X s ε , Y ε ( s ) ) , ( Y ^ ε ( s ) Y ε ( s ) ) d B 2 ( s ) + 1 ε t Δ t | g ( X t Δ ε , Y ^ ε ( s ) ) g ( X s ε , Y ε ( s ) ) | 2 d s + t Δ t X 2 h ( X t Δ ε , Y ^ s ε , z ) h ( X s ε , Y s ε , z ) , Y s ε Y s ε N ˜ 1 ε φ ε ( d s , d z ) + t Δ t X | h ( X t Δ ε , y ^ Y s ε , z ) h ( X s ε , Y s ε , z ) , Y ^ s ε Y s ε | 2 N ˜ 1 ε φ ε ( d s , d z ) + 1 ε t Δ t X | h ( X t Δ ε , y ^ Y ε ( s ) ) h ( X s ε , Y ε ( s ) ) | 2 φ ε ( s , z ) ν ( d z ) d s = i = 1 8 I ε ( t ) .
Using (21) in Hypothesis 4 yields for any ε > 0 and t [ t Δ , t Δ + 1 ]
I 1 ε ( t ) 2 β 1 ε t Δ t | Y ^ ε ( s ) Y ε ( s ) | 2 d s + 2 β 2 Δ ε | | X t Δ ε X t ε | | 2 .
The boundedness of g given by (18) in Hypothesis 4, the fact that ξ ε U ˜ ε M and Cauchy-Schwartz’s inequality imply for any ε > 0 and t [ t Δ , t Δ + 1 ] that
I 2 ε ( t ) 4 Λ M d ( ε ) ε 1 + t Δ t | Y ^ ε ( s ) Y ε ( s ) | 2 d s .
Analogously, (21) in Hypothesis 4 together with (A1), (A2) given in Lemma A1 of Section A.1 of the Appendix A combined with the numeric fact x λ x 2 + 1 λ , x , λ 0 yield some C 1 = C 1 ( M , Λ ) > 0 such that for any ε > 0 and t [ t Δ , t Δ + 1 ] we have
I 3 ε ( t ) C 1 ε λ ( d ( ε ) + Δ ) + C 1 λ ε t Δ t | Y ^ ε ( s ) Y ε ( s ) | 2 Θ ε ( s ) d s and I 5 ε ( t ) + I 8 ε ( t ) C ε d 2 ( ε ) + Δ
where Θ ε ( t ) : = 0 t | z | | φ ε ( s , z ) 1 | ν ( d z ) , t [ 0 , T ] .
The estimates (A11)–(A13) imply for t [ t Δ , t Δ + 1 ] , ε > 0 and λ = λ ( ε ) > 0 fixed below the following P ¯ -a.s. bound on the event { T < τ ˜ R ( ε ) ε } :
| Y ^ ε ( t ) Y ε ( t ) | 2 ε t Δ t 1 ε 1 + d ( ε ) + λ ( ε ) Θ ε ( s ) | Y ^ ε ( s ) Y ε ( s ) | 2 d s + C 2 ( ε ) + I 4 ε ( t ) + I 6 ε ( t ) + I 7 ε ( t ) ,
where
C 2 ( ε ) ε 1 ε Δ ( ε ) R 2 ( ε ) + d ( ε ) + d ( ε ) λ ( ε ) ( 1 + Δ ( ε ) ) + d 2 ( ε ) + Δ ( ε ) as ε 0 .
Due to Gronwall’s lemma, the estimate (A2) in Lemma A1 (Section A.1 of the Appendix A) and the fact that E ¯ [ I 4 ε ] = E ¯ [ I 6 ε ] = E ¯ [ I 7 ε ] = 0 it follows, for any ε > 0 , λ = λ ( ε ) = ε and t [ t Δ , t Δ + 1 ] that
E ¯ [ | Y ^ ε ( t ) Y ε ( t ) | 2 1 { T < τ ˜ R ( ε ) ε } ] ε C 2 ( ε ) exp Δ ( ε ) ε ( 1 d ( ε ) ) + d ( ε ) + Δ ( ε ) .
Let ε 0 > 0 small enough such that 1 d ( ε ) > 1 2 and d ( ε ) + Δ ( ε ) < 1 for any ε < ε 0 . Therefore we have for any ε > 0 small enough and t [ 0 , T ] that
E ¯ [ | Y ^ ε ( t ) Y ε ( t ) | 2 1 { T < τ ˜ R ( ε ) ε } ] ε C 2 ( ε ) Δ ( ε ) e Δ ( ε ) 2 ε + 1 0 as ε 0 ,
due to the choice of Δ ( ε ) fixed in (52). □

References

  1. Mao, X. Stochastic Differential Equations and Applications, 2nd ed.; UK Horwood Publishing Limited: Chichester, UK, 2008. [Google Scholar]
  2. Baños, D.R.; Cordoni, F.; Di Nunno, G.; Di Persio, L.; Rose, E.E. Stochastic systems with memory and jumps. J. Diff. Eq. 2019, 226, 5772–5820. [Google Scholar] [CrossRef] [Green Version]
  3. Weinan, E.; Liu, S.; Vanden-Eijnden, E. Analysis of multiscale methods of stochastic differential equations. Comm. Pure Appl. Math. 2005, LVIII, 1544–1585. [Google Scholar] [CrossRef]
  4. Pavliotis, G.; Stuart, A. Multiscale Methods: Averaging and Homogenization; Texts in Applied Mathematics; Springer: Berlin/Heidelberg, Germany, 2008; Volume 53. [Google Scholar]
  5. Fouque, J.-P.; Papanicolaou, G.; Sircar, K.R. Derivatives in Financial Markets with Stochastic Volatility; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
  6. Fouque, J.-P.; Papanicolaou GSircar, K.R.; Solna, K. Multiscale stochastic volatility asymptotics. Multiscale Model. Simul. 2003, 2, 22–42. [Google Scholar] [CrossRef] [Green Version]
  7. Kifer, Y. Averaging and climate models. In Stochastic Climate Models; Imkeller, P., Storch, J.-S.G., Eds.; Progress in Probability; Birkhäuser Verlag: Basel, Switzerland, 2001; Volume 49. [Google Scholar]
  8. Debussche, A.; Högele, M.; Imkeller, P. The Dynamics of Nonlinear Reaction-Diffusion Equations with Small Lévy Noise; Lecture Notes in Mathematics; Springer: Berlin/Heidelberg, Germany, 2013; Volume 2085. [Google Scholar]
  9. Arnold, L. Hasselmann’s program revisited: The analysis of stochasticity in deterministic climate models In Stochastic Climate Models; Imkeller, P., Storch, J.-S.G., Eds.; Progress in Probability; Birkhäuser Verlag: Basel, Switzerland, 2001; Volume 49. [Google Scholar]
  10. Dijkstra, H.A. Nonlinear Climate Dynamics; Cambridge University Press: New York, NY, USA, 2013. [Google Scholar]
  11. Ditlevsen, P.D. Observation of a stable noise induced millennial climate changes from an ice-core record. Geophys. Res. Lett. 1999, 26, 1441–1444. [Google Scholar] [CrossRef] [Green Version]
  12. Gairing, J.; Högele, M.; Kosenkova, T.; Kulik, A. On the Calibration of Lévy Driven Time Series with Coupling Distances with an Application in Paleoclimate; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
  13. Hein, C.; Imkeller, P.; Pavlyukevich, I. Limit theorems for p-variations of solutions of SDEs driven by additive Stable Levy noise and model selection for paleo-climatic data. Interdiscip. Math. Sci. 2009, 8, 137–150. [Google Scholar]
  14. Dijkstra, H.A. Derivation of delay climate models using the Mori-Zwanzig formalism. Proc. R. Soc. A 2019, 475, 2227. [Google Scholar]
  15. Khasminkii, R.Z. On the principle of averaging the Ito’s stochastic differential equations. Kybernetika 1968, 4, 260–279. [Google Scholar]
  16. Freidlin, M.I.; Wentzell, A.D. Random Perturbations of Dynamical Systems, 2nd ed.; Grundlehren der Mathematischen Wissenschaften 260; Springer: New York, NY, USA, 1998. [Google Scholar]
  17. Freidlin, M. The Averaging Principle and Theorems on Large Deviations. Russian Math. Surveys 1978, 33, 117–176. [Google Scholar] [CrossRef]
  18. Veretennikov, A.Y. On the Averaging Principle for Systems of Stochastic Differential Equations. Math. USRR-Sbornik 1991, 69, 271–284. [Google Scholar] [CrossRef]
  19. Cerrai, S. A Khasminskii type of Averaging Principle for Stochastic Reaction Diffusion Equations. Ann. Appl. Prob. 2009, 19, 899–948. [Google Scholar] [CrossRef]
  20. Cerrai, S.; Freidlin, M. Averaging principle for a class of stochastic reaction-diffusion equations. Probab. Theory Relat. Fields 2009, 144, 137–177. [Google Scholar] [CrossRef] [Green Version]
  21. Cerrai, S. Normal deviations from the averaged motion for some reaction-diffusion equations with fast oscillating perturbation. J. Math. Pures App. 2009, 91, 614–647. [Google Scholar] [CrossRef] [Green Version]
  22. Givon, D. Strong Convergence Rate for Two-Time-Scale Jump Diffusion Stochastic Differential Systems. Multiscale Model Simul. 2007, 6, 577–594. [Google Scholar] [CrossRef] [Green Version]
  23. Liu, D. Strong convergence rate of principle of averaging for jump diffusion processes. Front. Math. China 2012, 7, 305–320. [Google Scholar] [CrossRef]
  24. Xu, J.; Miao, Y.; Liu, J. Strong averaging principle for slow-fast SPDEs with Poisson random measures. Discret. Contin. Dyn. Syst. Ser. B 2015, 20, 2233–2256. [Google Scholar] [CrossRef]
  25. Xu, J. Lp-strong convergence of the averaging principle for slow-fast SPDEs with jumps. J. Math. Analysis Appl. 2017, 445, 342–373. [Google Scholar] [CrossRef] [Green Version]
  26. Bao, J.; Song, Q.; Yin, G.; Yuan, C. Ergodicity and strong limit results for two-time-scale functional stochastic differential equations. Stoch. Anal. Appl. 2017, 35, 1030–1060. [Google Scholar] [CrossRef]
  27. Mao, W.; You, S.; Wu, X.; Mao, X. On the averaging principle for stochastic delay differential equations with jumps. Adv. Differ. Eq. 2015, 2015, 70. [Google Scholar] [CrossRef] [Green Version]
  28. Budhiraja, A.; Dupuis, P.; Ganguly, A. Large deviations for small noise diffusions in a fast Markovian environment. Electron. J. Probab. 2018, 23, 1–33. [Google Scholar] [CrossRef]
  29. Duan, J.; Wang, W.; Roberts, A.J. Large deviations and approximations for slow–fast stochastic reaction–diffusion equations. J. Diff. Eqs. 2012, 253, 3501–3522. [Google Scholar]
  30. Kumar, R.; Popovic, L. Large deviations for multi-scale jump diffusion processes. Stoch. Proc. Their Appl. 2017, 127, 1297–1320. [Google Scholar] [CrossRef] [Green Version]
  31. Veretennikov, A.Y.; Yu, A. On large deviations for SDEs with small diffusion and averaging. Stoch. Process. Their Appl. 2000, 89, 69–79. [Google Scholar] [CrossRef] [Green Version]
  32. Feng, J.; Fouque, J.-P.; Kumar, R. Small-Time Asymptotics for Fast Mean-Reverting Stochastic Volatility Models. Ann. Appl. Prob. 2012, 22, 1541–1575. [Google Scholar] [CrossRef] [Green Version]
  33. Guillin, A. Moderate deviations of inhomogeneous functionals of Markov processes and application to averaging. Stoch. Proc. Appl. 2001, 92, 287–313. [Google Scholar] [CrossRef] [Green Version]
  34. Guillin, A. Averaging principle of SDE with small diffusion: Moderate deviations. Ann. Prob. 2003, 31, 413–443. [Google Scholar] [CrossRef]
  35. Friz, P.; Geshold, S.; Pinter, A. Option Pricing in the Moderate Deviations Regime. Math. Fin. 2018, 28, 962–988. [Google Scholar] [CrossRef] [Green Version]
  36. Jacquier, A.; Spiliopoulos, J.K. Pathwise Moderate Seviations in Option Pricing. Math. Financ. 2019, 30, 1–38. [Google Scholar] [CrossRef] [Green Version]
  37. Dejellout, H.; Guillin, A.; Wu, L. Large and Moderate Deviations for Estimators of Quadratic Variational Processes of Diffusions. Stat. Inference Stoch. Proc. 2000, 2, 195–225. [Google Scholar] [CrossRef]
  38. Keblaner, F.C.; Lipster, R. Moderate deviations for randomly perturbed dynamical systems. Stoch. Proc. Their Appl. 1999, 180, 157–176. [Google Scholar]
  39. Fleming, W.H. A stochastic control approach to some large deviations problems. In Recent Mathematical Methods in Dynamic Programming; Dolcetta, C., Fleming, W.H., Zoletti, T., Eds.; Springer Lecture notes in Math; Springer: Berlin/Heidelberg, Germany, 1985; Volume 1119, pp. 52–66. [Google Scholar]
  40. Fleming, W.H. Stochastic control and large deviations. In Future Tendencies in Computer Science, Control and Applied Mathematics; Bensoussan, A., Verjus, J.P., Eds.; Springer: Berlin/Heidelberg, Germany, 1992; Volume 653. [Google Scholar]
  41. Dupuis, P.; Ellis, R.S. A Weak Convergence Approach to the Theory of Large Deviations; Wiley Series in Probability and Statistics; Wiley and Sons: New York, NY, USA, 1997. [Google Scholar]
  42. Budhiraja, A.; Dupuis, P. A variational representation for positive functionals of infinite Brownian motion. Probab. Math. Stat. 2000, 20, 39–61. [Google Scholar]
  43. Budhiraja ADupuis, P.; Maroulas, V. Variational representations for continuous time processes. Ann. de l’Inst. Henr. Poinc. B Probab. Stat. 2011, 47, 725–747. [Google Scholar] [CrossRef]
  44. Budhiraja, A.; Chen, J.; Dupuis, P. Large deviations for stochastic partial differential equations driven by a Poisson random measure. Stochastic Process. Appl. 2013, 123, 523–560. [Google Scholar] [CrossRef] [Green Version]
  45. Budhiraja, A.; Dupuis, P. Analysis and Approximation of Rare Events. Representations and Weak Convergence Methods; Series Prob. Theory and Stoch. Modelling; Springer: Berlin/Heidelberg, Germany, 2019; Volume 94. [Google Scholar]
  46. Budhiraja, A.; Dupuis, P.; Ganguly, A. Moderate deviation principles for stochastic differential equations with jumps. Ann. Probab. 2016, 44, 1723–1775. [Google Scholar] [CrossRef]
  47. Budhiraja, A.; Wu, R. Moderate Deviation Principles for Weakly Interacting Particle Systems. Probab. Theory Relat. Fields 2017, 168, 721–771. [Google Scholar] [CrossRef] [Green Version]
  48. Zheng, W.; Zhai, J.; Zhang, T. Moderate deviations for stochastic models of two-dimensional second-grade fluids driven by Lévy noises. Comm. Math. Stat. 2018, 6, 583–612. [Google Scholar] [CrossRef] [Green Version]
  49. Azencott RGeiger, B.; Ott, W. Large deviations for Gaussian diffusions with delay. J. Stat. Phys 2018, 170, 254–285. [Google Scholar] [CrossRef] [Green Version]
  50. Lipshutz, D. Exit time asymptotics for small noise stochastic delay differential equations. Discret. Contin. Dyn. Syst. A 2018, 38, 3099–3138. [Google Scholar] [CrossRef] [Green Version]
  51. Ma, X.; Xi, F. Moderate deviations for neutral stochastic differential delay equations. Stat. Prob. Lett. 2016, 126, 97–107. [Google Scholar] [CrossRef]
  52. Suo, Y.; Tao, J.; Zhang, W. Moderate deviations and central limit theorem for stochastic differential delay equations with polynomial growth. Front. Math. Chima 2018, 13, 913–933. [Google Scholar] [CrossRef]
  53. Billinsgley, P. Convergence of Probability Measures, 2nd ed.; Wiley-Interscience: Hoboken, NJ, USA, 1999. [Google Scholar]
  54. Dzhaparidze, K.; van Zantem, J.H. On Bernstein-type inequalities for martingales. Stoch. Proc. Appl. 2001, 93, 109–117. [Google Scholar] [CrossRef] [Green Version]
  55. Ikeda, N.; Watanabe, S. Stochastic Differential Equations and Diffusion Processes; North-Holland Publishing Co.: Amsterdam, The Netherlands, 1981. [Google Scholar]
  56. Stroock, D. An Introduction to the Theory of Large Deviations; Springer: Berlin/Heidelberg, Germany, 1984. [Google Scholar]
  57. Nishimori, Y. Large deviations for symmetric stable processes with Feynman-Kac functionals and its applications to pinned polymers. Tohoku Math. J. 2013, 65, 467–494. [Google Scholar] [CrossRef]
  58. Protter, P.E. Stochastic Integration and Diferential Equations; Stochastic Modelling and Applied Probability; Springer: Berlin/Heidelberg, Germany, 2004; p. 21. [Google Scholar]
  59. DaPrato, G. An Introduction to Infinite Dimensional Analysis; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  60. Quiao, H. Exponential Ergodicity for SDEs with Jumps and non-Lipschitz coefficients. J. Theor. Prob. 2014, 27, 137–152. [Google Scholar] [CrossRef]
  61. Xu, J.; Liu, J.; Miao, Y. Strong Averaging Principle for Two-Time Scale SDEs with nonLipschitz coefficients. J. Math. Anal. Appl. 2018, 468, 116–140. [Google Scholar] [CrossRef]
  62. Rosiński, J. Tempering stable processes. Stoch. Proc. Appl. 2007, 177, 677–707. [Google Scholar] [CrossRef] [Green Version]
  63. Oliveira Gomes, A.D.; Högele, M.A. The Kramers problem driven by small accelerated Lévy noise with exponentially light jumps. Stochastics Dyn. 2021, 32, 2150019. [Google Scholar] [CrossRef]
  64. Jacod, J.; Shiryaev, A.N. Limit Theorems for Stochastic Processes; Springer: Berlin/Heidelberg, Germany, 1987. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

de Oliveira Gomes, A.; Catuogno, P. Moderate Averaged Deviations for a Multi-Scale System with Jumps and Memory. Dynamics 2023, 3, 171-201. https://doi.org/10.3390/dynamics3010011

AMA Style

de Oliveira Gomes A, Catuogno P. Moderate Averaged Deviations for a Multi-Scale System with Jumps and Memory. Dynamics. 2023; 3(1):171-201. https://doi.org/10.3390/dynamics3010011

Chicago/Turabian Style

de Oliveira Gomes, André, and Pedro Catuogno. 2023. "Moderate Averaged Deviations for a Multi-Scale System with Jumps and Memory" Dynamics 3, no. 1: 171-201. https://doi.org/10.3390/dynamics3010011

Article Metrics

Back to TopTop