Next Article in Journal
Optimal Microbiome Networks: Macroecology and Criticality
Next Article in Special Issue
Parameter Estimation with Data-Driven Nonparametric Likelihood Functions
Previous Article in Journal
A Symmetric Image Encryption Algorithm Based on a Coupled Logistic–Bernoulli Map and Cellular Automata Diffusion Strategy
Previous Article in Special Issue
Diffusion Equation-Assisted Markov Chain Monte Carlo Methods for the Inverse Radiative Transfer Equation
Article

State and Parameter Estimation from Observed Signal Increments

Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str. 24/25, D-14476 Potsdam, Germany
*
Author to whom correspondence should be addressed.
Entropy 2019, 21(5), 505; https://doi.org/10.3390/e21050505
Received: 26 March 2019 / Revised: 13 May 2019 / Accepted: 14 May 2019 / Published: 17 May 2019
(This article belongs to the Special Issue Information Theory and Stochastics for Multiscale Nonlinear Systems)

Abstract

The success of the ensemble Kalman filter has triggered a strong interest in expanding its scope beyond classical state estimation problems. In this paper, we focus on continuous-time data assimilation where the model and measurement errors are correlated and both states and parameters need to be identified. Such scenarios arise from noisy and partial observations of Lagrangian particles which move under a stochastic velocity field involving unknown parameters. We take an appropriate class of McKean–Vlasov equations as the starting point to derive ensemble Kalman–Bucy filter algorithms for combined state and parameter estimation. We demonstrate their performance through a series of increasingly complex multi-scale model systems.
Keywords: parameter estimation; continuous-time data assimilation; ensemble Kalman filter; correlated noise; multi-scale diffusion processes parameter estimation; continuous-time data assimilation; ensemble Kalman filter; correlated noise; multi-scale diffusion processes

1. Introduction

The research presented in this paper has been motivated by the state and parameter estimation problem for particles moving under a stochastic velocity field, with the measurements given by partial and noisy observations of their position increments. If the deterministic contributions to the velocity field are stationary, and the position increments of the moving particle are exactly observed, then one is led to a standard parameter estimation problem for stochastic differential equations (SDEs) [1,2]. In [3], this setting was extended to the case where the deterministic contributions to the velocity field themselves undergo a stochastic time evolution. Furthermore, while continuous-time observations of position increments are at the focus of the present study, the assimilation of discrete-time observations of particle positions has been investigated in [4,5] under a so-called Lagrangian data assimilation setting for atmospheric fluid dynamics.
The assumption of exactly and fully observed position increments is not always realistic and the case of partial and noisy observations is at the center of the present study. With access to partial and noisy observations of position increments leads to correlations between the measurement and model errors. The theoretical impact of such correlations on state and parameter estimation problems has been discussed, for example, in [6] in the context of linear systems, and in [7] for nonlinear systems. In particular, one finds that the appropriately adjusted data likelihood involves the gradient of log-densities, which is nontrivial from a computational perspective, and which prevents a straightforward application of standard Markov chain Monte Carlo (MCMC) or sequential Monte Carlo (SMC) methods [8].
In this paper, we instead follow an alternative Monte Carlo approach based on appropriately adjusted McKean–Vlasov filtering equations, an approach pioneered in [9] in the context of the standard state estimation problem for diffusion processes. McKean–Vlasov equations, first studied in [10], are a class of SDEs in which the right-hand side depends on the law of the process itself. We rely on a particular formulation of McKean–Vlasov filtering equations, the so-called feedback particle filters [11], utilising stochastic innovation processes [12].
Our proposed Monte Carlo formulation avoids the need for estimating log-densities, and can be implemented in a numerically robust manner relying on a generalised ensemble Kalman–Bucy filter approximation applied to an extended state space formulation [13]. The ensemble Kalman–Bucy filter [14,15] has been introduced previously as an extension of the popular ensemble Kalman filter [13,16,17] to continuous-time data assimilation under the assumption of uncorrelated measurement and model errors.
While the McKean-Vlasov formulation is essentially mathematically equivalent to the more conventional one based on the Kushner-Stratonovitch equation [7], these two approaches differ significantly in structure, suggesting different tools for their analysis as well as numerical approximations. More broadly speaking, the McKean–Vlasov approach to filtering is appealing since its Monte Carlo implementations completely avoid the need for resampling characteristic of standard SMC methods. Furthermore, a wide range of approximations are possible within the McKean–Vlasov framework with some of them, such as the ensemble Kalman–Bucy filter, applicable to high-dimensional problems. The McKean–Vlasov approach also arises naturally when analysing sequential Monte Carlo methods [18].
In Section 6, we apply the proposed algorithms to a series of state and parameter estimation problems of increasing complexity. First, we study the state and parameter estimation problem for an Ornstein–Uhlenbeck process [2]. Two further experiments investigate the behaviour of the filters for reduced model equations, with the data being collected from underlying multi-scale models. There we distinguish between the averaging and homogenisation scenarios [19]. Finally, we look at examples of nonparametric drift estimation [3] and parameter estimation for the stochastic heat equation [20].
We finally mention that SMC methods for correlated noise terms in discrete-time have been discussed, for example, in [21] and in the context of the ensemble Kalman filter in [22]. Similar ideas have also been pursued in a more applied context in [23].

2. Mathematical Problem Formulation

We consider the time evolution of a random state variable X t R N x in N x -dimensional state space, N x 1 , as prescribed by an SDE of the form
d X t = f ( X t , a ) d t + G d W t ,
for time t 0 , with the drift function f : R N x × R N a R N x depending on N a 0 unknown parameters a = ( a 1 , , a N a ) T R N a . Model errors are represented through standard N w -dimensional Brownian motion W t , N w 1 , and a matrix G R N x × N w . We also introduce the associated model error covariance matrix Q = G G T . We will generally assume that the initial condition X 0 is fixed, that is, X 0 = x 0 a.s. for given x 0 R N x . In terms of a more specific example, one can think of X t denoting the position of a particle at time t 0 moving in N x = 3 dimensional space under the influence of a stochastic velocity field, with deterministic contributions given by f and stochastic perturbations by G W t . In the case G = 0 , the SDE (1) reduces to an ordinary differential equation with given initial condition x 0 .
We assume throughout this paper that (1) possesses unique, strong solutions for all parameter values a. See, for example, [2] (Section 3.3) for sufficient conditions on the drift function f. The distribution of X t is denoted by π t , which we also abbreviate by π t = Law ( X t ) . We use the same notation for measures and their Lebesgue densities, provided they exist.
Example 1.
A wide class of drift functions can be written in the form
f ( x , a ) = f 0 ( x ) + B ( x ) a = f 0 ( x ) + i = 1 N a b i ( x ) a i ,
where f 0 : R N x R N x is a known drift function, the b i : R N x R N x , i = 1 , , N a , denote appropriate basis functions, and the vector a = ( a 1 , , a N a ) T R N a contains the unknown parameters of the model. The family { b i ( x ) } of basis functions, which we collect in a matrix-valued function B ( x ) = ( b 1 ( x ) , b 2 ( x ) , , b N a ( x ) ) R N x × N a , could arise from a finite-dimensional truncation of some appropriate Hilbert space H . See, for example, [24] for computational approaches to nonparametric drift estimation using a Galerkin approximation in H , where the b i ( x ) become finite element basis functions. Furthermore, the expansion coefficients { a i } could be made time-dependent by letting them evolve according to some system of differential equations arising, for example, from the discretisation of an underlying partial differential equation with solutions in H . See [3] for specific examples of such a setting. While the present paper focuses on stationary drift functions, i.e., the parameters { a i } are time-independent, the results from Section 3 and Section 5, respectively, can easily be extended to the non-stationary case where the parameters themselves satisfy given evolution equations.
Data and an observation model are required in order to perform state and parameter estimation for SDEs of the form (1). In this paper, we assume that we observe partial and noisy increments d Y t of the signal X t , given by
d Y t = H d X t + R 1 / 2 d V t = H f ( X t , a ) d t + H G d W t + R 1 / 2 d V t , Y 0 = X 0 = x 0 ,
for t in the observation interval [ 0 , T ] , T > 0 , where H R N y × N x is a given linear operator, V t denotes standard N y -dimensional Brownian motion with N y 1 and R R N y × N y is a covariance matrix. We introduce the observation map
h ( x , a ) = H f ( x , a )
for later use. Unless H G = 0 , it is clear that the model error E t m : = G W t in (1) and the total observation error
E t o : = H G W t + R 1 / 2 V t
in (3) are correlated. The impact of correlations between the model and measurement errors on the state estimation problem have been discussed by [6,7]. Furthermore, such correlations require adjustments to sequential estimation methods [16,17,25] which are the main focus of this paper. We assume throughout this paper that the covariance matrix
C = H G G T H T + R = H Q H T + R
of the observation error (5) is invertible.
The special case R = 0 and H = I leads to a pure parameter estimation problem which has been extensively studied in the literature in the settings of maximum likelihood and Bayesian estimators [1,2]. In Section 3, we provide a reformulation of the Bayesian approach as McKean–Vlasov equations for the parameters, based on the results in [9,11].
If R 0 , then (1) and (3) lead to a combined state and parameter estimation problem with correlated noise terms. We will first discuss the impact of this correlation on the pure state estimation problem in Section 4 assuming that the parameters of the problem are known. Again, we will derive appropriate McKean–Vlasov equations in the state variables. Our key contribution is a formulation that avoids the need for log-density estimates, and can be put into an appropriately generalised ensemble Kalman–Bucy filter approximation framework [14,15]. We also formally demonstrate that the McKean–Vlasov filter equation reduces to d X t = d Y t in the limit R 0 and H = I , a property that is less straightforward to demonstrate for filter formulations involving log-densities.
These McKean–Vlasov equations are generalised to the combined state and parameter estimation problem via an augmentation of state space [13] in Section 5. Given the results from Section 4, such an extension is rather straightforward.
The numerical experiments in Section 6 rely exclusively on the generalised ensemble Kalman–Bucy filter approximation to the McKean–Vlasov equations, which are easy to implement and yield robust and accurate numerical results.

3. Parameter Estimation from Noiseless Data

In this section, we treat the simpler Bayesian parameter estimation problem which arises from setting R = 0 and H = I in (3), i.e., N y = N x . This leads to d X t = d Y t and, furthermore, X t = Y t for all t [ 0 , T ] , provided X 0 = Y 0 = x 0 which we assume throughout this paper. The requirement that C = Q is invertible requires that G has rank N x ; that is, N w N x in (1). The data likelihood
l t ( a ) = exp 0 t f ( Y s , a ) T Q 1 d Y s 1 2 0 t f ( Y s , a ) T Q 1 f ( Y s , a ) d s
thus follows from the observation model with additive Brownian noise in (3). Given a prior distribution Π 0 ( a ) for the parameters, the resulting posterior distribution at any time t ( 0 , T ] is
Π t ( a ) = l t ( a ) Π 0 ( a ) Π 0 [ l t ]
according to Bayes’ theorem [7]. Here, we have introduced the shorthand
Π 0 [ l t ] = R N a l t ( a ) Π 0 ( a ) d a
for the expectation of l t with respect to Π 0 . It is well-known that the posterior distributions Π t satisfy the stochastic partial differential equation
d Π t [ ϕ ] = Π t [ ϕ h t ] Π t [ ϕ ] Π t [ h t ] T Q 1 ( d Y t Π t [ h t ] d t )
with the time-dependent observation map
h t ( a ) = f ( Y t , a ) ,
where ϕ : R N a R is a compactly supported smooth test function, and Π t [ ϕ ] again denoting the expectation of ϕ with respect to Π t . See [7] for a detailed discussion. Equation (10) is a special instance of the well-known Kushner–Stratonovitch equation from time-continuous filtering [7].

3.1. Feedback Particle Filter

We now state a McKean–Vlasov reformulation of the Kushner–Stratonovitch Equation (10) as a special instance of the feedback particle filter of [11,12]. The key idea is to formulate a stochastic differential equation in the parameters in which they are treated as time-dependent random variables. We introduce the notation A ˜ t for these, and require that the law of A ˜ t coincide with (8) for t [ 0 , T ] , i.e., with the solution to (10).
Lemma 1 (Feedback particle filter).
Consider the McKean–Vlasov equations
d A ˜ t = K t ( A ˜ t ) d I t + Ω t ( A ˜ t ) d t ,
where the matrix-valued Kalman gain K t R N a × N y satisfies
· Π ˜ t K t Q = Π ˜ t h t Π ˜ t [ h t ] T , Π ˜ t = Law ( A ˜ t ) .
The innovation process I t can be chosen to be given by either
d I t = d Y t 1 2 h t ( A ˜ t ) + Π ˜ t [ h t ] d t ,
or
d I t = d Y t h t ( A ˜ t ) d t + G d W t ,
and
Ω t i = 1 2 j = 1 N a k , l = 1 N y Q k l K t j l j K t i k , i = 1 , , N a .
Then, the distribution Π ˜ t = Law ( A ˜ t ) coincides with the solution to (10), provided that the initial distributions agree. In other words, Π ˜ t = Π t for all t [ 0 , T ] .
Throughout this paper, we write (12) in the more compact Stratonovitch form
d A ˜ t = K t ( A ˜ t ) d I t ,
where the Stratonovitch interpretation is to be applied only to A ˜ t in K t ( A ˜ t ) , while the explicit time-dependence of K t remains in its Itô interpretation. It should be noted that the matrix-valued function K t is not uniquely defined by the PDE (13). Indeed, provided K t solves (13), K t + β t is also a solution whenever · Π ˜ t β t = 0 . As discussed in [15], the minimiser over all suitable K t with respect to a kinetic energy-type functional is of the form
K t = Ψ t Q 1
for a vector of potential functions Ψ t = ( ψ t 1 , , ψ t N x ) , ψ t k : R N a R . Inserting (18) into (13) leads to N x elliptic partial differential equations (often referred to as Poisson equations),
· Π ˜ t Ψ t = Π ˜ t h t Π ˜ t [ h t ] T , Π ˜ t [ Ψ t ] = 0 ,
understood component wise, where the centring condition Π ˜ t [ Ψ t ] = 0 makes the solution unique under mild assumptions on Π ˜ t (see [26]). The numerical approximation of (19) in the context of the feedback particle filter has been discussed in [27]. Finally, (15) yields a particularly appealing formulation, since it is based on a direct comparison of d Y t with a random realisation of the right hand side of the SDE (1), given a parameter value a = A ˜ t ( ω ) and a realisation of the noise term d W t ( ω ) . This fact will be explored further in Section 4.
Remark 1.
For clarity, let us repeat Equations (13) and (18) in their index forms:
i = 1 N a j = 1 N y i Π ˜ t K t i j Q j k = Π ˜ t h t k Π ˜ t [ h t k ] , k = 1 , , N y ,
j = 1 N y K t i j ( a ) Q j k = i ψ t k ( a ) , i = 1 , , N a , k = 1 , , N y .

3.2. Ensemble Kalman–Bucy Filter

Let us now assume that the initial distribution Π 0 is Gaussian, and that f is linear in the unknown parameters such as in (2). Then, the distributions Π ˜ t remain Gaussian for all times with mean a ¯ t and covariance matrix P t a a . The elliptic PDE (13) is solved by the parameter-independent Kalman gain matrix
K t = P t a a B ( Y t ) T Q 1
and one obtains the McKean–Vlasov formulation
d A ˜ t = P t a a B ( Y t ) T Q 1 d I t
of the Kalman–Bucy filter, with the innovation process I t defined by either
d I t = d Y t f 0 ( Y t ) + 1 2 B ( Y t ) ( A ˜ t + a ¯ t ) d t
or
d I t = d Y t f 0 ( Y t ) + B ( Y t ) A ˜ t d t + G d W t .
Please note that the Stratonovitch formulation (17) reduces to the standard Itô interpretation, since K t no longer depends explicitly on A ˜ t .
The McKean–Vlasov Equation (23) can be extended to nonlinear, non-Gaussian parameter estimation problems by generalising the parameter-independent Kalman gain matrix (22) to
K t = P t a h Q 1 , P t a h = Π ˜ t ( a a ¯ t ) ( h t ( a ) Π ˜ t [ h t ] ) T = Π ˜ t a ( h t ( a ) Π ˜ t [ h t ] ) T
Clearly, the gain (26) provides only an approximation to the solution of (13). However, such approximations have become popular in nonlinear state estimation in the form of the ensemble Kalman filter [16,17], and we will test its suitability for parameter estimation in Section 6.
Numerical implementations of the proposed McKean–Vlasov approaches rely on Monte–Carlo approximations. More specifically, given M samples A ˜ 0 i , i = 1 , , M , from the initial distribution Π 0 , we introduce the interacting particle system
d A ˜ t i = K t M ( A ˜ t i ) d I t i ,
where the innovation processes I t i are defined by either
d I t i = d Y t 1 2 h t ( A ˜ t i ) + h ¯ t M d t , h ¯ t M = 1 M i = 1 M h t ( A ˜ t i ) ,
or, alternatively,
d I t i = d Y t h t ( A ˜ t i ) d t + G d W t i ,
and W t i , i = 1 , , M , denote independent N w -dimensional Brownian motions. For K t M , we will use the parameter-independent empirical Kalman gain approximation
K t M = P ^ t a h Q 1 , P ^ t a h = 1 M 1 i = 1 M A ˜ t i ( h t ( A ˜ t i ) h ¯ t M ) T ,
in our numerical experiments, which leads to the so-called ensemble Kalman–Bucy filter [14,15]. Please note that P ^ t a h provides an unbiased estimator of P t a h .
Finally, a robust and efficient time-stepping procedure for approximating A ˜ t n , t n = n Δ t , is provided in [28,29,30]. Denoting the approximations at time t n by A ˜ n i , i = 1 , , M , we obtain
A ˜ n + 1 i = A ˜ n i + Δ t P ^ n a h Q + Δ t P ^ n h h 1 Δ I n i
with step size Δ t > 0 , empirical covariance matrices
P ^ n a h = 1 M 1 i = 1 M A ˜ n i ( h n ( A ˜ n i ) h ¯ n M ) T , P ^ n h h = 1 M 1 i = 1 M h n ( A ˜ n i ) ( h n ( A ˜ n i ) h ¯ n M ) T ,
and innovation increments Δ I n i given by either
Δ I n i = Δ Y n 1 2 h n ( A ˜ n i ) + h ¯ n M Δ t , h ¯ n M = 1 M i = 1 M h n ( A ˜ n i ) ,
or
Δ I n i = Δ Y n h n ( A ˜ n i ) Δ t + Δ t 1 / 2 G Ξ n i , Ξ n i N ( 0 , I ) .
Here we have used the abbreviations h n ( a ) = f ( Y n , a ) , Y n = Y t n , and Δ Y n = Y t n + 1 Y t n .
While the feedback particle formulation (17) and its ensemble Kalman–Bucy filter approximation (31) are special cases of already available formulations, they provide the starting point for our novel McKean–Vlasov equations and their numerical approximation of the combined state and parameter estimation problem with correlated measurement and model errors, which we develop in the following two sections.

4. State Estimation for Noisy Data

We return to the observation Model (3) with R 0 and general H. The pure state estimation problem is considered first; that is, f ( x , a ) = f ( x ) in (1).
Using E t o , given by (5), and E t c defined by
E t c = G ( I G T H T C 1 H G ) W t Q H T C 1 R 1 / 2 V t
with the total measurement error covariance matrix C given by (6), we find that
G W t = E t c + Q H T C 1 E t o ,
and the covariations [2] satisfy
E o , E c t = 0 , E o , E o t = C t , E c , E c t = G ( I G T H T C 1 H G ) G T t .
These errors naturally suggest linear combinations of W t and V t in (1) and (3) that shift the correlation between measurement and model errors to the signal dynamics, yielding
d X t = f ( X t ) d t + G ( I G T H T C 1 H G ) 1 / 2 d W ^ t + Q H T C 1 / 2 d V ^ t ,
d Y t = H f ( X t ) d t + C 1 / 2 d V ^ t ,
where W ^ t and V ^ t denote mutually independent standard Brownian motions of dimension N w and N y , respectively. These equations correspond exactly to the correlated noise example from [7] (Section 3.8). Furthermore, H = I and R = 0 lead to E t c = 0 , Q H T C 1 / 2 = C 1 / 2 , and, hence, d X t = d Y t .
A straightforward application of the results from [7] (Section 3.8) yields the following statement:
Lemma 2 (Generalised Kushner–Stratonovich equation).
The conditional expectations π t [ ϕ ] = E [ ϕ ( X t ) | Y [ 0 , t ] ] satisfy
π t [ ϕ ] = π 0 [ ϕ ] + 0 t π s [ L ϕ ] d s + 0 t π s ϕ h + H Q ϕ ϕ π s [ h ] T C 1 d Y s π s [ h ] d s ,
where We use the notation Q : ϕ = i , j = 1 N x Q i j i j ϕ .
L = f · + 1 2 Q :
is the generator of (1), h ( x ) = H f ( x ) denotes the observation map, and ϕ is a compactly supported smooth function.
For the convenience of the reader, we present an independent derivation in Appendix A. We note that (39) also arises as the Kushner–Stratonovitch equations for an SDE Model (1) with observations Y t satisfying the observation model
d Y t = H f ( X t ) Q log π t ( X t ) d t + C 1 / 2 d V ˜ t ,
where V ˜ t denotes N y -dimensional Brownian motion independent of the Brownian motion W t in (1). Here we have used that π t H Q π t = 0 . This reinterpretation of our state estimation problem in terms of uncorrelated model and observation errors and modified observation map
h ˜ t ( x ) = H f ( x ) Q log π t ( x )
allows one to apply available MCMC and SMC methods for continuous-time filtering and smoothing problems. See, for example, [16]. However, there are two major limitations of such an approach. First, it requires approximating the gradient of the log-density. Second, the modified observation Model (41) is not well-defined in the limit R 0 and H = I , since the density π t collapses to a Dirac delta function under the given initial condition X 0 = x 0 a.s.
In order to circumvent these complications, we develop an alternative approach based on an appropriately modified feedback particle filter formulation in the following subsection.

4.1. Generalised Feedback Particle Filter Formulation

While it is clearly possible to apply the standard feedback particle filter formulations using (41), the following alternative formulation avoids the need for approximating the gradient of the log-density.
Lemma 3 (Feedback particle filter with correlated innovation).
Consider the McKean–Vlasov equation
d X ˜ t = f ( X ˜ t ) d t + G d W t + K t ( X ˜ t ) d I t + Ω t ( X ˜ t ) d t ,
where the gain K t R N x × N y solves
· π ˜ t K t C Q H T = π ˜ t h π ˜ t [ h ] T , π ˜ t = Law ( X ˜ t ) ,
with observation map h ( x ) = H f ( x ) . The function Ω t is given by
Ω t i = 1 2 l = 1 N x j = 1 N y l K t i j ( Q H T ) l j , i = 1 , , N x ,
and the innovation process I t by
d I t = d Y t h ( X ˜ t ) d t + H G d W t + R 1 / 2 d U t .
Here, W t and U t denote mutually independent N x -dimensional and N y -dimensional Brownian motions, respectively. Then, π ˜ t = Law ( X ˜ t ) coincides with the solution to (39), provided that the initial distributions agree.
It should be stressed that W t in (43) and (46) denote the same Brownian motion, resulting in correlations between the innovation process and model noise.
Proof. 
In this proof the Einstein summation convention over repeated indices is employed, noting that (44) takes the form
i π ˜ t K t i j C j k ( Q H T ) i k = π ˜ t h k π ˜ t [ h k ] , k = 1 , , N y .
We begin by writing (43) in its Itô-form,
d X ˜ t = f ( X ˜ t ) d t + G d W t + K t ( X ˜ t ) d I t + Ω ^ t ( X ˜ t ) d t ,
where
Ω ^ t i = Ω t i + 1 2 l K t i j ( Q H T ) l j + 2 l K t i j K t l k C k j = l K t i j K t l k C k j ( Q H T ) l j
Here, we have used that the covariation between K t and I t satisfies
d K i j , I j t = l K t i j G l k d W k , I t + K t l k d I k , I j t .
Furthermore, G W , I t = Q H T t and I , I t = 2 C t .
For a smooth compactly supported test function ϕ , Itô’s formula implies
ϕ ( X ˜ t ) = ϕ ( X ˜ 0 ) + 0 t i ϕ ( X ˜ s ) d X ˜ s i + 1 2 0 t i j ϕ ( X ˜ s ) d X ˜ i , X ˜ j s ,
where the covariation process is given by
X ˜ , X ˜ t = t Q 0 t K s H Q + Q H T K s T d s + 2 0 t K s C K s T d s .
Our aim is to show that π ˜ t [ ϕ ] coincides with π t [ ϕ ] as defined by the Kushner–Stratonovich Equation (39). To this end, we insert (48) and (52) into (51) and take the conditional expectation, arriving at
π ˜ t [ ϕ ] = π ˜ 0 [ ϕ ] + 0 t π ˜ s [ L ϕ ] d s + 0 t π ˜ s ( i ϕ ) K s i j d Y s j 0 t π ˜ s ( i ϕ ) K s i j h j d s + 0 t π ˜ s ( i ϕ ) Ω ^ s i d s + 0 t π ˜ s i j ϕ K s ( C K s T H Q ) i j d s ,
recalling that the generator L has been defined in (40). Under the assumption that K t satisfies (44), the two Equations (39) and (53) coincide. Indeed,
π ˜ s ( i ϕ ) ( K s i k C k j ( Q H T ) i j ) = π ˜ s ϕ h j π ˜ s h j
implies
π ˜ s [ ϕ · K s ] = π ˜ s ϕ h + H Q ϕ ϕ π ˜ s [ h ] T C 1 ,
and the d Y s -contributions agree. To verify the same for the d s -contributions, we use (44) to obtain
π ˜ s ( i ϕ ) K s i j ( h j π ˜ t [ h j ] ) = R N x ( i ϕ ) K s i j l π ˜ s K s l n C n j ( Q H T ) l j d x = π ˜ s ( i ϕ ) Ω ^ s i + π ˜ s i j ϕ K s ( C K s T K s H Q ) i j .
Finally, collecting terms in (53) and (56), and applying (55) to the remaining d s -contribution, i.e., π ˜ s [ ϕ · K s ] π ˜ s [ h ] , leads to the desired result. □
We note that the correlation between the innovation process I t and the model error W t leads to a correction term Ω t in (43) which cannot be subsumed into a Stratonovitch correction, in contrast to the standard feedback particle filter formulation (17).
Remark 2.
Assuming that there exist potential functions Ψ t = ( ψ t 1 , , ψ t N y ) , ψ t k : R N x R , solving the Poisson equation(s) (19) (with Π ˜ t being replaced by π ˜ t ), (44) can be solved by requiring
K t = ( Ψ t + Q H T ) C 1 ,
thus generalising (18).
Remark 3.
If we set R = 0 , H = I , and K t = Q H T C 1 = I in (43), then one obtains
d X ˜ t = d Y t
since Ω t vanishes, and all other terms in (43) cancel each other out. If, furthermore, Y 0 = X ˜ 0 = x 0 a.s., then X ˜ t = Y t for all t [ 0 , T ] , which in turn justifies our assumption that the gain K t is independent of the state variable. Hence, the McKean–Vlasov formulation (43) reproduces the exact reference trajectory Y t in the case of no measurement errors and perfectly known initial conditions.
We develop a simplified version of the feedback particle filter formulation (43) for linear SDEs and Gaussian distributions in the following subsection, which will form the basis of the generalised ensemble Kalman–Bucy filter put forward in the follow-up Section 4.3.

4.2. Generalised Kalman–Bucy Filter

Let us assume that f ( x ) = F x with F R N x × N x , i.e., Equations (1) and (3) take the form
d X t = F X t d t + G d W t ,
d Y t = H F X t d t + H G d W t + R 1 / 2 d V t ,
with initial conditions drawn from a Gaussian distribution. In this case π t stays Gaussian for all t > 0 , i.e., π t N ( x ¯ t , P t ) with x ¯ t R N x , P t R N x × N x . Equation (19) can be solved uniquely by x Ψ = P t F T H T , and thus the McKean–Vlasov equations for the feedback particle filter (43) reduce to
d X ˜ t = F X ˜ t d t + G d W t + P t F T H T + Q H T C 1 d I t ,
with the innovation process (46) leading to
d I t = d Y t H F X ˜ t d t H G d W t R 1 / 2 d U t .
We take the expectation in (60) and (61) and end up with
d x ¯ t = F x ¯ t d t + P t F T + Q H T C 1 d Y t H F x ¯ t d t .
Defining u t : = X ˜ t x ¯ t , we see that
d u t = F u t d t + G d W ˜ t P t F T + Q H T C 1 H F u t d t + H G d W t + R 1 / 2 d t .
Next we use
d u t u t T = d u t u t T + u t d u t T + d u , u T t
and P t = E [ u t u t T ] to obtain, after some calculations,
d P t = ( F P t + P t F T ) d t P t F T + Q H T C 1 H F P t + Q d t + Q d t .
Hence we have shown that our McKean–Vlasov formulation (60) agrees with the standard Kalman–Bucy filter equations for the mean and the covariance matrix in the correlated noise case [6].

4.3. Ensemble Kalman–Bucy Filter

The McKean–Vlasov Equation (60) for linear systems, along with Gaussian prior and posterior distributions, suggest approximating the feedback particle filter formulation (43) for nonlinear systems by
d X ˜ t = f ( X ˜ t ) d t + G d W t + P t x h + Q H T C 1 d I t ,
where the innovation process I t given by (46) as before. In other words, we approximate the gain matrix K t in (43) by the state independent term P t x h + Q H T C 1 with the covariance matrix P t x h defined by
P t x h = π ˜ t ( x x ¯ t ) ( h ( x ) π ˜ t [ h ] ) T = π ˜ t x ( h ( x ) π ˜ t [ h ] ) T
where π ˜ t denotes the law of X ˜ t .
We can now generalise the ensemble Kalman–Bucy filter formulation (31) for the pure parameter estimation problem to the state estimation problem with correlated noise. We assume that M initial state values X ˜ 0 i have been sampled from an initial distribution π 0 or, alternatively, X 0 i = x 0 for all i = 1 , , M in case the initial condition is known exactly. These state values are then propagated under the time-stepping procedure
X ˜ n + 1 i = X ˜ n i + Δ t f ( X ˜ n i ) + Δ t 1 / 2 G Θ n i + P ^ n x h + Q H T C + Δ t P ^ n h h 1 Δ I n i
with Θ n i N ( 0 , I ) , step size Δ t > 0 , empirical covariance matrices
P ^ n x h = 1 M 1 i = 1 M X ˜ n i ( h ( X ˜ n i ) h ¯ n M ) T , h ¯ n M = 1 M i = 1 M h ( X ˜ n i ) ,
P ^ n h h = 1 M 1 i = 1 M h ( X ˜ n i ) ( h ( X ˜ n i ) h ¯ n M ) T ,
and innovation increments Δ I n i given by
Δ I n i = Δ Y n Δ t h ( X ˜ n i ) Δ t 1 / 2 H G Θ n i Δ t 1 / 2 R 1 / 2 Ξ n i , Ξ n i N ( 0 , I ) .
The McKean–Vlasov equations of this section form the basis for the methods proposed for the combined state and parameter estimation problem to be considered next.

5. Combined State and Parameter Estimation

We now return to the combined state and parameter estimation problem, and consider the augmented dynamics
d X t = f ( X t , A t ) d t + G d W t ,
d A t = 0 ,
with observations (3) as before. The initial conditions satisfy X 0 = x 0 a.s., and A 0 Π 0 . Let us introduce the extended state space variable Z t = ( X t T , A t T ) T . In terms of Z t , the Equations (3) and (71) take the form
d Z t = f ¯ ( Z ) d t + G ¯ d W t ,
d Y t = H ¯ d Z t + R 1 / 2 d V t ,
with
f ¯ ( z ) = f ( x , a ) 0 , G ¯ = G 0 0 0 , H ¯ = H 0 .
Thus we end up with an augmented state estimation problem of the general structure considered in detail in Section 4 already. Below we provide details on some of the necessary modifications.

5.1. Feedback Particle Filter Formulation

The appropriately extended feedback particle filter Equation (43) leads to
d X ˜ t = f ( X ˜ , A ˜ t ) d t + G d W t + ( x Ψ t ( X ˜ t , A ˜ t ) + Q H T ) C 1 d I t + Ω t ( X ˜ t , A ˜ t ) ,
d A ˜ t = a Ψ t ( X ˜ t , A ˜ t ) C 1 d I t ,
where (46) takes the form
d I t = d Y t h ( X ˜ t , A ˜ t ) d t + H G d W t + R 1 / 2 d U t
with observation map (4) and correction Ω t given by (45), with Q replaced by Q ¯ = G ¯ G ¯ T and H by H ¯ . In the Poisson equation(s) (19), Π ˜ t is replaced by π ˜ t denoting the joint density of ( X ˜ t , A ˜ t ) . We also stress that Ψ t becomes a function of x and a, and we distinguish between gradients with respect to x and a using the notation x and a , respectively.
Numerical implementations of the extended feedback particle filter are demanding due to the need for solving the Poisson equation(s) (19). Instead, we again rely on the ensemble Kalman–Bucy filter approximation, which we describe next.

5.2. Ensemble Kalman–Bucy Filter

We approximate the joint density π ˜ t of Z ˜ t by an ensemble of particles
Z ˜ t i = X ˜ t i A ˜ t i ,
that is,
π ˜ t 1 M i = 1 M δ Z ˜ t i ,
where δ z denotes the Dirac delta function centred at z . The initial ensemble satisfies X 0 i = x 0 for all i = 1 , , M , and the initial parameter values A 0 i are independent draws from the prior distribution Π 0 .
At the same time, we make the approximation Z ˜ t N ( z ¯ t M , P ^ t z z ) when dealing with the Kalman gain of the feedback particle filter. Here the empirical mean z ¯ t M has components
x ¯ t M = 1 M i = 1 M X ˜ t i , a ¯ t M = 1 M i = 1 M A ˜ t i ,
and the joint empirical covariance matrix is given by
P ^ t z z = 1 M 1 i = 1 M Z ˜ t i ( Z ˜ t z ¯ t M ) T = P ^ t x x P ^ t x a ( P ^ t x a ) T P ^ t a a .
As in Section 4.3, the solution to (19) can be approximated by
x Ψ t = P t x h , a Ψ t = P t a h ,
where finally, the covariance matrices P t x h and P t a h are estimated by their empirical counterparts
P ^ t x h = 1 M 1 i = 1 M X ˜ t i ( h ( X ˜ t i , A ˜ t i ) h ¯ t M ) T ,
P ^ t a h = 1 M 1 i = 1 M A ˜ t i ( h ( X ˜ t i , A ˜ t i ) h ¯ t M ) T ,
with h ¯ t M defined by
h ¯ t M = 1 M i = 1 M h ( X ˜ t i , A ˜ t i ) .
Summing everything up, we obtain the following generalised ensemble Kalman–Bucy filter equations
d X ˜ t i = f ( X ˜ t i , A ˜ t i ) d t + G d W t i + ( P ^ t x h + Q H T ) C 1 d I t i ,
d A ˜ t i = P ^ t a h C 1 d I t i ,
where the innovations are given by
d I t i = d Y t h ( X ˜ t i , A ˜ t i ) d t + H G d W t i + R 1 / 2 d U t i ,
and W t i and U t i denote independent N x -dimensional and N y -dimensional Brownian motions, respectively, for i = 1 , , M .
The interacting particle Equation (83) can be time-stepped along the lines discussed in Section 4.3 for the pure state estimation formulation of the ensemble Kalman–Bucy filter.

6. Numerical Results

We now apply the generalised ensemble Kalman–Bucy filter formulation (83) with innovation (84) to five different model scenarios.

6.1. Parameter Estimation for the Ornstein–Uhlenbeck Process

Our first example is provided by the Ornstein–Uhlenbeck process
d X t = a X t d t + Q 1 / 2 d W t
with unknown parameter a R , and known initial condition X 0 = 1 / 2 . We assume an observation model of the form (3) with H = 1 , and a measurement error taking values R = 0.01 , R = 0.0001 , and R = 0 . The model error variance is set to either Q = 0.5 or Q = 0.005 . Except for the case R = 0 a combined state and parameter estimation problem is to be solved. We implement the ensemble Kalman–Bucy filter (Section 5.2) with innovation (84), step size Δ t = 0.005 , and ensemble size M = 1000 . The data is generated using the Euler–Maruyama method applied to (85), with a = 1 / 2 and integrated over a time-interval [ 0 , 500 ] with the same step size. The prior distribution Π 0 for the parameter is Gaussian with mean a ¯ = 1 / 2 and variance σ a 2 = 2 . The results can be found in Figure 1. We find that the ensemble Kalman–Bucy filter is able to successfully identify the unknown parameter under all tested experimental settings, except for the largest measurement error case where R = 0.01 . There, a small systematic offset of the estimated parameter value can be observed. One can also see that the variance in the parameter estimate monotonically decreases in time in all cases, while the variance in the state estimates approximately reaches a steady state.

6.2. Averaging

Consider the equations
d Y t = 1 Z t 2 Y t d t + Q 1 / 2 d W t y ,
d Z t = α ϵ Z t d t + 2 λ ϵ d W t z
from [19] for λ , α , γ , ϵ > 0 , and initial condition Y 0 = 1 / 2 , Z 0 = 0 . The reduced equations in the limit ϵ 0 are given by (85), with parameter value
a = 1 λ α
and initial condition X 0 = 1 / 2 . The reduced dynamics corresponds to a (stable) Ornstein–Uhlenbeck process for λ / α > 1 . We wish to estimate the parameter a from observed increments
Δ Y n = Y n + 1 Y n + Δ t 1 / 2 R 1 / 2 Ξ n , Ξ n N ( 0 , 1 ) ,
where the sequence of { Y n } n 0 is obtained by time-stepping (86) using the Euler–Maruyama method with a step size Δ t . We set λ = 3 , α = 2 (so that a = 1 / 2 ), Q = 0.5 , and ϵ { 0.1 , 0.01 } in our experiments. The measurement noise is set to R = 0.01 or R = 0 (pure parameter estimation).
We implement the ensemble Kalman–Bucy filter (83) with innovation (84), step size Δ t = ϵ / 50 , and ensemble size M = 1000 for the reduced Equation (87). The data is generated from an Euler–Maruyama discretization of (86) with the same step size. We also investigate the effect of subsampling the observations for ϵ = 0.01 by solving (86) with step size Δ t = ϵ / 50 and storing only every tenth solution Y n , while the reduced equations and the ensemble Kalman–Bucy filter equations are integrated with Δ t = ϵ / 5 . The results are shown in Figure 2. Figure 3 shows the results for the same experiments repeated with a smaller ensemble size of M = 10 . We find that the smaller ensemble size leads to more noisy estimates for the variance in X ˜ n and a faster decay of the variance in A ˜ n , but the estimated parameter values are equally well converged. Subsampling does not lead to significant changes in the estimated parameter values. This is in contrast to the example considered next.
We finally mention [31] for alternative approaches to sequential estimation in the context of averaging using however different assumptions on the data.

6.3. Homogenisation

In this example, the data is produced by integrating the multi-scale SDE
d Y t = σ / 2 ϵ Z t + a Y t d t ,
d Z t = 1 ϵ 2 Z t d t + 2 ϵ d W t z
with parameter values ϵ = 0.1 , a = 1 / 2 , σ = 1 / 2 , and initial condition Y 0 = 1 / 2 , Z 0 = 0 . Here, W t z denotes standard Brownian motion. The equations are discretised with step size Δ τ = ϵ 2 / 50 = 0.0002 , and the resulting increments (88) are stored over a time interval [ 0 , 500 ] . See [32] for more details.
According to homogenisation theory, the reduced model is given by (85) with Q = σ , and we wish to estimate the parameter a from the data { Δ Y n } produced according to (88). It is known that a standard maximum likelihood estimator (MLE) given by
a ML = n Y t n ( Y t n + 1 Y t n ) n Y t n 2 Δ τ
leads to a ML = 0 in the limit Δ τ 0 and the observation interval T . This MLE corresponds to H = I and R = 0 in our extended state space formulation of the problem. Subsampling can be achieved by choosing an appropriate time-step Δ t > Δ τ in the ensemble Kalman–Bucy filter equations and a corresponding subsampling of the data points Y n in (88). We used Δ t = 50 Δ τ = 0.01 and Δ t = 500 Δ τ = 0.1 , respectively. The results can be found in Figure 4. It can be seen that only the larger subsampling leads to a correct estimate of the parameter a. This is in line with known results for the maximum likelihood estimator (90). See [32] and references therein.

6.4. Nonparametric Drift and State Estimation

We consider nonparametric drift estimation for one-dimensional SDEs over a periodic domain [ 0 , 2 π ) in the setting considered from a theoretical perspective in [33]. There, a zero-mean Gaussian process prior GP ( 0 , D 1 ) is placed on the unknown drift function, with inverse covariance operator
D : = η [ ( Δ ) p + κ I ] .
The integer parameter p sets the regularity of the process, whereas η , κ R + control its characteristic correlation length and stationary variance.
Spatial discretization of the problem is carried out by first defining a grid of N d evenly spaced points on the domain, at locations x i = i Δ x , Δ x = 2 π / N d . The drift function is projected onto compactly supported functions centred at these points, which are piecewise linear with
b i ( x j ) = δ i j
and linear interpolation is used to define a drift function f ( x , a ) for all x [ 0 , 2 π ) , that is, it is of the form (2) with f 0 ( x ) 0 . In this example, we set N d = 200 . Sample realisations, as well as the reference drift f * , can be found in Figure 5a.
Data is generated by integrating the SDE (1) with drift f * forward in time from initial condition X 0 = π and with noise level Q = 0.1 , using the Euler–Maruyama discretisation with step size Δ t = 0.1 over one million time-steps. The spatial distribution of the solutions X n is plotted in Figure 5b. The data is then given by
Δ Y n = X n + 1 X n + Δ t 1 / 2 R 1 / 2 Ξ n
with R = 0.00001 . Data assimilation is performed using the time-discretised ensemble Kalman–Bucy filter Equation (83) with innovation (84), ensemble size M = 200 , and step size Δ t = 0.1 .
The final estimate of the drift function (ensemble mean) and the ensemble of drift functions can be found in Figure 5c. Figure 5d displays the ensemble of state estimates and the value of the reference solution at the final time. We find that the ensemble Kalman–Bucy filter is able to successfully estimate the drift function and the model states. Further experiments reveal that the drift function can only be identified for sufficiently small measurement errors.

6.5. Spde Parameter Estimation

Consider the stochastic heat equation on the periodic domain x [ 0 , 2 π ) , given in conservative form by the stochastic partial differential equation (SPDE)
d u ( x , t ) = · θ ( x ) u ( x , t ) d t + σ 1 / 2 d W ( x , t ) ,
where W ( x , t ) is space-time white noise. With constant θ ( x ) = θ , this SPDE reduces to
d u ( x , t ) = θ Δ u ( x , t ) d t + σ 1 / 2 d W ( x , t ) .
In this example, we examine the estimation of θ from incremental measurements of a locally averaged quantity q ( x , t ) that arises naturally in a standard finite volume discretisation of (95).
To discretise the system, one first defines q t i = q ( x i , t ) around N d = 200 grid points x i on a regular grid, separated by distances Δ x , as
q t i = x i Δ x / 2 x i + Δ x / 2 u ( x , t ) d x .
The conservative (drift) term in (94) reduces to
x i Δ x / 2 x i + Δ x / 2 · θ ( x ) u ( x , t ) d x = θ i + 1 / 2 u t i + 1 / 2 θ i 1 / 2 u t i 1 / 2 ,
where θ i ± 1 / 2 θ ( x i + Δ x / 2 ) , etc. The following standard finite difference approximations
u t i + 1 / 2 u t i + 1 u t i Δ x , u t i Δ x 1 q t i
yield the N d -dimensional SDE
d q t i = θ q t i + 1 2 q t i + q t i 1 Δ x 2 d t + σ 1 / 2 Δ x 1 / 2 d W t i
for constant θ , where W t i are independent one-dimensional Brownian motions in time.
Following recent results from [20] we consider the case of estimation of a constant a = θ value from measurements d q t * at a fixed location/index j * { 1 , , N d } . The data trajectory is thus given by
d Y t = d q t * + R 1 / 2 d V t
where R 1 / 2 is a scalar and V t is a standard Brownian motion in one dimension. We perform numerical experiments in which the initial state q 0 i is set to zero for all indices i and the prior on the unknown parameter a = θ is uniform over the interval [ 0.2 , 1.8 ] .
The increment data is generated by first integrating (95) forward in time from the known initial condition q i ( 0 ) = 0 for all i. The equation is discretised in time using the Euler-Maruyama method. It is known that Δ t < θ Δ x 2 / 2 is required for stability of the Euler–Maruyama discretisation; we use the much smaller time step Δ t = Δ x 2 / 80 . The solution is sampled with this same time step, and increment measurements are approximated at time t n by setting the measurement noise level R to zero in (100), resulting in
Δ Y n = q n + 1 * q n * .
Please note that the associated model error in (1) is given by G = σ 1 / 2 Δ x 1 / 2 I and the matrix H in (3) projects the vector of state increments onto a single component with index j * = N d / 2 . Simulations are performed over the time-interval [ 0 , 20 ] . The results can be found in Figure 6a. We also compute the model evidence for a sequence of parameter values θ { 0.2 , 0.3 , , 1.8 } based on a standard Kalman–Bucy filter [6] for the associated linear state estimation problem. See Figure 6b. Both approaches agree with the reference value θ = 1 .

6.6. Discussion

The results presented here demonstrate that the proposed methodology can be applied to a broad range of continuous-time state and parameter estimation problems with correlated measurement and model errors. Alternatively, one could have employed standard SMC or MCMC methods utilising the modified observation Model (41) as implied by the Kushner–Stratonovitch formulation (39) of the filtering problem. However, such implementations require the approximation of the additional Q log π t term which is nontrivial if only samples from π t are available. Furthermore, the limiting behaviour of such implementations in the limit R 0 and H = I (pure parameter estimation problem) is unclear since π t degenerates into a Dirac delta distribution, potentially leading to numerical difficulties in this singular regime. The proposed generalised feedback particle filter formulation avoids these issues through the use of stochastic innovations which are correlated with the model noise. In other words, the distribution π t does not appear explicitly in the innovation process (46), and the correlated noise terms cancel each other out as discussed in Remark 3 for R = 0 and H = I . The main computational challenge of the feedback particle filter approach is given by the need for finding the Kalman gain matrix (57). However, the constant gain ensemble Kalman–Bucy approximation
K t P x h + Q H T C 1
is easy to implement. In fact, the only differences with the standard ensemble Kalman–Bucy filter formulation of [14] are in the additional Q H T term in the Kalman gain, and a correlation between the stochastic innovation process and the model error. While the ensemble Kalman–Bucy filter gave rather satisfactory results for the numerical experiments displayed in Section 6, strongly non-Gaussian distributions might require more accurate approximations to the Kalman gain matrix (57). In that case, one could rely on the particle-based diffusion map approximation considered in [27].

7. Conclusions

In this paper, we have derived McKean–Vlasov equations for combined state and parameter estimation from continuously observed state increments. An approximate and robust implementation of these McKean–Vlasov equations in the form of a generalised ensemble Kalman–Bucy filter has been provided and applied to a range of increasingly complex model systems. Future work will address the treatment of temporally-correlated measurement and model errors, as well as a rigorous analysis of these McKean–Vlasov equations in the contexts of multi-scale dynamics and nonparametric drift estimation.

Author Contributions

Methodology, N.N. and S.R.; software, S.R. and P.J.R.; validation, N.N., S.R. and P.J.R.; writing—original draft preparation, N.N., S.R.; writing—review and editing, N.N., S.R. and P.J.R.

Funding

This research has been partially funded by Deutsche Forschungsgemeinschaft (DFG) through grants CRC 1294 ‘Data Assimilation’ (project A06) and CRC 1114 ‘Scaling Cascades’ (project A02).

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. The Filtering Equations for Correlated Noise

In this appendix we outline a derivation of the Kushner-Stratonovich Equation (39) for the signal-observation dynamics given by (38). In fact, we only compute the evolution equation (termed modified Zakai equation) for the unnormalised filtering distribution ρ t [ ϕ ] = E l t ϕ ( X t ) | Y [ 0 , t ] , where the likelihood l t is given by
l t l ( Y [ 0 , t ] | X [ 0 , t ] ) = exp 0 t f ( X s ) T H T C 1 d Y s 1 2 0 t f ( X s ) T H T C 1 H f ( X s ) d s .
Obtaining the Kushner-Stratonovich formulation is then standard, applying Itô’s formula to the Kallianpur-Striebel formula π [ ϕ ] = ρ t [ ϕ ] / ρ t [ 1 ] , see ([7], Chapter 3). The following result is in agreement with the corollaries 3.39 and 3.40 in [7].
Lemma A1.
The modified Zakai equation is given by
ρ t [ ϕ ] = ρ 0 [ ϕ ] + 0 t ρ s [ L ϕ ] d s + 0 t ρ s ϕ f T H T C 1 d Y s + 0 t ρ s ϕ Q H T C 1 d Y s ,
where the generator L has been defined in (40).
Proof. 
For convenience, let us define the process
M t = 0 t f ( X s ) T H T C 1 d Y s ,
where Y s satisfies (38b). From Y t = C t we see that
M t = 0 t f ( X s ) T H T C 1 H f ( X s ) d s .
Hence, the likelihood takes the form
l t = exp M t 1 2 M t ,
satisfying the SDE
d l t = l t d M t .
For an arbitrary smooth compactly supported test function ϕ , Itô’s formula implies
l t ϕ ( X t ) = ϕ ( X 0 ) + 0 t ϕ ( X s ) d l s + 0 t l s ϕ ( X s ) · d X s
+ 1 2 i , j = 1 N x 0 t l s i j ϕ ( X s ) d X i , X j s + i = 1 N x 0 t i ϕ ( X s ) d l , X i s ,
where X s satisfies (38a). For the covariation process l , X t we obtain
l , X t = l t M , X t = l t f ( X t ) T H T C 1 H Q t ,
using Y , X t = H Q t . Furthermore, X , X t = Q t , which follows from the definition of the stochastic contributions in (38a).
We now apply the conditional expectation to (A7). Noticing that
0 t ϕ ( X s ) d l s = 0 t l s ϕ ( X s ) f ( X s ) T H T C 1 d Y s ,
the result follows from (A6). □

References

  1. Kutoyants, Y. Statistical Inference for Ergodic Diffusion Processes; Springer: New York, NY, USA, 2004. [Google Scholar]
  2. Pavliotis, G. Stochastic Processes and Applications; Springer: New York, NY, USA, 2014. [Google Scholar]
  3. Apte, A.; Hairer, M.; Stuart, A.; Voss, J. Sampling the posterior: An approach to non-Gaussian data assimilation. Phys. D Nonlinear Phenom. 2007, 230, 50–64. [Google Scholar] [CrossRef][Green Version]
  4. Salman, H.; Kuznetsov, L.; Jones, C.; Ide, K. A method for assimilating Lagrangian data into a shallow-water-equation ocean model. Mon. Weather Rev. 2006, 134, 1081–1101. [Google Scholar] [CrossRef]
  5. Apte, A.; Jones, C.; Stuart, A. A Bayesian approach to Lagrangian data assimilation. Tellus A 2008, 60, 336–347. [Google Scholar] [CrossRef][Green Version]
  6. Simon, D. Optimal State Estimation; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
  7. Bain, A.; Crisan, D. Fundamentals of Stochastic Filtering; Springer: New York, NY, USA, 2009. [Google Scholar]
  8. Liu, J. Monte Carlo Strategies in Scientific Computing; Springer: New York, NY, USA, 2001. [Google Scholar]
  9. Crisan, D.; Xiong, J. Approximate McKean-Vlasov representation for a class of SPDEs. Stochastics 2010, 82, 53–68. [Google Scholar] [CrossRef]
  10. McKean, H. A class of Markov processes associated with nonlinear parabolic equations. Proc. Natl. Acad. Sci. USA 1966, 56, 1907–1911. [Google Scholar] [CrossRef] [PubMed]
  11. Yang, T.; Mehta, P.; Meyn, S. Feedback particle filter. IEEE Trans. Autom. Control 2013, 58, 2465–2480. [Google Scholar] [CrossRef]
  12. Reich, S. Data assimilation: The Schrödinger perspective. Acta Numer. 2019, 28, 635–710. [Google Scholar]
  13. Majda, A.; Harlim, J. Filtering Complex Turbulent Systems; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
  14. Bergemann, K.; Reich, S. An ensemble Kalman–Bucy filter for continuous data assimilation. Meteorol. Z. 2012, 21, 213–219. [Google Scholar] [CrossRef]
  15. Taghvaei, A.; de Wiljes, J.; Mehta, P.; Reich, S. Kalman filter and its modern extensions for the continuous-time nonlinear filtering problem. ASME. J. Dyn. Syst. Meas. Control 2017, 140. [Google Scholar] [CrossRef]
  16. Law, K.; Stuart, A.; Zygalakis, K. Data Assimilation: A Mathematical Introduction; Springer: New York, NY, USA, 2015. [Google Scholar]
  17. Reich, S.; Cotter, C. Probabilistic Forecasting and Bayesian Data Assimilation; Cambridge University Press: Cambridge, UK, 2015. [Google Scholar]
  18. Moral, P.D. Mean Field Simulation for Monte Carlo Integration; Chapman and Hall/CRC: London, UK, 2013. [Google Scholar]
  19. Pavliotis, G.; Stuart, A. Multiscale Methods; Springer: New York, NY, USA, 2008. [Google Scholar]
  20. Altmeyer, R.; Reiß, M. Nonparametric Estimation for Linear SPDEs from Local Measurements; Technical Report; Humboldt University Berlin: Berlin, Germany, 2019. [Google Scholar]
  21. Saha, S.; Gustafsson, F. Particle filtering with dependent noise processes. IEEE Trans. Signal Process. 2012, 60, 4497–4508. [Google Scholar] [CrossRef]
  22. Berry, T.; Sauer, T. Correlations between systems and observation errors in data assimilation. Mon. Weather Rev. 2018, 146, 2913–2931. [Google Scholar] [CrossRef]
  23. Mitchell, H.L.; Daley, R. Discretization error and signal/error correlation in atmospheric data assimilation: (I). All scales resolved. Tellus A 1997, 49, 32–53. [Google Scholar] [CrossRef]
  24. Papaspiliopoulos, O.; Pokern, Y.; Roberts, G.; Stuart, A. Nonparametric estimation of diffusion: A differential equation approach. Biometrika 2012, 99, 511–531. [Google Scholar] [CrossRef]
  25. Särkkä, S. Bayesian Filtering and Smoothing; Cambridge University Press: Cambridge, UK, 2013. [Google Scholar]
  26. Laugesen, R.S.; Mehta, P.G.; Meyn, S.P.; Raginsky, M. Poisson’s equation in nonlinear filtering. SIAM J. Control Optim. 2015, 53, 501–525. [Google Scholar] [CrossRef]
  27. Taghvaei, A.; Mehta, P.; Meyn, S. Gain Function Approximation in the Feedback Particle Filter; Technical Report; University of Illinois at Urbana-Champaign: Champaign, IL, USA, 2019. [Google Scholar]
  28. Amezcua, J.; Kalnay, E.; Ide, K.; Reich, S. Ensemble transform Kalman-Bucy filters. Q. J. R. Meteorol. Soc. 2014, 140, 995–1004. [Google Scholar] [CrossRef]
  29. De Wiljes, J.; Reich, S.; Stannat, W. Long-time stability and accuracy of the ensemble Kalman–Bucy filter for fully observed processes and small measurement noise. SIAM J. Appl. Dyn. Syst. 2018, 17, 1152–1181. [Google Scholar] [CrossRef]
  30. Blömker, D.; Schillings, C.; Wacker, P. A strongly convergent numerical scheme for ensemble Kalman inversion. SIAM J. Numer. Anal. 2018, 56, 2537–2562. [Google Scholar] [CrossRef]
  31. Harlim, J. Model error in data assimilation. In Nonlinear and Stochastic Climate Dynamics; Franzke, C., Kane, T.O., Eds.; Cambridge University Press: Cambridge, UK, 2017; pp. 276–317. [Google Scholar]
  32. Krumscheid, S.; Pavliotis, G.; Kalliadasis, S. Semi-parametric drift and diffusion estimation for multiscale diffusions. SIAM J. Multiscale Model. Simul. 2011, 11, 442–473. [Google Scholar] [CrossRef]
  33. Van Waaij, J.; van Zanten, H. Gaussian process methods for one-dimensional diffusion: Optimal rates and adaptation. Electron. J. Stat. 2016, 10, 628–645. [Google Scholar] [CrossRef]
Figure 1. Results for the Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a) Q = 1 / 2 , R = 0.01 ; (b) Q = 1 / 2 , R = 0.0001 ; (c) Q = 1 / 2 , R = 0 (pure parameter estimation); (d) Q = 0.005 , R = 0.0001 . The ensemble size is set to M = 1000 in all cases. Displayed are the ensemble mean a ¯ n and the ensemble variance in A ˜ n and X ˜ n . The variance of X ˜ n is zero when R = 0 in case (b).
Figure 1. Results for the Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a) Q = 1 / 2 , R = 0.01 ; (b) Q = 1 / 2 , R = 0.0001 ; (c) Q = 1 / 2 , R = 0 (pure parameter estimation); (d) Q = 0.005 , R = 0.0001 . The ensemble size is set to M = 1000 in all cases. Displayed are the ensemble mean a ¯ n and the ensemble variance in A ˜ n and X ˜ n . The variance of X ˜ n is zero when R = 0 in case (b).
Entropy 21 00505 g001
Figure 2. Results for the averaged Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a) Q = 1 / 2 , R = 0.01 , ϵ = 0.1 ; (b) Q = 1 / 2 , R = 0 , ϵ = 0.1 (pure parameter estimation); (c) Q = 1 / 2 , R = 0.01 , ϵ = 0.01 ; (d) Q = 1 / 2 , R = 0.01 , ϵ = 0.01 and subsampling by a factor of ten. The ensemble size is set to M = 1000 in all cases. Displayed are the ensemble mean and the ensemble variance in A ˜ n and X ˜ n . The variance of X ˜ n is zero when R = 0 in case (b).
Figure 2. Results for the averaged Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a) Q = 1 / 2 , R = 0.01 , ϵ = 0.1 ; (b) Q = 1 / 2 , R = 0 , ϵ = 0.1 (pure parameter estimation); (c) Q = 1 / 2 , R = 0.01 , ϵ = 0.01 ; (d) Q = 1 / 2 , R = 0.01 , ϵ = 0.01 and subsampling by a factor of ten. The ensemble size is set to M = 1000 in all cases. Displayed are the ensemble mean and the ensemble variance in A ˜ n and X ˜ n . The variance of X ˜ n is zero when R = 0 in case (b).
Entropy 21 00505 g002
Figure 3. Results for the averaged Ornstein-Uhlenbeck process, now with a smaller ensemble size M = 10. Otherwise, panels (ad) correspond to the same experimental settings as in Figure 2.
Figure 3. Results for the averaged Ornstein-Uhlenbeck process, now with a smaller ensemble size M = 10. Otherwise, panels (ad) correspond to the same experimental settings as in Figure 2.
Entropy 21 00505 g003
Figure 4. Results for the homoginsation Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a) Q = 1 / 2 , R = 0.01 , ϵ = 0.1 ; (b) Q = 1 / 2 , R = 0 , ϵ = 0.1 (pure parameter estimation); (c) Q = 1 / 2 , R = 0.01 , ϵ = 0.1 and subsampling by a factor of fifty; (d) Q = 1 / 2 , R = 0.01 , ϵ = 0.1 and subsampling by a factor of five hundred. The ensemble size is set to M = 10 in all cases. Displayed are the ensemble mean and the ensemble variance in A ˜ n and X ˜ n . The variance of X ˜ n is zero under (c).
Figure 4. Results for the homoginsation Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a) Q = 1 / 2 , R = 0.01 , ϵ = 0.1 ; (b) Q = 1 / 2 , R = 0 , ϵ = 0.1 (pure parameter estimation); (c) Q = 1 / 2 , R = 0.01 , ϵ = 0.1 and subsampling by a factor of fifty; (d) Q = 1 / 2 , R = 0.01 , ϵ = 0.1 and subsampling by a factor of five hundred. The ensemble size is set to M = 10 in all cases. Displayed are the ensemble mean and the ensemble variance in A ˜ n and X ˜ n . The variance of X ˜ n is zero under (c).
Entropy 21 00505 g004
Figure 5. Results for the nonparametric drift and state estimation problem: (a) reference drift function (thick line) and ensemble of drift functions drawn from the prior distribution; (b) histogram of samples from the reference trajectory; (c) reference drift function and its estimate (top) and ensemble of drift functions (bottom) at final time; (d) ensemble of states and the true value at final time.
Figure 5. Results for the nonparametric drift and state estimation problem: (a) reference drift function (thick line) and ensemble of drift functions drawn from the prior distribution; (b) histogram of samples from the reference trajectory; (c) reference drift function and its estimate (top) and ensemble of drift functions (bottom) at final time; (d) ensemble of states and the true value at final time.
Entropy 21 00505 g005
Figure 6. Results for SPDE parameter estimation: (a) estimate of θ as a function of time as obtained by the ensemble Kalman–Bucy filter; (b) evidence based on a Kalman–Bucy filter for state estimation applied to a sequence of parameter values θ { 0.2 , 0.3 , , 1.8 } .
Figure 6. Results for SPDE parameter estimation: (a) estimate of θ as a function of time as obtained by the ensemble Kalman–Bucy filter; (b) evidence based on a Kalman–Bucy filter for state estimation applied to a sequence of parameter values θ { 0.2 , 0.3 , , 1.8 } .
Entropy 21 00505 g006
Back to TopTop