State and Parameter Estimation from Observed Signal Increments

Nüsken, Nikolas; Reich, Sebastian; Rozdeba, Paul J.

doi:10.3390/e21050505

Open AccessArticle

State and Parameter Estimation from Observed Signal Increments

by

Nikolas Nüsken

,

Sebastian Reich

^*

and

Paul J. Rozdeba

Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str. 24/25, D-14476 Potsdam, Germany

^*

Author to whom correspondence should be addressed.

Entropy 2019, 21(5), 505; https://doi.org/10.3390/e21050505

Submission received: 26 March 2019 / Revised: 13 May 2019 / Accepted: 14 May 2019 / Published: 17 May 2019

(This article belongs to the Special Issue Information Theory and Stochastics for Multiscale Nonlinear Systems)

Download

Browse Figures

Versions Notes

Abstract

:

The success of the ensemble Kalman filter has triggered a strong interest in expanding its scope beyond classical state estimation problems. In this paper, we focus on continuous-time data assimilation where the model and measurement errors are correlated and both states and parameters need to be identified. Such scenarios arise from noisy and partial observations of Lagrangian particles which move under a stochastic velocity field involving unknown parameters. We take an appropriate class of McKean–Vlasov equations as the starting point to derive ensemble Kalman–Bucy filter algorithms for combined state and parameter estimation. We demonstrate their performance through a series of increasingly complex multi-scale model systems.

Keywords:

parameter estimation; continuous-time data assimilation; ensemble Kalman filter; correlated noise; multi-scale diffusion processes

1. Introduction

The research presented in this paper has been motivated by the state and parameter estimation problem for particles moving under a stochastic velocity field, with the measurements given by partial and noisy observations of their position increments. If the deterministic contributions to the velocity field are stationary, and the position increments of the moving particle are exactly observed, then one is led to a standard parameter estimation problem for stochastic differential equations (SDEs) [1,2]. In [3], this setting was extended to the case where the deterministic contributions to the velocity field themselves undergo a stochastic time evolution. Furthermore, while continuous-time observations of position increments are at the focus of the present study, the assimilation of discrete-time observations of particle positions has been investigated in [4,5] under a so-called Lagrangian data assimilation setting for atmospheric fluid dynamics.

The assumption of exactly and fully observed position increments is not always realistic and the case of partial and noisy observations is at the center of the present study. With access to partial and noisy observations of position increments leads to correlations between the measurement and model errors. The theoretical impact of such correlations on state and parameter estimation problems has been discussed, for example, in [6] in the context of linear systems, and in [7] for nonlinear systems. In particular, one finds that the appropriately adjusted data likelihood involves the gradient of log-densities, which is nontrivial from a computational perspective, and which prevents a straightforward application of standard Markov chain Monte Carlo (MCMC) or sequential Monte Carlo (SMC) methods [8].

In this paper, we instead follow an alternative Monte Carlo approach based on appropriately adjusted McKean–Vlasov filtering equations, an approach pioneered in [9] in the context of the standard state estimation problem for diffusion processes. McKean–Vlasov equations, first studied in [10], are a class of SDEs in which the right-hand side depends on the law of the process itself. We rely on a particular formulation of McKean–Vlasov filtering equations, the so-called feedback particle filters [11], utilising stochastic innovation processes [12].

Our proposed Monte Carlo formulation avoids the need for estimating log-densities, and can be implemented in a numerically robust manner relying on a generalised ensemble Kalman–Bucy filter approximation applied to an extended state space formulation [13]. The ensemble Kalman–Bucy filter [14,15] has been introduced previously as an extension of the popular ensemble Kalman filter [13,16,17] to continuous-time data assimilation under the assumption of uncorrelated measurement and model errors.

While the McKean-Vlasov formulation is essentially mathematically equivalent to the more conventional one based on the Kushner-Stratonovitch equation [7], these two approaches differ significantly in structure, suggesting different tools for their analysis as well as numerical approximations. More broadly speaking, the McKean–Vlasov approach to filtering is appealing since its Monte Carlo implementations completely avoid the need for resampling characteristic of standard SMC methods. Furthermore, a wide range of approximations are possible within the McKean–Vlasov framework with some of them, such as the ensemble Kalman–Bucy filter, applicable to high-dimensional problems. The McKean–Vlasov approach also arises naturally when analysing sequential Monte Carlo methods [18].

In Section 6, we apply the proposed algorithms to a series of state and parameter estimation problems of increasing complexity. First, we study the state and parameter estimation problem for an Ornstein–Uhlenbeck process [2]. Two further experiments investigate the behaviour of the filters for reduced model equations, with the data being collected from underlying multi-scale models. There we distinguish between the averaging and homogenisation scenarios [19]. Finally, we look at examples of nonparametric drift estimation [3] and parameter estimation for the stochastic heat equation [20].

We finally mention that SMC methods for correlated noise terms in discrete-time have been discussed, for example, in [21] and in the context of the ensemble Kalman filter in [22]. Similar ideas have also been pursued in a more applied context in [23].

2. Mathematical Problem Formulation

We consider the time evolution of a random state variable

X_{t} \in R^{N_{x}}

in

N_{x}

-dimensional state space,

N_{x} \geq 1

, as prescribed by an SDE of the form

d X_{t} = f (X_{t}, a) d t + G d W_{t},

(1)

for time

t \geq 0

, with the drift function

f : R^{N_{x}} \times R^{N_{a}} \to R^{N_{x}}

depending on

N_{a} \geq 0

unknown parameters

a = {(a^{1}, \dots, a^{N_{a}})}^{T} \in R^{N_{a}}

. Model errors are represented through standard

N_{w}

-dimensional Brownian motion

W_{t}

,

N_{w} \geq 1

, and a matrix

G \in R^{N_{x} \times N_{w}}

. We also introduce the associated model error covariance matrix

Q = G G^{T}

. We will generally assume that the initial condition

X_{0}

is fixed, that is,

X_{0} = x_{0}

a.s. for given

x_{0} \in R^{N_{x}}

. In terms of a more specific example, one can think of

X_{t}

denoting the position of a particle at time

t \geq 0

moving in

N_{x} = 3

dimensional space under the influence of a stochastic velocity field, with deterministic contributions given by f and stochastic perturbations by

G W_{t}

. In the case

G = 0

, the SDE (1) reduces to an ordinary differential equation with given initial condition

x_{0}

.

We assume throughout this paper that (1) possesses unique, strong solutions for all parameter values a. See, for example, [2] (Section 3.3) for sufficient conditions on the drift function f. The distribution of

X_{t}

is denoted by

π_{t}

, which we also abbreviate by

π_{t} = Law (X_{t})

. We use the same notation for measures and their Lebesgue densities, provided they exist.

Example 1.

A wide class of drift functions can be written in the form

f (x, a) = f_{0} (x) + B (x) a = f_{0} (x) + \sum_{i = 1}^{N_{a}} b_{i} (x) a^{i},

(2)

where

f_{0} : R^{N_{x}} \to R^{N_{x}}

is a known drift function, the

b_{i} : R^{N_{x}} \to R^{N_{x}}

,

i = 1, \dots, N_{a}

, denote appropriate basis functions, and the vector

a = {(a^{1}, \dots, a^{N_{a}})}^{T} \in R^{N_{a}}

contains the unknown parameters of the model. The family

{b_{i} (x)}

of basis functions, which we collect in a matrix-valued function

B (x) = (b_{1} (x), b_{2} (x), \dots, b_{N_{a}} (x)) \in R^{N_{x} \times N_{a}}

, could arise from a finite-dimensional truncation of some appropriate Hilbert space

H

. See, for example, [24] for computational approaches to nonparametric drift estimation using a Galerkin approximation in

H

, where the

b_{i} (x)

become finite element basis functions. Furthermore, the expansion coefficients

{a^{i}}

could be made time-dependent by letting them evolve according to some system of differential equations arising, for example, from the discretisation of an underlying partial differential equation with solutions in

H

. See [3] for specific examples of such a setting. While the present paper focuses on stationary drift functions, i.e., the parameters

{a^{i}}

are time-independent, the results from Section 3 and Section 5, respectively, can easily be extended to the non-stationary case where the parameters themselves satisfy given evolution equations.

Data and an observation model are required in order to perform state and parameter estimation for SDEs of the form (1). In this paper, we assume that we observe partial and noisy increments

d Y_{t}

of the signal

X_{t}

, given by

d Y_{t} = H d X_{t} + R^{1 / 2} d V_{t} = H f (X_{t}, a) d t + H G d W_{t} + R^{1 / 2} d V_{t}, Y_{0} = X_{0} = x_{0},

(3)

for t in the observation interval

[0, T]

,

T > 0

, where

H \in R^{N_{y} \times N_{x}}

is a given linear operator,

V_{t}

denotes standard

N_{y}

-dimensional Brownian motion with

N_{y} \geq 1

and

R \in R^{N_{y} \times N_{y}}

is a covariance matrix. We introduce the observation map

h (x, a) = H f (x, a)

(4)

for later use. Unless

H G = 0

, it is clear that the model error

E_{t}^{m} : = G W_{t}

in (1) and the total observation error

E_{t}^{o} : = H G W_{t} + R^{1 / 2} V_{t}

(5)

in (3) are correlated. The impact of correlations between the model and measurement errors on the state estimation problem have been discussed by [6,7]. Furthermore, such correlations require adjustments to sequential estimation methods [16,17,25] which are the main focus of this paper. We assume throughout this paper that the covariance matrix

C = H G G^{T} H^{T} + R = H Q H^{T} + R

(6)

of the observation error (5) is invertible.

The special case

R = 0

and

H = I

leads to a pure parameter estimation problem which has been extensively studied in the literature in the settings of maximum likelihood and Bayesian estimators [1,2]. In Section 3, we provide a reformulation of the Bayesian approach as McKean–Vlasov equations for the parameters, based on the results in [9,11].

If

R \neq 0

, then (1) and (3) lead to a combined state and parameter estimation problem with correlated noise terms. We will first discuss the impact of this correlation on the pure state estimation problem in Section 4 assuming that the parameters of the problem are known. Again, we will derive appropriate McKean–Vlasov equations in the state variables. Our key contribution is a formulation that avoids the need for log-density estimates, and can be put into an appropriately generalised ensemble Kalman–Bucy filter approximation framework [14,15]. We also formally demonstrate that the McKean–Vlasov filter equation reduces to

d X_{t} = d Y_{t}

in the limit

R \to 0

and

H = I

, a property that is less straightforward to demonstrate for filter formulations involving log-densities.

These McKean–Vlasov equations are generalised to the combined state and parameter estimation problem via an augmentation of state space [13] in Section 5. Given the results from Section 4, such an extension is rather straightforward.

The numerical experiments in Section 6 rely exclusively on the generalised ensemble Kalman–Bucy filter approximation to the McKean–Vlasov equations, which are easy to implement and yield robust and accurate numerical results.

3. Parameter Estimation from Noiseless Data

In this section, we treat the simpler Bayesian parameter estimation problem which arises from setting

R = 0

and

H = I

in (3), i.e.,

N_{y} = N_{x}

. This leads to

d X_{t} = d Y_{t}

and, furthermore,

X_{t} = Y_{t}

for all

t \in [0, T]

, provided

X_{0} = Y_{0} = x_{0}

which we assume throughout this paper. The requirement that

C = Q

is invertible requires that G has rank

N_{x}

; that is,

N_{w} \geq N_{x}

in (1). The data likelihood

l_{t} (a) = exp (\int_{0}^{t} f {(Y_{s}, a)}^{T} Q^{- 1} d Y_{s} - \frac{1}{2} \int_{0}^{t} f {(Y_{s}, a)}^{T} Q^{- 1} f (Y_{s}, a) d s)

(7)

thus follows from the observation model with additive Brownian noise in (3). Given a prior distribution

Π_{0} (a)

for the parameters, the resulting posterior distribution at any time

t \in (0, T]

is

Π_{t} (a) = \frac{l_{t} (a) Π_{0} (a)}{Π_{0} [l_{t}]}

(8)

according to Bayes’ theorem [7]. Here, we have introduced the shorthand

Π_{0} [l_{t}] = \int_{R^{N_{a}}} l_{t} (a) Π_{0} (a) d a

(9)

for the expectation of

l_{t}

with respect to

Π_{0}

. It is well-known that the posterior distributions

Π_{t}

satisfy the stochastic partial differential equation

d Π_{t} [ϕ] = {(Π_{t} [ϕ h_{t}] - Π_{t} [ϕ] Π_{t} [h_{t}])}^{T} Q^{- 1} (d Y_{t} - Π_{t} [h_{t}] d t)

(10)

with the time-dependent observation map

h_{t} (a) = f (Y_{t}, a),

(11)

where

ϕ : R^{N_{a}} \to R

is a compactly supported smooth test function, and

Π_{t} [ϕ]

again denoting the expectation of

ϕ

with respect to

Π_{t}

. See [7] for a detailed discussion. Equation (10) is a special instance of the well-known Kushner–Stratonovitch equation from time-continuous filtering [7].

3.1. Feedback Particle Filter

We now state a McKean–Vlasov reformulation of the Kushner–Stratonovitch Equation (10) as a special instance of the feedback particle filter of [11,12]. The key idea is to formulate a stochastic differential equation in the parameters in which they are treated as time-dependent random variables. We introduce the notation

{\tilde{A}}_{t}

for these, and require that the law of

{\tilde{A}}_{t}

coincide with (8) for

t \in [0, T]

, i.e., with the solution to (10).

Lemma 1 (Feedback particle filter).

Consider the McKean–Vlasov equations

d {\tilde{A}}_{t} = K_{t} ({\tilde{A}}_{t}) d I_{t} + Ω_{t} ({\tilde{A}}_{t}) d t,

(12)

where the matrix-valued Kalman gain

K_{t} \in R^{N_{a} \times N_{y}}

satisfies

\nabla \cdot ({\tilde{Π}}_{t} (K_{t} Q)) = - {\tilde{Π}}_{t} {(h_{t} - {\tilde{Π}}_{t} [h_{t}])}^{T}, {\tilde{Π}}_{t} = Law ({\tilde{A}}_{t}) .

(13)

The innovation process

I_{t}

can be chosen to be given by either

d I_{t} = d Y_{t} - \frac{1}{2} (h_{t} ({\tilde{A}}_{t}) + {\tilde{Π}}_{t} [h_{t}]) d t,

(14)

or

d I_{t} = d Y_{t} - \{h_{t} ({\tilde{A}}_{t}) d t + G d W_{t}\},

(15)

and

Ω_{t}^{i} = \frac{1}{2} \sum_{j = 1}^{N_{a}} \sum_{k, l = 1}^{N_{y}} Q^{k l} K_{t}^{j l} (\partial_{j} K_{t}^{i k}), i = 1, \dots, N_{a} .

(16)

Then, the distribution

{\tilde{Π}}_{t} = Law ({\tilde{A}}_{t})

coincides with the solution to (10), provided that the initial distributions agree. In other words,

{\tilde{Π}}_{t} = Π_{t}

for all

t \in [0, T]

.

Throughout this paper, we write (12) in the more compact Stratonovitch form

d {\tilde{A}}_{t} = K_{t} ({\tilde{A}}_{t}) \circ d I_{t},

(17)

where the Stratonovitch interpretation is to be applied only to

{\tilde{A}}_{t}

in

K_{t} ({\tilde{A}}_{t})

, while the explicit time-dependence of

K_{t}

remains in its Itô interpretation. It should be noted that the matrix-valued function

K_{t}

is not uniquely defined by the PDE (13). Indeed, provided

K_{t}

solves (13),

K_{t} + β_{t}

is also a solution whenever

\nabla \cdot ({\tilde{Π}}_{t} β_{t}) = 0

. As discussed in [15], the minimiser over all suitable

K_{t}

with respect to a kinetic energy-type functional is of the form

K_{t} = \nabla Ψ_{t} Q^{- 1}

(18)

for a vector of potential functions

Ψ_{t} = (ψ_{t}^{1}, \dots, ψ_{t}^{N_{x}})

,

ψ_{t}^{k} : R^{N_{a}} \to R

. Inserting (18) into (13) leads to

N_{x}

elliptic partial differential equations (often referred to as Poisson equations),

\nabla \cdot ({\tilde{Π}}_{t} \nabla Ψ_{t}) = - {\tilde{Π}}_{t} {(h_{t} - {\tilde{Π}}_{t} [h_{t}])}^{T}, {\tilde{Π}}_{t} [Ψ_{t}] = 0,

(19)

understood component wise, where the centring condition

{\tilde{Π}}_{t} [Ψ_{t}] = 0

makes the solution unique under mild assumptions on

{\tilde{Π}}_{t}

(see [26]). The numerical approximation of (19) in the context of the feedback particle filter has been discussed in [27]. Finally, (15) yields a particularly appealing formulation, since it is based on a direct comparison of

d Y_{t}

with a random realisation of the right hand side of the SDE (1), given a parameter value

a = {\tilde{A}}_{t} (ω)

and a realisation of the noise term

d W_{t} (ω)

. This fact will be explored further in Section 4.

Remark 1.

For clarity, let us repeat Equations (13) and (18) in their index forms:

\sum_{i = 1}^{N_{a}} \sum_{j = 1}^{N_{y}} \partial_{i} ({\tilde{Π}}_{t} (K_{t}^{i j} Q^{j k})) = - {\tilde{Π}}_{t} (h_{t}^{k} - {\tilde{Π}}_{t} [h_{t}^{k}]), k = 1, \dots, N_{y},

(20)

\sum_{j = 1}^{N_{y}} K_{t}^{i j} (a) Q^{j k} = \partial_{i} ψ_{t}^{k} (a), i = 1, \dots, N_{a}, k = 1, \dots, N_{y} .

(21)

3.2. Ensemble Kalman–Bucy Filter

Let us now assume that the initial distribution

Π_{0}

is Gaussian, and that f is linear in the unknown parameters such as in (2). Then, the distributions

{\tilde{Π}}_{t}

remain Gaussian for all times with mean

{\bar{a}}_{t}

and covariance matrix

P_{t}^{a a}

. The elliptic PDE (13) is solved by the parameter-independent Kalman gain matrix

K_{t} = P_{t}^{a a} B {(Y_{t})}^{T} Q^{- 1}

(22)

and one obtains the McKean–Vlasov formulation

d {\tilde{A}}_{t} = P_{t}^{a a} B {(Y_{t})}^{T} Q^{- 1} d I_{t}

(23)

of the Kalman–Bucy filter, with the innovation process

I_{t}

defined by either

d I_{t} = d Y_{t} - (f_{0} (Y_{t}) + \frac{1}{2} B (Y_{t}) ({\tilde{A}}_{t} + {\bar{a}}_{t})) d t

(24)

or

d I_{t} = d Y_{t} - \{(f_{0} (Y_{t}) + B (Y_{t}) {\tilde{A}}_{t}) d t + G d W_{t}\} .

(25)

Please note that the Stratonovitch formulation (17) reduces to the standard Itô interpretation, since

K_{t}

no longer depends explicitly on

{\tilde{A}}_{t}

.

The McKean–Vlasov Equation (23) can be extended to nonlinear, non-Gaussian parameter estimation problems by generalising the parameter-independent Kalman gain matrix (22) to

K_{t} = P_{t}^{a h} Q^{- 1}, P_{t}^{a h} = {\tilde{Π}}_{t} [(a - {\bar{a}}_{t}) {(h_{t} (a) - {\tilde{Π}}_{t} [h_{t}])}^{T}] = {\tilde{Π}}_{t} [a {(h_{t} (a) - {\tilde{Π}}_{t} [h_{t}])}^{T}]

(26)

Clearly, the gain (26) provides only an approximation to the solution of (13). However, such approximations have become popular in nonlinear state estimation in the form of the ensemble Kalman filter [16,17], and we will test its suitability for parameter estimation in Section 6.

Numerical implementations of the proposed McKean–Vlasov approaches rely on Monte–Carlo approximations. More specifically, given M samples

{\tilde{A}}_{0}^{i}

,

i = 1, \dots, M

, from the initial distribution

Π_{0}

, we introduce the interacting particle system

d {\tilde{A}}_{t}^{i} = K_{t}^{M} ({\tilde{A}}_{t}^{i}) \circ d I_{t}^{i},

(27)

where the innovation processes

I_{t}^{i}

are defined by either

d I_{t}^{i} = d Y_{t} - \frac{1}{2} (h_{t} ({\tilde{A}}_{t}^{i}) + {\bar{h}}_{t}^{M}) d t, {\bar{h}}_{t}^{M} = \frac{1}{M} \sum_{i = 1}^{M} h_{t} ({\tilde{A}}_{t}^{i}),

(28)

or, alternatively,

d I_{t}^{i} = d Y_{t} - (h_{t} ({\tilde{A}}_{t}^{i}) d t + G d W_{t}^{i}),

(29)

and

W_{t}^{i}

,

i = 1, \dots, M

, denote independent

N_{w}

-dimensional Brownian motions. For

K_{t}^{M}

, we will use the parameter-independent empirical Kalman gain approximation

K_{t}^{M} = {\hat{P}}_{t}^{a h} Q^{- 1}, {\hat{P}}_{t}^{a h} = \frac{1}{M - 1} \sum_{i = 1}^{M} {\tilde{A}}_{t}^{i} {(h_{t} ({\tilde{A}}_{t}^{i}) - {\bar{h}}_{t}^{M})}^{T},

(30)

in our numerical experiments, which leads to the so-called ensemble Kalman–Bucy filter [14,15]. Please note that

{\hat{P}}_{t}^{a h}

provides an unbiased estimator of

P_{t}^{a h}

.

Finally, a robust and efficient time-stepping procedure for approximating

{\tilde{A}}_{t_{n}}

,

t_{n} = n Δ t

, is provided in [28,29,30]. Denoting the approximations at time

t_{n}

by

{\tilde{A}}_{n}^{i}

,

i = 1, \dots, M

, we obtain

{\tilde{A}}_{n + 1}^{i} = {\tilde{A}}_{n}^{i} + Δ t {\hat{P}}_{n}^{a h} {(Q + Δ t {\hat{P}}_{n}^{h h})}^{- 1} Δ I_{n}^{i}

(31)

with step size

Δ t > 0

, empirical covariance matrices

{\hat{P}}_{n}^{a h} = \frac{1}{M - 1} \sum_{i = 1}^{M} {\tilde{A}}_{n}^{i} {(h_{n} ({\tilde{A}}_{n}^{i}) - {\bar{h}}_{n}^{M})}^{T}, {\hat{P}}_{n}^{h h} = \frac{1}{M - 1} \sum_{i = 1}^{M} h_{n} ({\tilde{A}}_{n}^{i}) {(h_{n} ({\tilde{A}}_{n}^{i}) - {\bar{h}}_{n}^{M})}^{T},

(32)

and innovation increments

Δ I_{n}^{i}

given by either

Δ I_{n}^{i} = Δ Y_{n} - \frac{1}{2} (h_{n} ({\tilde{A}}_{n}^{i}) + {\bar{h}}_{n}^{M}) Δ t, {\bar{h}}_{n}^{M} = \frac{1}{M} \sum_{i = 1}^{M} h_{n} ({\tilde{A}}_{n}^{i}),

(33)

or

Δ I_{n}^{i} = Δ Y_{n} - (h_{n} ({\tilde{A}}_{n}^{i}) Δ t + Δ t^{1 / 2} G Ξ_{n}^{i}), Ξ_{n}^{i} \sim N (0, I) .

(34)

Here we have used the abbreviations

h_{n} (a) = f (Y_{n}, a)

,

Y_{n} = Y_{t_{n}}

, and

Δ Y_{n} = Y_{t_{n + 1}} - Y_{t_{n}}

.

While the feedback particle formulation (17) and its ensemble Kalman–Bucy filter approximation (31) are special cases of already available formulations, they provide the starting point for our novel McKean–Vlasov equations and their numerical approximation of the combined state and parameter estimation problem with correlated measurement and model errors, which we develop in the following two sections.

4. State Estimation for Noisy Data

We return to the observation Model (3) with

R \neq 0

and general H. The pure state estimation problem is considered first; that is,

f (x, a) = f (x)

in (1).

Using

E_{t}^{o}

, given by (5), and

E_{t}^{c}

defined by

E_{t}^{c} = G (I - G^{T} H^{T} C^{- 1} H G) W_{t} - Q H^{T} C^{- 1} R^{1 / 2} V_{t}

(35)

with the total measurement error covariance matrix C given by (6), we find that

G W_{t} = E_{t}^{c} + Q H^{T} C^{- 1} E_{t}^{o},

(36)

and the covariations [2] satisfy

{〈 E^{o}, E^{c} 〉}_{t} = 0, {〈 E^{o}, E^{o} 〉}_{t} = C t, {〈 E^{c}, E^{c} 〉}_{t} = G (I - G^{T} H^{T} C^{- 1} H G) G^{T} t .

(37)

These errors naturally suggest linear combinations of

W_{t}

and

V_{t}

in (1) and (3) that shift the correlation between measurement and model errors to the signal dynamics, yielding

\begin{matrix} d X_{t} & = f (X_{t}) d t + G {(I - G^{T} H^{T} C^{- 1} H G)}^{1 / 2} d {\hat{W}}_{t} + Q H^{T} C^{- 1 / 2} d {\hat{V}}_{t}, \end{matrix}

(38a)

\begin{matrix} d Y_{t} & = H f (X_{t}) d t + C^{1 / 2} d {\hat{V}}_{t}, \end{matrix}

(38b)

where

{\hat{W}}_{t}

and

{\hat{V}}_{t}

denote mutually independent standard Brownian motions of dimension

N_{w}

and

N_{y}

, respectively. These equations correspond exactly to the correlated noise example from [7] (Section 3.8). Furthermore,

H = I

and

R = 0

lead to

E_{t}^{c} = 0

,

Q H^{T} C^{- 1 / 2} = C^{1 / 2}

, and, hence,

d X_{t} = d Y_{t}

.

A straightforward application of the results from [7] (Section 3.8) yields the following statement:

Lemma 2 (Generalised Kushner–Stratonovich equation).

The conditional expectations

π_{t} [ϕ] = E [ϕ (X_{t}) | Y_{[0, t]}]

satisfy

\begin{matrix} π_{t} [ϕ] & = π_{0} [ϕ] + \int_{0}^{t} π_{s} [L ϕ] d s + \int_{0}^{t} π_{s} {[ϕ h + H Q \nabla ϕ - ϕ π_{s} [h]]}^{T} C^{- 1} (d Y_{s} - π_{s} [h] d s), \end{matrix}

(39)

where We use the notation

Q : \nabla \nabla ϕ = \sum_{i, j = 1}^{N_{x}} Q^{i j} \partial_{i} \partial_{j} ϕ

.

L = f \cdot \nabla + \frac{1}{2} Q : \nabla \nabla

(40)

is the generator of (1),

h (x) = H f (x)

denotes the observation map, and ϕ is a compactly supported smooth function.

For the convenience of the reader, we present an independent derivation in Appendix A. We note that (39) also arises as the Kushner–Stratonovitch equations for an SDE Model (1) with observations

Y_{t}

satisfying the observation model

d Y_{t} = H (f (X_{t}) - Q \nabla log π_{t} (X_{t})) d t + C^{1 / 2} d {\tilde{V}}_{t},

(41)

where

{\tilde{V}}_{t}

denotes

N_{y}

-dimensional Brownian motion independent of the Brownian motion

W_{t}

in (1). Here we have used that

π_{t} [H Q \nabla π_{t}] = 0

. This reinterpretation of our state estimation problem in terms of uncorrelated model and observation errors and modified observation map

{\tilde{h}}_{t} (x) = H (f (x) - Q \nabla log π_{t} (x))

(42)

allows one to apply available MCMC and SMC methods for continuous-time filtering and smoothing problems. See, for example, [16]. However, there are two major limitations of such an approach. First, it requires approximating the gradient of the log-density. Second, the modified observation Model (41) is not well-defined in the limit

R \to 0

and

H = I

, since the density

π_{t}

collapses to a Dirac delta function under the given initial condition

X_{0} = x_{0}

a.s.

In order to circumvent these complications, we develop an alternative approach based on an appropriately modified feedback particle filter formulation in the following subsection.

4.1. Generalised Feedback Particle Filter Formulation

While it is clearly possible to apply the standard feedback particle filter formulations using (41), the following alternative formulation avoids the need for approximating the gradient of the log-density.

Lemma 3 (Feedback particle filter with correlated innovation).

Consider the McKean–Vlasov equation

d {\tilde{X}}_{t} = f ({\tilde{X}}_{t}) d t + G d W_{t} + K_{t} ({\tilde{X}}_{t}) \circ d I_{t} + Ω_{t} ({\tilde{X}}_{t}) d t,

(43)

where the gain

K_{t} \in R^{N_{x} \times N_{y}}

solves

\nabla \cdot ({\tilde{π}}_{t} (K_{t} C - Q H^{T})) = - {\tilde{π}}_{t} {(h - {\tilde{π}}_{t} [h])}^{T}, {\tilde{π}}_{t} = Law ({\tilde{X}}_{t}),

(44)

with observation map

h (x) = H f (x)

. The function

Ω_{t}

is given by

Ω_{t}^{i} = - \frac{1}{2} \sum_{l = 1}^{N_{x}} \sum_{j = 1}^{N_{y}} \partial_{l} K_{t}^{i j} {(Q H^{T})}^{l j}, i = 1, \dots, N_{x},

(45)

and the innovation process

I_{t}

by

d I_{t} = d Y_{t} - (h ({\tilde{X}}_{t}) d t + H G d W_{t} + R^{1 / 2} d U_{t}) .

(46)

Here,

W_{t}

and

U_{t}

denote mutually independent

N_{x}

-dimensional and

N_{y}

-dimensional Brownian motions, respectively. Then,

{\tilde{π}}_{t} = Law ({\tilde{X}}_{t})

coincides with the solution to (39), provided that the initial distributions agree.

It should be stressed that

W_{t}

in (43) and (46) denote the same Brownian motion, resulting in correlations between the innovation process and model noise.

Proof.

In this proof the Einstein summation convention over repeated indices is employed, noting that (44) takes the form

\partial_{i} ({\tilde{π}}_{t} (K_{t}^{i j} C^{j k} - {(Q H^{T})}^{i k})) = - {\tilde{π}}_{t} (h^{k} - {\tilde{π}}_{t} [h^{k}]), k = 1, \dots, N_{y} .

(47)

We begin by writing (43) in its Itô-form,

d {\tilde{X}}_{t} = f ({\tilde{X}}_{t}) d t + G d W_{t} + K_{t} ({\tilde{X}}_{t}) d I_{t} + {\hat{Ω}}_{t} ({\tilde{X}}_{t}) d t,

(48)

where

\begin{matrix} {\hat{Ω}}_{t}^{i} & = Ω_{t}^{i} + \frac{1}{2} \{- (\partial_{l} K_{t}^{i j}) {(Q H^{T})}^{l j} + 2 (\partial_{l} K_{t}^{i j}) K_{t}^{l k} C^{k j}\} \\ = (\partial_{l} K_{t}^{i j}) \{K_{t}^{l k} C^{k j} - {(Q H^{T})}^{l j}\} \end{matrix}

(49)

Here, we have used that the covariation between

K_{t}

and

I_{t}

satisfies

d {〈K^{i j}, I^{j}〉}_{t} = \partial_{l} K_{t}^{i j} (G^{l k} d {〈W^{k}, I〉}_{t} + K_{t}^{l k} d {〈I^{k}, I^{j}〉}_{t}) .

(50)

Furthermore,

{〈 G W, I 〉}_{t} = - Q H^{T} t

and

{〈 I, I 〉}_{t} = 2 C t

.

For a smooth compactly supported test function

ϕ

, Itô’s formula implies

ϕ ({\tilde{X}}_{t}) = ϕ ({\tilde{X}}_{0}) + \int_{0}^{t} \partial_{i} ϕ ({\tilde{X}}_{s}) d {\tilde{X}}_{s}^{i} + \frac{1}{2} \int_{0}^{t} \partial_{i} \partial_{j} ϕ ({\tilde{X}}_{s}) d {〈 {\tilde{X}}^{i}, {\tilde{X}}^{j} 〉}_{s},

(51)

where the covariation process is given by

{〈 \tilde{X}, \tilde{X} 〉}_{t} = t Q - \int_{0}^{t} (K_{s} H Q + Q H^{T} K_{s}^{T}) d s + 2 \int_{0}^{t} K_{s} C K_{s}^{T} d s .

(52)

Our aim is to show that

{\tilde{π}}_{t} [ϕ]

coincides with

π_{t} [ϕ]

as defined by the Kushner–Stratonovich Equation (39). To this end, we insert (48) and (52) into (51) and take the conditional expectation, arriving at

\begin{matrix} {\tilde{π}}_{t} [ϕ] & = {\tilde{π}}_{0} [ϕ] + \int_{0}^{t} {\tilde{π}}_{s} [L ϕ] d s + \int_{0}^{t} {\tilde{π}}_{s} [(\partial_{i} ϕ) K_{s}^{i j}] d Y_{s}^{j} - \int_{0}^{t} {\tilde{π}}_{s} [(\partial_{i} ϕ) K_{s}^{i j} h^{j}] d s \\ + \int_{0}^{t} {\tilde{π}}_{s} [(\partial_{i} ϕ) {\hat{Ω}}_{s}^{i}] d s + \int_{0}^{t} {\tilde{π}}_{s} [(\partial_{i} \partial_{j} ϕ) {(K_{s} (C K_{s}^{T} - H Q))}^{i j}] d s, \end{matrix}

(53)

recalling that the generator

L

has been defined in (40). Under the assumption that

K_{t}

satisfies (44), the two Equations (39) and (53) coincide. Indeed,

{\tilde{π}}_{s} [(\partial_{i} ϕ) (K_{s}^{i k} C^{k j} - {(Q H^{T})}^{i j})] = {\tilde{π}}_{s} [ϕ (h^{j} - {\tilde{π}}_{s} [h^{j}])]

(54)

implies

{\tilde{π}}_{s} [\nabla ϕ \cdot K_{s}] = {\tilde{π}}_{s} {[ϕ h + H Q \nabla ϕ - ϕ {\tilde{π}}_{s} [h]]}^{T} C^{- 1},

(55)

and the

d Y_{s}

-contributions agree. To verify the same for the

d s

-contributions, we use (44) to obtain

\begin{matrix} {\tilde{π}}_{s} [(\partial_{i} ϕ) K_{s}^{i j} (h^{j} - {\tilde{π}}_{t} [h^{j}])] & = - \int_{R^{N_{x}}} (\partial_{i} ϕ) K_{s}^{i j} \partial_{l} ({\tilde{π}}_{s} (K_{s}^{l n} C^{n j} - {(Q H^{T})}^{l j})) d x \\ = {\tilde{π}}_{s} [(\partial_{i} ϕ) {\hat{Ω}}_{s}^{i}] + {\tilde{π}}_{s} [(\partial_{i} \partial_{j} ϕ) {(K_{s} (C K_{s}^{T} - K_{s} H Q))}^{i j}] . \end{matrix}

(56)

Finally, collecting terms in (53) and (56), and applying (55) to the remaining

d s

-contribution, i.e.,

- {\tilde{π}}_{s} [\nabla ϕ \cdot K_{s}] {\tilde{π}}_{s} [h]

, leads to the desired result. □

We note that the correlation between the innovation process

I_{t}

and the model error

W_{t}

leads to a correction term

Ω_{t}

in (43) which cannot be subsumed into a Stratonovitch correction, in contrast to the standard feedback particle filter formulation (17).

Remark 2.

Assuming that there exist potential functions

Ψ_{t} = (ψ_{t}^{1}, \dots, ψ_{t}^{N_{y}})

,

ψ_{t}^{k} : R^{N_{x}} \to R

, solving the Poisson equation(s) (19) (with

{\tilde{Π}}_{t}

being replaced by

{\tilde{π}}_{t}

), (44) can be solved by requiring

K_{t} = (\nabla Ψ_{t} + Q H^{T}) C^{- 1},

(57)

thus generalising (18).

Remark 3.

If we set

R = 0

,

H = I

, and

K_{t} = Q H^{T} C^{- 1} = I

in (43), then one obtains

d {\tilde{X}}_{t} = d Y_{t}

(58)

since

Ω_{t}

vanishes, and all other terms in (43) cancel each other out. If, furthermore,

Y_{0} = {\tilde{X}}_{0} = x_{0}

a.s., then

{\tilde{X}}_{t} = Y_{t}

for all

t \in [0, T]

, which in turn justifies our assumption that the gain

K_{t}

is independent of the state variable. Hence, the McKean–Vlasov formulation (43) reproduces the exact reference trajectory

Y_{t}

in the case of no measurement errors and perfectly known initial conditions.

We develop a simplified version of the feedback particle filter formulation (43) for linear SDEs and Gaussian distributions in the following subsection, which will form the basis of the generalised ensemble Kalman–Bucy filter put forward in the follow-up Section 4.3.

4.2. Generalised Kalman–Bucy Filter

Let us assume that

f (x) = F x

with

F \in R^{N_{x} \times N_{x}}

, i.e., Equations (1) and (3) take the form

\begin{matrix} d X_{t} & = F X_{t} d t + G d W_{t}, \end{matrix}

(59a)

\begin{matrix} d Y_{t} & = H F X_{t} d t + H G d W_{t} + R^{1 / 2} d V_{t}, \end{matrix}

(59b)

with initial conditions drawn from a Gaussian distribution. In this case

π_{t}

stays Gaussian for all

t > 0

, i.e.,

π_{t} \sim N ({\bar{x}}_{t}, P_{t})

with

{\bar{x}}_{t} \in R^{N_{x}}

,

P_{t} \in R^{N_{x} \times N_{x}}

. Equation (19) can be solved uniquely by

\nabla_{x} Ψ = P_{t} F^{T} H^{T}

, and thus the McKean–Vlasov equations for the feedback particle filter (43) reduce to

d {\tilde{X}}_{t} = F {\tilde{X}}_{t} d t + G d W_{t} + (P_{t} F^{T} H^{T} + Q H^{T}) C^{- 1} d I_{t},

(60)

with the innovation process (46) leading to

d I_{t} = d Y_{t} - H F {\tilde{X}}_{t} d t - H G d W_{t} - R^{1 / 2} d U_{t} .

(61)

We take the expectation in (60) and (61) and end up with

d {\bar{x}}_{t} = F {\bar{x}}_{t} d t + (P_{t} F^{T} + Q) H^{T} C^{- 1} (d Y_{t} - H F {\bar{x}}_{t} d t) .

(62)

Defining

u_{t} : = {\tilde{X}}_{t} - {\bar{x}}_{t}

, we see that

d u_{t} = F u_{t} d t + G d {\tilde{W}}_{t} - (P_{t} F^{T} + Q) H^{T} C^{- 1} (H F u_{t} d t + H G d W_{t} + R^{1 / 2} d t) .

(63)

Next we use

d (u_{t} u_{t}^{T}) = d u_{t} u_{t}^{T} + u_{t} d u_{t}^{T} + d {〈 u, u^{T} 〉}_{t}

(64)

and

P_{t} = E [u_{t} u_{t}^{T}]

to obtain, after some calculations,

d P_{t} = (F P_{t} + P_{t} F^{T}) d t - (P_{t} F^{T} + Q) H^{T} C^{- 1} H (F P_{t} + Q) d t + Q d t .

(65)

Hence we have shown that our McKean–Vlasov formulation (60) agrees with the standard Kalman–Bucy filter equations for the mean and the covariance matrix in the correlated noise case [6].

4.3. Ensemble Kalman–Bucy Filter

The McKean–Vlasov Equation (60) for linear systems, along with Gaussian prior and posterior distributions, suggest approximating the feedback particle filter formulation (43) for nonlinear systems by

d {\tilde{X}}_{t} = f ({\tilde{X}}_{t}) d t + G d W_{t} + (P_{t}^{x h} + Q H^{T}) C^{- 1} d I_{t},

(66)

where the innovation process

I_{t}

given by (46) as before. In other words, we approximate the gain matrix

K_{t}

in (43) by the state independent term

(P_{t}^{x h} + Q H^{T}) C^{- 1}

with the covariance matrix

P_{t}^{x h}

defined by

P_{t}^{x h} = {\tilde{π}}_{t} [(x - {\bar{x}}_{t}) {(h (x) - {\tilde{π}}_{t} [h])}^{T}] = {\tilde{π}}_{t} [x {(h (x) - {\tilde{π}}_{t} [h])}^{T}]

(67)

where

{\tilde{π}}_{t}

denotes the law of

{\tilde{X}}_{t}

.

We can now generalise the ensemble Kalman–Bucy filter formulation (31) for the pure parameter estimation problem to the state estimation problem with correlated noise. We assume that M initial state values

{\tilde{X}}_{0}^{i}

have been sampled from an initial distribution

π_{0}

or, alternatively,

X_{0}^{i} = x_{0}

for all

i = 1, \dots, M

in case the initial condition is known exactly. These state values are then propagated under the time-stepping procedure

{\tilde{X}}_{n + 1}^{i} = {\tilde{X}}_{n}^{i} + Δ t f ({\tilde{X}}_{n}^{i}) + Δ t^{1 / 2} G Θ_{n}^{i} + ({\hat{P}}_{n}^{x h} + Q H^{T}) {(C + Δ t {\hat{P}}_{n}^{h h})}^{- 1} Δ I_{n}^{i}

(68)

with

Θ_{n}^{i} \sim N (0, I)

, step size

Δ t > 0

, empirical covariance matrices

\begin{matrix} {\hat{P}}_{n}^{x h} & = \frac{1}{M - 1} \sum_{i = 1}^{M} {\tilde{X}}_{n}^{i} {(h ({\tilde{X}}_{n}^{i}) - {\bar{h}}_{n}^{M})}^{T}, {\bar{h}}_{n}^{M} = \frac{1}{M} \sum_{i = 1}^{M} h ({\tilde{X}}_{n}^{i}), \end{matrix}

(69a)

\begin{matrix} {\hat{P}}_{n}^{h h} & = \frac{1}{M - 1} \sum_{i = 1}^{M} h ({\tilde{X}}_{n}^{i}) {(h ({\tilde{X}}_{n}^{i}) - {\bar{h}}_{n}^{M})}^{T}, \end{matrix}

(69b)

and innovation increments

Δ I_{n}^{i}

given by

Δ I_{n}^{i} = Δ Y_{n} - Δ t h ({\tilde{X}}_{n}^{i}) - Δ t^{1 / 2} H G Θ_{n}^{i} - Δ t^{1 / 2} R^{1 / 2} Ξ_{n}^{i}, Ξ_{n}^{i} \sim N (0, I) .

(70)

The McKean–Vlasov equations of this section form the basis for the methods proposed for the combined state and parameter estimation problem to be considered next.

5. Combined State and Parameter Estimation

We now return to the combined state and parameter estimation problem, and consider the augmented dynamics

\begin{matrix} d X_{t} & = f (X_{t}, A_{t}) d t + G d W_{t}, \end{matrix}

(71a)

\begin{matrix} d A_{t} & = 0, \end{matrix}

(71b)

with observations (3) as before. The initial conditions satisfy

X_{0} = x_{0}

a.s., and

A_{0} \sim Π_{0}

. Let us introduce the extended state space variable

Z_{t} = {(X_{t}^{T}, A_{t}^{T})}^{T}

. In terms of

Z_{t}

, the Equations (3) and (71) take the form

\begin{matrix} d Z_{t} = \bar{f} (Z) d t + \bar{G} d W_{t}, \end{matrix}

(72a)

\begin{matrix} d Y_{t} = \bar{H} d Z_{t} + R^{1 / 2} d V_{t}, \end{matrix}

(72b)

with

\bar{f} (z) = (\begin{matrix} f (x, a) \\ 0 \end{matrix}), \bar{G} = (\begin{matrix} G & 0 \\ 0 & 0 \end{matrix}), \bar{H} = (\begin{matrix} H & 0 \end{matrix}) .

(73)

Thus we end up with an augmented state estimation problem of the general structure considered in detail in Section 4 already. Below we provide details on some of the necessary modifications.

5.1. Feedback Particle Filter Formulation

The appropriately extended feedback particle filter Equation (43) leads to

\begin{matrix} d {\tilde{X}}_{t} & = f (\tilde{X}, {\tilde{A}}_{t}) d t + G d W_{t} + (\nabla_{x} Ψ_{t} ({\tilde{X}}_{t}, {\tilde{A}}_{t}) + Q H^{T}) C^{- 1} \circ d I_{t} + Ω_{t} ({\tilde{X}}_{t}, {\tilde{A}}_{t}), \end{matrix}

(74a)

\begin{matrix} d {\tilde{A}}_{t} & = \nabla_{a} Ψ_{t} ({\tilde{X}}_{t}, {\tilde{A}}_{t}) C^{- 1} \circ d I_{t}, \end{matrix}

(74b)

where (46) takes the form

d I_{t} = d Y_{t} - (h ({\tilde{X}}_{t}, {\tilde{A}}_{t}) d t + H G d W_{t} + R^{1 / 2} d U_{t})

(75)

with observation map (4) and correction

Ω_{t}

given by (45), with Q replaced by

\bar{Q} = \bar{G} {\bar{G}}^{T}

and H by

\bar{H}

. In the Poisson equation(s) (19),

{\tilde{Π}}_{t}

is replaced by

{\tilde{π}}_{t}

denoting the joint density of

({\tilde{X}}_{t}, {\tilde{A}}_{t})

. We also stress that

Ψ_{t}

becomes a function of x and a, and we distinguish between gradients with respect to x and a using the notation

\nabla_{x}

and

\nabla_{a}

, respectively.

Numerical implementations of the extended feedback particle filter are demanding due to the need for solving the Poisson equation(s) (19). Instead, we again rely on the ensemble Kalman–Bucy filter approximation, which we describe next.

5.2. Ensemble Kalman–Bucy Filter

We approximate the joint density

{\tilde{π}}_{t}

of

{\tilde{Z}}_{t}

by an ensemble of particles

{\tilde{Z}}_{t}^{i} = (\begin{matrix} {\tilde{X}}_{t}^{i} \\ {\tilde{A}}_{t}^{i} \end{matrix}),

(76)

that is,

{\tilde{π}}_{t} \approx \frac{1}{M} \sum_{i = 1}^{M} δ_{{\tilde{Z}}_{t}^{i}},

(77)

where

δ_{z^{'}}

denotes the Dirac delta function centred at

z^{'}

. The initial ensemble satisfies

X_{0}^{i} = x_{0}

for all

i = 1, \dots, M

, and the initial parameter values

A_{0}^{i}

are independent draws from the prior distribution

Π_{0}

.

At the same time, we make the approximation

{\tilde{Z}}_{t} \sim N ({\bar{z}}_{t}^{M}, {\hat{P}}_{t}^{z z})

when dealing with the Kalman gain of the feedback particle filter. Here the empirical mean

{\bar{z}}_{t}^{M}

has components

{\bar{x}}_{t}^{M} = \frac{1}{M} \sum_{i = 1}^{M} {\tilde{X}}_{t}^{i}, {\bar{a}}_{t}^{M} = \frac{1}{M} \sum_{i = 1}^{M} {\tilde{A}}_{t}^{i},

(78)

and the joint empirical covariance matrix is given by

{\hat{P}}_{t}^{z z} = \frac{1}{M - 1} \sum_{i = 1}^{M} {\tilde{Z}}_{t}^{i} {({\tilde{Z}}_{t} - {\bar{z}}_{t}^{M})}^{T} = (\begin{matrix} {\hat{P}}_{t}^{x x} & {\hat{P}}_{t}^{x a} \\ {({\hat{P}}_{t}^{x a})}^{T} & {\hat{P}}_{t}^{a a} \end{matrix}) .

(79)

As in Section 4.3, the solution to (19) can be approximated by

\nabla_{x} Ψ_{t} = P_{t}^{x h}, \nabla_{a} Ψ_{t} = P_{t}^{a h},

(80)

where finally, the covariance matrices

P_{t}^{x h}

and

P_{t}^{a h}

are estimated by their empirical counterparts

\begin{matrix} {\hat{P}}_{t}^{x h} & = \frac{1}{M - 1} \sum_{i = 1}^{M} {\tilde{X}}_{t}^{i} {(h ({\tilde{X}}_{t}^{i}, {\tilde{A}}_{t}^{i}) - {\bar{h}}_{t}^{M})}^{T}, \end{matrix}

(81a)

\begin{matrix} {\hat{P}}_{t}^{a h} & = \frac{1}{M - 1} \sum_{i = 1}^{M} {\tilde{A}}_{t}^{i} {(h ({\tilde{X}}_{t}^{i}, {\tilde{A}}_{t}^{i}) - {\bar{h}}_{t}^{M})}^{T}, \end{matrix}

(81b)

with

{\bar{h}}_{t}^{M}

defined by

{\bar{h}}_{t}^{M} = \frac{1}{M} \sum_{i = 1}^{M} h ({\tilde{X}}_{t}^{i}, {\tilde{A}}_{t}^{i}) .

(82)

Summing everything up, we obtain the following generalised ensemble Kalman–Bucy filter equations

\begin{matrix} d {\tilde{X}}_{t}^{i} & = f ({\tilde{X}}_{t}^{i}, {\tilde{A}}_{t}^{i}) d t + G d W_{t}^{i} + ({\hat{P}}_{t}^{x h} + Q H^{T}) C^{- 1} d I_{t}^{i}, \end{matrix}

(83a)

\begin{matrix} d {\tilde{A}}_{t}^{i} & = {\hat{P}}_{t}^{a h} C^{- 1} d I_{t}^{i}, \end{matrix}

(83b)

where the innovations are given by

d I_{t}^{i} = d Y_{t} - (h ({\tilde{X}}_{t}^{i}, {\tilde{A}}_{t}^{i}) d t + H G d W_{t}^{i} + R^{1 / 2} d U_{t}^{i}),

(84)

and

W_{t}^{i}

and

U_{t}^{i}

denote independent

N_{x}

-dimensional and

N_{y}

-dimensional Brownian motions, respectively, for

i = 1, \dots, M

.

The interacting particle Equation (83) can be time-stepped along the lines discussed in Section 4.3 for the pure state estimation formulation of the ensemble Kalman–Bucy filter.

6. Numerical Results

We now apply the generalised ensemble Kalman–Bucy filter formulation (83) with innovation (84) to five different model scenarios.

6.1. Parameter Estimation for the Ornstein–Uhlenbeck Process

Our first example is provided by the Ornstein–Uhlenbeck process

d X_{t} = a X_{t} d t + Q^{1 / 2} d W_{t}

(85)

with unknown parameter

a \in R

, and known initial condition

X_{0} = 1 / 2

. We assume an observation model of the form (3) with

H = 1

, and a measurement error taking values

R = 0.01

,

R = 0.0001

, and

R = 0

. The model error variance is set to either

Q = 0.5

or

Q = 0.005

. Except for the case

R = 0

a combined state and parameter estimation problem is to be solved. We implement the ensemble Kalman–Bucy filter (Section 5.2) with innovation (84), step size

Δ t = 0.005

, and ensemble size

M = 1000

. The data is generated using the Euler–Maruyama method applied to (85), with

a = - 1 / 2

and integrated over a time-interval

[0, 500]

with the same step size. The prior distribution

Π_{0}

for the parameter is Gaussian with mean

\bar{a} = - 1 / 2

and variance

σ_{a}^{2} = 2

. The results can be found in Figure 1. We find that the ensemble Kalman–Bucy filter is able to successfully identify the unknown parameter under all tested experimental settings, except for the largest measurement error case where

R = 0.01

. There, a small systematic offset of the estimated parameter value can be observed. One can also see that the variance in the parameter estimate monotonically decreases in time in all cases, while the variance in the state estimates approximately reaches a steady state.

6.2. Averaging

Consider the equations

\begin{matrix} d Y_{t} & = (1 - Z_{t}^{2}) Y_{t} d t + Q^{1 / 2} d W_{t}^{y}, \end{matrix}

(86a)

\begin{matrix} d Z_{t} & = - \frac{α}{ϵ} Z_{t} d t + \sqrt{\frac{2 λ}{ϵ}} d W_{t}^{z} \end{matrix}

(86b)

from [19] for

λ, α, γ, ϵ > 0

, and initial condition

Y_{0} = 1 / 2

,

Z_{0} = 0

. The reduced equations in the limit

ϵ \to 0

are given by (85), with parameter value

a = 1 - \frac{λ}{α}

(87)

and initial condition

X_{0} = 1 / 2

. The reduced dynamics corresponds to a (stable) Ornstein–Uhlenbeck process for

λ / α > 1

. We wish to estimate the parameter a from observed increments

Δ Y_{n} = Y_{n + 1} - Y_{n} + Δ t^{1 / 2} R^{1 / 2} Ξ_{n}, Ξ_{n} \sim N (0, 1),

(88)

where the sequence of

{Y_{n}}_{n \geq 0}

is obtained by time-stepping (86) using the Euler–Maruyama method with a step size

Δ t

. We set

λ = 3

,

α = 2

(so that

a = - 1 / 2

),

Q = 0.5

, and

ϵ \in {0.1, 0.01}

in our experiments. The measurement noise is set to

R = 0.01

or

R = 0

(pure parameter estimation).

We implement the ensemble Kalman–Bucy filter (83) with innovation (84), step size

Δ t = ϵ / 50

, and ensemble size

M = 1000

for the reduced Equation (87). The data is generated from an Euler–Maruyama discretization of (86) with the same step size. We also investigate the effect of subsampling the observations for

ϵ = 0.01

by solving (86) with step size

Δ t = ϵ / 50

and storing only every tenth solution

Y_{n}

, while the reduced equations and the ensemble Kalman–Bucy filter equations are integrated with

Δ t = ϵ / 5

. The results are shown in Figure 2. Figure 3 shows the results for the same experiments repeated with a smaller ensemble size of

M = 10

. We find that the smaller ensemble size leads to more noisy estimates for the variance in

{\tilde{X}}_{n}

and a faster decay of the variance in

{\tilde{A}}_{n}

, but the estimated parameter values are equally well converged. Subsampling does not lead to significant changes in the estimated parameter values. This is in contrast to the example considered next.

We finally mention [31] for alternative approaches to sequential estimation in the context of averaging using however different assumptions on the data.

6.3. Homogenisation

In this example, the data is produced by integrating the multi-scale SDE

\begin{matrix} d Y_{t} & = (\frac{\sqrt{σ / 2}}{ϵ} Z_{t} + a Y_{t}) d t, \end{matrix}

(89a)

\begin{matrix} d Z_{t} & = - \frac{1}{ϵ^{2}} Z_{t} d t + \frac{\sqrt{2}}{ϵ} d W_{t}^{z} \end{matrix}

(89b)

with parameter values

ϵ = 0.1

,

a = - 1 / 2

,

σ = 1 / 2

, and initial condition

Y_{0} = 1 / 2

,

Z_{0} = 0

. Here,

W_{t}^{z}

denotes standard Brownian motion. The equations are discretised with step size

Δ τ = ϵ^{2} / 50 = 0.0002

, and the resulting increments (88) are stored over a time interval

[0, 500]

. See [32] for more details.

According to homogenisation theory, the reduced model is given by (85) with

Q = σ

, and we wish to estimate the parameter a from the data

{Δ Y_{n}}

produced according to (88). It is known that a standard maximum likelihood estimator (MLE) given by

a_{ML} = \frac{\sum_{n} Y_{t_{n}} (Y_{t_{n + 1}} - Y_{t_{n}})}{\sum_{n} Y_{t_{n}}^{2} Δ τ}

(90)

leads to

a_{ML} = 0

in the limit

Δ τ \to 0

and the observation interval

T \to \infty

. This MLE corresponds to

H = I

and

R = 0

in our extended state space formulation of the problem. Subsampling can be achieved by choosing an appropriate time-step

Δ t > Δ τ

in the ensemble Kalman–Bucy filter equations and a corresponding subsampling of the data points

Y_{n}

in (88). We used

Δ t = 50 Δ τ = 0.01

and

Δ t = 500 Δ τ = 0.1

, respectively. The results can be found in Figure 4. It can be seen that only the larger subsampling leads to a correct estimate of the parameter a. This is in line with known results for the maximum likelihood estimator (90). See [32] and references therein.

6.4. Nonparametric Drift and State Estimation

We consider nonparametric drift estimation for one-dimensional SDEs over a periodic domain

[0, 2 π)

in the setting considered from a theoretical perspective in [33]. There, a zero-mean Gaussian process prior

GP (0, D^{- 1})

is placed on the unknown drift function, with inverse covariance operator

D : = η [{(- Δ)}^{p} + κ I] .

(91)

The integer parameter p sets the regularity of the process, whereas

η, κ \in R^{+}

control its characteristic correlation length and stationary variance.

Spatial discretization of the problem is carried out by first defining a grid of

N_{d}

evenly spaced points on the domain, at locations

x_{i} = i Δ x

,

Δ x = 2 π / N_{d}

. The drift function is projected onto compactly supported functions centred at these points, which are piecewise linear with

b_{i} (x_{j}) = δ_{i j}

(92)

and linear interpolation is used to define a drift function

f (x, a)

for all

x \in [0, 2 π)

, that is, it is of the form (2) with

f_{0} (x) \equiv 0

. In this example, we set

N_{d} = 200

. Sample realisations, as well as the reference drift

f^{*}

, can be found in Figure 5a.

Data is generated by integrating the SDE (1) with drift

f^{*}

forward in time from initial condition

X_{0} = π

and with noise level

Q = 0.1

, using the Euler–Maruyama discretisation with step size

Δ t = 0.1

over one million time-steps. The spatial distribution of the solutions

X_{n}

is plotted in Figure 5b. The data is then given by

Δ Y_{n} = X_{n + 1} - X_{n} + Δ t^{1 / 2} R^{1 / 2} Ξ_{n}

(93)

with

R = 0.00001

. Data assimilation is performed using the time-discretised ensemble Kalman–Bucy filter Equation (83) with innovation (84), ensemble size

M = 200

, and step size

Δ t = 0.1

.

The final estimate of the drift function (ensemble mean) and the ensemble of drift functions can be found in Figure 5c. Figure 5d displays the ensemble of state estimates and the value of the reference solution at the final time. We find that the ensemble Kalman–Bucy filter is able to successfully estimate the drift function and the model states. Further experiments reveal that the drift function can only be identified for sufficiently small measurement errors.

6.5. Spde Parameter Estimation

Consider the stochastic heat equation on the periodic domain

x \in [0, 2 π)

, given in conservative form by the stochastic partial differential equation (SPDE)

\begin{matrix} d u (x, t) = \nabla \cdot (θ (x) \nabla u (x, t)) d t + σ^{1 / 2} d W (x, t), \end{matrix}

(94)

where

W (x, t)

is space-time white noise. With constant

θ (x) = θ

, this SPDE reduces to

\begin{matrix} d u (x, t) = θ Δ u (x, t) d t + σ^{1 / 2} d W (x, t) . \end{matrix}

(95)

In this example, we examine the estimation of

θ

from incremental measurements of a locally averaged quantity

q (x, t)

that arises naturally in a standard finite volume discretisation of (95).

To discretise the system, one first defines

q_{t}^{i} = q (x_{i}, t)

around

N_{d} = 200

grid points

x_{i}

on a regular grid, separated by distances

Δ x

, as

\begin{matrix} q_{t}^{i} = \int_{x_{i} - Δ x / 2}^{x_{i} + Δ x / 2} u (x, t) d x . \end{matrix}

(96)

The conservative (drift) term in (94) reduces to

\begin{matrix} \int_{x_{i} - Δ x / 2}^{x_{i} + Δ x / 2} \nabla \cdot (θ (x) \nabla u (x, t)) d x = θ_{i + 1 / 2} \nabla u_{t}^{i + 1 / 2} - θ_{i - 1 / 2} \nabla u_{t}^{i - 1 / 2}, \end{matrix}

(97)

where

θ_{i \pm 1 / 2} \equiv θ (x_{i} + Δ x / 2)

, etc. The following standard finite difference approximations

\begin{matrix} \nabla u_{t}^{i + 1 / 2} ≃ \frac{u_{t}^{i + 1} - u_{t}^{i}}{Δ x}, u_{t}^{i} ≃ Δ x^{- 1} q_{t}^{i} \end{matrix}

(98)

yield the

N_{d}

-dimensional SDE

\begin{matrix} d q_{t}^{i} & = θ (\frac{q_{t}^{i + 1} - 2 q_{t}^{i} + q_{t}^{i - 1}}{Δ x^{2}}) d t + σ^{1 / 2} Δ x^{1 / 2} d W_{t}^{i} \end{matrix}

(99)

for constant

θ

, where

W_{t}^{i}

are independent one-dimensional Brownian motions in time.

Following recent results from [20] we consider the case of estimation of a constant

a = θ

value from measurements

d q_{t}^{*}

at a fixed location/index

j^{*} \in {1, \dots, N_{d}}

. The data trajectory is thus given by

\begin{matrix} d Y_{t} = d q_{t}^{*} + R^{1 / 2} d V_{t} \end{matrix}

(100)

where

R^{1 / 2}

is a scalar and

V_{t}

is a standard Brownian motion in one dimension. We perform numerical experiments in which the initial state

q_{0}^{i}

is set to zero for all indices i and the prior on the unknown parameter

a = θ

is uniform over the interval

[0.2, 1.8]

.

The increment data is generated by first integrating (95) forward in time from the known initial condition

q_{i} (0) = 0

for all i. The equation is discretised in time using the Euler-Maruyama method. It is known that

Δ t < θ Δ x^{2} / 2

is required for stability of the Euler–Maruyama discretisation; we use the much smaller time step

Δ t = Δ x^{2} / 80

. The solution is sampled with this same time step, and increment measurements are approximated at time

t_{n}

by setting the measurement noise level R to zero in (100), resulting in

\begin{matrix} Δ Y_{n} = q_{n + 1}^{*} - q_{n}^{*} . \end{matrix}

(101)

Please note that the associated model error in (1) is given by

G = σ^{1 / 2} Δ x^{1 / 2} I

and the matrix H in (3) projects the vector of state increments onto a single component with index

j^{*} = N_{d} / 2

. Simulations are performed over the time-interval

[0, 20]

. The results can be found in Figure 6a. We also compute the model evidence for a sequence of parameter values

θ \in {0.2, 0.3, \dots, 1.8}

based on a standard Kalman–Bucy filter [6] for the associated linear state estimation problem. See Figure 6b. Both approaches agree with the reference value

θ = 1

.

6.6. Discussion

The results presented here demonstrate that the proposed methodology can be applied to a broad range of continuous-time state and parameter estimation problems with correlated measurement and model errors. Alternatively, one could have employed standard SMC or MCMC methods utilising the modified observation Model (41) as implied by the Kushner–Stratonovitch formulation (39) of the filtering problem. However, such implementations require the approximation of the additional

Q \nabla log π_{t}

term which is nontrivial if only samples from

π_{t}

are available. Furthermore, the limiting behaviour of such implementations in the limit

R \to 0

and

H = I

(pure parameter estimation problem) is unclear since

π_{t}

degenerates into a Dirac delta distribution, potentially leading to numerical difficulties in this singular regime. The proposed generalised feedback particle filter formulation avoids these issues through the use of stochastic innovations which are correlated with the model noise. In other words, the distribution

π_{t}

does not appear explicitly in the innovation process (46), and the correlated noise terms cancel each other out as discussed in Remark 3 for

R = 0

and

H = I

. The main computational challenge of the feedback particle filter approach is given by the need for finding the Kalman gain matrix (57). However, the constant gain ensemble Kalman–Bucy approximation

K_{t} \approx (P^{x h} + Q H^{T}) C^{- 1}

(102)

is easy to implement. In fact, the only differences with the standard ensemble Kalman–Bucy filter formulation of [14] are in the additional

Q H^{T}

term in the Kalman gain, and a correlation between the stochastic innovation process and the model error. While the ensemble Kalman–Bucy filter gave rather satisfactory results for the numerical experiments displayed in Section 6, strongly non-Gaussian distributions might require more accurate approximations to the Kalman gain matrix (57). In that case, one could rely on the particle-based diffusion map approximation considered in [27].

7. Conclusions

In this paper, we have derived McKean–Vlasov equations for combined state and parameter estimation from continuously observed state increments. An approximate and robust implementation of these McKean–Vlasov equations in the form of a generalised ensemble Kalman–Bucy filter has been provided and applied to a range of increasingly complex model systems. Future work will address the treatment of temporally-correlated measurement and model errors, as well as a rigorous analysis of these McKean–Vlasov equations in the contexts of multi-scale dynamics and nonparametric drift estimation.

Author Contributions

Methodology, N.N. and S.R.; software, S.R. and P.J.R.; validation, N.N., S.R. and P.J.R.; writing—original draft preparation, N.N., S.R.; writing—review and editing, N.N., S.R. and P.J.R.

Funding

This research has been partially funded by Deutsche Forschungsgemeinschaft (DFG) through grants CRC 1294 ‘Data Assimilation’ (project A06) and CRC 1114 ‘Scaling Cascades’ (project A02).

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. The Filtering Equations for Correlated Noise

In this appendix we outline a derivation of the Kushner-Stratonovich Equation (39) for the signal-observation dynamics given by (38). In fact, we only compute the evolution equation (termed modified Zakai equation) for the unnormalised filtering distribution

ρ_{t} [ϕ] = E [l_{t} ϕ (X_{t}) | Y_{[0, t]}]

, where the likelihood

l_{t}

is given by

l_{t} \equiv l (Y_{[0, t]} | X_{[0, t]}) = exp (\int_{0}^{t} f {(X_{s})}^{T} H^{T} C^{- 1} d Y_{s} - \frac{1}{2} \int_{0}^{t} f {(X_{s})}^{T} H^{T} C^{- 1} H f (X_{s}) d s) .

(A1)

Obtaining the Kushner-Stratonovich formulation is then standard, applying Itô’s formula to the Kallianpur-Striebel formula

π [ϕ] = ρ_{t} [ϕ] / ρ_{t} [1]

, see ([7], Chapter 3). The following result is in agreement with the corollaries 3.39 and 3.40 in [7].

Lemma A1.

The modified Zakai equation is given by

ρ_{t} [ϕ] = ρ_{0} [ϕ] + \int_{0}^{t} ρ_{s} [L ϕ] d s + \int_{0}^{t} ρ_{s} [ϕ f^{T} H^{T} C^{- 1}] d Y_{s} + \int_{0}^{t} ρ_{s} [\nabla ϕ] Q H^{T} C^{- 1} d Y_{s},

(A2)

where the generator

L

has been defined in (40).

Proof.

For convenience, let us define the process

M_{t} = \int_{0}^{t} f {(X_{s})}^{T} H^{T} C^{- 1} d Y_{s},

(A3)

where

Y_{s}

satisfies (38b). From

{〈 Y 〉}_{t} = C t

we see that

{〈 M 〉}_{t} = \int_{0}^{t} f {(X_{s})}^{T} H^{T} C^{- 1} H f (X_{s}) d s .

(A4)

Hence, the likelihood takes the form

l_{t} = exp (M_{t} - \frac{1}{2} {〈 M 〉}_{t}),

(A5)

satisfying the SDE

d l_{t} = l_{t} d M_{t} .

(A6)

For an arbitrary smooth compactly supported test function

ϕ

, Itô’s formula implies

\begin{matrix} l_{t} ϕ (X_{t}) & = ϕ (X_{0}) + \int_{0}^{t} ϕ (X_{s}) d l_{s} + \int_{0}^{t} l_{s} \nabla ϕ (X_{s}) \cdot d X_{s} \end{matrix}

(A7a)

\begin{matrix} + \frac{1}{2} \sum_{i, j = 1}^{N_{x}} \int_{0}^{t} l_{s} \partial_{i} \partial_{j} ϕ (X_{s}) d {〈 X^{i}, X^{j} 〉}_{s} + \sum_{i = 1}^{N_{x}} \int_{0}^{t} \partial_{i} ϕ (X_{s}) d {〈 l, X^{i} 〉}_{s}, \end{matrix}

(A7b)

where

X_{s}

satisfies (38a). For the covariation process

{〈 l, X 〉}_{t}

we obtain

{〈 l, X 〉}_{t} = l_{t} {〈 M, X 〉}_{t} = l_{t} f {(X_{t})}^{T} H^{T} C^{- 1} H Q t,

(A8)

using

{〈 Y, X 〉}_{t} = H Q t

. Furthermore,

{〈 X, X 〉}_{t} = Q t

, which follows from the definition of the stochastic contributions in (38a).

We now apply the conditional expectation to (A7). Noticing that

\int_{0}^{t} ϕ (X_{s}) d l_{s} = \int_{0}^{t} l_{s} ϕ (X_{s}) f {(X_{s})}^{T} H^{T} C^{- 1} d Y_{s},

(A9)

the result follows from (A6). □

References

Kutoyants, Y. Statistical Inference for Ergodic Diffusion Processes; Springer: New York, NY, USA, 2004. [Google Scholar]
Pavliotis, G. Stochastic Processes and Applications; Springer: New York, NY, USA, 2014. [Google Scholar]
Apte, A.; Hairer, M.; Stuart, A.; Voss, J. Sampling the posterior: An approach to non-Gaussian data assimilation. Phys. D Nonlinear Phenom. 2007, 230, 50–64. [Google Scholar] [CrossRef] [Green Version]
Salman, H.; Kuznetsov, L.; Jones, C.; Ide, K. A method for assimilating Lagrangian data into a shallow-water-equation ocean model. Mon. Weather Rev. 2006, 134, 1081–1101. [Google Scholar] [CrossRef]
Apte, A.; Jones, C.; Stuart, A. A Bayesian approach to Lagrangian data assimilation. Tellus A 2008, 60, 336–347. [Google Scholar] [CrossRef] [Green Version]
Simon, D. Optimal State Estimation; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
Bain, A.; Crisan, D. Fundamentals of Stochastic Filtering; Springer: New York, NY, USA, 2009. [Google Scholar]
Liu, J. Monte Carlo Strategies in Scientific Computing; Springer: New York, NY, USA, 2001. [Google Scholar]
Crisan, D.; Xiong, J. Approximate McKean-Vlasov representation for a class of SPDEs. Stochastics 2010, 82, 53–68. [Google Scholar] [CrossRef]
McKean, H. A class of Markov processes associated with nonlinear parabolic equations. Proc. Natl. Acad. Sci. USA 1966, 56, 1907–1911. [Google Scholar] [CrossRef] [PubMed]
Yang, T.; Mehta, P.; Meyn, S. Feedback particle filter. IEEE Trans. Autom. Control 2013, 58, 2465–2480. [Google Scholar] [CrossRef]
Reich, S. Data assimilation: The Schrödinger perspective. Acta Numer. 2019, 28, 635–710. [Google Scholar]
Majda, A.; Harlim, J. Filtering Complex Turbulent Systems; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
Bergemann, K.; Reich, S. An ensemble Kalman–Bucy filter for continuous data assimilation. Meteorol. Z. 2012, 21, 213–219. [Google Scholar] [CrossRef]
Taghvaei, A.; de Wiljes, J.; Mehta, P.; Reich, S. Kalman filter and its modern extensions for the continuous-time nonlinear filtering problem. ASME. J. Dyn. Syst. Meas. Control 2017, 140. [Google Scholar] [CrossRef]
Law, K.; Stuart, A.; Zygalakis, K. Data Assimilation: A Mathematical Introduction; Springer: New York, NY, USA, 2015. [Google Scholar]
Reich, S.; Cotter, C. Probabilistic Forecasting and Bayesian Data Assimilation; Cambridge University Press: Cambridge, UK, 2015. [Google Scholar]
Moral, P.D. Mean Field Simulation for Monte Carlo Integration; Chapman and Hall/CRC: London, UK, 2013. [Google Scholar]
Pavliotis, G.; Stuart, A. Multiscale Methods; Springer: New York, NY, USA, 2008. [Google Scholar]
Altmeyer, R.; Reiß, M. Nonparametric Estimation for Linear SPDEs from Local Measurements; Technical Report; Humboldt University Berlin: Berlin, Germany, 2019. [Google Scholar]
Saha, S.; Gustafsson, F. Particle filtering with dependent noise processes. IEEE Trans. Signal Process. 2012, 60, 4497–4508. [Google Scholar] [CrossRef]
Berry, T.; Sauer, T. Correlations between systems and observation errors in data assimilation. Mon. Weather Rev. 2018, 146, 2913–2931. [Google Scholar] [CrossRef]
Mitchell, H.L.; Daley, R. Discretization error and signal/error correlation in atmospheric data assimilation: (I). All scales resolved. Tellus A 1997, 49, 32–53. [Google Scholar] [CrossRef]
Papaspiliopoulos, O.; Pokern, Y.; Roberts, G.; Stuart, A. Nonparametric estimation of diffusion: A differential equation approach. Biometrika 2012, 99, 511–531. [Google Scholar] [CrossRef]
Särkkä, S. Bayesian Filtering and Smoothing; Cambridge University Press: Cambridge, UK, 2013. [Google Scholar]
Laugesen, R.S.; Mehta, P.G.; Meyn, S.P.; Raginsky, M. Poisson’s equation in nonlinear filtering. SIAM J. Control Optim. 2015, 53, 501–525. [Google Scholar] [CrossRef]
Taghvaei, A.; Mehta, P.; Meyn, S. Gain Function Approximation in the Feedback Particle Filter; Technical Report; University of Illinois at Urbana-Champaign: Champaign, IL, USA, 2019. [Google Scholar]
Amezcua, J.; Kalnay, E.; Ide, K.; Reich, S. Ensemble transform Kalman-Bucy filters. Q. J. R. Meteorol. Soc. 2014, 140, 995–1004. [Google Scholar] [CrossRef]
De Wiljes, J.; Reich, S.; Stannat, W. Long-time stability and accuracy of the ensemble Kalman–Bucy filter for fully observed processes and small measurement noise. SIAM J. Appl. Dyn. Syst. 2018, 17, 1152–1181. [Google Scholar] [CrossRef]
Blömker, D.; Schillings, C.; Wacker, P. A strongly convergent numerical scheme for ensemble Kalman inversion. SIAM J. Numer. Anal. 2018, 56, 2537–2562. [Google Scholar] [CrossRef]
Harlim, J. Model error in data assimilation. In Nonlinear and Stochastic Climate Dynamics; Franzke, C., Kane, T.O., Eds.; Cambridge University Press: Cambridge, UK, 2017; pp. 276–317. [Google Scholar]
Krumscheid, S.; Pavliotis, G.; Kalliadasis, S. Semi-parametric drift and diffusion estimation for multiscale diffusions. SIAM J. Multiscale Model. Simul. 2011, 11, 442–473. [Google Scholar] [CrossRef]
Van Waaij, J.; van Zanten, H. Gaussian process methods for one-dimensional diffusion: Optimal rates and adaptation. Electron. J. Stat. 2016, 10, 628–645. [Google Scholar] [CrossRef]

Figure 1. Results for the Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a)

Q = 1 / 2

,

R = 0.01

; (b)

Q = 1 / 2

,

R = 0.0001

; (c)

Q = 1 / 2

,

R = 0

(pure parameter estimation); (d)

Q = 0.005

,

R = 0.0001

. The ensemble size is set to

M = 1000

in all cases. Displayed are the ensemble mean

{\bar{a}}_{n}

and the ensemble variance in

{\tilde{A}}_{n}

and

{\tilde{X}}_{n}

. The variance of

{\tilde{X}}_{n}

is zero when

R = 0

in case (b).

Figure 1. Results for the Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a)

Q = 1 / 2

,

R = 0.01

; (b)

Q = 1 / 2

,

R = 0.0001

; (c)

Q = 1 / 2

,

R = 0

(pure parameter estimation); (d)

Q = 0.005

,

R = 0.0001

. The ensemble size is set to

M = 1000

in all cases. Displayed are the ensemble mean

{\bar{a}}_{n}

and the ensemble variance in

{\tilde{A}}_{n}

and

{\tilde{X}}_{n}

. The variance of

{\tilde{X}}_{n}

is zero when

R = 0

in case (b).

Figure 2. Results for the averaged Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.1

; (b)

Q = 1 / 2

,

R = 0

,

ϵ = 0.1

(pure parameter estimation); (c)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.01

; (d)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.01

and subsampling by a factor of ten. The ensemble size is set to

M = 1000

in all cases. Displayed are the ensemble mean and the ensemble variance in

{\tilde{A}}_{n}

and

{\tilde{X}}_{n}

. The variance of

{\tilde{X}}_{n}

is zero when

R = 0

in case (b).

Figure 2. Results for the averaged Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.1

; (b)

Q = 1 / 2

,

R = 0

,

ϵ = 0.1

(pure parameter estimation); (c)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.01

; (d)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.01

and subsampling by a factor of ten. The ensemble size is set to

M = 1000

in all cases. Displayed are the ensemble mean and the ensemble variance in

{\tilde{A}}_{n}

and

{\tilde{X}}_{n}

. The variance of

{\tilde{X}}_{n}

is zero when

R = 0

in case (b).

Figure 3. Results for the averaged Ornstein-Uhlenbeck process, now with a smaller ensemble size M = 10. Otherwise, panels (a–d) correspond to the same experimental settings as in Figure 2.

Figure 4. Results for the homoginsation Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.1

; (b)

Q = 1 / 2

,

R = 0

,

ϵ = 0.1

(pure parameter estimation); (c)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.1

and subsampling by a factor of fifty; (d)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.1

and subsampling by a factor of five hundred. The ensemble size is set to

M = 10

in all cases. Displayed are the ensemble mean and the ensemble variance in

{\tilde{A}}_{n}

and

{\tilde{X}}_{n}

. The variance of

{\tilde{X}}_{n}

is zero under (c).

Figure 4. Results for the homoginsation Ornstein–Uhlenbeck state and parameter estimation problem under different experimental settings: (a)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.1

; (b)

Q = 1 / 2

,

R = 0

,

ϵ = 0.1

(pure parameter estimation); (c)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.1

and subsampling by a factor of fifty; (d)

Q = 1 / 2

,

R = 0.01

,

ϵ = 0.1

and subsampling by a factor of five hundred. The ensemble size is set to

M = 10

in all cases. Displayed are the ensemble mean and the ensemble variance in

{\tilde{A}}_{n}

and

{\tilde{X}}_{n}

. The variance of

{\tilde{X}}_{n}

is zero under (c).

Figure 5. Results for the nonparametric drift and state estimation problem: (a) reference drift function (thick line) and ensemble of drift functions drawn from the prior distribution; (b) histogram of samples from the reference trajectory; (c) reference drift function and its estimate (top) and ensemble of drift functions (bottom) at final time; (d) ensemble of states and the true value at final time.

Figure 6. Results for SPDE parameter estimation: (a) estimate of

θ

as a function of time as obtained by the ensemble Kalman–Bucy filter; (b) evidence based on a Kalman–Bucy filter for state estimation applied to a sequence of parameter values

θ \in {0.2, 0.3, \dots, 1.8}

.

Figure 6. Results for SPDE parameter estimation: (a) estimate of

θ

as a function of time as obtained by the ensemble Kalman–Bucy filter; (b) evidence based on a Kalman–Bucy filter for state estimation applied to a sequence of parameter values

θ \in {0.2, 0.3, \dots, 1.8}

.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nüsken, N.; Reich, S.; Rozdeba, P.J. State and Parameter Estimation from Observed Signal Increments. Entropy 2019, 21, 505. https://doi.org/10.3390/e21050505

AMA Style

Nüsken N, Reich S, Rozdeba PJ. State and Parameter Estimation from Observed Signal Increments. Entropy. 2019; 21(5):505. https://doi.org/10.3390/e21050505

Chicago/Turabian Style

Nüsken, Nikolas, Sebastian Reich, and Paul J. Rozdeba. 2019. "State and Parameter Estimation from Observed Signal Increments" Entropy 21, no. 5: 505. https://doi.org/10.3390/e21050505

APA Style

Nüsken, N., Reich, S., & Rozdeba, P. J. (2019). State and Parameter Estimation from Observed Signal Increments. Entropy, 21(5), 505. https://doi.org/10.3390/e21050505

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

State and Parameter Estimation from Observed Signal Increments

Abstract

1. Introduction

2. Mathematical Problem Formulation

3. Parameter Estimation from Noiseless Data

3.1. Feedback Particle Filter

3.2. Ensemble Kalman–Bucy Filter

4. State Estimation for Noisy Data

4.1. Generalised Feedback Particle Filter Formulation

4.2. Generalised Kalman–Bucy Filter

4.3. Ensemble Kalman–Bucy Filter

5. Combined State and Parameter Estimation

5.1. Feedback Particle Filter Formulation

5.2. Ensemble Kalman–Bucy Filter

6. Numerical Results

6.1. Parameter Estimation for the Ornstein–Uhlenbeck Process

6.2. Averaging

6.3. Homogenisation

6.4. Nonparametric Drift and State Estimation

6.5. Spde Parameter Estimation

6.6. Discussion

7. Conclusions

Author Contributions

Funding

Conflicts of Interest

Appendix A. The Filtering Equations for Correlated Noise

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI