Mean-Square Estimation of Nonlinear Functionals via Kalman Filtering

: This paper focuses on estimation of a nonlinear functional of state vector (NFS) in discrete-time linear stochastic systems. The NFS represents a nonlinear multivariate functional of state variables, which can indicate useful information of a target system for control. The optimal mean-square estimator of a general NFS represents a function of the Kalman estimate and its error covariance. The polynomial functional of state vector is studied in detail. In this case an optimal estimation algorithm has a closed-form computational procedure. The novel mean-square quadratic estimator is derived. For a general NFS we propose to use the unscented transformation to calculate an optimal estimate. The obtained results are demonstrated on theoretical and practical examples with different types of NFS. Comparative analysis with suboptimal estimators for NFS is presented. The subsequent application of the proposed estimators to linear discrete-time systems demonstrates their practical effectiveness.


Introduction
This paper is focused on the development of the optimal mean-square estimator for a nonlinear functional of state vector (NFS) in linear discrete-time stochastic systems. In this section, background material, a concise literature review, the formulation of the problem of interest for this study, the scope and the contributions of this investigation, and the organization of the paper are provided.

Scope and Contributions of this Investigation
The aim of this paper is to develop an optimal two-stage minimum mean square error (MMSE) estimator for an arbitrary NFS in a linear stochastic system, and investigate special polynomial and quadratic estimators for which one can obtain important mean square estimation results. The main contributions of the paper are the following: (1) This paper studies the MMSE estimation problem of an NFS within the discrete-time Kalman filtering framework. Using the mean square estimation approach, an optimal two-stage nonlinear estimator is proposed. (2) The optimal MMSE estimator for polynomial functionals (such as quadratic, cubic and quartic) is derived. We establish that the polynomial estimator representing a compact closed-form formula depends only on the Kalman estimate and its error covariance. (4) Performance of the proposed estimators for real NFS illustrates their theoretical and practical usefulness.

Organization of the Paper
This paper is organized as follows. Section 2 presents two motivating examples. The main theoretical results are presented in Section 3, which is divided into four subsections: (i) main idea and general solution; (ii) optimal closed-form MMSE estimator for quadratic functional; (iii) optimal closed-form MMSE estimator for polynomial functional; and, (iv) application of the unscented transformation for estimation of general NFS. Simulation analysis and verification of the proposed estimators are given in Section 4, which is divided into four subsections: (i) estimation of distance between unknown and moving points; (ii) estimation of power and modulus of unknown signal; (iii) estimation of total kinetic energy of wind tunnel system; and, (iv) estimation of size of oil spread contour. In Section 5, the summary of this investigation, the conclusions drawn in this paper, and the future direction of research are provided.

Motivating Examples
Two examples illustrate an NFS and how it may be used in practice. The first motivating example of the NFS is kinematic quantities of projectile motion. State vector of an object x ∈ R 4 , an angle (θ) and range (r) are shown in Figure 1.

Organization of the Paper
This paper is organized as follows. Section 2 presents two motivating examples. The main theoretical results are presented in Section 3, which is divided into four subsections: (i) main idea and general solution; (ii) optimal closed-form MMSE estimator for quadratic functional; (iii) optimal closed-form MMSE estimator for polynomial functional; and, (iv) application of the unscented transformation for estimation of general NFS. Simulation analysis and verification of the proposed estimators are given in Section 4, which is divided into four subsections: (i) estimation of distance between unknown and moving points; (ii) estimation of power and modulus of unknown signal; (iii) estimation of total kinetic energy of wind tunnel system; and, (iv) estimation of size of oil spread contour. In Section 5, the summary of this investigation, the conclusions drawn in this paper, and the future direction of research are provided.

Motivating Examples
Two examples illustrate an NFS and how it may be used in practice. The first motivating example of the NFS is kinematic quantities of projectile motion. State vector of an object x ∈ ℝ , an angle ( ) and range (r) are shown in Figure 1. The second example can be an arbitrary quadratic form, f(x ) = x Ωx , representing an energy-like function of an object or the Euclidean square distance, f(x ) = d (x , x ), between the current x and nominal x states, respectively. Mechanical examples of a quadratic form as energy and work are shown in Section 3.2.1. Hence, estimation and prediction of quantities represented by an NFS can be helpful in different applications, such as system control and target tracking.

Main Idea and General Solution
The main idea of the proposed optimal estimator includes two stages: the optimal Kalman estimate of a state vector computed at the first stage is nonlinearly transformed at the second stage based on the NFS (2) and the MMSE criterion. The second example can be an arbitrary quadratic form, f(x t ) = x T t Ωx t , representing an energy-like function of an object or the Euclidean square distance, f(x t ) = d 2 (x t , x n t ), between the current x t and nominal x n t states, respectively. Mechanical examples of a quadratic form as energy and work are shown in Section 3.2.1. Hence, estimation and prediction of quantities represented by an NFS can be helpful in different applications, such as system control and target tracking.

Main Idea and General Solution
The main idea of the proposed optimal estimator includes two stages: the optimal Kalman estimate of a state vector computed at the first stage is nonlinearly transformed at the second stage based on the NFS (2) and the MMSE criterion. At first, according to the MMSE criterion, the optimal estimate of the state x k represents the conditional mean, i.e.,x k = E(x k |Y k ). The conditional mean and corresponding error covariance P k = Cov(e k , e k ), e k = x k −x k , are described by the recursive Kalman filter (KF) equations [2,[8][9][10]: Next, the optimal mean square estimate of the NFS z k = f(x k ) based on the overall sensor measurements Y k , also represents a conditional mean: where p k ( x|Y k ) = N(x k , P k ) is a multivariate conditional Gaussian probability density function. Thus, the best estimate in (5) represents the MMSE estimator: which depends on the Kalman estimatex k and its error covariance P k determined by the KF Equation (4). Remark: For the general NFS, z k = f(x k ), calculation of the optimal estimate,ẑ k = E(z k |Y k ), is reduced to calculation of the multivariate integral (5). The shortcoming of the proposed estimator can be attributed to the impossibility of analytically calculating the integrals (optimal estimate) for an arbitrary class of nonlinear functionals. Analytical calculation of the integrals is possible only in special cases which are considered in Sections 3.2.2 and 3.3. In these cases, the conditional mean E(z k |Y k ) can be explicitly calculated in terms ofx k and P k , and the best MMSE estimator (5) takes a closed-form.

Optimal Closed-Form MMSE Estimator for Quadratic Functional
Consider an arbitrary quadratic functional (QF) of a state vector x k ∈ R n : where Ω k = Ω T k ≥ 0 and a k are an arbitrary matrix and vector, respectively.

Examples of Quadratic Functional as Energy and Work
Example 1. Work done from a moving particle Consider a particle with mass M and speed υ k = υ(t k ), which is moving along the x-axis. Let the particle have a velocity υ k at displacement x k = x(t k ). Then current work done, z k = W k , represents a QF such as: The equation of motion of a harmonic oscillator in a random environment is: where υ t = .
x t , ω 2 = k/M, M is the mass, k is the force constant of a spring. White Gaussian noise ξ t is added to represent the random environment.
Let sampling period be ∆t = t k − t k−1 1 so that terms of order ∆t 2 can be ignored. Then the discretized model of the harmonic oscillator is approximately: where ξ k represents discrete white noise. Then energy, z k = E k , is conserved as: Here the potential energy is U k = kx 2 k /2.

Closed-Form Quadratic Estimator
In the case of the QF (7) the optimal nonlinear estimate (5) can be explicitly calculated in terms of a state estimatex k and its error covariance P k . Using the formula [30]: we obtain the optimal mean-square estimate for the QF: In (12) and (13), tr(A) is the trace of a matrix A. In parallel to the optimal estimate (13) consider a suboptimal estimate of the QF, such aŝ z sub

True Mean Square Error for Quadratic Estimators
We compare the estimation accuracy of the optimal and suboptimal quadratic estimators (13) and (14), respectively.
The following theorem defines the true mean square errors (MSEs): Theorem 1. The mean square errors P opt z,k and P sub z,k for the optimal and suboptimal quadratic estimators (13) and (14) are given by: and respectively. Here the mean µ k = E(x k ) and covariance C k = Cov(x k , x k ) are determined by the Lyapunov equation: The proof of the theorem is given in the Appendix A.

Corollary 1.
In the particular case with z k = x k 2 = x T k x k , the optimal and suboptimal quadratic estimators and MSEs take the form:ẑ Generalization of a QF is a multivariate polynomial functional, which we consider in the next section.

Optimal Closed-Form MMSE Estimator for Polynomial Functional
Here we consider a special NFS which represents an arbitrary multivariate polynomial functional of an Nth degree: where A 1 2 ... n are a nonzero coefficients. Further for simplicity, we ignore the subscript k of x k ,x k , x i,k ,x i,k and P k , P ij,k . In case of the polynomial functional (20), the MMSE estimator,ẑ k = E f(x) |Y k , has a closed form because the best estimate (conditional expectation) depends on high-order conditional moments, m 1 2 ... n = E(x 1 1 x 2 2 . . . x n n |Y k ), of a multivariate Gaussian distribution, which can be explicitly calculated in terms of first-and second-order conditional moments, namely, the Kalman estimate and its error covariance (x, P) [31,32]. For example, the thirdand fourth-order moments of the components x i of a state vector x T = x 1 . . . x n are: For an arbitrary NFS,z k = f(x k ), we can use a truncated Taylor series. Then an optimal estimate of NFS can be approximately evaluated using the analytical Equation (22) for high-order moments. Also, to avoid complicated calculations of the multivariate integral in (5) it is useful to apply the unscented transformation.

Application of Unscented Transformation for Estimation of General NFS
The unscented transformation (UT) approximates the statistics of a transformed random variable, for example, the mean and covariance [33]. Following this approach, the procedure to calculate the best estimate of an NFS,ẑ opt k = E f(x k ) Y k , using the UT can be summarized as follows: where the sigma points X (s) k and corresponding weights ω (s) k are determined by: Here √ P k s is the sth column of the matrix square root of P k , and is the scaling parameter [34]. Thus, the UT estimateẑ UT k is represented by known functional of the Kalman estimatex k and error covariance P k .

Numerical Verification
In this section, demonstrative examples are used for testing the effectiveness of the MMSE estimators developed in this paper.

Estimation of Distance between Unknown and Moving Points
If θ is a scalar unknown which is measured in the presence of additive white noise then: The KF Equation (4) becomes: Using "step-by-step" induction, we obtain an exact formula for the error covariance P k = E (θ −x k ) 2 : Further, we consider an NFS representing distance between two points. In this case an NFS becomes z k = |θ − a k |. Using (5), the best estimate of the distance between an unknown point (location) θ and the moving point a k , k = 1, 2, . . . , is: wherex k and P k are determined by (26) and (27), respectively, and Φ(u) is the Gaussian cumulative distribution function:

Estimation of Power and Modulus of Unknown Signal
If x k is a scalar random signal measured in additive white noise then the system model is where v k ∼ N(0, q), and w k ∼ N(0, r) are the uncorrelated white Gaussian noises, and a ∈ (0; 1). The KF Equation (4) gives the following: Further, we consider two specific nonlinear functionals of the scalar signal x k . They represent power and modulus of the signal.

Estimation of Power of Signal
In this case an NFS represents quadratic functional, z k = f(x k ) = x 2 k . Using Equation (13) we obtain the optimal MMSE estimate of power of the signal,ẑ k =x 2 k + P k . In addition to the optimal estimate consider a simple suboptimal estimate of the power, such as z k = f(x k ) =x 2 k . We can compare the estimation accuracy of the optimal and suboptimal estimates. Using Theorem 1 with Formulas (16)- (18) we derive the precise formulas for the true MSEs of these estimates, P opt k = E (z k −ẑ k ) 2 and P sub k = E (z k − z k ) 2 , respectively: Here the mean µ k and covariance C k of the signal x k are determined by the Lyapunov Equation (18): The analytical solution of the Lyapunov equations is: Thus, the KF Equation (31) with Formulas (32)-(34) completely establish the true MSEs for the optimal and suboptimal estimates,ẑ k =x 2 k + P k , and z k =x 2 k , respectively. The difference between P opt k and P sub k is equal to P sub k − P opt k = P 2 k . Figures 2 and 3 show the MSEs and the relative error ∆ k = (P sub k − P opt k )/P opt k 100% for the values a = 0.9, q = 0.05, m 0 = 0, σ 2 0 = 4, and r = 1, respectively. From Figure 3 we observe that the relative error ∆ k (%) varies from 3% to 6% within the time zone k ≤ 11, and then increases. In a steady-state regime, k > 40, the relative error reaches the value ∆ ∞ = 18.3%, and at the same time zone the absolute values of the MSEs are equal to P opt ∞ = 0.1087 and P sub ∞ = 0.1286. Thus, the numerical results show that the suboptimal estimate z k =x 2 k may be significantly worse than the optimal oneẑ k = P k +x 2 k . P = [(x − x − P ) ] = 4P C − 2P + 4μ P , Here the mean μ and covariance C of the signal x are determined by the Lyapunov Equation (18): The analytical solution of the Lyapunov equations is: μ = a m , C = a σ + q ∑ a , k = 1,2, … Thus, the KF Equation (31) with Formulas (32)-(34) completely establish the true MSEs for the optimal and suboptimal estimates, z = x + P , and z = x , respectively. The difference between P and P is equal to P − P = P . Figures 2 and 3 show the MSEs and the relative error Δ = P − P /P 100% for the values a = 0.9, q = 0.05, m = 0, σ = 4, and r = 1, respectively. From Figure 3 we observe that the relative error Δ (%) varies from 3% to 6% within the time zone k 11, and then increases. In a steady-state regime, k 40, the relative error reaches the value ∆ = 18.3%, and at the same time zone the absolute values of the MSEs are equal to P = 0.1087 and P = 0.1286. Thus, the numerical results show that the suboptimal estimate z = x may be significantly worse than the optimal one z = P + x .

Estimation of Modulus of Signal
Based on Equation (28), the best estimate of an unknown modulus z = f(x ) = |x | takes the form: In contrast to the optimal estimate (35) we can consider a simple suboptimal estimate: To study the behavior of the true MSEs, P = [(z − z ) ] and P = [(z − z ) ], set a = 0.99, q = 1, r = 0.5, and x ~ℕ (1; 4). To compare the MSEs, a Monte-Carlo simulation with 1000 runs was used. As shown in Figure 4, the optimal estimate z demonstrates an improvement over the suboptimal estimate z .

Estimation of Modulus of Signal
Based on Equation (28), the best estimate of an unknown modulus z k = f(x k ) = |x k | takes the form:ẑ In contrast to the optimal estimate (35) we can consider a simple suboptimal estimate: To study the behavior of the true MSEs, P opt k = E (z k −ẑ k ) 2 and P sub k = E (z k − z k ) 2 , set a = 0.99, q = 1, r = 0.5, and x 0 ∼ N(1; 4). To compare the MSEs, a Monte-Carlo simulation with 1000 runs was used. As shown in Figure 4, the optimal estimateẑ k demonstrates an improvement over the suboptimal estimate z k .
Based on Equation (28), the best estimate of an unknown modulus z = f(x ) = |x | takes the form: In contrast to the optimal estimate (35) we can consider a simple suboptimal estimate: To study the behavior of the true MSEs, P = [(z − z ) ] and P = [(z − z ) ], set a = 0.99, q = 1, r = 0.5, and x ~ℕ(1; 4). To compare the MSEs, a Monte-Carlo simulation with 1000 runs was used. As shown in Figure 4, the optimal estimate z demonstrates an improvement over the suboptimal estimate z .

Numerical Example of QF: Wind Tunnel System
Following is a comparative experimental analysis of the optimal and suboptimal quadratic estimators considered in an example of total kinetic energy of the wind tunnel system. A discrete-time high-speed closed-air unit wind tunnel model is given in [35]: The state vector x ∈ ℝ consists of the state variables x , , x , and x , , representing derivatives from a chosen equilibrium point of the following quantities: x = Mach number, x =actuator position guide vane angle in a driving fan, and x =actuator rate. The initial mean and

Numerical Example of QF: Wind Tunnel System
Following is a comparative experimental analysis of the optimal and suboptimal quadratic estimators considered in an example of total kinetic energy of the wind tunnel system. A discrete-time high-speed closed-air unit wind tunnel model is given in [35]: The state vector x k ∈ R 3 consists of the state variables x 1,k , x 2,k and x 3,k , representing derivatives from a chosen equilibrium point of the following quantities: x 1 = Mach number, x 2 = actuator position guide vane angle in a driving fan, and x 3 = actuator rate. The initial mean and covariance are m 0 = [3 28 10] T and C 0 = diag[1 1 0], the sampling period of ∆t = 0.01, and v k ∈ R 3 is white Gaussian noise, v k ∼ N(0, Q), Q = diag 0.01 2 0.01 2 0.01 2 . Two sensory measurement model is given by: The total kinetic energy of an actuator represented by the QF, z k = x T k Ωx k , can be expressed as the sum of the translational kinetic energy of the center of mass, E t k = Mυ 2 k /2, and the rotational kinetic energy about the center of mass, E r k = Iω 2 k /2, where I is rotational inertia, ω k = .
x 2,k is angular velocity, M is mass and υ k = x 3,k is linear velocity.
Applying finite difference approximation for the velocity, ω k ≈ (x 2,k − x 2,k−1, )/∆t, the total kinetic energy can be expressed in the following QF (see Section 3.2.2): where X k ∈ R 4 is extended state vector, and I = 0.136 kgm 2 , M = 7.39 kg.
Using the obtained results of Equations (13) and (14) the optimal and suboptimal quadratic estimators take the form: respectively. The estimate of the stateX k ∈ R 4 and error covariance P k ∈ R 4×4 are determined by the KF Equation (4). Our point of interest is the behavior of MSEs of the estimators (40). Using Theorem 1, the MSEs are given by P opt z,k = 4tr(ΩP k ΩC k ) − 2tr(ΩP k ΩP k ) + 4µ T k ΩP k Ωµ k , P sub z,k = P opt z,k + tr 2 (ΩP k ), respectively. Here the mean µ k and covariance C k of the state vector X k satisfy the Lyapunov Equation (18). We observe in Figure 5 that the difference between absolute values of the MSEs P opt k and P sub k is very small; for example, P opt k ≈ 0.938 and P sub k ≈ 0.884 at k > 10. However, the relative error exceeds 6% at the same time zone, ∆ k (%) ≈ 6.2%. For this reason we conclude that the suboptimal estimator is not suitable for evaluation of the kinetic energy of the wind tunnel system.
The total kinetic energy of an actuator represented by the QF, z = x Ωx , can be expressed as the sum of the translational kinetic energy of the center of mass, E = Mυ 2, ⁄ and the rotational kinetic energy about the center of mass, E = ω 2, ⁄ where is rotational inertia, ω = x , is angular velocity, M is mass and υ = x , is linear velocity. Applying finite difference approximation for the velocity, ω x , − x , , ∆t ⁄ , the total kinetic energy can be expressed in the following QF (see Section 3.2.2): where X ∈ ℝ is extended state vector, and = 0.136 kgm , M = 7.39 kg.
Using the obtained results of Equations (13) and (14) the optimal and suboptimal quadratic estimators take the form: respectively. The estimate of the state X ∈ ℝ and error covariance P ∈ ℝ × are determined by the KF Equation (4 respectively. Here the mean μ and covariance C of the state vector X satisfy the Lyapunov Equation (18).
We observe in Figure 5 that the difference between absolute values of the MSEs P and P is very small; for example, P 0.938 and P 0.884 at k 10. However, the relative error exceeds 6% at the same time zone, Δ (%) 6.2% . For this reason we conclude that the suboptimal estimator is not suitable for evaluation of the kinetic energy of the wind tunnel system.

Numerical Example of General NFS: Motion of Unmanned Marine Probe
A comparative experimental analysis of the proposed estimators is considered for the motion of an unmanned marine probe (UMP). In a marine inspection environment, an UMP system is often considered because of their benefits of convenience and human safety.
Assume a scenario in which the UMP detected an oil-tanker accident, from which oil has spread on the surface of the water without the influence of wind. As an initial action, the UMP estimates the length of a contour of the oil spread ( Figure 6).
x , = x , − 2x , + v , , t ∈ [0; 3], x , = x , − x , + v , , where v , and v , are uncorrelated white Gaussian noises with intensities q = q = 0.1; x , ~ℕ(20; 0.2), and x , ~ℕ(0; 0.2). The UMP measures the relative coordinates x , and x , from the oil-tanker, respectively. The measurement model for the UMP is given by:  Figure 7 illustrates the time histories of the MSEs for the both estimators. It shows that the optimal estimate d has the best performance due to the lowest value of the MSE, P P . To control the size of the surface, the UMP must compute the distance from the oil-tanker d k = d t k at every time instance represented by the NFS: where x 1,k and x 2,k are coordinates of UMP.
Here, we verify the proposed estimators using a discretization of a continuous-time model of the UMP [36]: where v 1,t and v 2,t are uncorrelated white Gaussian noises with intensities q 1 = q 2 = 0.1; x 1,0 ∼ N(20; 0.2), and x 2,0 ∼ N(0; 0.2). The UMP measures the relative coordinates x 1,k and x 2,k from the oil-tanker, respectively. The measurement model for the UMP is given by: where w 1,k ∼ N(0; 0.1) and w 2,k ∼ N(0; 0.1) are uncorrelated white Gaussian noises.
A comparative experimental analysis of the optimal estimated opt k which is based on the UT (23) and the suboptimal estimate,d sub k , a Monte-Carlo simulation with 1000 runs and finite-difference approximation with the step ∆t = 0.01 were performed. Figure 7 illustrates the time histories of the MSEs for the both estimators. It shows that the optimal estimated opt k has the best performance due to the lowest value of the MSE, P opt k < P sub k .

Conclusions
In some application problems, nonlinear functional of state variables are interpreted as a cost function that denotes useful information of a target system for control. In order to estimate an arbitrary NFS, the optimal MMSE estimator is proposed. However, calculation of the optimal estimate z is reduced to calculation of the multivariate integral (Equation (5)) or usage of the unscented transformation z . To avoid these numerical difficulties, two important classes of an NFS (quadratic and polynomial) are studied in detail. Analytical calculation of the integral (Equation (5)) in terms of the Kalman statistics (x , P ) is possible for these functionals and, as a result, effective closed-form quadratic and polynomial MMSE estimators are derived.
Special attention is given for quadratic functionals. In this case, the quadratic estimator is comprehensively investigated, and novel compact matrix forms for the optimal estimate (Equation (13)) and true MSE (Theorem 1) are derived.
In view of the importance of an NFS in practice, the proposed estimation algorithms are illustrated on theoretical and practical examples for a real NFS. The examples show that the optimal estimator yields reasonably good estimation accuracy, and we confirm that the proposed optimal MMSE estimator is suitable for data processing in practice although, in some situations, the suboptimal estimators are slightly worse (in terms of absolute MSE) than the optimal estimator (Sections 4.2.1 and 4.3). In such situations, we propose an additional comparison of the estimators through a relative error Δ (%), which exceeds 6% and demonstrates the advantage of the optimal estimator over the suboptimal one.
The KF gain K and error covariance P may be pre-computed, since they do not depend on current measurements but only on the noise statistics and system matrices, which are part of the system model (Equation (1)). Thus, once the measurement schedule has been settled, real-time implementation of the MMSE estimator (Equation (6)) or (Equation (13)) requires only the computation of the state estimate and final estimate of the NFS.
The optimality and statistical accuracy of the proposed estimator of NFS as well as the KF depends on several factors, such as quality of prior assumptions about the system model (Equation (1)), especially the process Q and measurement R noise covariances, respectively. The quality of prior assumptions is an important factor that leads to the optimality of estimates. Inadequacy of prior information could lead to unexpected results and estimator divergence. Determination of suitable values for noise covariances and system matrices plays a crucial role in obtaining a converged estimator. Adaptive filtering is one of the powerful approaches to prevent the divergence problem of the filter when precise knowledge on the prior assumptions is not available [37,38]. Therefore, the adaptive Kalman estimate x can be used in the proposed MMSE estimator (6).

Conclusions
In some application problems, nonlinear functional of state variables are interpreted as a cost function that denotes useful information of a target system for control. In order to estimate an arbitrary NFS, the optimal MMSE estimator is proposed. However, calculation of the optimal estimatê z opt k is reduced to calculation of the multivariate integral (Equation (5)) or usage of the unscented transformationẑ UT k . To avoid these numerical difficulties, two important classes of an NFS (quadratic and polynomial) are studied in detail. Analytical calculation of the integral (Equation (5)) in terms of the Kalman statistics (x k , P k ) is possible for these functionals and, as a result, effective closed-form quadratic and polynomial MMSE estimators are derived.
Special attention is given for quadratic functionals. In this case, the quadratic estimator is comprehensively investigated, and novel compact matrix forms for the optimal estimate (Equation (13)) and true MSE (Theorem 1) are derived.
In view of the importance of an NFS in practice, the proposed estimation algorithms are illustrated on theoretical and practical examples for a real NFS. The examples show that the optimal estimator yields reasonably good estimation accuracy, and we confirm that the proposed optimal MMSE estimator is suitable for data processing in practice although, in some situations, the suboptimal estimators are slightly worse (in terms of absolute MSE) than the optimal estimator (Sections 4.2.1 and 4.3). In such situations, we propose an additional comparison of the estimators through a relative error ∆ k (%), which exceeds 6% and demonstrates the advantage of the optimal estimator over the suboptimal one.
The KF gain K k and error covariance P k may be pre-computed, since they do not depend on current measurements but only on the noise statistics and system matrices, which are part of the system model (Equation (1)). Thus, once the measurement schedule has been settled, real-time implementation of the MMSE estimator (Equation (6)) or (Equation (13)) requires only the computation of the state estimate and final estimate of the NFS.
The optimality and statistical accuracy of the proposed estimator of NFS as well as the KF depends on several factors, such as quality of prior assumptions about the system model (Equation (1)), especially the process Q k and measurement R k noise covariances, respectively. The quality of prior assumptions is an important factor that leads to the optimality of estimates. Inadequacy of prior information could lead to unexpected results and estimator divergence. Determination of suitable values for noise covariances and system matrices plays a crucial role in obtaining a converged estimator. Adaptive filtering is one of the powerful approaches to prevent the divergence problem of the filter when precise knowledge on the prior assumptions is not available [37,38]. Therefore, the adaptive Kalman estimatex k can be used in the proposed MMSE estimator (6).
For future work, we propose to extend the MMSE estimator for nonlinear functionals which depend not only on the current state x k , but also on its past x k , x k−1 , . . . , x 1 , for example, f = ∑ k t=1 x T t Q t x t .
Author Contributions: W.C. developed the theoretical formalism, performed the analytic calculations, I.Y.S. and V.S. designed and performed the numerical simulations, derived the models and analysed the data. All authors equally contributed in the elaboration of this work.

Conflicts of Interest:
The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

Appendix A
The Proof of Theorem 1. The derivation of the optimal MSE (16) is based on the following lemma.
Lemma A1: Let X ∈ R 3n be a composite multivariate Gaussian vector, X T = U T V T W T , i.e., X ∼ N(x; µ, S), U, V, W ∈ R n , S uu S uv S uw S vu S vv S vw S wu S wv S ww   .

(A1)
Then the third-and fourth-order vector moments of the composite random vector X are given by: (i) E(U T VW) = µ T u µ v µ w + tr(S uv )µ T w + µ T v S uw + µ T u S vw ; (ii) E(U T UV T V) = µ T u µ u µ T v µ v + 2tr(S uv S vu )+ tr(S uu )tr(S vv ) + tr(S uu )µ T v µ v + tr(S vv )µ T u µ u + 4µ T u S uv µ v ; (iii) E(U T VV T U) = µ T u µ v µ T v µ u + tr(S uu S vv )+ tr(S uv )tr(S vu ) + tr(S 2 uv )+ µ T v S uu µ v + µ T u S vv µ u + µ T v S uv µ u + µ T u S vu µ v + 2tr(S uv )µ T u µ v ; (iv) E(U T VW T U) = µ T u µ v µ T w µ u + tr(S uv )tr(S uw ) + tr(S uu S wv )+ tr(S uw S uv ) + tr(S uv )µ T u µ w + tr(S uw )µ T u µ v + µ T v S uu µ w + µ T v S uw µ u + µ T u S vu µ w + µ T u S vw µ u . (A2) The derivation of the vector Formula (A2) is based on their scalar versions [31,32]: E(x i x j x k ) = µ i µ j µ k + µ i S jk + µ j S ik + µ k S ij ; E(x i x j x k x ) = µ i µ j µ k µ + S ij S k + S ik S j + S i S jk + µ i µ j S k + µ i µ k S j + µ i µ S jk + µ j µ k S i + µ j µ S ik + µ k µ S ij , where µ h = E(x h ), S pq = Cov(x p , x q ), To derive Equation (16), for simplicity, we omit the time index k. Then using Equation (7) and Equation (13), the error can be written as: e z = z −ẑ opt = x T Ωx + a T x − tr Ω(P +xx T ) − a T x = x T Ωx −x T Ωx − tr(ΩP) + a T e = (e +x) T Ω(e +x) −x T Ωx − tr(ΩP) + a T e = e T Ωe + 2e T Ωx + a T e − tr(ΩP), where e = x −x, tr(Ωxx T ) =x T Ωx,x T Ωe = e T Ωx. (A4) Next, using the unbiased and orthogonality properties of the Kalman estimate E(e) = E(ex T ) = 0, we obtain: P opt z = E(e 2 z ) = E(e T Ωee T Ωe) + 4E(e T Ωxx T Ωe) +a T Pa + tr 2 (ΩP) + 4E(e T Ωee T Ωx)+ 2E(e T Ωee T )a − 2tr 2 (ΩP) + 4E(e T Ωxe T )a.
In the case of the suboptimal estimateẑ sub k , the derivation of the MSE (Equation (17)) is similar.