A Framework of Covariance Projection on Constraint Manifold for Data Fusion †

A general framework of data fusion is presented based on projecting the probability distribution of true states and measurements around the predicted states and actual measurements onto the constraint manifold. The constraint manifold represents the constraints to be satisfied among true states and measurements, which is defined in the extended space with all the redundant sources of data such as state predictions and measurements considered as independent variables. By the general framework, we mean that it is able to fuse any correlated data sources while directly incorporating constraints and identifying inconsistent data without any prior information. The proposed method, referred to here as the Covariance Projection (CP) method, provides an unbiased and optimal solution in the sense of minimum mean square error (MMSE), if the projection is based on the minimum weighted distance on the constraint manifold. The proposed method not only offers a generalization of the conventional formula for handling constraints and data inconsistency, but also provides a new insight into data fusion in terms of a geometric-algebraic point of view. Simulation results are provided to show the effectiveness of the proposed method in handling constraints and data inconsistency.


Introduction
Data fusion is the process of obtaining a more meaningful and precise estimate of a state by combining data from multiple sources. The architecture of multisensor data fusion can be broadly categorized into two, depending on the way raw data are processed: (1) Centralized fusion architecture [1], where raw data from multiple sources are directly sent to and fused in the central node for state estimation and (2) Distributed fusion architecture [1,2], where data measured at multiple sources are processed independently at individual nodes to obtain local estimates before they are sent to the central node for fusion. In the centralized architecture, it is possible to apply data fusion methodology such as the Kalman filter [3] to all raw data received to yield optimal estimates in the sense of minimum variance. However, the centralized architecture can be costly especially for a large system in terms of infrastructure and communication overheads at the central node, let alone the issues of reliability and scalability. On the other hand, the distributed architecture is advantageous in reliability and scalability, with lower infrastructure and communication costs. Although advantageous, the distributed architecture needs to address statistical dependency among the local state estimates received from multiple nodes for fusion. This is due to the fact that local state estimates at individual nodes can be subject to the same process noise [4] and to double counting, i.e., sharing the same data sources among them [5]. Ignoring such statistical dependency or cross-correlation among multiple nodes leads to inconsistent results, causing divergence in data fusion [6]. In the case of known cross-correlation, the Bar-Shalom Campo (BC) formula [7] provides a consistent fusion result for a pair of redundant data sources, where the fused estimate is based on maximum likelihood [8].
A generalization to more than two data sources with known cross-correlations is given by weighted fusion algorithms of the generalized Millman's formula (GMF) [9] and weighted Kalman filter (WKF) [10]. Sensors often provide spurious and inconsistent data due to unexpected situations such as short duration spike faults, sensor glitches, a permanent failure or slowly developing failure due to sensor elements [5,11]. Since these types of uncertainties are not attributable to the inherent noise, they are difficult to predict and model. The fusion of inconsistent sensor data with correct data can lead to severely inaccurate results [12]. For example, when exposed to abnormalities and outliers, a Kalman filter would easily diverge [13]. Hence, a data validation scheme is required to identify and eliminate the sensor faults/outliers/inconsistencies before fusion.
The detection of inconsistency needs either a priori information often in the form of specific failure model(s) or data redundancy [5]. Model-based approaches use the generated residuals between the model outputs and actual measurements to detect and remove faults. For instance, in [14], the Nadaraya-Watson estimator and a priori observations are used to validate sensor measurements. Similarly, a priori system model information as a reference is used to detect failures in filtered state estimates [15][16][17]. However, requirement of the prior information restricts the usage of these methods in the general case where prior information is not available or unmodeled failure occurs. A method to detect spurious data based on the Bayesian framework is proposed in [18,19]. The method adds a term to the Bayesian formulation which has the effect of increasing the covariance of the fused probability distribution when measurement from one of the sensor is inconsistent with respect to the other. However, the method is based on heuristics and assumes independence of sensor estimates in its analysis. In [20], the Covariance Union (CU) method is proposed where the fused covariance is enlarged to cover all local means and covariances in such a way that the fused estimate is consistent under spurious data. However, the method incurs high computational cost and results in an inappropriately large conservative fused result.
In some applications, the state variables observed in a multisensory system may be subject to additional constraints. These constraints can arise due to the basic laws of physics, kinematics or geometry consideration of a system or due to the mathematical relations to satisfy among states. For instance, the energy conservation laws in an undamped mechanical system; Kirchhoff's laws in electric circuits; a road constraint in a vehicle-tracking scenario [21]; an orthonormal constraint in quaternion-based estimation [22] etc. These constraints if properly included can lead to improvement in state estimation and data fusion.
Various methods have been proposed to incorporate linear constraints among the state variables of dynamic systems [23][24][25][26][27][28]. For instance, the dimensionality reduction method converts a constrained estimation problem to an unconstrained one of lower dimension by eliminating some state variables using the constraints [25]. However, the state variables in a reduced dimension model may become difficult to interpret and their physical meaning may be lost [23]. The pseudo-measurement method satisfies the linear constraints among state variables by treating the state constraints as additional perfect/noise-free measurements [26][27][28]. However, this method increases the computational complexity of state estimation due to an increase in the dimension of augmented measurement. Furthermore, due to the singularity of augmented measurement covariance, the method may cause numerical problems [23,29]. A popular approach, the estimate projection method, projects the unconstrained estimate obtained from conventional Kalman filtering onto the constraint subspace using classical optimization methods [23,24]. Unfortunately, the method may not lead to the true Sensors 2018, 18, 1610 3 of 23 constraint optimum since the projection method merely gives the solution as a feasible point that is closest to the unconstrained minimum.
This paper presents a unified and general data fusion framework, referred to as the Covariance Projection (CP) method to fuse multiple data sources under arbitrary correlations and linear constraints as well as data inconsistency. The method projects the probability distribution of true states and measurements around the predicted states and actual measurements onto the constraint manifold representing the constraints to be satisfied among true states and measurements. The proposed method also provides a framework for identifying and removing outliers in a fusion architecture where only sensor estimates may be available at the fusion center. This paper is an extended and improved version of the conference paper [30]. What was presented in [30] is a preliminary new framework of data fusion that we proposed. On the other hand, what is presented here represents a much more detailed implementation and refinement of the concept proposed in the conference paper. Specifically, this paper includes the following additions: (1) a detailed analysis of the equivalence of the proposed method to conventional methods for fusing redundant data sources; (2) handling linear constraints simultaneously under the proposed data fusion framework; (3) refining the mathematical formula and technical descriptions associated with them; and (4) detailed analysis of the method with additional simulations that deal with state estimation and data fusion in the presence of correlations, outliers and constraints.

Problem Statement
Consider a distributed sensor architecture [1], where each sensor system is equipped with a tracking system to provide local estimates of some quantity of interest in the form of mean and covariance. Assume the following linear dynamic system model for each local sensor system, where k is the discrete time, A k is the system matrix, B k is the input matrix, u k is the input vector and x k is the state vector. The system process is affected by zero mean Gaussian noise w k with covariance matrix Q. The sensor measurements are approximated as, where H k is the observation matrix and n represents the number of sensors. v k i is Gaussian noise with covariance matrices R i , i = 1, 2, . . . , n. Each sensor systems employs a Kalman filter to provide local state mean estimatex k and its covariance P k [31]. A prediction of the state estimatex − k and its estimation error covariance P − k can be computed based on process model (1), The Kalman filter then provides the state estimatex k and its covariance P k as, with the Kalman gain, The estimates provided by sensor systems are assumed to be correlated due to the common process noise or double counting, that is, P ij = cov x i ,x j = 0. To ensure optimality of the fused results, the cross-covariance P ij should be properly incorporated. Due to the inherent nature of the sensor and environmental factors [32], the sensor measurements can also be perturbed by unmodel random faults e k i , Subsequently, the estimates computed by local sensor systems may be spurious and inconsistent. Therefore, validation of the sensor estimates is required to remove inconsistencies before the fusion process.
In addition, the states x k of the sensor systems may subject to linear constraints due to the geometry of the system environment or the mathematical description of the system [23], such that, where C ∈ R n×m and c ∈ R n are both known. C is assumed to have a full row rank. The constraints provide deterministic information about the state variables and can be used to improve the fusion accuracy. These issues of correlations, data inconsistency and state constraints motivate the development of the Covariance Projection method, which is described next.

Proposed Approach
The proposed method first represents the probability of true states and measurements in the extended space around the data from state predictions and sensor measurements, where the extended space is formed by taking states and measurements as independent variables. Any constraints among true states and measurements that should be satisfied are then represented as a constraint manifold in the extended space. This is shown schematically in Figure 1a for filtering as an example (refer to Equations (1)-(6)). Data fusion is accomplished by projecting the probability distribution of true states and measurements onto the constraint manifold.
where ∈ ℝ × and ∈ ℝ are both known. is assumed to have a full row rank. The constraints provide deterministic information about the state variables and can be used to improve the fusion accuracy.
These issues of correlations, data inconsistency and state constraints motivate the development of the Covariance Projection method, which is described next.

Proposed Approach
The proposed method first represents the probability of true states and measurements in the extended space around the data from state predictions and sensor measurements, where the extended space is formed by taking states and measurements as independent variables. Any constraints among true states and measurements that should be satisfied are then represented as a constraint manifold in the extended space. This is shown schematically in Figure 1a for filtering as an example (refer to Equations (1)-(6)). Data fusion is accomplished by projecting the probability distribution of true states and measurements onto the constraint manifold.
More specifically, consider two mean estimates, and , of the state ∈ ℝ , with their respective covariances as , ∈ ℝ × . Furthermore, the estimates are assumed to be correlated with cross-covariance . The mean estimates and their covariances together with their crosscovariance in ℝ are then transformed to an extended space of ℝ along with the linear constraint between the two estimates: where and are constant matrices of compatible dimensions. In the case where and estimate the same entity, and become identity matrix . Figure 1b illustrates schematically the fusion of and in the extended space based on the proposed CP method. Fusion takes place by finding the point on the constraint manifold that represents the minimum weighted distance from in ℝ , where the weight is given by . As seen later, the proposed CP method with the minimum weighted distance is shown to be equivalent to the minimum variance estimates but advantageous for dealing with additional constraints and data inconsistency.
To find a point on the constraint manifold with minimum weighted distance, we apply the whitening transform (WT) defined as, More specifically, consider two mean estimates,x 1 andx 2 , of the state x ∈ R N , with their respective covariances as P 1 , P 2 ∈ R N×N . Furthermore, the estimates are assumed to be correlated with cross-covariance P 12 . The mean estimates and their covariances together with their cross-covariance in R N are then transformed to an extended space of R 2N along with the linear constraint between the two estimates:x where C 1 and C 2 are constant matrices of compatible dimensions. In the case wherex 1 andx 2 estimate the same entity, C 1 and C 2 become identity matrix I. Figure 1b illustrates schematically the fusion of x 1 andx 2 in the extended space based on the proposed CP method. Fusion takes place by finding the point on the constraint manifold that represents the minimum weighted distance fromx in R 2N , where the weight is given by P. As seen later, the proposed CP method with the minimum weighted distance is shown to be equivalent to the minimum variance estimates but advantageous for dealing with additional constraints and data inconsistency.
To find a point on the constraint manifold with minimum weighted distance, we apply the whitening transform (WT) defined as, W = D −1/2 E T , where D and E are the eigenvalue and eigenvector matrices of P. Applying WT, where the matrix M = [C 1 C 2 ] T is the subspace of the constraint manifold. Figure 2 illustrates the transformation of the probability distribution as an ellipsoid into a unit circle after WT. The probability distribution is then orthogonally projected on the transformed manifold M W to satisfy the constraints between the data sources in the transformed space as illustrated in Figure 2. Inverse WT is applied to obtain the fused mean estimate and covariance in the original space, where P r = M W M W T M W −1 M W T is the orthogonal projection matrix. Using the definition of various components in (10) and (11), a close form simplification can be obtained as, where the matrix = [ ] is the subspace of the constraint manifold. Figure 2 illustrates the transformation of the probability distribution as an ellipsoid into a unit circle after WT. The probability distribution is then orthogonally projected on the transformed manifold to satisfy the constraints between the data sources in the transformed space as illustrated in Figure 2. Inverse WT is applied to obtain the fused mean estimate and covariance in the original space, where = is the orthogonal projection matrix. Using the definition of various components in (10) and (11), a close form simplification can be obtained as, The details of the simplification are provided in Appendix A. Due to the projection in extended space of ℝ , (12) and (13) provide a fused result with respect to each data source. In the case where and estimate the same entity, that is, = [ ] , the fused result will be same for the two data sources. As such, a close form equation for fusing redundant data sources in ℝ can be obtained from (12) and (13)  Given mean estimates , , …, of a state ∈ ℝ with their respective covariances , , … , ∈ ℝ × and known cross-covariances , , = 1, … , , (14) and (15) can be used to obtain the optimal fused mean estimate and covariance with = [ , , … , ] . For fusing correlated estimates from redundant sources, the CP method is equivalent to the weighted fusion algorithms [9,10], which compute the fused mean estimate and covariance as a summation of weighted individual estimates as, with ∑ = . Where the weights are determined by solving some cost function of (16) and (17) such that, ∑ = . Equivalently, the CP fused mean and covariance can be written as, The details of the simplification are provided in Appendix A. Due to the projection in extended space of R 2N , (12) and (13) provide a fused result with respect to each data source. In the case wherê x 1 andx 2 estimate the same entity, that is, M = [I N I N ] T , the fused result will be same for the two data sources. As such, a close form equation for fusing redundant data sources in R N can be obtained from (12) and (13) as, Given n mean estimatesx 1 ,x 2 , . . . ,x n of a state x ∈ R N with their respective covariances P 1 , P 2 , . . . , P n ∈ R N×N and known cross-covariances P ij , i, j = 1, . . . , n, (14) and (15) can be used to obtain the optimal fused mean estimate and covariance with M = [I N1 , I N2 , . . . , I Nn ] T .
For fusing correlated estimates from n redundant sources, the CP method is equivalent to the weighted fusion algorithms [9,10], which compute the fused mean estimate and covariance as a summation of weighted individual estimates as, Where the weights c i are determined by solving some cost function of (16) and (17) such that, ∑ n i=1 c i = I. Equivalently, the CP fused mean and covariance can be written as, In the particular case of two data sources, the CP fused solution reduces to the well-known Bar-Shalom Campo formula [7], Although equivalent to the traditional approaches in fusing redundant data sources, the proposed method offers a generalized framework not only for fusing correlated data sources but also for handling linear constraints and data inconsistency simultaneously within the framework.
The proposed method provides an unbiased and optimal fused estimate in the sense of minimum mean square error (MMSE). Theorem 1. For n unbiased mean estimatesx 1 ,x 2 , . . . ,x n , the fused mean estimate x provided by the CP method is an unbiased estimator of x, that is, E( x) = E(x). (18), we can write,

Proof. From
Taking the expectation on both sides, we get, Since the sensor estimatesx 1 ,x 2 , . . . ,x n are unbiased, we have E( where ∑ n i=1 L i = I. This concludes that the fused state estimate x is an unbiased estimate of x.
Theorem 2. The fused covariance P of the CP method is smaller than the individual covariances, that is, Proof. From equation (15), we can write, By Schwartz matrix inequality, we have, where M is the constraint between data sources and M i = [I Ni , 0, . . . , 0] T is the constraint matrix for P i . The equality holds for P i = P ij , that is, P = P i , when P i = P ij , j = 1, 2, . . . , n.
It can be observed from (14) and (15) that computation of cross-covariance P ij is needed to compute the fused mean and covariance. Cross-covariance among the local estimates can be computed as [9, 10,33], where K i and K j are the Kalman gain of source i and j respectively for i, j = 1, . . . , n and P k−1 ij represent the cross covariance of the previous cycle between source i and j.

Fusion in the Presence of Spurious Data
Due to the inherent nature of sensor devices and the real-world environment, the sensor observations may also be affected by random faults. Subsequently, the local estimates provided by sensor systems in a distributed architecture may be spurious and inconsistent. This may cause the fusion methodologies to fail since they are based on the assumption of consistent input sensor estimates. Therefore, a validation scheme is required to detect and remove the spurious estimates from the fusion pool. The proposed approach exploits the constraint manifold among sensor estimates to identify any data inconsistency. The identification of inconsistent data is based on the distance from the constraint manifold to the mean of redundant data sources in the extended space that provides a confidence measure with the relative disparity among data sources. Assuming a joint multivariate normal distribution for the data sources, the data confidence can be measured by computing the distance from the constraint manifold as illustrated in Figure 3. It can be observed from (14) and (15) that computation of cross-covariance is needed to compute the fused mean and covariance. Cross-covariance among the local estimates can be computed as [9,10,33], where and are the Kalman gain of source and respectively for , = 1, … , and represent the cross covariance of the previous cycle between source and . □

Fusion in the Presence of Spurious Data
Due to the inherent nature of sensor devices and the real-world environment, the sensor observations may also be affected by random faults. Subsequently, the local estimates provided by sensor systems in a distributed architecture may be spurious and inconsistent. This may cause the fusion methodologies to fail since they are based on the assumption of consistent input sensor estimates. Therefore, a validation scheme is required to detect and remove the spurious estimates from the fusion pool. The proposed approach exploits the constraint manifold among sensor estimates to identify any data inconsistency. The identification of inconsistent data is based on the distance from the constraint manifold to the mean of redundant data sources in the extended space that provides a confidence measure with the relative disparity among data sources. Assuming a joint multivariate normal distribution for the data sources, the data confidence can be measured by computing the distance from the constraint manifold as illustrated in Figure 3. Consider the joint space representation of n sensor estimates (x N1 , P 1 ), (x N2 , P n ), . . . , (x Nn , P 1 ), where N is the dimension of the state vector. The distance d from the constraint manifold can be calculated as, where x is the point on the manifold and can be obtain by using (12). In the case of two data sources with meanx 1 ,x 2 ∈ R N and respective covariance matrices P 1 and P 2 ∈ R N×N . The distance d can be obtained as, The point on the manifold is given as, The details of simplifications are provided in Appendix B. From (24), it can be observed that distance d is a weighted distance between the two data sources and it can provide a measure of nearness or farness of the two data sources to each other. A large value of d implies a large separation while a small d signifies closeness of the data sources. In other words, the distance from the manifold provides an indication of the relative disparity among data sources.

Theorem 3.
For n data sources of N dimension, the d distance (23) follow a chi-squared distribution with nN degrees of freedom (DOF), that is, d ∼ χ 2 (Nn).
Under the whitening transformation, WP −1 W = I, Wx =x W and W x = x W . Thus, we can write, Since d is a chi-square distribution with Nn DOF, then for any significance level α ∈ (0, 1), χ 2 α (Nn) is defined such that the probability, This is depicted in Figure 4. Hence, to have a confidence of 100 × (1 − α) percent, d should be less than respective critical value as illustrated in Figure 4. A chi-square table [34] can be used to obtain the critical value for the confidence distance with a particular significance level and DOF. A value of α = 0.05 is assumed in this paper unless specified.  Figure 4. Hence, to have a confidence of 100 × (1 − ) percent, should be less than respective critical value as illustrated in Figure 4. A chi-square table [34] can be used to obtain the critical value for the confidence distance with a particular significance level and DOF. A value of = 0.05 is assumed in this paper unless specified. □

Inconsistency Detection and Exclusion
To obtain reliable and consistent fusion results, it is important that the inconsistent estimates be identified and excluded before fusion. For this reason, at each time step when the fusion center receives computed estimates from sensor nodes, distance is calculated. A chi-square table is then used to obtain the critical value for a particular significance level and DOF. A computed distance less than the critical value mean that we are confident about the closeness of sensor estimates and

Inconsistency Detection and Exclusion
To obtain reliable and consistent fusion results, it is important that the inconsistent estimates be identified and excluded before fusion. For this reason, at each time step when the fusion center receives computed estimates from sensor nodes, distance d is calculated. A chi-square table is then used to obtain the critical value for a particular significance level and DOF. A computed distance d less than the critical value mean that we are confident about the closeness of sensor estimates and that they can be fused together to provide a better estimate of the underlying states. On the other hand, a distance d greater than or equal to the critical value indicate spuriousness of the sensor estimates. At least one of the sensor estimate is significantly different than the other sensor estimates. To exclude the outliers, a distance from the manifold is computed for every estimate and compared with the respective critical values. For n mean estimatesx 1 ,x 2 , . . . ,x n with respective covariances P 1 , P 2 , . . . , P n and cross-covariances P ij for i, j = 1, . . . , n, the hypothesis and decision rule are summarized as follows,

Hypotheses:
H 0 :x 1 =x 2 = · · · =x n H 1 : α (Nn) If the hypothesis H 0 is accepted, then the estimates are optimally fused using (14) and (15). On the other hand, rejection of null hypothesis means that at least one of the sensor estimates is significantly different than the other estimates. Then, a distance from the manifold is computed for each of the estimates as, . . , n with x computed using (14). The outliers are identified and eliminated based on the respective critical value, that is, if d i ≥ χ 2 α (N), they are rejected, where N is the dimension of an individual data source.

Effect of Correlation on d Distance
Since the estimates provided by multiple data sources are correlated, it is important to consider the effect of cross-correlation in the calculation of confidence distance. Consider two sensor estimateŝ x 1 ∈ R 1 andx 2 ∈ R 1 with respective variances σ 2 1 and σ 2 2 and cross-covariance σ 2 12 = ρ σ 2 1 σ 2 2 , where ρ ∈ [−1, 1] is the correlation coefficient. The d distance for the pair of multivariate Gaussian estimates x 1 , σ 2 1 and x 2 , σ 2 2 , with cross-covariance σ 2 12 can be written as, It is apparent that the distance between the mean values is affected by the correlation between the data sources. Figure 5 illustrates the dependency of confidence distance d on the correlation coefficient. Figure 5a shows the distance d with changing correlation coefficient from −1 to 1. It can be observed that a positive correlation between the data sources results in a large d distance. This means that a slight separation between the positively correlated data sources indicates spuriousness with high significance as compared to negatively correlated and uncorrelated data sources. Figure 5b shows the scenario in which a data source (with changing mean and constant variance) is moving away from another data source (with constant mean and constant variance). The distance d is plotted for various values of correlation coefficients. The y-axis shows the percentage of rejection of the null hypothesis H 0 . It can be noted that ignoring the cross-correlation in distance d results in underestimated or overestimated confidence and may lead to an incorrect rejection of the true null hypothesis (Type I error) or incorrect retaining of false null hypothesis (Type II error). The proposed framework inherently takes care of any cross-correlation among multiple data sources in the computation of distance d. The sensor measurements are assumed to be cross-correlated. It is also assumed that the sensor 1, sensor 2 and sensor 3 measurements are independently corrupted by unmodeled noise and produce inconsistent data for 33%, 33% and 34% of the time respectively. The sensors compute local estimates of the state and send it to the fusion center. Three strategies for combining the local sensor estimates are compared: (1) CP, which fuses the three sensor estimates using (14) and (15) without removing outliers; (2) CP WO-d means the outliers were identified and rejected before fusion based on (27) with = 0, that is, correlation in computation of is ignored and; (3) CP WO-dC, reject the outliers based on (27) with taking into account the cross-correlation. Figure 6 shows the fused solution of three sensors when the estimate provided by sensor 2 is in disagreement with sensor 1 and 3. It can be observed from Figure 6 that neglecting the cross-correlation in CP WO-d results in Type II error, that is, all the three estimates are fused despite the fact that estimate 2 is inconsistent. CP WO-dC correctly identifies and eliminates the spurious estimate before the fusion process. Example: Consider a numerical simulation with the constant state, Three sensors are used to estimate the state x k , where the measurements of the sensors are affected with respective variances of R 1 , R 2 and R 3 . The values for the parameters assumed in the simulation are, The sensor measurements are assumed to be cross-correlated. It is also assumed that the sensor 1, sensor 2 and sensor 3 measurements are independently corrupted by unmodeled noise and produce inconsistent data for 33%, 33% and 34% of the time respectively. The sensors compute local estimates of the state and send it to the fusion center. Three strategies for combining the local sensor estimates are compared: (1) CP, which fuses the three sensor estimates using (14) and (15) without removing outliers; (2) CP WO-d means the outliers were identified and rejected before fusion based on (27) with σ 2 12 = 0, that is, correlation in computation of d is ignored and; (3) CP WO-dC, reject the outliers based on (27) with taking into account the cross-correlation. Figure 6 shows the fused solution of three sensors when the estimate provided by sensor 2 is in disagreement with sensor 1 and 3. It can be observed from Figure 6 that neglecting the cross-correlation in CP WO-d results in Type II error, that is, all the three estimates are fused despite the fact that estimate 2 is inconsistent. CP WO-dC correctly identifies and eliminates the spurious estimate before the fusion process. Figure 7 shows the estimated position after the fusion of the three sensors' estimates for 100 samples. It can be seen that the presence of outliers greatly affects the outcome of multisensor data fusion. As depicted in Figure 7, eliminating outliers before fusion improves the estimation performance. The fused samples of CP WO-d and CP WO-dC on average lies closer to the actual state. Figure 7 also shows the difference in fusion performance when outliers are identified with and without considering cross-correlation. It can be noted that neglecting the correlation affects the estimation quality because of Type I and Type II errors.

Fusion under Linear Constraints
The system model of a linear dynamic system takes into account the relation and dependencies among components of the state vector. In some applications, however, the state variables may be subject to additional constraints due to the basic laws of physics, geometry of the system environment or due to the mathematical description of the state vector. Imposing such certain information in an otherwise probabilistic setting should yield a more accurate estimate that is guaranteed to be feasible.
Consider a linear dynamic system model, where represents the discrete-time index, is the system matrix, is the input matrix, is the input vector and is the state vector. The system process noise with covariance matrix and measurement noise with covariance are assumed to be correlated, that is,

Fusion under Linear Constraints
The system model of a linear dynamic system takes into account the relation and dependencies among components of the state vector. In some applications, however, the state variables may be subject to additional constraints due to the basic laws of physics, geometry of the system environment or due to the mathematical description of the state vector. Imposing such certain information in an otherwise probabilistic setting should yield a more accurate estimate that is guaranteed to be feasible.
Consider a linear dynamic system model, where represents the discrete-time index, is the system matrix, is the input matrix, is the input vector and is the state vector. The system process noise with covariance matrix

Fusion under Linear Constraints
The system model of a linear dynamic system takes into account the relation and dependencies among components of the state vector. In some applications, however, the state variables may be subject to additional constraints due to the basic laws of physics, geometry of the system environment or due to the mathematical description of the state vector. Imposing such certain information in an otherwise probabilistic setting should yield a more accurate estimate that is guaranteed to be feasible.
Consider a linear dynamic system model, where k represents the discrete-time index, A k is the system matrix, B k is the input matrix, u k is the input vector and x k is the state vector. The system process noise w k with covariance matrix Q and measurement noise v k with covariance R are assumed to be correlated, that is, The state x k ∈ R N is known to be constrained as, For c = 0, the state space can be translated by a factor c such that Cx k = 0. After constrained state estimation, the state space can be translated back by the factor c to satisfy Cx k = c. Hence, without loss of generality, the c = 0 case is considered for analysis here. The matrix C ∈ R n×m is assumed to have a full row rank. A row deficient matrix C signifies the presence of redundant constraints. In such a case, we can simply remove the linearly dependent rows from C. In the following, the estimate projection (EP) method is briefly reviewed which is followed by the Covariance Projection (CP) method for linear constraints among state variables.

Estimate Projection Method
The estimate projection (EP) method [23,24] projects the unconstrained estimate obtained from Kalman filtering onto the constraint subspace to satisfy the linear constraints among state variables. Let us denote the unconstrained filtered estimate and constrained estimate as (x u , P u ) and (x p , P p ), respectively. Then the following optimization problem is solved to obtain the constrained estimate, where U is any symmetric positive definite weighting matrix. Solving (31) using Lagrange multipliers results in a constrained mean and covariance,x where J is the projector on the null space of constrained matrix C, defined as, Any symmetrical positive definite matrix can be used as a weighting matrix U to obtain the constrained estimate but the two most common choices are identity matrix I and inverse of unconstrained covariance P u −1 .

Covariance Projection Method for Linear Constraints
The CP framework incorporates any linear constraints among states without any additional processing. Let us denote the constrained filtered estimate of the CP method as (x c , P c ). The extended space representation of the state predictions and measurements of multiple sensors can be written as, Then the CP estimate in the presence of linear constraints among states can be obtained using (12) and (13) as, where the M c matrix is the subspace of the constraint among the state prediction x − k and sensor measurements z k i as well as linear constraints C among state variables. The subspace of the linear constraint among state prediction and sensor measurements can be written as, Then, M c is a combination of M and C, that is, The projection of the probability distribution of true states and measurements around the predicted states and actual measurements onto the constraint manifold M c in the extended space provide the filtered or fused estimate of state prediction and sensor measurements as well as completely satisfying the linear constraints among the states directly in one step.
It can be observed from (31) that the EP method forces the unconstrained estimate on the linear constraint under some norm to satisfy the linear constraints among state variables. Consequently, the true optimality cannot be guaranteed due to the fact that the projected point close to the unconstrained estimate does not imply that it is close to the true constrained state [29]. On the other hand, the unified constraint matrix M c of the CP method ensures that the linear constraints among state variables are exactly satisfied. Furthermore, as compared to the EP method that needs the online projection steps (32) and (33) after filtering, the M c matrix for the CP method can be computed offline. This means that the EP method is computationally less efficient. Additionally, the proposed CP method is inherently suitable for taking care of any cross-correlation in the constrained estimation process.

Simulation Results
In this section, illustrative examples are provided to demonstrate the effectiveness of the theoretical results derived in the previous sections. We use the Monte Carlo technique [35], a method extensively used in a wide variety of fields such as physical science, computational biology, statistics, computational geometry, artificial intelligence, engineering, decision theory, and quantitative finance (see, e.g., the recent works [36][37][38][39][40]).

Tracking in the Presence of Correlations and Outliers
Consider a target tracking scenario characterized by the following dynamic system model, with the state vector x k = x, .
x T . Where x and .
x are the position and velocity of the target at time k, respectively. T = 0.5 s is the sampling period. The system process is affected by zero mean Gaussian noise with covariance matrix Q. Four sensors are employed to track the movement of the target, where the sensor measurements are approximated by the following equation, The measurements of the sensors are affected by noise v k i with respective covariances of R 1 , R 2 , R 3 and R 4 . The covariances of the process noise and sensor measurement noises used in the simulation are, Starting from an initial value of [100, 3], in each time step the individual sensor uses (36) to predict the state of the target and then update the state prediction by its own sensor measurements. The local estimates are assumed to be correlated and (22) is used to compute the cross-correlation among local estimates. The estimated mean and covariances of the states by each sensor are sent to the fusion center, where they are fused by the CP method. Table 1 summarizes the trace of covariance of local sensors along with the trace of fused covariance provided by the CP method. As seen from Table 1, the trace of covariance provided by the CP method is less than the individual local state estimates, that is, trace P ≤ traceP i , i = 1, . . . , 4. This means that the fused result is better than each of the local state estimates. In order to further verify this theoretical result, we compute the mean square error (MSE) as, where V is the number of Monte Carlo trials andx i (k) and x i (k) are the estimated and actual state vector respectively. Since, Then we have [41], The simulation is carried out for 1000 Monte Carlo runs and the local estimates provided by four sensor nodes along with the fused result of the CP method are shown in Figure 8. The straight lines in Figure 8 denote the trace of error covariance matrices and the solid curve represents the MSE of local and fused estimates. It can be observed from Figure 8 that the MSE of the individual sensor node fluctuates around the trace, which is consistent with (38). Furthermore, the accuracy relation of local sensor estimates and fused estimates in terms of MSE in Figure 8 is coincident with the theoretical results in Table 1.
The simulation is carried out for 1000 Monte Carlo runs and the local estimates provided by four sensor nodes along with the fused result of the CP method are shown in Figure 8. The straight lines in Figure 8 denote the trace of error covariance matrices and the solid curve represents the MSE of local and fused estimates. It can be observed from Figure 8 that the MSE of the individual sensor node fluctuates around the trace, which is consistent with (38). Furthermore, the accuracy relation of local sensor estimates and fused estimates in terms of MSE in Figure 8 is coincident with the theoretical results in Table 1.  Consider the same dynamic example of four sensors with the following system and measurement equations, where the process and measurement noise parameters are the same. Now, it is also assumed that the sensor 1, sensor 2, sensor 3 and sensor 4 measurements are independently affected by unmodeled random noise for 5%, 15%, 20% and 10% of the time, respectively, and thus the estimates provided by the sensors are spurious. The control input alternate between 1 and −1 and set to a value of 1 if ̇ < 30 otherwise it is changed to −1 until ̇ < 5. Starting from an initial value of [100, 3], in each time step the individual sensor node compute local filtered estimates. The estimated mean and covariances by each sensor are sent to the fusion center, where they are fused. The three fusion strategies of CP (fusion without outlier removal), CP WO-d (outlier removal without considering cross-correlation) and CP WO-dC (taking care of correlation in outlier removal) are compared based on root mean squared error (RMSE) between the actual state value and fused estimate of the state. The inconsistency is detected with significance level = 0.05. Figure 9a,b illustrate the RMSE of the target position and velocity respectively versus time for 1000 Monte Carlo runs. Table 2 summarizes the average RMSE of position and velocity. It can be observed from Figure 9 and Table 2 that the presence of outliers deteriorates the performance of multisensor data fusion. Eliminating the outliers before fusion greatly improves the estimation quality. Figure 9 and Table 2 also shows the difference in fusion performance of CP WO-d and CP WO-dC, when outliers are identified with and without Consider the same dynamic example of four sensors with the following system and measurement equations, where the process and measurement noise parameters are the same. Now, it is also assumed that the sensor 1, sensor 2, sensor 3 and sensor 4 measurements are independently affected by unmodeled random noise e k i for 5%, 15%, 20% and 10% of the time, respectively, and thus the estimates provided by the sensors are spurious. The control input alternate between 1 and −1 and set to a value of 1 if .
x < 30 otherwise it is changed to −1 until . x < 5. Starting from an initial value of [100, 3], in each time step the individual sensor node compute local filtered estimates. The estimated mean and covariances by each sensor are sent to the fusion center, where they are fused. The three fusion strategies of CP (fusion without outlier removal), CP WO-d (outlier removal without considering cross-correlation) and CP WO-dC (taking care of correlation in outlier removal) are compared based on root mean squared error (RMSE) between the actual state value and fused estimate of the state. The inconsistency is detected with significance level α = 0.05. Figure 9a,b illustrate the RMSE of the target position and velocity respectively versus time for 1000 Monte Carlo runs. Table 2 summarizes the average RMSE of position and velocity. It can be observed from Figure 9 and Table 2 that the presence of outliers deteriorates the performance of multisensor data fusion. Eliminating the outliers before fusion greatly improves the estimation quality. Figure 9 and Table 2 also shows the difference in fusion performance of CP WO-d and CP WO-dC, when outliers are identified with and without consideration of cross-correlation in distance d respectively. It can be noted that consideration of correlation in distance d improves the estimation quality in presence of outliers by avoiding Type I and Type II errors. consideration of cross-correlation in distance respectively. It can be noted that consideration of correlation in distance improves the estimation quality in presence of outliers by avoiding Type I and Type II errors.

Target Tracking in the Presence of Linear Constraints
Consider a 2D target tracking problem [24] with the following system equation,

Target Tracking in the Presence of Linear Constraints
Consider a 2D target tracking problem [24] with the following system equation, where x 2 T is the state vector, with the first two states as the north and east position with the last two states as the north and east velocity of the target respectively. A sensor measures the position and velocity of the system as, . w k and v k are the Gaussian process and measurement noise, respectively. Suppose that we have additional information that the vehicle is moving on the road with a heading angle of θ from the east x 2 , then we can write, Due to the heading of the vehicle, the states are dependent on each other, thus providing us additional constraints, Based on the constraint matrix M = I 4 H T between the state prediction and sensor measurement and linear constraints (43) among the state variables, the unified constraint matrix M c can be obtained as, The covariance of the process and measurement noise are assumed to be, x 2 < 30 otherwise it is changed to −1 until .
x 1 or .
x 2 < 5. The process and measurement noise are assumed to be correlated with covariance, where F 1 and F 2 are the cholesky decomposition of Q and R respectively. ρ is the correlation coefficient and assumed as 0.4 in the simulation. Starting from the initial value, in each time step (41) is used to predict the states of the vehicle. The states are then updated with the sensor measurements (42). Using M c from (44) in (34) and (35), the proposed method directly satisfies the constraint between the perdition and measurement as well as the linear constraints among states due to the heading of the vehicle. On the other hand, the EP method first employs a Kalman filter to obtain the unconstrained state estimates and then project the unconstrained state estimates on the linear constraints subspace to satisfy the constraints among states. The performance of the proposed method is compared with the unconstrained estimate and estimate provided by the EP method in terms of RMSE. Table 3 summarizes the average RMSE of the individual states for 1000 Monte Carlo runs. It can be observed from Table 3 that the RMSE of the CP method is lower than the other estimators for all states. For the EP method, the use of P u −1 as a weighting parameter provides better results than using I. The comparative results of different methods can be also seen in Figure 10a,b, where Figure 10a is the RMSE of the northerly position and Figure 10b is the RMSE of northerly velocity. It can be seen that the proposed method performs better as compared to the unconstrained state estimation and EP method. Figure 11a,b show the variance of northerly position and northerly velocity of the vehicle, respectively. It is proven that the EP method with weighting parameter P u −1 has the smallest covariance [24]. This can also be observed from Figure 11a,b where the EP method with weight P u −1 provides smaller variance than the unconstrained estimate and EP method by identitying matrix I as the weighting parameter. However, the CP method results in even smaller state variance than all the competing methods as seen in Figure 11a,b.

Methods
Average RMSE However, the CP method results in even smaller state variance than all the competing methods as seen in Figure 11a,b.

Conclusions
In this paper, we propose a general approach to data fusion under arbitrary correlations and linear constraints as well as data inconsistency. The proposed method provides an unbiased and optimal solution in the sense of MMSE for fusing data from multiple sources. The method improves the fusion accuracy by automatically detecting and removing inconsistent estimates from the fusion pool by a statistical confidence measure. Moreover, it is shown that by considering the crosscorrelation among local estimates, the proposed method avoids the deterioration of the fusion accuracy due to Type I and Type II errors. Without any additional manipulation, the proposed method completely satisfies any linear constraints among state variables. The method improves the accuracy of constrained state estimation by satisfying the linear constraints among state variables and provides better results than the unconstrained state estimation and constrained state estimation using the estimate projection method.
Future work includes the extension of the proposed method for state estimation of non-linear dynamic systems. Another avenue is examining the proposed method for incorporating non-linear constraints among state variables.
Author Contributions: Sukhan Lee proposed the concept of data fusion based on covariance projection in the extended space as well as its application to outlier removal, while Muhammad Abu Bakr carried out the mathematical implementation and simulation analysis. Abu Bakr drafted the paper. Lee supervised Abu Bakr with critical assessment of the draft for quality revision.

Conclusions
In this paper, we propose a general approach to data fusion under arbitrary correlations and linear constraints as well as data inconsistency. The proposed method provides an unbiased and optimal solution in the sense of MMSE for fusing data from multiple sources. The method improves the fusion accuracy by automatically detecting and removing inconsistent estimates from the fusion pool by a statistical confidence measure. Moreover, it is shown that by considering the cross-correlation among local estimates, the proposed method avoids the deterioration of the fusion accuracy due to Type I and Type II errors. Without any additional manipulation, the proposed method completely satisfies any linear constraints among state variables. The method improves the accuracy of constrained state estimation by satisfying the linear constraints among state variables and provides better results than the unconstrained state estimation and constrained state estimation using the estimate projection method.
Future work includes the extension of the proposed method for state estimation of non-linear dynamic systems. Another avenue is examining the proposed method for incorporating non-linear constraints among state variables.
Author Contributions: Sukhan Lee proposed the concept of data fusion based on covariance projection in the extended space as well as its application to outlier removal, while Muhammad Abu Bakr carried out the

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
The fused mean estimate and covariance of the covariance projection (CP) method are given as,