Multisensor Estimation Fusion on Statistical Manifold

In the paper, we characterize local estimates from multiple distributed sensors as posterior probability densities, which are assumed to belong to a common parametric family. Adopting the information-geometric viewpoint, we consider such family as a Riemannian manifold endowed with the Fisher metric, and then formulate the fused density as an informative barycenter through minimizing the sum of its geodesic distances to all local posterior densities. Under the assumption of multivariate elliptical distribution (MED), two fusion methods are developed by using the minimal Manhattan distance instead of the geodesic distance on the manifold of MEDs, which both have the same mean estimation fusion, but different covariance estimation fusions. One obtains the fused covariance estimate by a robust fixed point iterative algorithm with theoretical convergence, and the other provides an explicit expression for the fused covariance estimate. At different heavy-tailed levels, the fusion results of two local estimates for a static target display that the two methods achieve a better approximate of the informative barycenter than some existing fusion methods. An application to distributed estimation fusion for dynamic systems with heavy-tailed process and observation noises is provided to demonstrate the performance of the two proposed fusion algorithms.


Introduction
Multisensor data fusion has been studied and widely applied to many important areas, such as image processing [1,2], wireless sensor networks [3,4] and remote sensing [5,6]. The research mainly centers on two basic architectures, i.e., centralized fusion and distributed fusion. The latter has less communication burden, higher flexibility and reliability, but challenges in dealing with the cross-correlation among local estimation errors caused by common process noises, related measurement noises and past data exchanges [7][8][9].
Considering the limitations of traditional Bayesian estimation methods such as some Kalman filtering variants, many estimation fusion methods only utilize the first two moments (i.e., mean and covariance) of each local posterior density for data fusion. For instance, the covariance intersection (CI) fusion method [10,11] propagates each local information pair consisting of the information matrix (i.e., the inverse covariance matrix) and the information vector (i.e., the state estimate left multiplied by the information matrix), and then provides a convex combination of all local information pair with specified normalized weights to yield a conservative fused estimate. Specifically, the determinant-minimization CI (DCI) method [12] selects weights by minimizing the trace or the determinant of fused covariance matrix, which also provided a set-theoretic interpretation [13]. Applying the set-theoretic criterion, some novel set-theoretic methods such as the relaxed Chebyshev center CI (RCC-CI) [14] and the analytic center CI (AC-CI) [15] take both the local state estimate and covariance matrix of estimation error into account to obtain the optimal weights, but the DCI only considers the latter.
The probability density function (PDF) constitutes a complete probabilistic description of the state estimate (i.e., the point estimate of the quantity of concern) from each (ii) We derive the explicit expression for the MHD-based projection from a point onto the MED submanifold with a fixed mean, and then develop two fusion methods MHDP-I and MHDP-E by relaxing the fusion criterion, which both have the same mean estimation fusion, but differ in the form of the fused covariance. A fixed point iteration is given to obtain this fused mean. (iii) The MHDP-I obtains the fused covariance using a robust fixed point iterative algorithm with theoretical convergence, while the MHDP-E provides an explicit expression for the fused covariance by introducing a Lie algebraic structure. We also provide a crucial theorem presenting the exponential mapping and logarithmic mapping on the MED submanifold with a fixed mean. (iv) Simulations indicate that the two proposed information-geometric methods outperform some existing fusion methods under non-Gaussian distributions.
The outline of this paper is organized as follows. In Section 2, the basic facts for information geometry and some useful results concerning the MED manifold are introduced. Section 3 formulates an information-geometric fusion criterion based on the minimal MHD, and then proposes the MHDP-I and MHDP-E fusion methods in Section 4. Numerical examples in Section 5 are provided to demonstrate the superiority of the two methods. Section 6 gives a conclusion. All proofs are given in the Appendics A-F.

Notations
Throughout this paper, we use lightface letters to represent scalars and scalar-valued mappings, boldface lowercase letters to represent vectors, and boldface capital letters to represent matrices and matrix-valued mappings. All vectors are column vectors. The notation S m + is the set of m × m real symmetric positive-definite matrices, R + is the space of all positive real numbers, and R m denotes the set of all m-dimensional real column vectors. The symbol I m stands for the identity matrix of order m. For a matrix A, A T and tr(A) denote its transpose and trace, respectively. Moreover, Exp(·), Log(·) and arccosh(·) are matrix exponential, matrix logarithm and inverse hyperbolic functions, respectively.

Preliminaries
In this section, we review basic notions in the fields of information geometry and present some important results for later use.

Statistical Manifold and Fisher Metric
Consider a statistical model of probability densities with the global coordinate system θ = (θ 1 , . . . , θ m ), where Θ is an open set of R m . Using the Fisher metric g induced by the m × m Fisher information matrix with the (i, j)-th entry where E θ [·] represents the expectation operator with respect to p(x; θ), the statistical manifold S can be endowed with a natural Riemannian differentiable structure [44]. For brevity, we shall abbreviate the point p(x; θ) of S as its coordinate θ. Let T θ S be the tangent vector space at the point θ in S, and an inner product on T θ S is then defined by the Fisher metric g, written as ·, · θ : T θ S × T θ S → R.
Thus, the geodesic distance between two points θ 0 and θ 1 in S, also called the Rao distance or Fisher information distance, can be obtained through solving the following functional minimization problem where the dot symbol over a variable signifies its derivative with respect to t ∈ R, and the set Γ consists of all piecewise smooth curves linking θ 0 to θ 1 . For any ν ∈ T θ S, there exists a unique geodesic segment γ(t; θ 0 , ν), t ∈ [0, 1], satisfying γ(0) = θ 0 andγ(0) = ν, and then the exponential mapping is defined as Conversely, the logarithmic mapping log θ 0 (θ 1 ) maps θ 1 into the tangent vector ν at θ 0 . As a generalization, the manifold retraction R(·) is a smooth mapping from the tangent bundle TS = {T θ S : θ ∈ Θ} into S (see, e.g., [45]), and its restriction R θ (·) on an open ball B(0, r) with radius r in T θ S satisfies R θ (0) = θ and dR θ | 0 = id T θ S . Moreover, its inverse mapping R −1 θ (·), called the lifting mapping, exists in a neighborhood of θ. In particular, the exponential mapping and logarithmic mapping are the commonly used retraction mapping and lifting mapping, respectively.

Information Geometry of Elliptical Distributions
An m-dimensional random vector x has multivariate elliptical distribution EL m h (µ, Σ), if its probability density function is of the form with some generating function h(·), where µ ∈ R m is mean vector and Σ ∈ S m + is positivedefinite scatter matrix. From [46], x has covariance matrix κ h Σ with scale parameter κ h .

Remark 1.
For other classical elliptical distributions, such as the Pearson-type VII class of distributions, the multivariate Cauchy distribution and a special subfamily of MEDs, the reference [47] has derived the analytic forms for a h and b h . However, in practical applications, some special MEDs (e.g., the Gaussian mixture distribution and the Gaussian-Student's t mixture distribution [41]) used to model the non-Gaussian distributions may not have their analytical forms, so we can use (9) to numerically calculate the two parameters, or approximate these MEDs using some specified classical elliptical distributions.
By introducing a Riemannian structure associated with the Fisher metric, the MED manifold with the same fixed function h can be regarded as a Riemannian manifold with coordinate system (µ, Σ). Note that the dimension of M is m + m(m + 1)/2. From [47], the squared line element at a point θ = (µ, Σ) in M is and the geodesic equations are given bÿ . Next, we derive the explicit expressions of the geodesic distance and geodesic curve between two given endpoints on the two-dimensional MED manifold M (i.e., m = 1) with coordinate system (µ, σ 2 ), which will be used in Section 5.1.

Theorem 1.
Consider the two-dimensional MED manifold M and two points p 1 = EL 1 h (µ 1 , σ 2 1 ) and then the geodesic distance between p 1 and p 2 is Additionally, the geodesic curve between p 1 and p 2 is given by where, Proof. See Appendix A.
However, in the general case for m > 1, how to obtain the closed form of Rao distance on the MED manifold M remains an unsolved problem, except for some submanifolds with a constant mean vector or constant scatter matrix (see, e.g., [48]).

Some Submanifolds of Elliptical Distributions
One of the commonly used submanifolds of M is having a fixed mean µ 0 and the coordinate system (µ 0 , Σ) or Σ for short. It is a totally geodesic submanifold, meaning that each geodesic of M [µ 0 ,·] is also a geodesic of M. As shown in [48], for any two tangent vectors ν 1 , ν 2 ∈ T θ (M [µ 0 ,·] ) at a point θ = (µ 0 , Σ), the Fisher inner product is given by and the Rao distance between two points θ 0 = (µ 0 , Σ 0 ) and Another special submanifold is defined as with a fixed scatter matrix Σ 0 ∈ S m + and the coordinate system (µ, αΣ 0 ) or (µ, α) for short. As [43], an explicit expression for the Rao distance on M [·,αΣ 0 ] is given as follows.

Manhattan Distances
At present, there is no explicit expression for the Rao distance on the MED manifold M. Intuitively motivated from the commonly adopted distance to measure the length of path along the coordinate curves on M as the 1 distance in Euclidean space, we introduce an intermediate point

and then construct a one-parametric class
Moreover, for providing tighter upper bounds for the Rao distance F (θ 0 , θ 1 ) on M, the minimal MHD as the minimum in the class A can be obtained by optimally seeking the parameterα.

Fusion Criterion
Consider a distributed system with n sensors observing a common state x ∈ R m . As many practical scenarios for dynamic target tracking in [49,50], the dynamic system is assumed having the heavy-tailed process and observation noises. We then adopt the MED assumption to better exploit heavy-tailed features inherent in noises, and denote by p k = EL m h (x k , P k ) the local posterior density from the k-th sensor. The goal is to fuse all p k into a single fused density with unavailable correlations among local sensors.

Information-Geometric Criterion
In practice, the Kullback-Leibler divergence (KLD) and Rao distance are widely accepted as basic information theoretic quantities that can capture higher-order statistical information, but the former is not a true distance in mathematics owing to the lack of triangle inequality and symmetry. Instead, as a natural intrinsic measure, the Rao distance not only enables us to deal with statistical issues while respecting the nonlinear geometry structure of MEDs, but is also invariant under coordinate transformations. From the viewpoint of information geometry, the geometrical illustration develops an intrinsic understanding of statistical models, and also provides a better avenue for estimation fusion. Then, taking the sum of geodesic distances F (p k , p) between p = EL m h (µ, Σ) and all p k in the MED manifold M as the cost function, we formulate a fusion criterion as (x, P) = arg min to fuse all local available information p k into a single posterior densityp = EL m h (x, P). The aforementioned inherent feature of the Rao distance ensures a unique fusion result in the multisensor fusion network involving different measurement frameworks. However, due to the absence of the closed form of Rao distance on the manifold of MEDs, the minimal MHD (28) as its tight upper bound is applied in the fusion criterion (29), that is, where the intermediate points p α k = EL m h (µ, α k P k ). The advantages of determining the minimal MHD to measure the dissimilarity between two MEDs are obvious: (i) The minimal MHD (28) is constructed from two Rao distances on two submanifolds (22) and (25) by inserting an intermediate point, thus having the decouple feature and inheriting some good properties from the Rao distance. (ii) As a tight upper bound for the Rao distance on the MED manifold, the minimal MHD (28) can be efficiently computed numerically. Moreover, its superior approximation performance has been fully verified by comparison with the Rao distances on the manifolds of MTDs and MGGDs (see [43] for more details).

Decoupling Fusion Criterion
Define the MHD-based projection distance (MHDPD) from a point where Before further solving the optimization problem (30), we firstly derive the explicit expressions of the MHD-based projection (MHDP) from a point onto the submanifold M [µ,·] and the corresponding MHDPD.
and the optimal scale factorα Proof. See Appendix B.
As an application to distributed dynamic system, each available local estimate (x i , P i ) ∈ R m × S m + should be limited to a bounded local feasible set S i around the true location (µ i , Σ i ) with high reliability (see, e.g., [51]). Therefore by the property of continuous function on compact set, it is straightforward to decompose the optimization (30) into two steps-first over µ and then over Σ: Denote two functions of the variable µ ∈ R m as follows: Similar to the relaxing strategy in [42], we solve the optimization (35) by successively optimizing Σ and µ. Specifically, exchanging the summation and the minimization over Σ and simplifying the objective function in the summation, (35) is then reduced tô or equivalently from (31), (32) and (34), In general, (39) is not equivalent to the original optimization problem (35) owing to its providing a lower bound for the minimum of the objective function in (35), but it makes good sense for the mean fusion based on the following insights: (i) As depicted in Figure 1, we replace the local posterior densities p k with the MHDPs ·] to measure the dissimilarity between the sought-for fused posterior densityp and p k in the fusion criterion (30).
(ii) The inner minimizations in (39) project each local density p k onto the MED submanifold M [µ,·] with some common fixed mean to obtain its substitute pα k . An appropriate candidatex for the specified mean variable µ is selected by the outer optimization, minimizing the sum of MHDPDs from the local posterior densities p k onto M [µ,·] .  ·] with the proper scale factorsα k . Additionally, moving each local estimate (x k , P k ) to its MHDP (x,α k P k ) inevitably leads to an increase in the local estimation error. It is observed from (40) thatα k (x) will tend to the scalar 1 from the right hand side as the fused meanx approaches the k-th local mean estimatex k , so it is reasonable to replace the local scatter matrix P k withα k P k . Then, we can fuse all MHDPs pα 1 , . . . , pα n on the totally geodesic submanifold M [x,·] with the fixedx to seek the fused scatter matrix P, using the following fusion criterion: It is worth noting that the squared geodesic distance, used as the cost function in (41), differs from the original fusion criterion (29). The main reason are as follows: (i) The covariance fusion (41) is indeed performed on the space S m + , since the Rao distance Σ of M [x,·] only depends on the coordinate Σ ∈ S m + . Moreover, the Riemannian mean (also called Riemannian center of mass) of data points p 1 , . . . , p n in S m + , which is the unique global minimizer of ∑ n k=1 2 (p k , p) with a geodesic distance (·, ·) on S m + (see [52,53]), has been widely studied on the manifold of covariance matrices [54]. (ii) The criterion ∑ n k=1 (p k , p) gives a robust alternative for the center of mass, called the Riemannian median [55,56]. Both the Riemannian mean and the Riemannian median have been successfully applied to signal detection (e.g., [57,58]) and have shown their own advantages. However when only two sensors (i.e., n = 2) are considered, any point lying on the shortest geodesic segment between two points p 1 and p 2 can be regarded as the Riemannian median, which seems quite undesirable for fusion. In addition, the final mean fusion (39) does not contradict this claim owing to its adopting the MHDPD M as the cost function. (iii) Due to the intractable root operation within the geodesic distance Σ , the squared geodesic distance is considered for the convenience of solving the optimization prob-lem (41). As illustrated in Section 4.2, we develop an efficient iterative strategy for the fused scatter estimate in the criterion (41) with at least linear convergence.
In summary, the final fused mean estimatex and scatter matrix estimate P can be obtained by solving (39) and (41), respectively.

Two MHDP-Based Fusion Methods
In this section, we first derive a fixed point iteration for the mean fusion (39), and then provide two different forms of covariance estimation fusion by iteratively solving the optimization problem (41) and introducing the Lie algebraic structure on M [x,·] to obtain an explicit form of the fused covariance, respectively.

Mean Estimation Fusion
Define two auxiliary functions and then rewrite (39) asx Theorem 3. The solution of the optimization problem (44) can be expressed in the following implicit formx where the normalized weights are given as Proof. See Appendix C.
In Theorem 3, the fused mean estimatex given by (45) has the same form as the traditional CI estimate, but in general with different weights. To solve the implicit Equation (45) forx, we derive a fixed point iteration for seeking the finalx in the following theorem.

Theorem 4. By adopting the fixed point iteration
with the normalized weightŝ the resulting sequence {µ t , t ∈ N} converges to a solutionx of (45) as t tends to infinity.

Proof. See Appendix D.
As shown in Algorithm 1, the fixed point iteration (47) is adopted to calculate the optimal estimate of meanx in (44), where the convergence of iteration can be guaranteed by Theorem 4.

Algorithm 1: MHDP-based mean estimation fusion.
Input : {(x k , P k )} k=1,...,n and tolerance Output :x 1 Set t = 0, r = ; 2 Initiate µ 0 using the CI estimate with equal weights; 3 while r ≥ do 4 Run the mean iteration using (47); 5 Compute the Frobenius norm r = µ t+1 − µ t ; 6 t ← t + 1; 7 end 8 returnx = µ t Remark 2. Similar to the GP method in [42], the fused scatter matrix estimate seems available to be set as since the fused mean estimatex has the CI form (45), and then the fused covariance estimate is naturally given as with each local covariance Q k = κ h P k . Nevertheless, (50) is reasonable to be replaced by the following two MHDP-based covariance fusion forms owing to its subjectively adopting the covariance fusion.

Iterative Solution for Covariance Estimation Fusion
Define an auxiliary function and then inserting the Rao distance (24) into (51) yields where,P In the theorem given below, we derive the Riemannian gradient of ψ k (Σ) on M [x,·] .
Theorem 5. The Riemannian gradient of ψ k (Σ) on the submanifold M [x,·] is given as Proof. See Appendix E.
In general, (58) cannot be solved explicitly due to the noncommutative nature on S m + , but we formulate the fixed point iteration for the scatter estimation fusion as follows: which is a very efficient iterative strategy for solving (58) with at least linear convergence (see [60] for a detailed proof). The fusion method above is called the MHDP-I method (i.e., Algorithm 2).

Explicit Solution for Covariance Estimation Fusion
The submanifold M [x,·] with the fixed meanx can be identified with the symmetric positive-definite matrix space S m + equipped with the same Riemannian metric induced by (23). Meanwhile, we can endow S m + with a group structure depending on the following definition: is a commutative group whose group operations are defined as follows: (i) Multiplication operation: g 1 g 2 := Exp(Log(g 1 ) + Log(g 2 )), for any g 1 , g 2 ∈ S m + ; (ii) Inverse operation: i(g) := g −1 , for any g ∈ S m + ; (iii) Identity element: e = I m .
It is easy to verify that the two operations and i are compatible with the Riemannian structure on S m + (see, e.g., [61]), hence S m + is a Lie group. As is well known, the Lie group is locally equivalent to a linear space, in which the local neighborhood of any element can be described by its tangent space. The tangent space T e M [x,·] at the identity element e = I m , also called Lie algebra g, is indeed isomorphic to the space S m of real symmetric matrices. Theoretically, the retraction mapping R(·) establishes a vector space isomorphism from (S m , +) to (M [x,·] , ). Then, we endow M [x,·] , namely S m + , with a "Lie vector space structure". It is important to stress that, based on the linear structure of Lie algebra, we can handle the covariance/scatter estimation fusion with the Euclidean operation, i.e., the arithmetic averaging in the Log-Euclidean domain, instead of the Riemannian one. It is explained as follows: (i) Move the MHDPsP 1 , . . . ,P n into some neighborhood of the identity element e = I m ∈ M [x,·] via using a left translation L P (·) by the sought-for scatter estimate P to obtain L P (P k ) = P −1 P k . Then, all L P (P k ) can be shifted to g by the lifting mapping R −1 e (·) at e, obtaining the vectors R −1 e (L P (P k )) for k = 1, . . . , n. (ii) The arithmetic average of R −1 e (L P (P k )) is the best way to minimize the total displacements from each P k to the sought-for P on the Lie algebra g owing to the linear structure of g (see, e.g., [60]), hence 1 n ∑ n k=1 R −1 e (P −1 P k ) = 0, or equivalently, Let the retraction mapping R e (·) = exp e (·) and the lifting mapping R −1 e (·) = log e (·) on M [x,·] , whose analytical expressions are given in the theorem below.
for any Σ 1 , Proof. See Appendix F.
Substituting (61) and (62) into (60) yields Then, by P −1 P k = Exp(LogP k − Log P) and the definition of , we have the fused scatter matrix estimate and the fused covariance matrix estimate Remark 4. As shown in [60], the explicit scatter matrix estimate (64) coincides with the arithmetic mean in the log-domain, i.e., the Log-Euclidean Fréchet mean.
In summary, the fusion method above is called the MHDP-E method (i.e., Algorithm 3). For comparison, the MHDP-I and MHDP-E fusion algorithms have the same mean fusion (see Algorithm 1), but the latter possesses an explicit form for the fused covariance matrix estimate (65). In addition, many existing fusion methods have identical forms as the CI, but differ from each other in the weights. For example, the KLA has equal weights 1/N, while many others, including DCI, FCI, GP and QMMHD, have their own weights, depending on local estimates. Taking two sensors tracking a one-dimensional target for instance, the fused estimates for the KLA, DCI, FCI, GP and QMMHD theoretically lie on the (straight) Euclidean line segment between two local estimates from sensors, as shown in Figure 2 of Section 5.1. However, the MED manifold M with the Riemannian structure is generally not flat [47], so the proposed MHDP-I and MHDP-E are more reasonable owing to their improving covariance estimation fusion based on this geometric structure.

Numerical Examples
In this section, we provide two numerical examples to show the performance of two fusion methods MHDP-I and MHDP-E, including static and dynamic target tracking. The traditional fusion algorithms DCI [16], FCI [15], and KLA [18], and two informationgeometric methods GP [42] and QMMHD [43] are taken for comparison.

Static Target Tracking
To intuitively evaluate the performance of these distributed fusion methods, we consider two local estimates (x 1 , P 1 ) = (2, 5) and (x 2 , P 2 ) = (4, 7) of a static target from two local sensors, similar to the settings in [42]. Note that the MHDP-I and MHDP-E methods have the same covariance fusion due to the commutability of two first-order matrices. In Figure 2, six subfigures are drawn to display the geodesic distance between the fused density and the informative barycenter under six different MED assumptions, respectively. Here, the informative barycenter as the optimal solution of (29) is an arbitrary point on the geodesic segment linking (x 1 , P 1 ) and (x 2 , P 2 ), and we uniquely determine it by minimizing the sum of its squared distances to the two local estimates. Since the MHDP-I and MHDP-E are developed based on the relaxation of (29), learning about the informative barycenter can improve our understanding of their fusion performance as the heavy-tailed level varies.
(i) The MTD has the degree of freedom (DOF) parameter ν > 2 (the higher ν, the lower the heavy-tailed level), and is almost equivalent to the Gaussian distribution as ν tends to infinity. As depicted in subfigures (a), (b) and (c), the MHDP-E is consistently superior to other methods regardless of the DOF, since the fused density of the MHDP-E is closest to the informative barycenter. (ii) The MGGD has the shape parameter β > 0 (the higher β, the lower the heavy-tailed level) and corresponds to the Gaussian distribution when β = 1 . Similar to the MTD, subfigures (d), (e) and (f) show the MHDP-E outperforms other fusion methods for different shape parameters.
Moreover, the aforementioned fusion methods DCI, FCI, KLA, GP and QMMHD have the same form as (45), but with different weights in general. Thus in Figure 2 their fused estimates lie on the (straight) Euclidean segment between (x 1 , P 1 ) and (x 2 , P 2 ), while the MHDP-E puts its estimate fairly close to the (curved) geodesic segment. Therefore, the MHDP-I and MHDP-E are considered more reasonable because the MED manifold M is indeed non-Euclidean.

Dynamic Target Tracking
Consider a distributed dynamic system with n sensors and one fusion center for tracking a two-dimensional moving target: where there are n 0 sensors for one-dimensional measurements and n − n 0 sensors for multidimensional measurements, the state vector x k consists of the position and velocity, and the sampling interval T = 1 s. As [49,50] for more practical scenarios, the outlier contaminated state and measurement noises are modeled as where "w.p." denotes "with probability". The local state estimatex i (k|k) and scatter matrix P i (k|k) of estimation error are computed by using the specified filter, when the i-th sensor obtains its own measurement at the instant k. The fusion center fuses all local estimates from the sensors to obtain the final estimate (x(k|k), P(k|k)) by applying some specified fusion methods, and then transmits it back to each local sensor. To demonstrate the performance of various fusion methods, i.e., the DCI, FCI, KLA, GP, QMMHD, MHDP-I and MHDP-E, we use the Student's t based filter (STF) [49] and the Gaussian-Student's t mixture distribution based Kalman filter (GSTMDKF) [41] to compare the root mean squared errors (RMSEs) of the fused results in the following two cases: (i) Case I (two sensors, i.e., n 0 = n = 2): The variances Q = 1 and R 1 k = R 2 k = 16, the measurement matrices and the two filters are initialized aŝ (ii) Case II (three sensors, i.e., n 0 = 2, n = 3): The system parameters Q, R 1 k , R 2 k , h 1 k , h 2 k , x(0|0), P(0|0) have the same settings as Case I, and also It is seen from (69)-(71) that the process and observation noises follow heavy-tailed distributions. The STF models the initial state vector and the noise distributions with u = 200 as follows: By (75)-(78), each local posterior density can be represented by the MTD with the DOF parameter ν = 3, i.e., ν i = 3, i = 1, . . . , 4. In the GSTMDKF setup, a Gaussian-Student's t mixture distribution (see [41] for more details) with the fixed DOF 3 is used to model the heavy-tailed noises (69)-(71) with u = 100, the initial state vector follows a Gaussian distribution, i.e., x 0 ∼ N(x(0|0), P(0|0)), and other parameters are the same as those in [41]. As explained in [41], the GSTM distribution has the same efficiency as the Student's t distribution in the presence of outliers, and thus our proposed MHDP-E and MHDP-I methods fuse all posterior densities by modeling them as the MTDs with the common DOF parameter 3. Note that the DCI, FCI, KLA, and GP only utilize the first two moments (i.e., x(k|k) and κP(k|k) for the i-th local posterior density), and the FCI, KLA and GP fusion methods are developed under Gaussian assumption so that the posterior density for the i-th sensor can be modeled as N(x i (k|k), κP i (k|k)). By adopting two different filters STF and GSTMDKF, Figure 3 shows the RMSEs of the position over 100 time steps and 500 Monte Carlo runs in two different cases using the aforementioned fusion methods, while the DCI does not appear in Figure 3 owing to it behaving worst. We can see that the proposed MHDP-I and MHDP-E are consistently better than other fusion methods when different filtering methods are adopted. Moreover, Table 1 reports the averaged root mean squared errors (ARMSEs) of the various methods in both cases. It is evident that the performance of all fusion methods in three-sensor example has a significant improvement over that in the two-sensor case, and also the MHDP-I performs slightly better than the MHDP-E when using the GSTMDKF. However in contrast to the MHDP-I, the MHDP-E has lower computational complexity due to the explicit expression for the fused covariance estimate.

Conclusions
In this work, we formulate an information-geometric fusion criterion, taking the geodesic distance as the loss function, and implement the fusion of local posterior densities on the MED manifold endowed with the Fisher metric. Two distributed estimation fusion algorithms MHDP-I and MHDP-E are proposed by using the minimal Manhattan distance instead of the geodesic distance on the manifold of MEDs, which both have the same mean estimation fusion, but differ in the form of covariance estimation fusion. On the MED submanifold with a fixed mean, the MHDP-I fuses all MHD-based projections of local posterior densities by minimizing the squared geodesic distances from a sought-for fused density. We have developed a robust fixed point iterative algorithm with theoretical convergence to compute the covariance estimate of such fused density. By introducing a specified Lie algebraic structure on the aforementioned submanifold and deriving its exponential and logarithmic mappings, the MHDP-E has provided an explicit expression for the fused covariance estimate. Numerical examples have shown better performance of the two proposed information-geometric methods than five other fusion methods. Then, the squared line element along the geodesic becomes Compared with (11), we obtains = 2/(12b h − 1)s, where s is the arc length parameter before the transformation (A1). Therefore, by Theorem 6.5 in [62], the geodesic distance between p 1 and p 2 is given as To that end, using Theorem 6.2 in [62], we obtain the geodesic curve on the MED manifold as µ(s) = (δ 1 + 2δ 2 tanh(s/d h + ε))/ where δ 1 , δ 2 and ε can be calculated by boundary conditions. Note that for the case of µ 1 = µ 2 , the geodesic given by (15) and (16) i.e., the logarithmic mapping As a result, replacing the left hand side of (A30) with ν ∈ T Σ 1 M [x,·] , we obtain the exponential mapping exp Σ 1 (ν) = Σ 1 Exp(Σ −1 1 ν). (A31)