Conservative Quantization of Covariance Matrices with Applications to Decentralized Information Fusion †

Information fusion in networked systems poses challenges with respect to both theory and implementation. Limited available bandwidth can become a bottleneck when high-dimensional estimates and associated error covariance matrices need to be transmitted. Compression of estimates and covariance matrices can endanger desirable properties like unbiasedness and may lead to unreliable fusion results. In this work, quantization methods for estimates and covariance matrices are presented and their usage with the optimal fusion formulas and covariance intersection is demonstrated. The proposed quantization methods significantly reduce the bandwidth required for data transmission while retaining unbiasedness and conservativeness of the considered fusion methods. Their performance is evaluated using simulations, showing their effectiveness even in the case of substantial data reduction.


Introduction
Interconnected sensor systems can gather more data, are more robust to faults and outliers, and can cover larger regions than a single sensor system. Such networked systems can also benefit from heterogeneous sensing modalities and parameterizations. Typical examples are wireless sensor networks which are used, for instance, in environmental monitoring [1][2][3], building automation [4,5], or moving object tracking [6,7]. A single node in a wireless sensor network often has limited energy, processing, and storage resources, and the wireless transmission of data is the most energy-intensive operation performed by the node while processing sensor data exhibits relatively low energy demands [8]. Even for networked systems that do not use wireless data transmission or that have sufficient energy resources, communication can be a limiting factor when nodes need to transmit large-scale estimates, which may occur in cooperative map building [9], cooperative localization [10,11], or multi-object tracking [12].
From the accruing sensor data, the interconnected devices can compute state estimates locally, e.g., by employing Kalman filter methods. Such estimates are typically supplied with error covariance matrices, which need to be transmitted and stored alongside the estimates, to be able to assess their uncertainty and combine them reliably. Therefore, reducing the amount of transmitted data through prior compression is key to meeting bandwidth limitations when high-dimensional state estimates are exchanged and to ensure long operating times of battery-driven sensor nodes when wireless data transmission is used. A comprehensive survey of lossless and lossy compression methods that are suitable for wireless sensor networks is given in [13]. The surveyed methods include multiple probabilistic quantization-based approaches [14][15][16][17] tailored to estimation problems. However, only scalar estimates are considered and their respective variances are assumed to be known to the receiver. Quantization as a means of data reduction has also been applied to Kalman filtering, a prominent example being the sign of innovations Kalman filter [18,19]. Again, the required covariance matrices are assumed to be known to the receiver. In contrast to the previous works, it is assumed in [20] that the receiver has no prior knowledge of the covariance matrices, which therefore need to be transmitted via the network. The authors develop data reduction methods for covariance matrices based on conservative diagonal approximations and, in [21], they investigate techniques to select subsets of the information to be transmitted. These methods assess how the selected information contributes to the receiver's estimation quality.
In this paper, similarly to [20], we assume that the receiver has no knowledge about the covariance matrix at the transmitter. Hence, the sender has to prepare both its estimate and covariance matrix for transmission. In general, the covariance matrix will not be diagonal and dominates the amount of data that needs to be transmitted as its number of elements grows quadratically with the dimension of the estimate. The individual quantization of each coefficient constitutes a promising approach to reduce the data. However, such a quantized covariance matrix, in general, does not reliably account for the uncertainty of the estimate and can even violate the positive definiteness of the covariance matrix. For this reason, we study and compare two different approaches to compute a quantized covariance matrix that conservatively bounds the actual error covariance matrix. The first scheme employs a quantization based on diagonal dominance while the second scheme relies on a modified Cholesky decomposition. To further compress the data to be transmitted, we also investigate a quantization of the estimates. As typical fusion algorithms rely on unbiasedness, we employ a quantizer that preserves this property. The proposed quantization schemes yield conservative estimates that reliably assess the estimation error and can be further processed at the receiver.
At the receiver, the estimates are typically fed to a fusion algorithm to combine them with other estimates and to improve the estimation accuracy. Fusion algorithms that strive to minimize the error of the fusion result need access to the covariance matrices of the input estimates. Optimal fusion algorithms [22,23] can be designed if cross-correlations between the estimates are also known. They typically require the transmission of additional information [24] or specific communication strategies [25,26]. In the case where correlations are unknown, conservative fusion algorithms compute a bound on the actual but unknown error covariance matrix of the fusion results. Examples of such algorithms are covariance intersection (CI) [27], fast covariance intersection (FCI) [28,29], and inverse covariance intersection (ICI) [30,31], which are guaranteed to produce results with a conservative uncertainty quantification in the form of a covariance matrix. Other algorithms such as ellipsoidal intersection (EI) [32,33] provide no such guarantee but are typically less conservative. In this paper, we study how the proposed quantization schemes integrate with fusion algorithms and consider both optimal fusion and covariance intersection. This paper is an extended version of [34], which proposed a quantization technique for covariance intersection. Here, we study the use of quantization in a broader sense to cover different fusion algorithms, and we propose an additional quantization scheme for covariance matrices that provides tighter bounds at the expense of higher computational demand. In total, this paper's contributions address three different aspects: Estimate Quantization. We extend the probabilistic quantization method from [15,35] to vector-valued correlated random variables in order to generate unbiased quantized estimates [34] and conservative bounds on their error covariance matrices. Covariance Quantization. We propose two approaches to the conservative quantization of covariance matrices. The first scheme uses diagonal dominance [34]. As an alternative, we study a modified Cholesky decomposition and compare it to the first approach.

Fusion of Estimates.
We apply the quantization schemes to both an optimal fusion algorithm and covariance intersection in order to demonstrate that reliable estimates are attained.
Implementations of the proposed quantization schemes, written in Python, are provided as supplementary material, the link to which can be found at the end of the paper.

Notation
Lower case letters x ∈ R denote scalar quantities and additional underlining x ∈ R n indicates n-dimensional vector-valued quantities. The standard basis vectors of R n are e 1 , . . . , e n and 0 n denotes the n-dimensional zero vector. Bold uppercase letters indicate n × n-matrices such as X ∈ R n×n . The ith coefficient of a vector and the i, jth coefficient of a matrix are x i and X i,j , respectively. In addition, index ranges such as i : j, i :, and : i are used to extract subvectors and submatrices. As an example, x i:j is the subvector containing the ith to the jth coefficient of x and X :i,j are the first i coefficients of the jth column of X. The use of boldface as in x ∈ R and x ∈ R n indicates random scalars and random vectors, respectively. Uppercase calligraphic letters A indicate sets. In particular, S n is used to denote the set of symmetric matrices in R n×n and S n + to denote the set of symmetric positive semi-definite (PSD) matrices in R n×n . For X ∈ S n + and Y ∈ S n + , the notation X Y signifies that Y − X ∈ S n + . If X Y then Y ∈ S n + is called an upper bound for X ∈ S n + . For (conditional) expectations the symbols E(·) and E(·|·) are used. The unconditional covariance between two random quantities is designated by C(·, ·) or by C(·) if the arguments are identical. Similarly, the conditional covariance is denoted by C(·, ·|·) or C(·|·).

Considered Problem
The process of quantization approximates a continuous quantity using a discrete one. In this work, we consider the quantization of covariance matrices and estimates with the goal of reducing the bandwidth and storage requirements on an interconnected sensor system. We demonstrate how optimal fusion and covariance intersection can be applied to quantized data while retaining some of their desirable properties.
For our purposes, a quantizer is a map q : D → C, where the domain D is a closed, coefficient-wise bounded subset of either R n or R n×n , and the codomain C, the so-called codebook, is a finite set. Quantizing covariance matrices for use in fusion methods is not straightforward. Naive coefficient-wise quantization of a covariance matrix can lead to a result that underestimates the uncertainty encoded in the original covariance matrix, or even worse, is not a valid covariance matrix anymore. This can cause divergence in certain estimation algorithms [27]. Ideally, the quantized covariance matrix q(X) should be an upper bound on a conservative estimate of the original matrix X in the sense that q(X) X holds. This averts divergence and guarantees that the confidence ellipsoid induced by q(X) contains the one induced by X [27]. The described situation is illustrated on the left side of Figure 1. Conservative quantization of covariance matrices can be achieved by enforcing certain conditions on the quantization error, as will be discussed in Section 4.
Similarly to quantizing covariance matrices, quantizing estimates for use in fusion methods creates certain challenges. Deterministic quantization of an estimate can introduce bias and additional noise, which (1) biases the results obtained from fusion methods and (2) invalidates the covariance matrix associated with the estimate. This is visualized on the right side of Figure 1. Both of the aforementioned issues can be addressed by using randomized quantizers, which will be discussed in Section 5 in the context of applying fusion methods to quantized data.

Conservative Quantization of Covariance Matrices
In the following, conservative quantizers for covariance matrices, i.e., symmetric positive semi-definite (PSD) matrices, are derived. To that end, let q c : D c → C c be a quantizer that maps PSD matrices from an coefficient-wise bounded and closed set D c ⊂ S n + to a finite codebook C c ⊂ R n×n . The quantizer q c should satisfy the condition to ensure that the quantized matrix q c (X) is an upper bound on the original matrix X. With the quantization error defined as ∆(X) = q c (X) − X, this can also be expressed as ∀X ∈ D c : ∆(X) 0 .
In other words, the quantization error must always be PSD for the quantized matrix to be an upper bound on the original matrix. Ideally, the quantizer should not only produce conservative results but should also minimize the total quantization error ∆(X) F . This requires enumerating all elements in C c in the worst case and is thus not computationally feasible, even for relatively small matrices. Practical quantizers will therefore not be able to minimize the total quantization error exactly. To remain computationally tractable, the quantizers considered in this work operate in two steps: First, the off-diagonal coefficients are individually rounded to the nearest codeword in some off-diagonal codebook C o ⊂ R. Then, the diagonal coefficients are individually rounded up to a codeword in some diagonal codebook C d ⊂ R so as to make the quantization error PSD.

Covariance Quantization Based on Diagonal Dominance
The rounding method for diagonal coefficients considered in this section is based on the notion of diagonal dominance. Diagonal dominance is a simple sufficient condition for a symmetric matrix, such as the quantization error matrix ∆(X), to be PSD. A symmetric matrix X ∈ S n is said to be diagonally dominant if holds for each row i = 1, . . . , n. The connection between diagonal dominance and positive semi-definiteness is obtained immediately by applying the Gershgorin circle theorem [36] to a diagonally dominant matrix to lower bound its eigenvalues. Theorem 1. Let X ∈ S n be diagonally dominant, then X 0 holds.
The approach to conservative quantization of a PSD matrix X ∈ D c pursued here is to first quantize the off-diagonal coefficients of X using a codebook C o ⊂ R and to then quantize the diagonal coefficients using a codebook C d ⊂ R such that (3) is satisfied for the quantization error ∆(X). This leads to the quantizer where rd(·) rounds to the nearest codeword in the off-diagonal codebook C o and · rounds up to the nearest codeword in the diagonal codebook C d . The codebooks are where x max is the maximum off-diagonal codeword, and Proof. The quantization error of an off-diagonal coefficient is δ o /2 at most. Therefore holds for all diagonal coefficients. Since the right hand side equals max(C d ), rounding up the perturbed diagonal coefficients is always possible, and the claim holds.
When not stated otherwise, the above conditions for well defined q c are implicitly assumed to hold. The next theorem confirms that the output of q c is indeed an upper bound for its input. Theorem 3. The quantizer q c : D c → C c proposed above has PSD quantization error ∆(X) = q c (X) − X for all X ∈ D c and is thus conservative, that is, q c (X) X holds for all X ∈ D c .
Proof. In the following, we omit the dependence of ∆(X) on X for brevity. The off-diagonal quantization errors are ∆ i,j = rd(X i,j ) − X i,j and the diagonal ones are By the definition of · we have and the claim follows from Theorem 1.
Furthermore, the quantizer q c introduced above is optimal in the sense that, given codebooks C d/o , there is no quantizer with symmetric diagonally dominant quantization error ∆(X) that has a smaller total quantization error ∆(X) F . Theorem 4. Let q c : D c → C c be defined by (4) with coefficient-wise codebooks C d and C o defined by (5) and (6). Given X ∈ D c , the quantization error ∆(X) = q c (X) − X is the minimizer of where the dependency of ∆(X) on X has been omitted for brevity.
Proof. The problem can be reformulated as a nested minimization, the inner one being over the diagonal coefficients given the off-diagonal coefficients and the outer one being over the off-diagonal coefficients given the solutions ∆ i,i * of the inner optimization. Furthermore, the inner minimization can be split into decoupled minimizations are the optimal solutions to these subproblems. They exist because q c is well defined. The minimum cost of each decoupled problem is thus which is non-decreasing in each ∆ i,j . Using this intermediate result, the outer minimization problem can be seen to attain its minimum by separately minimizing the |∆ i,j | 2 , as due to the non-decreasing property, |∆ i,i * | 2 is minimal if each |∆ i,j | 2 is minimal. Thus, the minimum is, by definition of rd(·), attained by setting ∆ i,j * = rd(X i,j ).
Although the above quantizer minimizes the conservativeness of the quantized matrix in the sense of Theorem 4, the inequalities (3) are only sufficient and not necessary for the quantization error to be PSD. Hence, the results of this method are usually more conservative than necessary. The proposed quantizer has a low computational complexity of O(n 2 ) because the matrix coefficients are quantized individually.

Covariance Quantization Based on Modified Cholesky Decomposition
The covariance matrix quantization approach presented in the previous section is computationally efficient but can be overly conservative. An alternative quantizer that is guaranteed to be less conservative at the cost of increased computational expense is presented in this section. The basic approach of first quantizing the off-diagonal coefficients and then finding quantized diagonal coefficients that make the overall quantization result conservative is retained. However, instead of employing diagonal dominance to find the quantized diagonal coefficients, a modified Cholesky factorization adopted from [37] is leveraged. We motivate the proposed method by first introducing the Cholesky decomposition in conjunction with a result relating its existence to positive semi-definiteness [36] (Corollary 7.2.9).
Theorem 5. Let X ∈ S n be a symmetric matrix and P ∈ R n×n a permutation matrix (Permutation matrices are orthogonal matrices that arise by permuting the rows and columns of an identity matrix. Matrix multiplication with a permutation matrix permutes either the rows or the columns of the other matrix, depending on the order of multiplication). Then there is a lower triangular matrix L ∈ R n×n with nonnegative diagonal coefficients such that PXP = LL (11) holds if and only if X is positive semi-definite. The above factorization is called a (pivoted) Cholesky decomposition of X with Cholesky factor L.
Should X not be PSD, a so-called modified Cholesky decomposition can be performed to find a diagonal nonnegative matrix D such that a Cholesky decomposition P(X + D)P = LL exists [37][38][39]. In the following, the basic recursive approach to simultaneously compute the matrices D, P, and L is introduced, based on the exposition in [37]. The recursion begins by setting X 1 = X. The computations are then performed for k = 1, . . . , n. Each P k is a permutation matrix swapping two rows and columns such that (P k X k P k ) k,k = X m,m k with m ≥ k determined according to some criterion. For now we will assume m = k so that P k = I. The s k are nonnegative perturbations applied to the kth diagonal coefficient of P k X k P k . The vector x k ∈ R n is selected to cancel the kth row and column ofX k . This is achieved by letting and results in the k − 1 upper-most/left-most rows and columns of X k being zero.
For the recursion to terminate successfully, s k must either be such thatX k,k k is positive or such thatX k,k: k is zero. This is always possible, as s k can be arbitrarily large. Unraveling the recursion up to X n+1 and using the fact that X n+1 = 0 gives This can be written in the more condensed form by introducing P = ∏ n k=1 P n−k+1 , L = l 1 · · · l n with l k = ∏ n−k j=1 P n−j+1 x k , and It can be shown that P is a permutation matrix, L is lower triangular with nonnegative diagonal coefficients, and D has its diagonal populated with the s k and is otherwise zero. Hence, (16) is a Cholesky decomposition of X + D and according to Theorem 5, X + D must be positive semi-definite. Note that the above recursion can be computed in-place essentially like an ordinary Cholesky decomposition (see for instance [40] (Algorithm 4.2.2)), the main differences being the diagonal shifts s k and allowing x k = 0 n .
We will now describe how the above approach can be applied to finding quantized diagonal coefficients that make the overall quantization result conservative. For that, first quantize each off-diagonal coefficient of the given PSD matrix X ∈ D c by rounding it to the nearest codeword in C o = {x max − kδ o | 0 ≤ k < 2 b } (see Section 4.1) resulting in the preliminary quantized matrix and preliminary quantization error Note that in the remainder of this section we will omit the dependence of X o (X) and ∆ o (X) on X for brevity. Since the diagonal elements of ∆ o are zero, ∆ o cannot be PSD unless it is zero, as can be easily verified. Then the modified Cholesky decomposition of ∆ o is computed, giving a diagonal matrix D such that ∆ o + D 0 holds. Adding D to X o and rounding the diagonal coefficients up to the nearest codeword in Assume for the moment that the quantizer defined above is well-defined, i.e., that there always are codewords in C d that the perturbed diagonal elements can be rounded up to. In that case, the following theorem applies. Theorem 6. The quantizer q c : D c → C c as defined above is conservative.
Proof. Due to the modified Cholesky decomposition, it holds that X o − X + D = ∆ o + D 0. By rounding the diagonal of X o + D up, we get X o + D + ∆ d where ∆ d is diagonal and nonnegative and thus ∆ d 0. Therefore, X o + D + ∆ d − X X o + D − X 0 or equivalently q c (X) X holds.
So far, the diagonal perturbations s k have been assumed to be almost arbitrary. In order to guarantee that the quantizer is well-defined, we adopt the specific choice described in [37]. This has several advantageous implications as can be seen from Theorem 7 [37] (Theorem 5.1.2) and the subsequent corollaries.

Theorem 7.
If s k is chosen as in (20) to compute the modified Cholesky decomposition of some X ∈ S n , then X + D is positive semi-definite (it can be rank-deficient) and the upper bound holds for k = 1, . . . , n. The result is valid, provided each P k swaps the kth row and column only with some subsequent row and column.
Applying the above theorem to the quantizer proposed in this section, it is evident that with the given choice of s k , the diagonal perturbations are always smaller than or equal to the maximum perturbation required by the quantization approach from Section 4.1.
Inspecting the proof in [37], it can indeed be deduced that every s k is smaller than or equal to its corresponding perturbation (after permutation using the P k ) in the diagonal dominance based approach. Proof. By applying Theorem 7 to ∆ o , it follows that s k ≤ max j ∑ n i=1 =j ∆ i,j o holds for k = 1, . . . , n. Comparing to (4), s k can be seen to be smaller than the maximum amount added to diagonal coefficients by the quantizer from Section 4.1. Hence, following the same argument as for Theorem 2 and using the bound on s k , the claim follows.

Corollary 2.
The quantizer q c : D c → C c proposed in this section has lower or identical total quantization error ∆ F compared to the quantizer from Section 4.1.
Proof. By the discussion above, the diagonal perturbations s k are smaller than or equal to those used by the quantizer based on diagonal dominance. This translates into a reduced absolute deviation from the original diagonal elements, also after rounding up. The offdiagonal quantization errors are identical. Hence, the total quantization error is less than or equal to that of the quantizer based on diagonal dominance.
The final ingredient in the above quantization approach is the choice of the permutation matrices P k which, up to now, have been assumed to be identity matrices. We adopt the choice of P k proposed in [37], which greatly improved performance in our experiments. Each matrix P k is chosen in order to swap two rows and columns such that (P k X k P k ) k,k = X m,m k with m = arg max k≤i≤n g i k where g k is a recursively computed vector initialized at k = 1 using and recursively updated for each k = 1, . . . , n according to after choosing the respective P k and s k . These changes do not affect the theoretical results, as they only pertain to the strategy of choosing P k . We do not take into account the remaining modifications proposed in [37], as they do not seem to improve performance in our case. Due to the modified Cholesky decomposition, the quantizer introduced in this section has computational complexity of O(n 3 ), compared to the O(n 2 ) complexity of the quantizer from Section 4.1.

Applications to Information Fusion
The goal of any fusion algorithm is to combine estimates x a ∈ R n and x b ∈ R n of the same random quantity x ∈ R n to obtain an, in some sense, improved estimate x f ∈ R n of x. Typically, the estimates x a and x b are provided in conjunction with error covariance matrix estimates C aa ≈ C(x a − x) and C bb ≈ C(x b − x) and the fusion method uses them to compute an error covariance estimate C f f ≈ C(x f − x) for the fused estimate x f .
In the following, two fusion methods from the literature that are unbiased (E(x f ) = E(x)) and conservative (C f f C(x f − x)) under certain conditions, are introduced and their application with quantized error covariance matrices and estimate vectors is considered. In that context, a quantizer for unbiased estimate vectors is derived, that retains their unbiasedness and provides a conservative estimate of the error covariance matrix of the quantization result. Finally, it is demonstrated how the covariance quantization methods from Section 4 can be applied in conjunction with the unbiased estimate quantizer to retain unbiasedness and conservativeness of the fusion methods.

Optimal Fusion and Covariance Intersection
In the following, let x a ∈ R n and x b ∈ R n be unbiased estimates of some random vector x ∈ R n and let be their joint error covariance matrix. The fused estimate and its associated estimated error covariance matrix will be denoted by x f and C f f , respectively. Note that C f f need not necessarily be identical to the actual error covariance matrix C(x f − x). In case the crosscovariance C ab is known, the optimal fusion result in the BLUE (best linear unbiased estimator) sense is given by the Bar-Shalom-Campo formulas [22] x The estimated error covariance matrix computed using the Bar-Shalom-Campo formulas is exact, i.e., C f f = C(x f − x) holds [22]. The cross-covariance C ab required by (25) and (26) can be tracked, e.g., using samples [41] to encode the cross-correlations or by a square-root decomposition [24] of the noise covariance matrices. Both approaches require additional data to be transmitted. If the error covariance matrices C aa and C bb are not known, but upper boundsĈ aa C aa andĈ bb C bb are available, applying (25) and (26) using the estimates yields a conservative error covariance matrix estimate [42]. In the common case where the cross-covariance is unknown but not negligible, setting C ab = 0 in (25) and (26) generally does not produce a conservative error covariance matrix estimate, i.e., C f f C(x f − x). This means that the confidence ellipsoid induced by C f f does not contain the confidence ellipsoid induced by C(x f − x) for any > 0. An example of this behavior is illustrated on the left side of Figure 2. , and C f f (orange, dashed) when C ab = 0. The result obtained using optimal fusion with the erroneous assumption C ab = 0 is shown on the left. The result achieved using CI is shown on the right.
The covariance intersection (CI) algorithm, originally devised by Julier and Uhlmann [27], enables the conservative and unbiased fusion of the estimates x a and x b , regardless of their generally unknown cross-covariance, as long as conservative error covariance matrix estimateŝ C aa C aa andĈ bb C bb are available. The CI algorithm itself is defined by where any ω ∈ [0, 1] gives an unbiased estimate x f and an upper bound C f f on its error covariance matrix C(x f − x). The weight ω is determined numerically by minimizing either the trace or the determinant of C f f . The right side of Figure 2 shows that using CI, the confidence ellipsoid induced by C f f contains the one induced by C(x f − x).

Unbiased Conservative Quantization of Estimates
Applying the Bar-Shalom-Campo formulas (25) and (26) and the covariance intersection Equations (27) and (28) to quantized estimate vectors requires some consideration as naively quantizing the unbiased estimate vectors x a and x b does not retain their unbiasedness which in turn leads to biased fused estimates x f . Moreover, quantizing estimate vectors increases their error covariance matrices, which has to be accounted for in order to retain conservativeness of the fusion algorithms. To address these issues we derive a randomized quantizer for unbiased estimate vectors that produces unbiased quantized estimate vectors and provides an upper bound on their error covariance matrices.
We begin by introducing a randomized quantizer q s : D s → C s with D s ⊂ R and C s ⊂ R satisfying min(D s ), max(D s ) ∈ C s , that has the desired unbiasedness property. The quantizer was proposed for estimating quantization in a different but equivalent form in [15] (see also [35,43]) and is defined by wherex ∈ R is an estimate of a random variable x ∈ R, rd(·) rounds to the nearest codeword in C s , and n ∈ R is independently uniformly distributed in the closed interval [−δ s /2, δ s /2]. The quantizer therefore consists of rounding combined with additive dither [44]. The codebook is given by C s = {x max − kδ s | 0 ≤ k < 2 b }, where x max is the maximum codeword, δ s = x max /2 b−1 is the increment between adjacent codewords, and b is the number of bits required to represent a codeword. The assumption min(D s ), max(D s ) ∈ C s guarantees a quantization error bounded by δ s . It is shown in [15] that q s satisfies that is, it does not add bias and provides an upper bound on the quantization result's variance. Undesirably, the upper bound is on the variance C(q s (x)), not the error variance C(q s (x) − x) and the two quantities coincide only when x is deterministic. We propose a randomized quantizer q m : D m → C m for coefficient-wise bounded estimate vectorsx ∈ D m ⊂ R n of some random vector x ∈ R n that is an coefficient-wise version of the one given by (29) [34]. Its domain and codebook are the Cartesian products D m = D n s and C m = C n s . The quantization process can thus be described by where rd(·) now denotes coefficient-wise rounding and n ∈ R n has independent coefficients uniformly distributed in the closed interval [−δ s /2, δ s /2]. As an immediate consequence of (30) applied coefficient-wise to q m (x), we have the following corollary. Due to rounding and dither, the quantized estimate's error q m (x) − x contains additional noise compared to the original estimate's errorx − x. Consequently, the known error covariance matrix C(x − x) of the original estimate must be adapted to reflect the increased uncertainty. In general, computing the exact covariance matrix of q m (x) − x is infeasible without knowledge of the distribution (if the distribution ofx was known, the approach in [45] could be used to approximate C(q m (x) − x) arbitrarily well) ofx. Therefore, a conservative upper bound for C(q m (x) − x) in a similar vein as (30), is determined. Theorem 8. Let q m : D m → C m be as in (31) and letx ∈ D m be an estimate of a random vector x ∈ R n , then C(q m (x) − x) C(x − x) + δ 2 s I holds.
Proof. The estimation error covariance matrix after quantization can be expanded into by adding and subtractingx. The cross-terms can be shown to be zero by using the definition of q m , unbiasedness, and the tower rule to obtain Due to q m being unbiased (also conditionally), E(rd(x + n)|x) =x holds. Since x and n are independent, we have E xrd(x + n) x = E(x|x)E rd(x + n) x . Applying these equations to the inner expectation of (32) shows that as claimed. As C(x − x) is known, it only remains to bound C(q m (x) −x). Since q m is (conditionally) unbiased and n has independent coefficients, follows for i = j, which by the tower rule and unbiasedness results in Furthermore, it holds that |q m (x) i −x i | ≤ δ s and thus C(q m (x) i −x i ) ≤ δ 2 s for i = 1, . . . , n. Combined, this means that C(q m (x) −x) δ 2 s I. The claimed upper bound then follows immediately.
The upper bound given above is fast to compute and does not require any knowledge of the distribution of either x orx, at the cost of being overly conservative, particularly if a component ofx is concentrated between two codewords or for large δ s .

Quantized Optimal Fusion and Covariance Intersection
We are now in the position to formulate a process that allows to apply the Bar-Shalom-Campo formulas or covariance intersection to quantized estimates and covariance matrices while retaining unbiasedness and conservativeness of said fusion methods. The proposed approach is as follows: 1.
Quantize the estimates x a and x b so that the quantization results remain unbiased. Account for the potential increase in uncertainty due to the quantization process. Both goals are achieved by employing the unbiased, conservative estimate quantizer introduced in the previous subsection.

2.
Quantize the error covariance matrices of the quantized estimates conservatively. This is done using either the quantizer from Section 4.1 or the one from Section 4.2.

3.
Apply the Bar-Shalom-Campo formulas or covariance intersection to the quantized estimates and quantized error covariance matrices. Since the quantized estimates are unbiased and the quantized error covariance matrices are conservative the fusion result will also be unbiased and conservative.
In the following, the above process using CI in conjunction with either the diagonal dominance (DD)-based quantizer from Section 4.1 or the modified Cholesky (MC) decomposition-based quantizer from Section 4.2 will be referred to as DD-CI and MC-CI, respectively. The methods obtained by replacing CI with the optimal (OPT) fusion formulas in the DD-CI and MC-CI methods will be referred to as DD-OPT and MC-OPT.

Results and Discussion
The total quantization errors of the proposed covariance matrix quantizers are evaluated using randomly selected covariance matrices. In addition, the performance of DD-CI/OPT and MC-CI/OPT relative to CI/optimal fusion is evaluated by applying the methods to randomly generated data. Finally, the DD-CI and MC-CI approaches are evaluated in a decentralized 2D target tracking scenario.

Evaluation of the Covariance Quantizers
The two proposed covariance matrix quantizers are applied to independent random covariance matrices. The random matrices are generated as X = LL where L ∈ R n×n has zero-mean, normally distributed elements with variance one. The Frobenius norms of the resulting quantization error matrices are averaged over all samples. Figure 3 shows the averaged Frobenius norm of the diagonal dominance based quantizer and the relative improvement in average norm achieved by the modified Cholesky decomposition based quantizer for varying numbers of bits per codeword b and matrix dimensions n. For the codebooks, x max = 50.0 was used and 10,000 samples were included in each average.
The quantization error increases monotonically as b decreases with larger n leading to stronger deterioration of performance. The dependency on b is a result of the quantization resolution decreasing exponentially for decreasing b. The dependency on n is due to the fact that for increasing n the quantization error per element remains roughly the same, whereas the number of matrix elements increases quadratically. Therefore the off-diagonal quantization error increases for increasing n. The shift applied to the diagonal elements to ensure conservativeness must then grow larger with increasing n, thereby increasing the diagonal quantization error. The relative improvement in average norm can be seen to be approximately constant over b, the notable exception being b ≤ 2 where there is little to no improvement. This behavior can be understood by considering the limiting case of b = 1. In said case, each element of the quantized matrix is either zero or the maximum codeword of the respective codebook. Since the off-diagonal elements are identical in both approaches and the diagonal elements are always rounded up, the quantized matrix and thus the quantization error are identical in both approaches.

Evaluation of Quantized Optimal Fusion and Quantized Covariance Intersection
The test data for this evaluation are generated by first drawing a zero mean Gaussian random vector x ∈ R n with covariance matrix C(x) = I. This vector represents the ground truth. Then, a random vector z = [z a , z b ] ∈ R 2n is drawn from a zero mean Gaussian distribution with covariance matrix LL , where L ∈ R 2n×2n has zero mean Gaussian elements with variance one. Finally, two correlated estimates x a = x + z a and x b = x + z b of x are computed. Optimal fusion, covariance intersection and their quantized versions are applied to the estimates x a and x b using their known conditional (cross-)covariance matrices

The mean squared errors MSE
, where x f is the fused estimate with f indicating the fusion approach, are computed by repeatedly generating test data, applying the fusion approaches, and averaging the squared Euclidean norm of the resulting estimate errors. The mean traces MTR f = E(tr(C f f )) of the error covariance estimates C f f , where f indicates the fusion approach, are also computed using averaging. If any quantization operation fails for any of the test data, the computed MSE and averaged trace are discarded.  Figures 6 and 7 show the same quantities but compare DD-CI to CI and MC-CI to DD-CI, respectively. In all figures, varying dimensions n and numbers of bits per codeword b are considered both with and without quantizing estimate vectors. The results are obtained by averaging over 1000 independent trials and using x max = 65.0 for the codebooks C m , C d , and C o .      From Figure 4, it can be seen that for DD-OPT the increase of the estimate MSE and of the averaged trace of the error covariance estimate is moderate except for small b. Moreover, the increase in the averaged trace is larger than the increase in MSE for all n and b. This is to be expected, since the quantization process retains conservativeness. The dimensionality of the test data has varying influence on the performance, depending on b and on whether quantized estimates are being used. Using quantized estimate vectors is seen to adversely affect performance. In fact, when quantizing the estimate vectors the error covariance quantization fails below five bits per codeword, due to the excessively inflated error covariance estimate produced by the estimate quantization. Figure 5 shows that the MC-OPT approach performs better than the DD-OPT approach in terms of the average trace of the error covariance estimate and in most cases also in terms of actual MSE. Larger n leads to larger improvements. The improved performance in terms of average trace is guaranteed by the theoretical results from Section 4.2. Note that there is no guarantee that the actual MSE reduces if the error covariance matrices are quantized more accurately. It can also be seen that there is an optimal number of bits per codeword and that for small b there is little to no improvement. The latter result is due to the phenomenon discussed in Section 6.1. The results for CI, DD-CI, and MC-CI displayed in Figures 6 and 7 exhibit the same general behavior as the results for OPT, DD-OPT, and MC-OPT discussed above, also in terms of theoretical guarantees. One notable difference is the relative improvement of actual MSE. In contrast to optimal fusion, the improvement does not seem to diminish for large n and small b.

Evaluation of Quantized Covariance Intersection in 2D Tracking Scenario
This evaluation scenario considers two sensor nodes that cooperatively track an object. The object is characterized by a discrete-time (nearly) constant acceleration model affected by the zero-mean white Gaussian noise term w k ∈ R 6 . The six-dimensional state x k ∈ R 6 consists of position, velocity, and acceleration in both the x 1 -and x 2 -direction. The corresponding matrices of the process model are given by where τ > 0 is the time step [46]. For the Monte Carlo simulation with 1000 runs, the initial states x 0 are drawn from Two sensor nodes a and b are simulated that observe projections of position and velocity according to The zero-mean white Gaussian measurement noise terms v a k , v b k ∈ R 2 have the covariance matrix R a = 0.5 1 0 0 0.1 , R b = 0.8 1 0 0 0.5 , respectively. Each sensor node uses a Kalman filter to compute estimates for 50 time steps. Sensor node a transmits its state and error covariance estimate to sensor node b at every 5th time step. Prior to transmission, it quantizes the estimates with the proposed method and codebook parameter x max = 30.0. Node b fuses its own estimate with the received one by employing CI. Every 11th time step, sensor node b quantizes and transmits its state and error covariance estimate to node a, which again fuses it with its own estimate using CI. The receiving node in both cases reinitializes its own estimate with the fusion result. Figures 8 and 9 compare DD-CI and MC-CI using different numbers of bits per codeword, against CI using 64-bit floating point numbers. The results for higher numbers of bits per codeword are close to the estimates obtained through CI with 64-bit floats. However, even a 5-bit quantization still yields reasonable results. Quantization using less than 5 bits per codeword leads to too conservative bounds on the error covariance matrices that cannot be encoded using the given codebook. The estimate MSE exhibits an initial transient peak. It is here, that the improved performance of MC-CI over DD-CI can be observed most clearly.

Conclusions
Available bandwidth and energy budget can be limiting factors for the data transmission capabilities of interconnected sensor systems. Algorithms for decentralized information fusion in networks require the exchange of estimates and, in some cases, covariance matrices. If covariance matrices need to be transmitted, they dominate the amount of transmitted data. In this paper, we have proposed two methods for the conservative quantization of covariance matrices, a method for the unbiased conservative quantization of estimates, and have applied them to optimal fusion and covariance intersection. The presented quantization approaches retain unbiasedness and conservativeness of the considered fusion methods while reducing the amount of data that must be transmitted. We have empirically demonstrated the effectiveness of the proposed covariance quantization methods, individually and in conjunction with fusion methods. Further improvements in performance could be achieved by using varying, possibly data-dependent quantization resolutions for subsets of the coefficients of the considered covariance matrices. Moreover, the proposed quantization schemes can also be applied to other sensor fusion algorithms like inverse covariance intersection. For future work, theoretical results concerning the convergence behavior of state and covariance estimates when using quantized data in a decentralized setting are of interest. Conservative vector quantization for covariance matrices is also worth consideration.