Multi-Sensor Adaptive Weighted Data Fusion Based on Biased Estimation

In order to avoid the loss of optimality of the optimal weighting factor in some cases and to further reduce the estimation error of an unbiased estimator, a multi-sensor adaptive weighted data fusion algorithm based on biased estimation is proposed. First, it is proven that an unbiased estimator can further optimize estimation error, and the reasons for the loss of optimality of the optimal weighting factor are analyzed. Second, the method of constructing a biased estimation value by using an unbiased estimation value and calculating the optimal weighting factor by using estimation error is proposed. Finally, the performance of least squares estimation data fusion, batch estimation data fusion, and biased estimation data fusion is compared through simulation tests, and test results show that biased estimation data fusion has a greater advantage in accuracy, stability, and noise resistance.


Introduction
Data fusion can utilize the information of multiple different data sources to complement each other, so the application of data fusion algorithms can suppress negative environmental factors and get more accurate results [1,2].Data fusion algorithms in combination with various algorithms can achieve better results [3,4], so they have been widely used in various fields [5,6].However, data fusion can only be applied to linear systems where all noises are independent of each other; if it is applied to a nonlinear system, it needs to be combined with the extended Kalman filter or the unscented Kalman filter [7].If noises are correlated, they need to be decoupled by linear changes [8].
It is necessary to first process the original data, which includes removing abnormal data, filtering system deviations, and lossless compression of data.Abnormal data is removed using the Grubbs criterion.Filtering bias uses bias estimation algorithms [9,10], and in distributed fusion, if data need to be restored in the data center, lossless compression is required [11].
Before conducting multi-sensor data fusion, each sensor can also use various algorithms to estimate the true value (called the estimation value), which can improve the fusion accuracy because the estimation value has higher accuracy than the measurement value.
When the measured physical quantity does not change over time, if there is sufficient understanding of the system, linear minimum mean square error estimation is the best method.This algorithm can find a linear function that minimizes mean square error through a single measurement value [12][13][14].Otherwise, it is best to use least squares estimation [15,16] and batch estimation [17].When the measured physical quantity changes over time, it can only be suitable to use Kalman filtering [18,19].
Adaptive weighted data fusion is a widely used multi-sensor data fusion algorithm with a unique optimal weighting factor that enables the fusion estimation value to have a lower estimation variance than the measurement variance of all sensors, but it requires prior knowledge of the measurement variance to run [20].However, its combination with a partially unbiased estimation algorithm (hereinafter referred to as unbiased estimation data fusion) not only further reduces the estimation variance but also removes the need for prior knowledge.
Unbiased estimation data fusion has two shortcomings.The first one is that unbiased estimation does not mean that the estimation results must be reliable.According to the Gauss-Markov theorem, the unbiased estimation variance has a lower bound, and when the lower bound itself is very large, even if the unbiased estimation variance is minimized, the difference between the estimation results and the true value (hereinafter referred to as estimation error) may not be within the acceptable range [21].The other one is that the optimal weighting factor is not necessarily optimal; the optimal weighting factor is calculated according to the measurement variance; a smaller measurement variance does not necessarily mean a smaller measurement error, so in some cases, the optimal weighting factor is not optimal.Therefore, this paper proposes a biased estimation data fusion algorithm that also does not require priori knowledge, which uses unbiased estimation values to construct biased estimation values with lower estimation errors and calculates the optimal weighting factor according to the estimation error to ensure its optimality.
This paper is organized as follows: Section 2 reviews some existing unbiased estimation data fusion algorithms, Section 3 describes the shortcomings of the existing algorithms and the feasibility of the improved methods, Section 4 describes the specific implementation of the algorithms in this paper, Section 5 compares the performance of the different algorithms, and Section 6 summarizes the entire content and provides prospects for the future.

Unbiased Estimation Data Fusion
When N sensors are measuring an invariant physical quantity and all the sensor nodes in a network measure the observed physical quantity at the same time, the measurement equation is: where x is the unknown quantity to be measured, µ is the true value of x, and k is the discrete time indicator, z(k) i is the measurement value of the ith sensor at the moment k, H i is the measurement matrix of the ith sensor (the H are often unit matrices), v(k) i is the measurement noise, v(k) i ∼ N 0, v 2 i and are independent of each other, and v 2 i be called measurement variance.Moreover, according to the additivity of the normal distribution, z(k) i ∼ N µ, v 2 i .In this system, only z(k) i is a known quantity; how one can use z(k) i to calculate the true value of µ? Since the measurement equations are designed for multiple sensors and the measurement variances may vary, adaptive weighted data fusion is often used to solve such problems, but adaptive weighted data fusion needs prior knowledge of the measurement variance of all the sensors, and in practice, the measurement covariance can be obtained by experiments.

Unbiased Estimation
Unbiased estimation refers to: if the mathematical expectation of an estimator is equal to the true value of the estimated parameter (this property is called unbiasedness), then this estimator is called the unbiased estimation of the estimated parameter.The mathematical expectation of the estimator constructed by unbiased estimation is equal to the true value of the estimated parameter, so using unbiased estimators instead of measurement values will not cause an increase in error.Unbiased estimation of sample variance can be used to replace the measurement variance of sensors, allowing adaptive weighted data fusion to operate without knowing the measurement variance of sensors.There are two commonly used unbiased estimation algorithms, as follows: Least squares estimation: Assuming that a particular sensor collects n data over a period of time, T, the estimation value, and σ2 , estimation variance, are given as follows: Batch estimation: Assuming that a particular sensor collects n data over a period of time, these n data are randomly divided into two groups, where the data in the ith groups are: T i , the sample mean, and σ 2 i , sample variance, are given as follows: According to the theory of batch estimation in statistics, the optimal estimation value T and the optimal estimation variance σ2 of these 2 sets of data are given as follows:

Adaptive Weighted Data Fusion
Adaptive weighted data fusion principle: assume that there are N sensors working on the same unknown quantity x (the true value is µ) is measured, and the measurement value of the ith sensor is M i .The measurement variance of the ith sensor is v 2 i , and the measurement data of the ith sensor follows N µ, v 2 i , then the unbiased estimator M is given as follows: Since M is an unbiased estimator, so that In order to find w i that minimizes σ2 (call this w i as the optimal weighting factor), the auxiliary function that is constructed according to the Lagrange multiplier method is given as follows: According to multivariate extremum theory, it is known that the solution of the following partial differential equation makes the estimation variance σ2 minimized, as follows: The simplified formula is given as follows: Thus, the formula for w i are given as follows: So, M and σ2 are given as follows: When adaptive weighted data fusion is combined with other unbiased estimation algorithms, it is possible to replace M i and v 2 i in Equation ( 15) with an unbiased estimation value and an unbiased estimation variance.The Figure 1 shows the flowchart for batch estimation data fusion, and the flowchart for other unbiased estimation data fusion is more or less the same.

Rationale
Statistical inference is a statistical method of inferring the whole through samples, and there are two kinds of errors in statistical inference: systematic error and random error.Unbiased estimation has no systematic error because of its unbiased nature, but there is still random error, so its estimation results are not necessarily reliable.
The optimal weighting factor is calculated based on the measurement variance alone, but a smaller measurement variance does not mean a smaller measurement error.Although the measurement variance of the data measured by different sensors varies, the mathematical expectations are all equal, so, in practice, it is easy to have a smaller measurement variance but a larger measurement error.

Biased Estimators
Theorem 1.The error of the unbiased estimator can be further optimized.
Proof of Theorem 1. Assuming that the true value of the measured unknown quantity x is µ, the sensor measurement data follows N µ, v 2 .E is an unbiased estimator constructed using a linear combination of the sensor measurement data, and letting the estimation variance of E be σ 2 , then E ∼ N µ, σ 2 , defining the estimation error est as follows: E(est) = 0, D(est) = σ 2 is easy to obtain, and est follows N 0, σ 2 .Assuming that the estimation error is tolerable when est ∈ [−z, z], the probability that the estimation error is tolerable is: If z = σ, we obtain: P(est ∈ [−σ, σ]) ≈ 68.3%, which means that for any σ, there is about a 31.7%chance that est is larger than the estimation standard deviation σ, so that even if σ is very small, est may be able to be reduced further.□ Theorem 2. A biased estimation value with a smaller error can be constructed from an unbiased estimation value.
Proof of Theorem 2. Assuming unbiased estimator A follows N µ, σ 2 , the biased estimator corresponding to A is defined as B and B = A + s, where s is called offset, which is a nonzero variable.The following results can be obtained: Because E(B) is not equal to the true value µ, B is a biased estimator.
Assuming that A 1 is a specific estimate of A, its estimation error is d 1 and d 1 = |A 1 − µ|.s 1 is the offset corresponding to A 1 and s 1 ∝ d 1 , and the biased estimator corresponding to A 1 is B 1 and B 1 = A 1 + s 1 .The estimation error of B 1 is e 1 and e 1 = |A 1 − µ + s 1 |, so: In other words, when the sign of s 1 is opposite to (A 1 − µ) and |s 1 | < 2 * d 1 , the estimation error of biased estimation is smaller than that of unbiased estimation.□

Optimal Weighting Factors
Theorem 3. In some cases, the optimal weighting factor is not optimal.
Proof of Theorem 3. Suppose there are two sensors S 1 and S 2 that simultaneously measure the unknown quantity x (the true value is µ).Their measurement variances are v 2 1 and v 2 2 , and they satisfy v 2 1 < v 2 2 .Their measurement data are z 1 and z 2 .For adaptive weighted data fusion, w 1 , the weights of S 1 , will always be greater than w 2 , the weights of S 2 , since Define m i , the measurement error (unlike est, m i only considers the magnitude of the error, not the sign of the error), as follows: , which implies that the estimation error of the fusion estimation value computed from optimal weighting factors is not minimal.So, when v 2 1 < v 2 2 but m 1 > m 2 , the optimal weighting factor is not optimal, and the probability is given as follows: It can be shown that P(m 1 > m 2 ) > 0, which indicates that the weights of any two sensors may not be optimal.The above proof uses measured values and measured variances, but replacing them with estimated values and estimated variances is equally valid.□ Theorem 4. It is optimal to compute the optimal weighting factor using the measurement error.
Proof of Theorem 4. Suppose there are N sensors measuring the unknown quantity x (the true value is µ), their measurement values are M 1 , M 2 • • • M n , and their measurement errors are m 1 , m 2 • • • m n ; moreover, there are 0 < m i < m i+1 .The optimal weighting factor calculated from the measurement error is w 1 , w 2 • • • w n , because m i < m i+1 , then w i > w i+1 , so the corresponding fusion estimation value M1 is as follows: In this paper, only one case is proved, which has two weights different from w 1 , w 2 • • • w n , and the proof can be easily generalized to other cases.
Counterfactual: suppose that when the optimal weighting factor is v 1 , v 2 , w 3 • • • w n and v 1 < v 2 , the fusion estimation value M2 is more accurate than M1 , where M2 is given as follows: The common term of M1 and M2 is M3 , and M3 = ∑ N i=3 w i M i , E M3 = αµ where α = ∑ N i=3 w i .So M1 and M2 become this form: Because for M1 and M2 , then we get the following equation: Because M3 + (w 1 + w 2 )µ = µ, so the larger |w 1 m 1 + w 2 m 2 | and |v 1 m 1 + v 2 m 2 | are, the less accurate M1 and M2 are.
Because m 1 > 0 and m 2 > 0, so The derivative of f (γ) is given as follows: So, the larger γ is, the smaller f (γ).Because w 1 > w 2 and v 1 < v 2 , so w 1 > θ 2 , v 1 < θ 2 , and M1 is more accurate than M2 .Contradicting the hypothesis, the optimal weighting factor calculated from the estimation error is better.The above proof uses measured values and measured variances, but replacing them with estimated values and estimated variances is equally valid.□

Algorithm Implementation
The algorithm in Section 4.1 only uses measurement data from a single sensor, so it can be implemented in a distributed manner.The algorithms in Sections 4.2 and 4.3 include data fusion algorithms, so they can only be implemented using centralized methods.

Unbiased Estimation
Suppose that N sensors simultaneously measure the unknown quantity x (the true value is µ), and all the sensor nodes in a network measure the observed physical quantity at the same time.For any of the sensors, if it collects 10 data over a period of time, in order from smallest to largest: a 1 , a 2 , • • • , a 10 , then construct the unbiased estimators as follows: Assuming that the sensor follows N µ, σ 2 , theoretically, Because a 1 ~a10 are data sorted in order of size, these estimators have lower estimated variances.Through experiments, it was found that the estimated variances of E 1 ∼ E 3 are approximately σ 2 7.5 , σ 2 8 , σ 2 9 .Therefore, E 1 ∼ E 3 is independent of each other and has an estimated variance smaller than the measured variance.
Ê, the mean of estimation values, and σ 2 i , the estimation variance, are given as follows: An adaptive weighted data fusion algorithm is used to obtain x, the unbiased estimation value, and σ2 , the unbiased estimation variance.And the specific formula is as follows: Because E 4 is the least squares estimator, theoretically, the accuracy of the fused estimate x will be higher than that of the least squares estimate.
If the number of measurement data collected by a sensor within a certain period is m and m is not equal to 10, the unbiased estimator can be constructed according to the following rules: 1.
Each group consists of at least two measurement values, and each group does not use duplicate measurement values.The unbiased estimator corresponding to each group is the mean of the measurement values within the group.

2.
When m is even (m = 2k), for any measurement value a i , it is necessary to ensure that a i and a m−i are in the same group.

3.
When m is odd (m = 2k + 1), for any measurement value a i , it is necessary to ensure that a i and a m−i are in the same group.In addition, it is also necessary to ensure that a k , a k+1 , and a k+2 are in the same group.

4.
Add an additional group that includes all measured values, such as E 4 in Equation (30).

Biased Estimation
|s i |, the size of the offset s i , needs to be determined first.For the ith sensor, define its unbiased estimation value and unbiased estimation variance as A i and σ 2 i .The fusion estimation value Â and the fusion estimation variance of these unbiased estimation values σ2 are computed by adaptive weighted data fusion algorithm.Define g i as the degree of deviation of the ith unbiased estimation value from the true value.And g i is given as (use Â replaces true value) follows: Define ω as the coefficient of |s i |, where the size of s i , and the value of ω i are deter- mined by g i .When g i < 0.1, ω i = 0, when 1 > g i > 0.1, ω i = 0.5.Otherwise, ω i = 1.ω is used to control the size of the offset and avoid excessive offset.
|s i | is given as follows: Then, the sign of s i needs to be determined.s i takes a negative sign when A i > Â and a positive sign otherwise.So B i is given as follows:

Data Fusion
The application of the adaptive weighted data fusion algorithm requires measuring variance, which is replaced by estimated variance in unbiased estimation data fusion.In biased estimation data fusion, the estimated variance is calculated based on the estimation error, and the design concept of the calculation formula comes from the Kalman gain.Define e i as the biased estimation error of B i , and e i is given as follows: The true value is unknown; use Â instead of the true value.So e 2 i is given as: Then, through 100,000 experiments, calculate e 2 i , mean of e 2 i , and g i , mean of g i , for different g-value intervals.The e 2 i and g i over the three g intervals are shown in the Table 1.
Table 1.Mean estimation error and mean g for different g intervals.
e 2 i and g i can be represented as follows: , which represents the biased estimation variance of B i , is given as follows: Thus, w i , the optimal weighting factor, and M, the fusion estimation value, are given as follows: The flowchart of biased estimation data fusion is shown in Figure 2:

Results
The simulation test is divided into two parts as follows: 1. a normal distribution white noise test (all noise follows a normal distribution), and 2. A uniform distribution white noise test (environmental noise is uniformly distributed white noise).The test items for the two tests are shown in Table 2: The testing mode is as follows: ten rounds of experiments are conducted for each combination of test items, and ten tests are conducted for each round.The average of the ten test results is used as the result of this round of experiments.Three sensors are used for each test, with 10 measurement data used for each sensor.
The test indicators are mean relative error (mre) and mean square error (mse).mre is used to compare the average performance of the three algorithms, and mse is used to compare the stability of the three algorithms.Their calculation formulas are as follows: where i represents the number of test rounds and m ij represents the estimated value of the jth test in the ith round.Relative improvement (ri) is used to compare the performance differences between two algorithms, and the calculation formula for ri is as follows: where V α and V β represent the same indicator value (e.g., mre) for algorithms α and algorithms β.And, if the indicator has multiple values, calculate the mean of the indicator first, and then calculate ri.If ri is greater than 0, it means that algorithm β has improved ri compared to algorithm α.

Normal Distribution White Noise Test
The test results are as follows (Figures 3-5 show the test results when the measurement noise is 0.25, and the remaining result graphs are Figures A1-A6 in Appendix A): From Table 3, it can be seen that the mre of biased estimation data fusion (BED) decreased by an average of 9.25% and 10.26% compared to least squares estimation data fusion (LSD) and batch estimation data fusion (BTD).Therefore, BED has higher fusion accuracy.
From Table 4, it can be seen that the mse of BED decreased by an average of 16.24% and 20.28% compared to LSD and BTD.Therefore, the bed has higher stability.
And, as the noise variance increases, the average increases in mre and mse of LSD are 0.297% and 0.036, the average increases in mre and mse of BTD are 0.306% and 0.038, and the average increases in the mre and mse of BED are 0.269% and 0.03.So, mre and mse of BED decreased by an average of 11.59% and 18.85% compared to LSD and BTD, respectively.Therefore, BED has higher noise resistance.

Uniform Distribution White Noise Test
The test results are as follows (Figures 6-8 show the test results when the measurement noise is 0.25, and the remaining result graphs are shown in Figures A7-A12):   From Table 5, it can be seen that the mre of BED decreased by an average of 8.56% and 6.58% compared to LSD and BTD.Therefore, BED has higher fusion accuracy, but the advantage of mre is reduced by 22.39% compared to normal distribution white noise testing.
From Table 6, it can be seen that the mse of BED decreased by an average of 17.42% and 21.01%compared to LSD and BTD.Therefore, BED has higher fusion accuracy, but the advantage of mse is increased by 5.23% compared to normal distribution white noise testing.

Figure 1 .
Figure 1.Flowchart of batch estimation data fusion.

Table 2 .
The test items for the two tests.

Table 3 .
mre of all algorithms when noise follows normal distribution.

Table 4 .
mse of all algorithms when noise follows normal distribution.

Table 5 .
mre of all algorithms when noise follows uniform distribution.

Table 6 .
mse of all algorithms when noise follows uniform distribution.