An Integrated Model for Robust Multisensor Data Fusion

This paper presents an integrated model aimed at obtaining robust and reliable results in decision level multisensor data fusion applications. The proposed model is based on the connection of Dempster-Shafer evidence theory and an extreme learning machine. It includes three main improvement aspects: a mass constructing algorithm to build reasonable basic belief assignments (BBAs); an evidence synthesis method to get a comprehensive BBA for an information source from several mass functions or experts; and a new way to make high-precision decisions based on an extreme learning machine (ELM). Compared to some universal classification methods, the proposed one can be directly applied in multisensor data fusion applications, but not only for conventional classifications. Experimental results demonstrate that the proposed model is able to yield robust and reliable results in multisensor data fusion problems. In addition, this paper also draws some meaningful conclusions, which have significant implications for future studies.


Introduction
Multisensor data fusion is a technology to enable combining information from several sensors into a unified result [1]. In multisensor data fusion, the information to be handled is always random, vague, imprecise and heterogeneous. The developed data fusion framework needs to be able to eliminate the OPEN ACCESS functions or experts maintain their own positions and we've found that the synthetic BBA is always more reliable than independent BBAs. However, how to build reasonable BBA synthetizing algorithm becomes another problem. The Jousselme distance [21] is a widely accepted way of measuring distance between two evidences bodies and it is able to reflect the conflict degree of evidences properly. Hence we develop the synthetizing algorithm by utilizing Jousselme distance to get synthetic BBAs, which represents a comprehensive knowledge of the information source by combining the views of different BBA functions or experts.
The decision making mechanism is also vital to obtain high accurate fusion results. In the perspective of dimensionality reduction, the unified BBA can be regarded as the comprehensive presentation of the original multisensor data. It's much more easily to make decision with the unified BBA rather than the original data. Traditionally, the Belief Transfer Model (TBM) is always used to convert the final BBA to Pignistic probability. However, the transferred Pignistic probability essentially depends on the singleton classes while the compound classes have no decisive influence on the final decision. In some conditions when the belief assignments of the compound classes are larger than the singleton classes, the accuracy of TBM is suspicious. Thus a decision making mechanism based on Extreme learning machine (ELM) is presented to solve the decision making problem. ELM [22] is a fast and easy implementing ANN without iterated operation, and to our knowledge, the accuracy of ELM is no worse compared to any other ANNs. Thus a decision making mechanism based on ELM is presented to solve the decision making problem.
A systematic multisensor data fusion model is built up in the basis of the above three main improvements. The framework will be illustrated in detailed, which includes four steps: BBA construction, BBA synthesis, combination of evidences and decision making. Experimental results and analysis on the IRIS data set and Diabetes Diagnosis data set will be illustrated to show the performance and result accuracy of the proposed algorithm.
The remainder of this paper is organized as follows: Section 2 introduces the preliminaries of thee Dempster Shafer evidence theory. Section 3 illustrates the proposed method in detailed. The experiments along with the observations are provided in Section 4. Conclusions and discussions are finally presented in Section 5.

Preliminaries of Dempster Shafer Evidence Theory
Dempster Shafer evidence theory (DSET) is an extension of the classical probability theory. It is a flexible evidential reasoning approach for dealing with the uncertainty in multisensor data fusion. Let Ω = ( , ⋯ , ) be a finite non-empty set and Ω is mutually exclusive and exhaustive. Ω is called the frame of discernment, the corresponding power set is 2 , which is composed by all possible subsets of Ω. The mass function of 2 is defined as a function m: 2 → [0, 1] and it satisfies the following property: where Ø denotes the null set, ( ) is called the basic belief assignment (BBA) of subset A. The numerical value of ( ) can be interpreted as the support degree of proportion A belongs to Ω with all relevant and available evidences. A focal element is a subset A with non-zero mass assignment and we call ( , ( )) a piece of evidence. For a given element of Ω, the belief function and plausibility function of are denoted by ( ) and ( ), respectively. ( ) is the total mass of elements belonging to and ( ) is the maximum total mass of elements that may be distributed in . Therefore, they are defined as the following expressions: where ( ) and ( ) are the lower and upper belief of hypothesis , respectively. ( ) and ( ) satisfy the following relation: where ̅ is the complementary set of . When ( ) = ( ), must be a singleton class. For an arbitrary focal element in Ω , its BBA distributes in an explicit measure of a belief interval [ ( ), ( )]. In DSET, The Dempster combinational rule can be used to fuse all independent evidences into one. It is expressed as: where ⨁ denotes the combinational operator.
designates the focal element regarding to data source .
indicates the conflict among the sources to be combined. After combining, a Pignistic probability can be obtained by using the Transfer Belief Model (TBM) and a typical transfer formula is defined as [17]: where Bet ( ) is the transferred Pignistic probability regarding to . At last, a decision can be made by choosing the class with maximum Pignistic probability as the result of the multisensor data fusion process.

Overview
The main goal of the framework is to guarantee accurate and robust decision-level fusion results, even in situations with high complex and nonlinear data sources. Thus, robustness is the principle of the algorithms in the entire process and it is guaranteed by three aforementioned reliable and robust measures: BBA constructing, BBA synthetizing and decision making. The process of the data fusion model is divided into four steps, as shown in Figure 1.
Let Ω = ( , ⋯ , ) be the frame of discernment containing c elements, = { , ⋯ , } are the BBA functions or experts, = { , ⋯ , } are a set of sensors. In step 1, there are functions or experts and sensors. Every BBA constructing function/expert will generate a BBA regarding to a sensor, hence there are × BBAs obtained in this step 1.
Step 2 is the BBA synthesis process involving a developed synthetizing algorithm, based on which the × BBAs will be decreased into synthetic BBAs. The combination calculation is conducted in step 3. Using the Dempster's combinational rule, the synthetic BBAs will be combined into one unified BBA.
Step 4 is the decision making process. With a trained ELM, the unified BBA will be transferred into the final output, which is easy for making decisions.

BBA Constructing Function Model
For decision fusion, local classification or decision results are essential before fusing them into a unified one. In scenarios when there is a large amount of raw data, directly uploading them to the cluster node (or sink node) is very costly. However, uploading local decision results will greatly reduce the amount of data transmission, and greatly reduce energy consumption, which is significant for distributed sensor networks, especially wireless sensor networks (WSNs). Therefore, developing a BBA constructing algorithm is necessary to obtain the local classification results.
In expert systems, BBAs are constructed based on human decisions. This paper focuses on constructing a BBA from data sources. Distance is a widely used metric measure of the similarity of an object and a class. However, each kind of distance maintains its own views. Let be the object to be classified. The training set is Γ = {( , ), ⋯ , ( , )} and ( = 1, . . , ) is the training class respect to . There are various kinds of distances, such as Euclidean distance, Mahalanobis distance, Manhattan distance, to name a few. Here, we set distance expressed as = { , … , }, where ( = 1, … , ) is the th distance definition of D.
Firstly, the distance between an object and each sample class must be calculated. The general distance can be expressed by the following expression: dis ( || ||), 1,..., , 1,..., where is the distance of and according to the th distance definition in D. Sample set can be the whole given sample set or the nearest objects around the object in the given sampe set. If an object belongs to a class , then the mass should be assigned to two subsets of Ω, and they are { } and Ω. Then the assigned BBA (• | ) can be defined as: where is the assigned BBA of according to the th distance definition in D. α ∈ (0,1) is a positive constant and is a monotonically radial basic function decreasing form (0) = 1 to lim → ( ) = 0. It can be postulated as: where is a positive constant value. In [23], a method for optimizing parameters and has been described by using of KNN method. Here we give a more exact form of and it is defined as: where d υ is the mean value of distance between the object and each class in Ω. can be changed to adjust the discrimination degree of the obtained BBA. After considering each pattern, the BBA function to be calculated as: where is the BBA with respect to function , (• | ) is the BBA calculated by Equations (10)- (12). With the k different BBA functions, for a hypothesis A, the generated corresponding BBAs are { | ( ), … , ( )}. Next step involves composing the synthetic BBA from the k BBAs generated by different functions.

BBA Synthetic Algorithm
Correct BBAs are the prerequisite for applying DSET in multi-source information fusion. In reality, there is no universal applicable BBA-constructing function. Different kinds of BBA functions are based on different theories, and each of them maintains its own views. For one information source, BBAs generated by different functions are always different. Dempster's combinational rule requires mutually independent evidences, and it cannot be applied to combining BBAs obtained from the same source. Hence, a new method for combining these BBAs into a comprehensive one that achieves a wide range of aspects of uncertainty is significant for obtaining reasonable synthetic BBAs. According to reference [21], the distance of evidence can be defined as: where and are evidence vectors. is a positively defined matrix and its coefficients can be obtained as: In order to transform the reliability of a function to metric, we define the credibility of a BBA as: where is the credibility of BBA with respect to mass function . The expression above means that a BBA with further distance will be assigned with a low credibility value and compared with other BBAs, and it is not reliable. The sum of credibility is: (20) Thus, the normalized can be calculated by: The sum of is ∑ = 1. Then, the synthetic BBA can be calculated by the following weighted sum method: where ( ) denotes the hybrid BBA with respect to hypothesis A, is the weight of BBA with respect to function . The value shows the reliability of the corresponding function. A believable function will be assigned a larger credibility compared to other functions. The final obtained BBA will be regarded as the local decision and uploaded to the fusion center.

Combination of the Synthetic BBAs
The Dempster's combinational rule has been widely accepted as a method for combining various evidences into a unified one. With the obtained n evidences, the unified evidence can be defined as: where denotes the conflict degree, final unified BBA contains the comprehensive knowledge of all information sources. Note that other combinational rules, including the method proposed in Section 3.3., can be applied instead of the Dempster's combinational rule.

ELM Based Decision Making Model
Unlike the traditional TBM transfer mechanism, this paper uses a decision-making algorithm based on ELM. Huang and his colleagues [24] proposed extreme Learning Machine (ELM). This single feed-forward neural network (SLFN) with a fast learning speed has both universal approximation and classification capabilities [25]. In ELM, the input weights of the hidden neurons can be generated randomly, and they are independent from applications. Thus, ELM requires no iterative calculation to determine the input weights, which is a big advantage of the traditional artificial neural networks.
As shown in Figure 2, ELM can be directly applied to conduct the decision making process. For N arbitrary distinct samples ( , )( = 1, … , ) , where input data = ( ( ), … , ( ), ( )) ∈ and output data = ( , … , ) ∈ , the output of the network with L hidden neural nodes can be expressed as: where is the network output of , = [ , … , ( ) ] is the input weight matrix between the input neural node and the jth hidden neural nodes, = [ , … , ] is the output weight matrix between the ith hidden nodes and output nodes, is the bias threshold respect to the ith hidden node. and are generated randomly and are independent from any specific applications.
Let ( ) be the activation function. The above N equations can be written as: with: where is hidden neural output matrix. The output matrix can be calculate by: where † H is the Moore-Penrose generalized inverse of matrix . To improve the robustness of the generalization performance, the above expression can modified as [22]: The ELM algorithm can be designed in 3 steps: (1) Assign the input weight matrix and bias b randomly; (2) Calculate the output matrix of the hidden neural nodes; (3) Calculate the output weight matrix . After training the ELM, it is able to perform specific functions, like approximation and regression.
Given new observed unified BBAs with masses = { , … , } and the respective m-th one is , = 1, … , , the output of the ELM is: where is the output matrix and where is the final decision with respect to g-th column in . Note that other ANN algorithms, like BP and RBF, also can be applied in this step, the decision making policy is the same as ELM.

Experimental Results
This section reviews the experiments that are performed to test the performances of the proposed data fusion algorithm. In the foregoing experiments, we simulate our model in three steps: First, we use the IRIS dataset to illustrate the performance of the proposed mass construction algorithm. Second, we use the Diabetes Database dataset to train our model, and then, we collect data from the people whose age range from 40 to 60 by human body sensors, and predict their health condition. In addition, the last experiment applied the proposed framework in vehicle type classification. Introductions about these tests and their corresponding results will be described in the following sections.

Experiment on IRIS Data Set
In this experiment, we use the IRIS data collected by statistician Fisher [26] to simulate the algorithm.
where ( ) denotes the accuracy rate with respect to class , N is the total BBA number of the test sets. is the number of accurate BBAs. If the object to be classified belongs to , ( ) should be larger compared to any other singleton classes. The accuracies of the results are calculated by the 7 source data set. The accuracies of the BBAs for different sources and BBA functions are shown in Table 1.   Figure 3 shows partial BBAs calculated using the Euclidean Distance. The horizontal axis denotes the number of test objects and vertical axis represents the mass assignments of each object. The sum of each mass equals 1. The first 10 objects are Se, next 10 objects are Ve, and the last 10 objects are Vi. In Table 1, the syn-BBA denotes the BBAs synthesized from the four BBAs with different distance definitions. We make the following observations: (1) The proposed mass construction method is able to build BBAs from observed data and information accurately and effectively. With a distinguishable data set, the mass of the compound classes will be much lower compared to the sum of the singleton classes, and the belief assignment with respect to the class to which the object belongs will always be much larger compared to other classes. As shown in (b), (c), (d) and Table 1, the accuracies of PL, (SL, SW), and (SL, SW, PL, PW) calculated by the Eu are 96.67%, 83.33%, and 98.33%, respectively. While given an ambiguous data set, as shown in (a), the masses of each object will likely to be confusing, significantly decreasing the accuracy rate and belief assignments of Ω (Se|Ve|Vi), with the accuracy being only 58.33%.
(2) Higher dimensionality data enhances the accuracy of BBAs. With the same BBA function in (a), the BBAs obtained from SW have low accuracy (58.3%), while the accuracy is much higher in (c) (83.33), where the BBAs are calculated by (SL, SW). In (d), the BBAs' accuracy is 98.33% and the dimension is 4. In a more dimensional space, the boundaries of the plants can be classified more clearly. Generally, higher dimensional data brings more accurate BBAs, as the BBAs accuracies of (SL, SW, PL, PW) are higher compared to others, except for the Ch-BBA function. (3) The synthesized BBA is able to comprehensively illustrate the data source. When assigning belief for an information source, different methods hold their own views and their results may different, too. Thus, the method of synthesizing different BBAs into a unified comprehensive BBA is able to get that the result reflecting the views of the majority.

Experiment on Diabetes Data Set
The Pima Indians Diabetes Database (available at [27]) was developed at the Applied Physics Laboratory, Johns Hopkins University. The eight indices in the data represent the diagnostic signs of diabetes according to World Health Organization criteria. The database comprises the data from 768 women over the age of 21 residing in Phoenix (Arizona, USA). All examples belongs to either positive (denotes by 1) or negative (denotes by 0) class. All input values are within [0, 1]. To test the proposed method, 75% (576) and 25% (192) samples are chosen randomly for training and testing at each trial, respectively.

Experimental Results with Changing α
To get a better understanding of the proposed algorithm, additional experiments are conducted. In another trial, we set different value of to find out its influences on BBA constructing and final result accuracy. The object is selected randomly from the test data set. Three classes of the power set are {diagnosis, Not Diagnosis, All }, where 'All' denotes the compound set. Parameter is set monotonically, increasing from 0 to 1. The values are used to calculate the corresponding BBAs. The BBA obtained with different α is shown in Figure 4. The accuracies of final unified BBA and final decision are shown in Figure 5.  In Figure 4, the BBA is obtained from the same object belonging to the diabetes diagnosis class. In Figure 5, the accuracy rate of BBA and decisions are both determined based on the training data set and testing data. From Figures 4 and 5, we can make the following conclusions: (1) Parameter will change the belief assignments of each subset in 2 . When is closer to 0, (Ω) ≈ 1 and the belief assignments of singleton class are close to 0. With an increasing , (Ω) decreases to a very low level while the belief assignments of singleton class increases to high levels. The gap between them will gradually diminish. However, it is strongly advised to set > 0.7 to get a high differentiation degree for the BBA.
(2) Parameter has no influence on the average accuracy of the BBA and decisions. In Figure 4, the BBA is larger for 'Diagnosis' compared to 'Not Diagnosis'', regardless of the value of . In Figure 5, the accuracies of unified BBAs in training data set and testing data set are 68.7 and 66.7, respectively. The accuracies of final decision results in training data set and testing data set are about 79% and 78%, respectively. Note that the decision accuracy fluctuation is caused by the instability of ELM. The stable accuracy rates illustrate that has no influence on the accuracy of the BBA and decision accuracy.

Experimental Results of Accuracies
Many algorithms and methods, such as BP neural network [28], Support Vector Machine (SVM) [29], ELM [22][23][24], and others, can use the database to get the classification results. To obtain a clear picture of the performance, we compare different algorithms, including BPNN, SVM, ELM, evidential data fusion with Pignistic transfer method (DSET-P) and the proposed DSET-E.
In this test, the parameter C of SVM algorithm is set at 10, and its accuracy results are obtained with 317 support vectors in average. All hidden nodes of the BPNN and ELM are 20. The DSET-P and DSET-ELM use the same process of calculating the unified BBA, thus their final BBAs are the same.
is set at 1 and the BBA function uses only the Mahalanobis distance because we have found that it has a high accuracy in constructing masses. In DSET -P, the unified BBA is converted to probability using the Pignistic transferring method in [19]. While in DSET-E, unified BBA is the input of a trained ELM, which is used to transfer the BBA to results and make decisions.
As shown in Table 2, all accuracies are calculated by the average accuracy results of repeating 100 times. From Table 2, we have the following observations: (1) The proposed DSET-E algorithm performed well in classifying problems. Compared with other algorithms, the proposed DSET-E algorithm obtains a testing rate of 78.14%, outperforming other methods, though the improvement is not sufficiently distinctive. Compared with the DSET-P method, the accuracy increases from 66.67% to 78.14%, which is sufficient to prove that the whole algorithm is reasonable and effective. (2) The traditional Pignistic transferring method is not a desirable algorithm in evidential data fusion, especially in situations with high complex and nonlinear data sources. The accuracy of DSET-P is 66.7%, which is much lower compared to the accuracies of other methods. Actually, in this problem, the highly complex source data are difficult to distinguish, and the final unified BBA has a low accuracy when calculated by Equation (30). The final decision made by Pignistic probability transferring model has the same accuracy as the unified BBA, which is 66.67%. With the same final unified BBA, the DSET-E decision accuracy rate of the test objects increases by 11.74%.
(3) The belief assignments of the compound classes are also important in decision-making.
In traditional Pignistic probability transferring model, the belief assignments of a compound class are carved up by proportion, which makes no difference in decision-making. Apparently, it is not suitable for all conditions. The belief assignments of a compound class show uncertainty, making it difficult to decide to which class it should belong. It should be allocated to other singleton classes according to the reality situations.

Experiment on Vehicle Type Classification Data Set
In this experiment, a data set for vehicle type classification (the data set can be downloaded at [30]) is used to test the proposed algorithm. In the test, 23 wireless distributed sensor nodes are used to classify the types of the vehicles. When a vehicle is passing by, the nearby sensor nodes are able to record the signals in three modalities: acoustic, seismic and infrared. We use the recorded acoustic and seismic signals to classify two possible vehicles: Assault Amphibian Vehicle (AAV) and DragonWagon (DW). Before classification, feature vectors must be extracted from the raw signals. A detailed introduction of the feature extraction method can be found in [31].
The experiment includes two parts: part one is the classification based on the whole data set. In this scenario, universal classification algorithms can be directly used and their classification results will be presented; part two is the classification conducted in a distributed multisensor data fusion way. Experimental results of local classification and data fusion will also be presented in the following sections.
In the first test, five classification methods are used, including k-NN, ELM, SVM, DSET-P and DSET-E. The sample set is consisted by 535 feature vectors, which are randomly selected from the provided whole feature data set. Among the sample set, 277 feature vectors belong to vehicle AAV, the rest are DW. The valid data set has 236 feature vectors, which are also randomly selected from the whole feature data set. The classification results are given in the following Table 3. As shown in Table 1, five classification algorithms are used for local classification. The parameter k in k-NN method, hidden nodes number of ELM and parameter C of SVM are set as 15, 100 and 1, respectively. The mass construction used in DSET-P and DSET-E is the method proposed in Section 3.2 and parameter is set as 0.85. From Table 3, we can conclude that the proposed DSET-E has a more reliable result than other methods, which is consistent with the results of Table 2.
Then we conducted the task in a multisensor data fusion scenario, in which each sensor node has its own sample data set collected by itself. Since the energy and bandwidth of wireless sensors are strictly limited, uploading the raw data to the sink node is unpractical. Therefore, a local classification in the sensor node needs to be conducted and then these local results are uploaded to the fusion center for final decision by data fusion algorithms. Except for DSET-P and DSET-E, the algorithms used in Table 3 cannot be used for classification. The local classification accuracies the final data fusion accuracies are shown in Tables 2 and 3, respectively.
As shown in Tables 4 and 5, there are 11 sensor nodes used for the collaborative data fusion task. Both test set have 1177 vector samples, in which 615 vectors belong to AAV, the other 562 vectors are DW. From Tables 3 to Table 5, it can be concluded that the proposed method has good performance of in multisensor data fusion applications, because it is able to get reliable and robust results. In Tables 2 and 3, the accuracies of DSET-E are always higher than the accuracies of other classification methods. In Table 4, the average classification accuracies of AAV and DW are 68.13% and 59.83%, respectively. However, the fusion results are greatly improved by both the DSET-P and DSET-E methods. The final average accuracy of DSET-E is 75.32%, which increased by 2.52% compared to DSET-P. And also, the average accuracy of DSET-E is close to the results of DSET-E in Table 3, which equals to 76.62%. These results also show that the proposed BBA constructing function is reasonable and effective for the DSET based data fusion framework.  To conclude the experimental section, the three tests prove that the proposed mass constructing and decision making method is reasonable and practical. The classification results with other universal classification methods (i.e., k-NN, BPNN, ELM and SVM) prove that the proposed method is able to obtain robust and reliable results. The experiment on vehicle type classification demonstrates that the proposed method has high performance in multisensor data fusion applications, but not only in practice for conventional problems. Therefore, the proposed model is robust and reliable in mutisensor data fusion.

Discussion and Conclusions
In conclusion, this paper proposed a systematic multisensor data fusion model to obtain robust and high-precision fusion results. DSET is used to provide a flexible way to combine multiple information sources into a unified one, and ELM is applied to make decisions. The combination of the two theories achieves a greater capacity in multisensor data fusion. Compared with the existing methods, the proposed framework gives more flexibility and rationality in constructing reasonable BBAs from data sources. Additionally, the framework is able to make decisions according to the actual situation. Moreover, it is stable and easy to implement. With adequate training samples, the algorithm is able to reason and learn and make decisions in a coherent process. The drawback is that it needs to be trained and the computation complexity is greater than that of traditional DSET-P, though ELM is 'extremely' fast in ANNs. However, it should be clear that the proposed method is not intended to achieve great improvement over other classification algorithms, but rather it is aimed at building a robust and reliable data fusion model practical for multisensor applications. Thus, the accuracy improvement is not the key point of the proposed model. Though the improvement of training accuracy and testing accuracy is not significant, the results still prove that the proposed is robust and reliable.
It is necessary to emphasize that in Section 3.3, 'minority is subordinate to the majority' underlies the synthetizing algorithm, which is useful when the performance of the adopted BBA functions or experts are unknown to us. It can be modified as 'outstanding is preferred', which means the weights of the BBA functions or experts are assigned according to their own accuracies. In Section 3.5, the activation function is the 'radbas' function. Other function types are also feasible. In Section 4.2, the experimental results indicate that a decision-making method exists after the combinational step, although it is not the Pignistic transferring method. If we could improve the conventional Pignistic transfer method, DSET could be greatly promoted in real applications. Future work may involve the following: (1) discovering a new decision-making method to get rid of the low accuracy limitation of the existing Pignistic methods; (2) developing a DSET-embedded ELM that is able to deal with pattern recognition or classification problems; and (3) exploring more inherent laws in the transducer mechanism and probability.