A Reliability-Based Multisensor Data Fusion with Application in Target Classification.

The theory of belief functions has been extensively utilized in many practical applications involving decision making. One such application is the classification of target based on the pieces of information extracted from the individual attributes describing the target. Each piece of information is usually modeled as the basic probability assignment (BPA), also known as the mass function. The determination of the BPA has remained an open problem. Although fuzzy membership functions such as triangular and Gaussian functions have been widely used to model the likelihood estimation function based on the historical data, it has been observed that less emphasis has been placed on the impact of the spread of the membership function on the decision accuracy of the reasoning process. Conflict in the combination of BPAs may arise due to poor characterization of fuzzy membership functions to induce belief mass. In this work, we propose a multisensor data fusion within the framework of belief theory for target classification where shape/spread of the membership function is adjusted during the training/modeling stage to improve on the classification accuracy while removing the need for the computation of the credibility. To further enhance the performance of the proposed method, the reliability factor is deployed not only to effectively manage the possible conflict among participating bodies of evidence for better decision accuracy but also to reduce the number of sources for improved efficiency. The effectiveness of the proposed method was evaluated using both the real-world and the artificial datasets.


Introduction
An integral component of an effective and efficient defense system to aid the commander in situational awareness of the battlefield is target classification. The task of classifying targets into a predefined set of classes depend on a group of features or attributes that characterize the different categories. Sensors such as radar, infrared (IR) camera, and electronic support measure (ESM) are often deployed to acquire relevant information regarding the different attributes [1,2]. Attributes may include signature and kinematic features such as speed, acceleration, altitude, radar cross-section (RCS), shape, length, transmission frequency, pulse repetitive frequency interval (PRI) [1,2].
Classification of a target requires data about the different attributes. The information extracted from the data is usually characterized by uncertainty due to ambiguity, imprecision, vagueness, incompleteness, noise, and conflict [3][4][5]. This uncertainty corrupts the quality of the information fusion system. Consequently, how to effectively and efficiently deal with uncertainty has become a topic of interest among researchers in the field of information fusion systems. Multisensor data fusion can effectively address this problem. Dealing with uncertainty through data fusion provokes three fundamental problems of (1) representation of uncertain information (2) aggregation of two or more pieces of uncertain information and (3) making a reasonable decision based on the aggregated pieces sensor usually a direct consequence of sensor defect and poor calibration, (2) improper belief function model due to poor estimation of the likelihood function and inappropriate selection of metric for the distance-based method, and (3) Large number of information sources. The membership functions are used to estimate the likelihood of the various classes; this means improper characterization of the fuzzy membership functions may induce conflicts. The idea behind this study is that by adjusting the spread of the membership function, we can improve on the decision accuracy of the method proposed in [13].
In this study, we propose a reliability-based multisensor data fusion, which is coined as reliability-based Dempster Shafer rule of combination (RDSRC), within the framework of the belief theory for target classification where shape/spread of the membership functions is adjusted during the training/modeling stage. Only the reliability is used to assign weights to the various information sources. The proposed method does not utilize credibility. Since every attribute (information source) of the unknown target produces a local declaration in the form of a belief function, calculation of the credibility for each belief function for every query target will incur additional overhead costs of the reasoning process. Besides the computational requirement, the credibility based on distance or similarity measure is with the assumption that the majority of the belief functions are reliable.
The proposed method is closely related to the work in [24]. However, they are different in the following respects: in this approach, we use triangular membership functions to model the historical data regarding the different attributes of the various target classes as opposed to the Gaussian membership function used in [24]. The reliability in this approach was calculated using an evaluation criterion based on the concordance index, while the Jaccard index was utilized in the determination of the static reliability in [24]. The spread of membership function is adjustable in our proposed method while it is fixed in [24]. Although the tuning of the spread of the membership function in the proposed method introduces additional overheads, it is only incurred offline. In [13,24], credibility/dynamic reliability is calculated at the reasoning phase, which creates an extra cost for on-line identification. The method of generating the BPA is different from the one used in [24]. This work is basically an extension/modification of [13]. The major contributions of the newly proposed method are summarized as follows.

1.
We introduced a tuning parameter for the likelihood estimation function and demonstrated its impact on the decision accuracy of the classification system.

2.
We proposed the average pairwise discordance index (APDI) as a selection criterion to reduce the number of evidence sources before the deployment of the DS framework.

3.
Three real-world and one artificially generated datasets were used to show the performance of the proposed method in terms of accuracy.
The rest of the paper is organized as follows: The basic preliminaries are briefly discussed in Section 2. In Section 3, the proposed reliability based multisensor data fusion with application in target classification is presented. The focus of Section 4 is to show the effectiveness of the proposed approach on both the real and artificial datasets. The conclusion is contained in Section 5.

Dempster-Shafer Theory (DST)
The Dempster Shafer (DS) theory, often referred to as the theory of belief functions, was originally introduced by Dempster in [7] and later developed by Shafer in [8]. The theory of belief functions allows probabilities to be assigned to subsets instead of only mutually exclusive singletons. It can model uncertainty better than the probability theory [16]. The basics of the DS theory include the frame of discernment, functions, the DS rule of combination, and the probability transformation.

Frame of Discernment
Let Ω, a set of M mutually exhaustive and exclusive hypotheses be defined as Ω is known as the frame of discernment. A power set 2 Ω is the set of all possible subsets of Ω.

Functions
For all A, B ⊆ Ω, evidence can be represented by functions which include: mass function, belief functions, and plausibility functions [25].

The DS Rule of Combination
To fuse evidence from multiple independent sources, the DS rule of combination is used. Suppose m 1 and m 2 are two mass functions obtained from two independent sources on the same frame of discernment Ω. The combined mass is defined as [8] ∀A, B, C ⊆ Ω and A = ∅.

Probability Transformation
The mass function obtained after the application of the DS combination rule is not adequate for decision making, consequently probability transformation is required to obtain probability values from the fused mass function. The Pignistic probability function introduced in [18] is often applied and it is formally defined as

Fuzzy Set Theory
Fuzzy set theory is a theoretical framework for handling imperfection in data [26]. Its concept is built on fuzzy sets to model uncertainty due to imprecision and vagueness. A fuzzy set is described by a membership function which allows an object to belong to different classes with varying degree of membership ranging from [0 1] [26].

Fuzzy Membership Function
The membership function µ is the mapping of each element x to a value µ(x) on [0, 1]. Although, there are several membership functions, the commonly used ones are Gaussian, triangular and trapezoidal functions. Gaussian membership function for set A is defined as A triangular fuzzy number A can be described by the triplets (a, b, c) with the membership value defined as Nonetheless the structure of the triangular fuzzy number is not as smooth as the Gaussian membership function, it is simpler and easier to use.

Type−2 Fuzzy Sets
The selection of the right type of membership function is one of the major challenges of using Type−1 fuzzy membership function for data and uncertainty representation. Type−2 fuzzy set can be used to solve the problem. The fact that different spreads of the membership function will produce different accuracy corroborates one of the reasons for the use of Type−2 fuzzy sets to model uncertainty about the appropriate type of membership function. Type −2 fuzzy set can be viewed as a collection of many embedded Type−1 fuzzy sets [27,28]. The membership value in a Type−2 fuzzy set is itself a fuzzy set. The traditional Type-1 fuzzy set is two dimensional (2D), however, the Type−2 fuzzy set is three dimensional (3D) to include element, primary membership value and secondary membership value denoted by x, u and, µ respectively as shown in Figures 1 and 2. The area between the lower membership function (LMF) and the upper membership function (UMF) is termed footprint of uncertainty (FOU) [28]. We provided a brief introduction to Type−2 fuzzy set because it forms the basis of the intuition for varying the spread factor of the membership function to model uncertainty about the membership function.
It can be observed that Type−2 fuzzy set can represent and deal with uncertainty associated with membership function by leveraging the additional degree of freedom provided by the newly introduced third dimension and the footprint of uncertainty.

Similarity Between Fuzzy Sets
In the literature, there are several measures of similarity between two fuzzy sets. Jaccard index and concordance index are of interest to us in this context. As shown in Figure 3, let us assume that µ 1 and µ 2 are two Gaussian membership functions for class C 1 and class C 2 . The concordance index S C and Jaccard index S J are defined as [29]

Proposed Method
Target classification problem can be formulated in the same way as the general data classification problem as follows. Let X = {x 1 , x 2 , ..., x n } be a set of n training samples with corresponding class labels {y 1 , y 2 , ..., y n }, x k is a N−dimensional attribute vector with class label y k , where y k ∈ Ω = {C 1 , C 2 , ..., C M }, which is a set of M classes. Suppose in a target classification system, a set of N− sensors {s 1 , s 2 , ..., s N } measuring different attributes of the target produces a collection of basic probability of assignment (BPA) defined as {m 1 , m 2 , ..., m N }. The primary goal of a target classification system is to assign the unknown target x to one of the members of the frame of discernment based on the combination of different pieces of evidence induced by the different attribute measurements.
In [23], it was asserted that the conflict within the framework of belief theory could be attributed to improper likelihood estimation function. Since fuzzy membership function is employed as the likelihood estimation model which is parameterized by the spread factor, poor characterization using the spread factor γ may result in high conflict and consequently, a degradation in the performance of the reasoning process. The motivation for this study was triggered by the application of interval Type−2 fuzzy set for the representation of uncertainty. An interval type 2 fuzzy set is a form of Type2 fuzzy set with uniform secondary membership function. Interval type2 fuzzy set is characterized by upper and lower membership functions. It was discovered that using the lower membership function at the modeling stage did not yield the same value of accuracy as utilizing the upper membership function. Moreover, the only difference between the two is the spread/width. This gives the insight that by varying the spread parameter, we can actually improve on the performance of the proposed method in terms of accuracy. In this work, the theory of belief functions is being proposed as a multisensor data fusion approach for target classification. Individual attributes of the unknown target induce local declarations in the form of belief functions by assigning masses to each of the subsets of the frame of discernment. To address possible conflict, a reliability degree, which is essentially the normalized average pairwise discordance index (APDI), is proposed based on the discriminatory power of each evidence source (attribute or feature). The reliability degree is then used as a weighting factor to obtain a weighted average belief function. The weighted average basic probability assignment (BPA) is fused to produce the final BPA. Decisions are taken based on the probability transformation of the final BPA. The flowchart of the proposed method is shown in Figure 4.  Individual class i having an attribute j is represented by a triangular membership function using the statistical information. The meanx ij and the standard deviation σ ij for every class C i (i = 1, 2, ..., M) and attribute A j (j = 1, 2, ..., N) are defined as: x ijt is the t th sample value given attribute j and class i. T is the sample size for the class. Therefore, for class i and a given attribute j the triplets for the triangular fuzzy number shown in Figure 5 is defined as where γ is a tuning parameter (adjustment factor) as opposed to being set to 2 in [12,13].

Determination of the Reliability Degree
A reliability factor was proposed in [13] and its defined as follows where, M is the number of classes, S C (C i , C k ) is the concordance index between class i and k. We define the reliability degree as the average pairwise discordance index (APDI). Therefore the normalized reliability of source j can be expressed as The normalization is required to satisfy the constraint imposed by the DS combination rule. Such that In [13], the reliability factor was not normalized, normalization occur after its combination with the credibility degree before the DS fusion.

Reasoning
The reasoning entails the generation, analysis, and the combination of BPAs. As can be seen, no calculation of credibility is required at the reasoning. Only the (static) reliability obtained at the modeling stage is used. The credibility using the similarity among evidence is with the assumption that a greater proportion of the evidence sources are credible. This assumption may not always be true.

Generation of the Basic Probability Assignment (BPA)
This is where information modeled as belief functions are extracted from sensor measurement. The attribute values of the unknown target are of lower abstraction level, which is mapped into a higher information abstraction level in the form of BPAs. The BPAs are generated based on the similarity between the different attribute values and the fuzzy models(membership functions) obtained from the historical/training data. Due to its simplicity, a similar method used in [11][12][13] is adopted in this work.

Computation of the Weighted Average BPA
Suppose there are N evidence sources provided by N sensors. For any proposition A, a subset of the frame of discernment, the weighted average mass function is a weighted combination of confidence polled from the different evidence sources and it is defined as [20]

Dempster Shafer (DS) Fusion
Having obtained the weighted average evidence, the next step is to apply the traditional DS Rule of combination on m wae in (N − 1) times [19].

Decision Making
This essentially consists of the transformation of the belief function and the application of an appropriate decision rule.

Pignistic Transformation
The final BPA retrieved from the DS fusion cannot be employed directly for decision making, hence, a transformation of the final mass function to probability distribution is required. A well-known probability transformation is the Pignistic probability transformation of the transferable belief model defined as [18] The ultimate goal of target classification is to assign the unknown target to one of the known classes, thus |A| equals 1, hence, (17) reduces to (18).
|B| (18) We adopted the Pignistic probability for decision making following the justification of its suitability provided in [30]

Decision Rule
Assign the unknown target to the class with the highest Pignistic probability.

Selection of Spread Factor γ
The following example is used to illustrate the impact of γ on the accuracy of the proposed model on the Iris dataset using only reliability as a weighting factor. Three models are built using 3 different values of γ, as shown in Figure 6 with the triangular fuzzy numbers (TFN1: TFN3). Model1, Model2, and Model3 can be viewed as the lower, the mid, and the upper membership functions, respectively. By applying 5 fold cross-validation 10 times, the associated accuracy with the three different values of γ is shown in Table 1. The focus of this study is not to deploy Type−2 fuzzy set as the likelihood estimation model but demonstrate a type−2 fuzzy set as part of the intuition behind this study.  Having discovered that by changing the value of γ, we can alter the decision accuracy of the DS model. The next question is how to select the value of γ. The training set is used to determine a suitable value of γ. With 5-fold cross-validation, we increase γ from 1.5 with a step size of 0.1 to 4.5, and their corresponding accuracies on the training set are recorded. The value of γ that returns the maximum accuracy is selected to build the fuzzy models.

Selection of Evidence Source
In the proposed framework, every attribute measurement is considered as a source of evidence to induce a corresponding belief function. The implication is that the utilization of every attribute obtained from signature and kinematic sets of the targets for characterization will unavoidably lead to high processing costs [31]. The long processing time comes from the combination of the various belief functions using the DS rule of combination. In addition to high processing costs, conflict in evidential reasoning can also be attributed to a large number of evidence sources [23]. Reducing the number of sources is analogous to the challenge of dimensionality reduction in the conventional machine learning algorithm.
Dimensionality reduction is one of the most well-known strategies to remove irrelevant and redundant features. The strategies can be broadly categorized into feature extraction and feature selection [32]. In feature extraction, the original feature space is transformed into a new feature space with a reduced dimension. However, in feature selection, a subset of the original feature space that enhances the performance of the machine learning algorithm is selected. In this study, feature selection is of importance to us for enhanced interpretation. As a result, we will incorporate a preprocessing stage that will involve a reduction of the cardinality of the measurement set based on the significance of each attribute in relation to its discriminatory capability for the various target classes. Only a set of significant attributes is selected as sources of information to produce the basic probability assignment (BPA). In the traditional machine learning, feature selection can be subdivided into two groups [33]: • Filter: Features are ranked based on evaluation criteria independent of learning algorithms. Filter methods have proven to be computationally efficient for feature subset selection.

•
Wrapper: In wrapper, the ranking of individual features utilizes learning algorithms. Wrapper method is more computationally expensive than the filter methods.
Suppose there are N information sources, I j (j = 1, ..., N). The proposed selection method is a filter-based approach that utilizes the average pairwise discordance index(APDI). It is implemented through the following steps:

1.
Evaluation of sources using average pairwise discordance index (APDI) (20) where, M is the # classes, and Y i and Y k are class i and k respectively. S Conc (Y i , Y k ) is the concordance index between class i and k 2.
Selection of sources based on certain threshold • Compute the mean APDI • Select source whose APDI is at least equal to the mean APDI A pseudo code for the proposed method of selection of information sources is presented in Algorithm 1. Compute the APDI APDI(I j ) 4: End 5: Compute the mean of the APDIĀPDI 6: for j = 1 ≤ N do 7: if APDI(I j ) ≥ĀPDI then

Simulation
Four problems consisting of three real datasets and one synthetic dataset were used to demonstrate the capability of the proposed reliability-based multisensor data fusion approach. The performance of the proposed method was compared with the recently proposed method in [13], and Decision Trees (DT) using five-fold cross-validations.

Real Datasets
The three real datasets: IRIS, Wine, and Breast cancer, were obtained from the UCI Machine Learning Repository. Information with respect to the datasets are depicted in Table 2.

Synthetic Dataset
A similar method of generating a synthetic dataset for an airborne target recognition problem for an air surveillance system used in [34] is adopted to illustrate the capability of the proposed method. Each target is described by three features: speed, acceleration, and length. The target belongs to one of the three classes of Commercial plane, Bomber, or Fighter. Recognition is based on a multisensor system to measure the average speed, the maximum acceleration, and the average length. The feature intervals for the various airborne target classes are shown in Table 3. A total of 300 samples were generated with 100 samples for each of the classes based on the information provided in Table 3.

Results
We repeated the 5− fold cross validation 10 times, and the average accuracy of each of the different methods is displayed in Table 4 and Figure 7.  With the introduction of the proposed attribute selection method, the experimental results are displayed in Tables 5-7. The ranking order of the various attributes for the different datasets are shown in Table 5 and their associated APDI displayed in Table 6. The average accuracy for both full and the reduced dataset after repeating the 5 fold cross-validation 10 times is shown in Table 7 and Figure 8.
Acc1 is the accuracy with the full set, while Acc2 is the accuracy with the reduced set. |Full set| and |Reduced set| are the cardinalities of the full and the reduced sets respectively. Figures 9-12 show the effect of the spread factor on the average recognition accuracy of the proposed method with 5 fold cross validation after 10 trials for both the full and the reduced sets for the different problems.

Discussion
It can be observed from the simulation results that the spread factor is of crucial importance to the decision accuracy of the classification system. It can be seen that the average accuracy of the newly proposed method, RDSRC, is better than the earlier method, RCDSRC, as well as DT. To reduce the number of evidence sources, we propose the APDI as an evaluation index. The average accuracy with the reduced evidence source is better than that of the full evidence source for both Iris and Wine datasets. However, for the Wisconsin breast cancer dataset and the artificially generated data set for the target classification problem, the classification accuracy with the full set is better than that of the reduced set.

Conclusions
We have proposed a reliability-based multisensor data fusion with application in target classification. This approach fundamentally consists of the representation of the training sets using triangular fuzzy membership functions, the generation of the local declarations in the forms of belief functions by mapping the various attribute measurements into the basic probability assignment (BPA). The various BPAs are preprocessed using the normalized reliability degree based on the goodness/ importance of the attribute to obtain the weighted average BPA. The weighted average BPA is fused with itself using the traditional DS rule of combination to obtain a final declaration (BPA). Then, decisions are made based on the Pignistic probability transformation of the final BPA. It is evident that this approach does not require the computation of the credibility. Through extensive simulations, the average accuracy of the newly proposed method is better than RCDSRC and DT on both the real and artificial datasets. The proposed selection method does not capture redundancy among information sources. The future research effort will be channeled towards incorporating a strategy to handle redundancy among the different sources.