Information Fusion in a Multi-Source Incomplete Information System Based on Information Entropy

As we move into the information age, the amount of data in various fields has increased dramatically, and data sources have become increasingly widely distributed. The corresponding phenomenon of missing data is increasingly common, and it leads to the generation of incomplete multi-source information systems. In this context, this paper’s proposal aims to address the limitations of rough set theory. We study the method of multi-source fusion in incomplete multi-source systems. This paper presents a method for fusing incomplete multi-source systems based on information entropy; in particular, by comparison with another method, our fusion method is validated. Furthermore, extensive experiments are conducted on six UCI data sets to verify the performance of the proposed method. Additionally, the experimental results indicate that multi-source information fusion approaches significantly outperform other approaches to fusion.


Introduction
Information fusion is used to obtain more accurate and definite inferences from the data provided by any single information source by integrating multiple information sources; several definitions have been proposed in the literature [1][2][3][4][5][6][7][8][9].The theory of information fusion was first used in the military field; it is defined as a multi-level and multi-aspect process that handles problems.In fact, data fusion can be broadly summarized as such a process; namely, to synthesize comprehensive intelligence from multi-sensor data and information according to established rules and analysis methods, and on this basis, to provide the user-required information, such as decisions, tasks, or tracks.Therefore, the basic purpose of data fusion is to obtain information that is more reliable than data from any single input.Along with the progress of time, information fusion technology has become increasingly important in the field of information service.Multi-source information fusion is one of the most important parts of information service in the age of big data, and many productive achievements have been made.Many scholars have conducted research on multi-source information fusion.For example, Hai [10] investigated predictions of formation drillability based on multi-source information fusion.Cai et al. [11] researched multi-source information fusion-based fault diagnosis of a ground-source heat pump using a Bayesian network.Ribeiro et al. [12] studied an algorithm for data information fusion that includes concepts from multi-criteria decision-making and computational intelligence, especially fuzzy multi-criteria decision-making and mixture aggregation operators with weighting functions.Some relative papers have studied entropy measure with other fuzzy extensions.For instance, Wei et al. [13] proposed uncertainty measures of extended hesitant fuzzy linguistic term sets.Based on interval-valued intuitionistic fuzzy soft sets, Liu et al. [14] proposed a theoretical development on the entropy.Yang et al. [15] proposed cross-entropy measures of linguistic hesitant intuitionistic fuzzy systems.
An information system is the main expression of an information source and the basic structure underlying information fusion.An information system is a data table that describes the relationships among objects and attributes.There is a great deal of uncertainty in the process of information fusion.Rough set theory is usually used to measure the uncertainty in an information table.Rough set theory-which was introduced by Pawlak [16][17][18][19][20]-is an extension of classical set theory.In data analysis, it can be considered a mathematical and soft computational tool to handle imprecision, vagueness, and uncertainty.This relatively new soft computing methodology has received a great deal of attention in recent years, and its effectiveness has been confirmed by successful applications in many science and engineering fields, including pattern recognition, data mining, image processing, and medical diagnosis [21,22].Rough set theory is based on the classification mechanism, and the theory is classified as an equivalence relation in a specific universe, and this equivalence relation constitutes a partition of the universe.A concept (or more precisely, the extension of a concept) is represented by a subset of a universe of objects, and is approximated by a pair of definable concepts in a logic language.The main idea of rough set theory is the use of known knowledge in a knowledge base to approximate inaccurate and uncertain knowledge.This seems to be of fundamental importance to artificial intelligence and cognitive science.An information system is the basic structure underlying information fusion, and rough set theory is usually used to measure the uncertainty in an information system.Therefore, it is feasible to use rough set theory for information fusion.Some scholars have conducted research in this field.For example, Grzymala-Busse [23] presented and compared nine different approaches to missing attribute values.For testing both naive classification and new classification techniques of LERS (Learning from Examples based on Rough Sets) were used.Dong et al. [24] researched the processing of information fusion based on rough set theory.Wang et al. [25] investigated multi-sensor information fusion based on rough sets.Huang et al. [26] proposed a novel method for tourism analysis with multiple outcome capability based on rough set theory.Luo et al. [27] studied incremental update of rough set approximation under the grade indiscernibility relation.Yuan et al. [28] considered multi-sensor information fusion based on rough set theory.In addition, Khan et al. [29,30] used views of the membership of objects to study rough sets and notions of approximates in multi-source situations.Md et al. [31] proposed a modal logic for multi-source tolerance approximation spaces based on the principle of considering only the information that sources have about objects.Lin et al. studied an information fusion approach based on combining multi-granulation rough sets with evidence theory [32].Recently, Balazs and Velásquez conducted a systematic study of opinion mining and information fusion [33].
However, these methods of information fusion are all based on complete information systems; a smaller amount of research has been conducted for incomplete information systems (I ISs).Jin et al. [34] studied feature selection in incomplete multi-sensor information systems based on positive approximation in rough set theory.IISs occur as a result of the ability to acquire data, the production environment, and other factors that result in the presence of original data with unknown values of attributes.As science has developed, people have found many ways to obtain information.An information box [35] can have multiple information sources, and every information source can be used to construct an information system.If all information sources are incomplete, then they can be used to construct multiple incomplete information systems.Therefore, the motivation for this paper is shown as follows: From the current research situation, most methods of information system fusion are all based on complete information systems.In order to broaden the research background of information fusion, we study the method of incomplete information system fusion.In order to reduce the amount of information loss in the process of information system fusion, we proposed the method which used information entropy to fuse incomplete information systems.In particular, by comparison with another method, our fusion method is validated.In this paper, we discuss the multi-source fusion of incomplete information tables based on information entropy.It is concluded that the method proposed here is more effective after comparing it with the mean value fusion method.This rest of this paper is organized as follows: Some relevant notions are reviewed in Section 2. In Section 3, we define conditional entropy in a multi-source decision system, propose a fusion method based on conditional entropy, and design an algorithm for creating a new information table from a multi-source decision table based on conditional entropy.In Section 4, we download some data sets from UCI to prove the validity and reliability of our method; furthermore, we analyze the results of the experiment.The paper ends with conclusions in Section 5.

Preliminaries
In this section, we simply review some basic concepts relating to rough set theory, incomplete information systems, incomplete decision systems, and conditional entropy (CE) in incomplete decision systems.More details can be found in the literature [16,[36][37][38][39].

Rough Sets
In rough set theory, let S = (U, AT, V, f ) be an information system.The U = (x 1 , x 2 , . . ., x n ) is the object set.The AT = (a 1 , a 2 , . . ., a m ) is the attribute set.The V = (v 1 , v 2 , . . ., v m ) is a set of corresponding attribute values.The f : U → V is a mapping function.
Let P ⊆ R and P = φ, the intersection of all the equivalence relations in P is called the equivalence relation on P or the indistinguishable relation is defined by I ND(P).
Let X be a subset of U.Then, x is an object of U, the equivalence class of x about R is defined by which represents the equivalence class that contains x.When a set X expresses a union of equivalence classes, the set X can be precisely defined; otherwise, the set X can only be approximated; in rough set theory, upper and lower approximation sets are used to describe the set X. Given a finite nonzero set, U, which is called the domain, that R is an equivalence relation in the universe U and X ⊆ U, the upper and lower approximations of X are defined by The R positive region, negative region, and the boundary region of X are defined as follows, respectively.
The approximation accuracy and roughness of the concept X in an attribute set, A, are defined as follows: respectively.They are often used for measuring uncertainty in rough set theory.|X| refers to the cardinality of the set X.
The approximation accuracy for rough classification was proposed by Pawlak [19] in 1991.By employing the attribute set R, the approximation accuracy provides the percentage of possibly correct decisions when classifying objects.
Let DS = (U, AT ∩ D, V, f ) be a decision system, U/D = {Y 1 , Y 2 , . . ., Y m } be a classification of universe U, and R be an attribute set satisfying R ⊆ AT.Then, the R-lower and R-upper approximations of U/D are defined as The approximation accuracy of U/D for R is defined as Recently, Dai and Xu [40] extended this to incomplete decision systems; i.e., The corresponding approximation roughness of U/D for R is defined as

Incomplete Information System
A quadruple IS = (U, AT, V, f ) is an information system.U is a nonempty finite set of objects, AT is a nonempty finite set of attributes, V= a∈A V a , where V a is the domain of a, and f : U × AT → V is an information function such that f (x, a) ∈ V a for each a ∈ AT and x ∈ U. A decision system, (DS), is a quadruple DS = (U, AT ∪ DT, V, f ), where C is the condition attribute set, D is the decision attribute set, and C ∩ D = φ, V is the union of the attribute domain.
If there exists a ∈ AT and x ∈ U such that f (a, x) is equal to a missing value (denoted " * "), then the information system is an incomplete information system (I IS).Otherwise, the information system is a complete information system (CIS).If * / ∈ V DT but * ∈ V AT , then we call the decision system an incomplete decision system (IDS).If * / ∈ V DT and * / ∈ V AT , then the information system is a complete decision system (CDS).
Because there are missing values, the equivalence relation is not suitable for incomplete information systems.Therefore, Kryszkiewicz [36,37] defined a tolerance relation for incomplete information systems.Given an incomplete information system, I IS = (U, AT, V, f ), for any attribute subset B ⊆ AT, let T(B) denote the binary tolerance relation between objects that are possibly indiscernible in terms of B. T(B) is defined as The tolerance class of object x with reference to an attribute set B is denoted T B (x) = {y|(x, y) ∈ T(B)}.For X ⊆ U, the lower and upper approximations of X with respect to B are defined as

Multi-Source Incomplete Information Fusion
With the development of science and technology, people have access to increasing numbers of channels from which to obtain information.The diversity of the channels has produced a large number of incomplete information sources-that is, a multi-source incomplete information system.Investigating some special properties of this system and fusing the information are the focus of the information technology field.In this section, we present a new fusion method for multi-source incomplete information systems and compare our fusion method with the mean value fusion method in a small experiment.

Multi-Source Information Systems
Let us consider the scenario in which we obtain information regarding a set of objects from different sources.Information from each source is collected in the above information system, and thus, a family of the single information systems with the same domain is obtained; it is called a multi-source information system [41].Definition 1. (see [32]) A multi-source information system can be defined as where U is a finite non-empty set of objects, AT i is a finite non-empty set of attributes of each subsystem, {V a } is the value of attribute a ∈ AT i , and f i : U × AT i → {(V a ) a∈AT i } such that for all x ∈ U and a ∈ AT i , In particular, a multi-source decision information system is given by where D is a finite non-empty set of decision attributes and g d : U → V d for any d ∈ D, where V d is the domain of decision attribute d.The multi-source information system includes s single information sources.Let the s overlapping pieces of single-source information system form an information box with s levels, as shown Figure 1, which comes from our previous study [35].

Multi-Source Incomplete Information System
Definition 2. A multi-source incomplete information system (MIIS) is defined as IIS i is the incomplete information system of subsystem i; 2.
U is a finite non-empty set of objects; 3.
AT i is the finite non-empty set of attributes for subsystem i; 4.
{V a } is the value of attribute a ∈ AT i ; 5.
In particular, a multi-source decision information system is given by where D is a finite non-empty set of decision attributes and g d : U → V d for any d ∈ D, where V d is the domain of decision attribute d.

Multi-Source Incomplete Information Fusion
Because the information box in each table is not complete, we propose a new fusion method.Definition 3. Let I be an incomplete information system (IIS) and U = {x 1 , x 2 , . . ., x n }. ∀a ∈ AT, x i , x j ∈ U, we define the distance between any two objects in U with attribute a as follows.The tolerance class of object x with respect to an attribute set B is denoted by In the literature [39], Dai et al. proposed a new conditional entropy to evaluate the uncertainty in an incomplete decision system.Given an incomplete decision system IDS = (U, AT ∪ DT, V, f ), U = {u 1 , u 2 , . . ., u n }.B ⊆ AT is a set of attributes, and U/D = {Y 1 , Y 2 , . . ., Y m }.The conditional entropy of D with respect to B is defined as Because the conditional entropy is monotonous and because the attribute set B increases in importance as the conditional entropy decreases, we have the following definitions: Definition 6.Let I 1 , I 2 , . . ., I s be s incomplete information systems and U = {u 1 , u 2 , . . ., u n }. ∀a ∈ AT, U/D = {Y 1 , Y 2 , . . ., Y m }.The uncertainty of the information sources in D with respect to I q (q = 1, 2, . . ., s) for attribute a is defined as , where T q a (u i ) is the tolerance class of the information sources in D with respect to I q (q = 1, 2, . . ., s) for attribute a.
Because the conditional entropy of Dai [39] is monotonous, H a ( D| I q ) (q = 1, 2, . . ., s) for attribute a is also monotonous, and for attribute a, the smaller the conditional entropy is, the more important the information source is.We have the following Definition 7: Definition 7. Let I 1 , I 2 , . . ., I s be s incomplete information system.We define the l th (l = 1, 2, . . ., s) incomplete information system, which is the most important for attribute a, as follows: where l a represents the l th information source, which is the most important for attribute a.
Example 1.Let us consider a real medical examination issue at a hospital.When diagnosing leukemia, there are 10 patients, x i (i = 1, 2, . . ., 10), to be considered.They undergo medical examinations at four hospitals, which test 6 indicators, a i (i = 1, 2, . . ., 6), where a 1 -a 6 are, respectively, the "hemoglobin count", "leukocyte count", "blood fat", "blood sugar", "platelet count", and "Hb level".Tables 1-4 are incomplete evaluation tables based on the medical examinations performed at the four hospitals; the symbol " * " means that an expert cannot determine the level of a project.Suppose V D = {Leukemia patient, Non leukemia patient} and U/D = {Y 1 , Y 2 }, where Y 1 = {x 1 , x 2 , x 6 , x 8 , x 9 }, Y 2 = {x 3 , x 4 , x 5 , x 7 , x 10 }.Then, the conditional entropy of the information sources of D with respect to I q (q = 1, 2, 3, 4) for attribute a i (i = 1, 2, . . ., 6) is as follows: Because the conditional entropy can be used to evaluate the importance of information sources for attribute a, we can determine the importance of all attributes for all information sources by using Definition 7 and Table 5.The smaller the conditional entropy is, the more important the information sources are for attribute a.Therefore, I 1 is the most important for a 1 and a 6 , I 2 is the most important for a 3 and a 5 , and I 4 is the most important for a 2 and a 4 .I 3 is not the most important for any attribute.A new information system, (N IS) is established by part of each table.Furthermore, we take I 1 for the value of a property for a 1 and a 6 , I 2 for the property's value for a 3 and a 5 , and I 4 for the property's value for a 2 and a 4 .That is, , where V I q a i (q = 1, 2, 3, 4; i = 1, 2, . . ., 10) represents the range of attribute a i under I q , and we obtain the new information system (N IS) after fusion.The new information system, (N IS), after fusion is shown in Table 6.The fusion process is shown in Figure 2. Suppose that there is a multi-source information system MS = {I 1 , I 2 , • • • , I s } that contains s information systems and that there are n objects and m attributes in each information system I i (i = 1, 2, . . ., s).We calculate the conditional entropy of each attribute by using Definition 6.Then, we determine the minimum of the conditional entropy for each attribute of the values using Definition 7.For example, we use different colors of rough lines to express the corresponding attributes to select a source.Then, the selected attribute values are integrated into a new information system.In practical applications, the mean value fusion method is one of the common fusion methods.We compare this type of method with conditional entropy fusion based on approximation accuracy.The results of two types of fusion method are presented in Tables 6 and 7. Using Tables 6 and 7, we compute the approximation accuracy of the results of the two fusion methods and compare their approximation accuracy.Please see Table 8.

Multi-Source Fusion Mean Value Fusion
Approximation accuracy 0.42857 0.33333 By comparing the approximation accuracies, we see that multi-source fusion is better than mean value fusion.Therefore, we design a multi-source fusion algorithm (Algorithm 1) and analyze its computational complexity.
The given algorithm (Algorithm 1) is a new approach to multi-source information fusion.Its approximation accuracy is better than that of mean value fusion in the result of example Section 3.3.First, we can calculate all the similarity classes T q a (x) for any x ∈ U for attribute a.Then, the conditional entropy, H a (D|I q ), is computed for information source q and attribute a. Finally, the minimum of the conditional entropy of the information source is selected for attribute a, and the results are spliced into a new table.The computational complexity of Algorithm 1 is shown in Table 9.

Input
: A multi-source information system MS = {I In steps 4 and 5 of Algorithm 1, we compute all T q a (x) for any x ∈ U for attribute a. Steps 6-14 calculate the conditional entropy for information source q and attribute a. Steps 17-26 are to find the minimum of the conditional entropy of the corresponding source for any a ∈ AT.Finally, the results are returned.

Experimental Evaluation
In this section, to further illustrate the correctness of the conclusions of the previous example, we conduct a series of experiments to explain why the approximate precision of conditional entropy fusion is generally higher than that of the mean value fusion based on standard data sets from the machine learning data repository of the University of California at Irvine (http://archive.ics.uci.edu/ml/datasets.html)called "Statlog (Vehicle Silhouettes)", "Letter Recognition", "Phishing Websites", "Robot Execution Failures", "Semeion Handwritten Digit", and "SPECTF Heart" in Table 10.The experimental program is running on a personal computer with the hardware and software described in Table 11.To build a real multi-source incomplete information system, we propose a method for obtaining incomplete data from multiple sources.First, to obtain incomplete data, a complete data set with some data randomly deleted is used as the original incomplete data set.Then, a multi-source incomplete decision table is constructed by adding Gaussian noise and random noise to the original incomplete data set.
Let MI IS = {I 1 , I 2 , . . ., I s } be a multi-source incomplete decision table constructed using the original incomplete information table, I.
First, s numbers (g 1 , g 2 , . . ., g s ) that have an N(0, σ) distribution, where σ is the standard deviation, are generated.The method of adding Gaussian noise is as follows: where I(x, a) is the value of object x with attribute a in the original incomplete information table and I i (x, a) represents object x with attribute a in the i-th incomplete information source.Then, s random numbers (e 1 , e 2 , . . ., e s ) between −e and e, where e is a random error threshold, are generated.The method of adding random noise is as follows: where I(x, a) represents the value of object x for attribute a in the original incomplete information table and I i (x, a) represents object x for attribute a in the i-th incomplete information source.Next, 40% of the objects are randomly selected from the original incomplete information table, I, and Gaussian noise is added to these objects.Then, 20% of the objects are randomly selected from the rest of the original incomplete information table, I, and random noise is added to these objects.

Related Works and Conclusion Analysis
In different fields of science, the standard deviation of Gaussian noise and the random error threshold of random noise may differ.In this paper, we conducted 20 experiments for each data set and set the standard deviation σ and the random error threshold e to values from 0 to 2, with an increase of 0.1 in each experiment.For CE fusion and mean value fusion, the approximation accuracy of U/D for each data set is displayed in Table 12 and Figures 3-      We can easily see from Figures 3-8 and Table 12 that when the noise is small, in most cases, the approximation accuracy of CE fusion is slightly higher than that of mean value fusion.In a certain range, as the noise increases, the approximation accuracy of CE fusion becomes much better than that of mean value fusion.
By observing the approximation accuracies of the extensions of concepts of CE and mean value fusion for the six data sets, we find that in most cases, the approximation accuracy of CE fusion is higher than that of mean value fusion.In a certain range, as the amount of noise increases, the accuracies of the extensions of concepts of CE and mean value fusion trend upward, but they are not strictly monotonic.

Conclusions
In this paper, we studied multi-source information fusion in view of the conditional entropy.There are many null information sources in the age of big data.To solve the problem of integrating multiple incomplete information sources, we studied an approach based on multi-source information fusion.We transformed a multi-source information system into an information table by using this fusion method.Furthermore, we used rough set theory to investigate the fused information table, and compared the accuracy of our fusion method with that of the mean value fusion method.According to the accuracies, CE fusion is better than mean value fusion under most conditions.In this paper, we constructed six multi-source information systems, each containing 10 single information sources.Based on these data sets, a series of experiments was conducted; the results showed the effectiveness of the proposed fusion method.This study will be useful for fusing uncertain information in multi-source information systems.It provides valuable selections for data processing in multi-source environments.

Figure 1 .
Figure 1.A multi-source information box.

Definition 4 .Definition 5 .
Given an incomplete information system IIS = (U, AT, V, f ), for any attribute a ∈ AT, let T(a) denote the binary tolerance relation between objects that are possibly indiscernible in terms of a. T(a) is defined asT(a) = { (x, y) | dis a (x, y)L a },where L a indicates the threshold associated with attribute a.The tolerance class of object x with reference to attribute a is denoted by T a (x) = {y|(x, y) ∈ T(a)}.Given an incomplete information system IIS = (U, AT, V, f ), for any attribute subset B ⊆ AT, let T(B) denote the binary tolerance relation between objects that are possibly indiscernible in terms of B. T(B) is defined as T(B) = a∈B T(a).

Figure 2 .
Figure 2. The process of multi-source information fusion.

Figure 4 .
Figure 4. Approximation accuracies for the decision classes in data set S (VS).

Figure 5 .
Figure 5. Approximation accuracies for the decision classes in data set AS-N.

Figure 6 .
Figure 6.Approximation accuracies for the decision classes in data set IS.

Figure 7 .
Figure 7. Approximation accuracies for the decision classes in data set S (LS).

Figure 8 .
Figure 8. Approximation accuracies for the decision classes in data set EES.

Table 5 .
The conditional entropy of information sources for different attributes.

Table 6 .
The result of multi-source information fusion.

Table 7 .
The result of mean value fusion of multiple information sources.
7 for i = 1 : |U| do 8 for j = 1 : m do 9 if |T q a (x i ∩ Y j | > 0 then 10 HCE ← HCE − H a (D|I q ) ← HCE; // recordCE for attribute a and information source q; 15 end 16 end 17 for each a ∈ AT do 18 minCE ← +∞; 19 for q = 1 : s do 20 if H a (D|I q ) < minCE then 21 minCE ← H a (D|I q ); 22 l a ← q;

Table 11 .
Description of the experimental environment.
8. CE and M stand for CE fusion and mean value fusion, respectively.Approximation accuracies for the decision classes in data set WC.

Table 12 .
Approximation accuracies of conditional entropy fusion (CE) and mean value fusion (M) for each data set.