- freely available
- re-usable
Entropy 2011, 13(4), 860-901; doi:10.3390/e13040860
Article
A Feature Subset Selection Method Based On High-Dimensional Mutual Information
1
Institute of Developmental Biology and Molecular Medicine, Fudan University, 220 Handan Road, Shanghai 200433, China
2
School of Life Sciences, Fudan University, 220 Handan Road, Shanghai 200433, China
3
School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore
* Authors to whom correspondence should be addressed.
Received: 8 January 2011; in revised form: 18 March 2011 / Accepted: 23 March 2011 / Published: 19 April 2011
The original version is still available [2671 KB, uploaded 19 April 2011 09:51 CEST]
Abstract: Feature selection is an important step in building accurate classifiers and provides better understanding of the data sets. In this paper, we propose a feature subset selection method based on high-dimensional mutual information. We also propose to use the entropy of the class attribute as a criterion to determine the appropriate subset of features when building classifiers. We prove that if the mutual information between a feature set X and the class attribute Y equals to the entropy of Y , then X is a Markov Blanket of Y . We show that in some cases, it is infeasible to approximate the high-dimensional mutual information with algebraic combinations of pairwise mutual information in any forms. In addition, the exhaustive searches of all combinations of features are prerequisite for finding the optimal feature subsets for classifying these kinds of data sets. We show that our approach outperforms existing filter feature subset selection methods for most of the 24 selected benchmark data sets.
Keywords: feature selection; mutual information; Entropy; information theory; Markov blanket; classification
Article Statistics
Click here to load and display the download statistics.Cite This Article
MDPI and ACS Style
Zheng, Y.; Kwoh, C.K. A Feature Subset Selection Method Based On High-Dimensional Mutual Information. Entropy 2011, 13, 860-901.
AMA StyleZheng Y, Kwoh CK. A Feature Subset Selection Method Based On High-Dimensional Mutual Information. Entropy. 2011; 13(4):860-901.
Chicago/Turabian StyleZheng, Yun; Kwoh, Chee Keong. 2011. "A Feature Subset Selection Method Based On High-Dimensional Mutual Information." Entropy 13, no. 4: 860-901.
