Entropy 2011, 13(4), 860-901; doi:10.3390/e13040860
Article

A Feature Subset Selection Method Based On High-Dimensional Mutual Information

1,2,* email and 3,* email
1 Institute of Developmental Biology and Molecular Medicine, Fudan University, 220 Handan Road, Shanghai 200433, China 2 School of Life Sciences, Fudan University, 220 Handan Road, Shanghai 200433, China 3 School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore
* Authors to whom correspondence should be addressed.
Received: 8 January 2011; in revised form: 18 March 2011 / Accepted: 23 March 2011 / Published: 19 April 2011
PDF Full-text Download PDF Full-Text [2671 KB, Updated Version, uploaded 21 April 2011 10:25 CEST]
The original version is still available [2671 KB, uploaded 19 April 2011 09:51 CEST]
Abstract: Feature selection is an important step in building accurate classifiers and provides better understanding of the data sets. In this paper, we propose a feature subset selection method based on high-dimensional mutual information. We also propose to use the entropy of the class attribute as a criterion to determine the appropriate subset of features when building classifiers. We prove that if the mutual information between a feature set X and the class attribute Y equals to the entropy of Y , then X is a Markov Blanket of Y . We show that in some cases, it is infeasible to approximate the high-dimensional mutual information with algebraic combinations of pairwise mutual information in any forms. In addition, the exhaustive searches of all combinations of features are prerequisite for finding the optimal feature subsets for classifying these kinds of data sets. We show that our approach outperforms existing filter feature subset selection methods for most of the 24 selected benchmark data sets.
Keywords: feature selection; mutual information; Entropy; information theory; Markov blanket; classification

Article Statistics

Load and display the download statistics.

Citations to this Article

Cite This Article

MDPI and ACS Style

Zheng, Y.; Kwoh, C.K. A Feature Subset Selection Method Based On High-Dimensional Mutual Information. Entropy 2011, 13, 860-901.

AMA Style

Zheng Y, Kwoh CK. A Feature Subset Selection Method Based On High-Dimensional Mutual Information. Entropy. 2011; 13(4):860-901.

Chicago/Turabian Style

Zheng, Yun; Kwoh, Chee Keong. 2011. "A Feature Subset Selection Method Based On High-Dimensional Mutual Information." Entropy 13, no. 4: 860-901.

Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert