Open AccessThis article is
- freely available
Discretization Based on Entropy and Multiple Scanning
Department of Electrical Engineering and Computer Science, University of Kansas, 3014 Eaton Hall, Lawrence, KS 66045, USA
Department of Expert Systems and Artificial Intelligence, University of Information Technology and Management, Rzeszow 35-225, Poland
Received: 28 February 2013; in revised form: 16 April 2013 / Accepted: 18 April 2013 / Published: 25 April 2013
(This article belongs to the Special Issue Big Data
Abstract: In this paper we present entropy driven methodology for discretization. Recently, the original entropy based discretization was enhanced by including two options of selecting the best numerical attribute. In one option, Dominant Attribute, an attribute with the smallest conditional entropy of the concept given the attribute is selected for discretization and then the best cut point is determined. In the second option, Multiple Scanning, all attributes are scanned a number of times, and at the same time the best cut points are selected for all attributes. The results of experiments on 17 benchmark data sets, including large data sets, with 175 attributes or 25,931 cases, are presented. For comparison, the results of experiments on the same data sets using the global versions of well-known discretization methods of Equal Interval Width and Equal Frequency per Interval are also included. The entropy driven technique enhanced both of these methods by converting them into globalized methods. Results of our experiments show that the Multiple Scanning methodology is significantly better than both: Dominant Attribute and the better results of Globalized Equal Interval Width and Equal Frequency per Interval methods (using two-tailed test and 0.01 level of significance).
Keywords: numerical attributes; entropy; discretization; data mining
Article StatisticsClick here to load and display the download statistics.
Notes: Multiple requests from the same IP address are counted as one view.
Cite This Article
MDPI and ACS Style
Grzymala-Busse, J.W. Discretization Based on Entropy and Multiple Scanning. Entropy 2013, 15, 1486-1502.
Grzymala-Busse JW. Discretization Based on Entropy and Multiple Scanning. Entropy. 2013; 15(5):1486-1502.
Grzymala-Busse, Jerzy W. 2013. "Discretization Based on Entropy and Multiple Scanning." Entropy 15, no. 5: 1486-1502.