Improving the Naive Bayes Classifier via a Quick Variable Selection Method Using Maximum of Entropy
AbstractVariable selection methods play an important role in the field of attribute mining. The Naive Bayes (NB) classifier is a very simple and popular classification method that yields good results in a short processing time. Hence, it is a very appropriate classifier for very large datasets. The method has a high dependence on the relationships between the variables. The Info-Gain (IG) measure, which is based on general entropy, can be used as a quick variable selection method. This measure ranks the importance of the attribute variables on a variable under study via the information obtained from a dataset. The main drawback is that it is always non-negative and it requires setting the information threshold to select the set of most important variables for each dataset. We introduce here a new quick variable selection method that generalizes the method based on the Info-Gain measure. It uses imprecise probabilities and the maximum entropy measure to select the most informative variables without setting a threshold. This new variable selection method, combined with the Naive Bayes classifier, improves the original method and provides a valuable tool for handling datasets with a very large number of features and a huge amount of data, where more complex methods are not computationally feasible. View Full-Text
Share & Cite This Article
Abellán, J.; Castellano, J.G. Improving the Naive Bayes Classifier via a Quick Variable Selection Method Using Maximum of Entropy. Entropy 2017, 19, 247.
Abellán J, Castellano JG. Improving the Naive Bayes Classifier via a Quick Variable Selection Method Using Maximum of Entropy. Entropy. 2017; 19(6):247.Chicago/Turabian Style
Abellán, Joaquín; Castellano, Javier G. 2017. "Improving the Naive Bayes Classifier via a Quick Variable Selection Method Using Maximum of Entropy." Entropy 19, no. 6: 247.
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.