Next Article in Journal
Prediction of Bubble Size Distributions in Large-Scale Bubble Columns Using a Population Balance Model
Next Article in Special Issue
Development of Simple-to-Use Predictive Models to Determine Thermal Properties of Fe2O3/Water-Ethylene Glycol Nanofluid
Previous Article in Journal
Multi Similarity Metric Fusion in Graph-Based Semi-Supervised Learning
Article Menu

Export Article

Open AccessArticle
Computation 2019, 7(1), 16; https://doi.org/10.3390/computation7010016

Extreme Multiclass Classification Criteria

NYU Tandon School of Engineering, Department of Electrical and Computer Engineering 5 MetroTech Center, Brooklyn, NY 11201, USA
*
Author to whom correspondence should be addressed.
Received: 2 February 2019 / Revised: 5 March 2019 / Accepted: 8 March 2019 / Published: 12 March 2019
(This article belongs to the Special Issue Machine Learning for Computational Science and Engineering)
Full-Text   |   PDF [404 KB, uploaded 12 March 2019]   |  

Abstract

We analyze the theoretical properties of the recently proposed objective function for efficient online construction and training of multiclass classification trees in the settings where the label space is very large. We show the important properties of this objective and provide a complete proof that maximizing it simultaneously encourages balanced trees and improves the purity of the class distributions at subsequent levels in the tree. We further explore its connection to the three well-known entropy-based decision tree criteria, i.e., Shannon entropy, Gini-entropy and its modified variant, for which efficient optimization strategies are largely unknown in the extreme multiclass setting. We show theoretically that this objective can be viewed as a surrogate function for all of these entropy criteria and that maximizing it indirectly optimizes them as well. We derive boosting guarantees and obtain a closed-form expression for the number of iterations needed to reduce the considered entropy criteria below an arbitrary threshold. The obtained theorem relies on a weak hypothesis assumption that directly depends on the considered objective function. Finally, we prove that optimizing the objective directly reduces the multi-class classification error of the decision tree. View Full-Text
Keywords: multiclass classification; decision trees; boosting multiclass classification; decision trees; boosting
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Choromanska, A.; Kumar Jain, I. Extreme Multiclass Classification Criteria. Computation 2019, 7, 16.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Computation EISSN 2079-3197 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top