Next Article in Journal
A Novel Image Encryption Scheme Using the Composite Discrete Chaotic System
Next Article in Special Issue
Voice Activity Detection Using Fuzzy Entropy and Support Vector Machine
Previous Article in Journal
Acoustic Detection of Coronary Occlusions before and after Stent Placement Using an Electronic Stethoscope
Previous Article in Special Issue
A PUT-Based Approach to Automatically Extracting Quantities and Generating Final Answers for Numerical Attributes
Article Menu

Export Article

Open AccessArticle
Entropy 2016, 18(8), 282; doi:10.3390/e18080282

How Is a Data-Driven Approach Better than Random Choice in Label Space Division for Multi-Label Classification?

1
Department of Computational Intelligence, Wrocław University of Technology, Wybrzeże Stanisława Wyspiańskiego 27, 50-370 Wrocław, Poland
2
Illimites Foundation, Gajowicka 64 lok. 1, 53-422 Wrocław, Poland
3
Department of Computer Science, TU Dortmund University, August-Schmidt-Straße 4, 44221 Dortmund, Germany
*
Author to whom correspondence should be addressed.
Academic Editor: Andreas Holzinger
Received: 1 February 2016 / Revised: 12 July 2016 / Accepted: 19 July 2016 / Published: 30 July 2016
View Full-Text   |   Download PDF [2969 KB, uploaded 30 July 2016]   |  

Abstract

We propose using five data-driven community detection approaches from social networks to partition the label space in the task of multi-label classification as an alternative to random partitioning into equal subsets as performed by RAkELd. We evaluate modularity-maximizing using fast greedy and leading eigenvector approximations, infomap, walktrap and label propagation algorithms. For this purpose, we propose to construct a label co-occurrence graph (both weighted and unweighted versions) based on training data and perform community detection to partition the label set. Then, each partition constitutes a label space for separate multi-label classification sub-problems. As a result, we obtain an ensemble of multi-label classifiers that jointly covers the whole label space. Based on the binary relevance and label powerset classification methods, we compare community detection methods to label space divisions against random baselines on 12 benchmark datasets over five evaluation measures. We discover that data-driven approaches are more efficient and more likely to outperform RAkELd than binary relevance or label powerset is, in every evaluated measure. For all measures, apart from Hamming loss, data-driven approaches are significantly better than RAkELd ( α = 0 . 05 ), and at least one data-driven approach is more likely to outperform RAkELd than a priori methods in the case of RAkELd’s best performance. This is the largest RAkELd evaluation published to date with 250 samplings per value for 10 values of RAkELd parameter k on 12 datasets published to date. View Full-Text
Keywords: label space clustering; label co-occurrence; label grouping; multi-label classification; clustering; machine learning; random k-label sets; ensemble classification label space clustering; label co-occurrence; label grouping; multi-label classification; clustering; machine learning; random k-label sets; ensemble classification
Figures

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Szymański, P.; Kajdanowicz, T.; Kersting, K. How Is a Data-Driven Approach Better than Random Choice in Label Space Division for Multi-Label Classification? Entropy 2016, 18, 282.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top