Next Article in Journal
A Filter Structure for Arbitrary Re-Sampling Ratio Conversion of a Discrete Signal
Next Article in Special Issue
Identifying High Quality Document–Summary Pairs through Text Matching
Previous Article in Journal
Subtraction and Division Operations of Simplified Neutrosophic Sets
Article Menu

Export Article

Open AccessArticle
Information 2017, 8(2), 52;

Multi-Label Classification from Multiple Noisy Sources Using Topic Models

Department of Computer Science and Automation, Indian Institute of Science, Bangalore-560012, India
This paper is an extended version of our paper published in TMNZ 2016 and IEEE ICTAI 2016.
Author to whom correspondence should be addressed.
Academic Editor: Willy Susilo
Received: 24 January 2017 / Revised: 24 April 2017 / Accepted: 27 April 2017 / Published: 5 May 2017
(This article belongs to the Special Issue Text Mining Applications and Theory)
Full-Text   |   PDF [775 KB, uploaded 5 May 2017]   |  


Multi-label classification is a well-known supervised machine learning setting where each instance is associated with multiple classes. Examples include annotation of images with multiple labels, assigning multiple tags for a web page, etc. Since several labels can be assigned to a single instance, one of the key challenges in this problem is to learn the correlations between the classes. Our first contribution assumes labels from a perfect source. Towards this, we propose a novel topic model (ML-PA-LDA). The distinguishing feature in our model is that classes that are present as well as the classes that are absent generate the latent topics and hence the words. Extensive experimentation on real world datasets reveals the superior performance of the proposed model. A natural source for procuring the training dataset is through mining user-generated content or directly through users in a crowdsourcing platform. In this more practical scenario of crowdsourcing, an additional challenge arises as the labels of the training instances are provided by noisy, heterogeneous crowd-workers with unknown qualities. With this motivation, we further augment our topic model to the scenario where the labels are provided by multiple noisy sources and refer to this model as ML-PA-LDA-MNS. With experiments on simulated noisy annotators, the proposed model learns the qualities of the annotators well, even with minimal training data. View Full-Text
Keywords: multi-label classification; topic models; multiple sources; variational inference multi-label classification; topic models; multiple sources; variational inference

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Share & Cite This Article

MDPI and ACS Style

Padmanabhan, D.; Bhat, S.; Shevade, S.; Narahari, Y. Multi-Label Classification from Multiple Noisy Sources Using Topic Models. Information 2017, 8, 52.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Information EISSN 2078-2489 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top