An Informed Framework for Training Classifiers from Social Media†
1
Department of Computer Science and Engineering, Hankuk University of Foreign Studies, 81 Oedae-ro, Mohyeon-myeon, Cheoin-gu, Yongin-si, Gyeonggi-do 449-791, South Korea
2
Department of Computer Science, University of Verona, Strada Le Grazie 15, I-37134 Verona, Italy
*
Author to whom correspondence should be addressed.
†
This paper is an extended version of our paper published in 18th International Conference on Image Analysis and Processing (ICIAP 2015), Genoa, Italy, 7–11 September 2015.
Academic Editor: Andreas Holzinger
Entropy 2016, 18(4), 130; https://doi.org/10.3390/e18040130
Received: 18 January 2016 / Revised: 22 March 2016 / Accepted: 28 March 2016 / Published: 9 April 2016
(This article belongs to the Special Issue Machine Learning and Entropy: Discover Unknown Unknowns in Complex Data Sets)
Extracting information from social media has become a major focus of companies and researchers in recent years. Aside from the study of the social aspects, it has also been found feasible to exploit the collaborative strength of crowds to help solve classical machine learning problems like object recognition. In this work, we focus on the generally underappreciated problem of building effective datasets for training classifiers by automatically assembling data from social media. We detail some of the challenges of this approach and outline a framework that uses expanded search queries to retrieve more qualified data. In particular, we concentrate on collaboratively tagged media on the social platform Flickr, and on the problem of image classification to evaluate our approach. Finally, we describe a novel entropy-based method to incorporate an information-theoretic principle to guide our framework. Experimental validation against well-known public datasets shows the viability of this approach and marks an improvement over the state of the art in terms of simplicity and performance.
View Full-Text
Keywords:
training sets; image classification; Shannon entropy; social media
▼
Show Figures
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited
- Supplementary File 1:
ZIP-Document (ZIP, 44217 KiB)
MDPI and ACS Style
Cheng, D.S.; Abdulhak, S.A. An Informed Framework for Training Classifiers from Social Media. Entropy 2016, 18, 130. https://doi.org/10.3390/e18040130
AMA Style
Cheng DS, Abdulhak SA. An Informed Framework for Training Classifiers from Social Media. Entropy. 2016; 18(4):130. https://doi.org/10.3390/e18040130
Chicago/Turabian StyleCheng, Dong S.; Abdulhak, Sami A. 2016. "An Informed Framework for Training Classifiers from Social Media" Entropy 18, no. 4: 130. https://doi.org/10.3390/e18040130
Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.
Search more from Scilit