Next Article in Journal
Power Capability Analysis of Lithium Battery and Supercapacitor by Pulse Duration
Previous Article in Journal
Experimental Validation of a Reliable Palmprint Recognition System Based on 2D Ultrasound Images
Open AccessArticle

Zero-Shot Deep Learning for Media Mining: Person Spotting and Face Clustering in Video Big Data

1
Department of Electronics, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju 28644, Korea
2
Informatics Department, Electronics Research Institute (ERI), Giza 12622, Egypt
3
Computer Engineering Department, Faculty of Engineering, Cairo University, Giza 12613, Egypt
4
University of Science and Technology, Zewail City of Science and Technology, October Gardens, Giza 12578, Egypt
*
Authors to whom correspondence should be addressed.
Electronics 2019, 8(12), 1394; https://doi.org/10.3390/electronics8121394
Received: 23 October 2019 / Revised: 15 November 2019 / Accepted: 19 November 2019 / Published: 22 November 2019
(This article belongs to the Section Artificial Intelligence)
The analysis of frame sequences in talk show videos, which is necessary for media mining and television production, requires significant manual efforts and is a very time-consuming process. Given the vast amount of unlabeled face frames from talk show videos, we address and propose a solution to the problem of recognizing and clustering faces. In this paper, we propose a TV media mining system that is based on a deep convolutional neural network approach, which has been trained with a triplet loss minimization method. The main function of the proposed system is the indexing and clustering of video data for achieving an effective media production analysis of individuals in talk show videos and rapidly identifying a specific individual in video data in real-time processing. Our system uses several face datasets from Labeled Faces in the Wild (LFW), which is a collection of unlabeled web face images, as well as YouTube Faces and talk show faces datasets. In the recognition (person spotting) task, our system achieves an F-measure of 0.996 for the collection of unlabeled web face images dataset and an F-measure of 0.972 for the talk show faces dataset. In the clustering task, our system achieves an F-measure of 0.764 and 0.935 for the YouTube Faces database and the LFW dataset, respectively, while achieving an F-measure of 0.832 for the talk show faces dataset, an improvement of 5.4%, 6.5%, and 8.2% over the previous methods. View Full-Text
Keywords: face clustering; face recognition; face detection; CNN; KL divergence; triplet loss face clustering; face recognition; face detection; CNN; KL divergence; triplet loss
Show Figures

Figure 1

MDPI and ACS Style

Abdallah, M.S.; Kim, H.; Ragab, M.E.; Hemayed, E.E. Zero-Shot Deep Learning for Media Mining: Person Spotting and Face Clustering in Video Big Data. Electronics 2019, 8, 1394.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop