Next Article in Journal
Statistic Experience Based Adaptive One-Shot Detector (EAO) for Camera Sensing System
Next Article in Special Issue
A Low-Light Sensor Image Enhancement Algorithm Based on HSI Color Model
Previous Article in Journal
Avoiding Void Holes and Collisions with Reliable and Interference-Aware Routing in Underwater WSNs
Previous Article in Special Issue
Face Detection in Nighttime Images Using Visible-Light Camera Sensors with Two-Step Faster Region-Based Convolutional Neural Network
Article Menu
Issue 9 (September) cover image

Export Article

Open AccessArticle
Sensors 2018, 18(9), 3040;

CNN-Based Multimodal Human Recognition in Surveillance Environments

Division of Electronics and Electrical Engineering, Dongguk University, 30 Pil-dong-ro, 1-gil, Jung-gu, Seoul 100-715, Korea
Author to whom correspondence should be addressed.
Received: 7 August 2018 / Revised: 7 September 2018 / Accepted: 8 September 2018 / Published: 11 September 2018
(This article belongs to the Special Issue Deep Learning-Based Image Sensors)
Full-Text   |   PDF [7611 KB, uploaded 11 September 2018]   |  


In the current field of human recognition, most of the research being performed currently is focused on re-identification of different body images taken by several cameras in an outdoor environment. On the other hand, there is almost no research being performed on indoor human recognition. Previous research on indoor recognition has mainly focused on face recognition because the camera is usually closer to a person in an indoor environment than an outdoor environment. However, due to the nature of indoor surveillance cameras, which are installed near the ceiling and capture images from above in a downward direction, people do not look directly at the cameras in most cases. Thus, it is often difficult to capture front face images, and when this is the case, facial recognition accuracy is greatly reduced. To overcome this problem, we can consider using the face and body for human recognition. However, when images are captured by indoor cameras rather than outdoor cameras, in many cases only part of the target body is included in the camera viewing angle and only part of the body is captured, which reduces the accuracy of human recognition. To address all of these problems, this paper proposes a multimodal human recognition method that uses both the face and body and is based on a deep convolutional neural network (CNN). Specifically, to solve the problem of not capturing part of the body, the results of recognizing the face and body through separate CNNs of VGG Face-16 and ResNet-50 are combined based on the score-level fusion by Weighted Sum rule to improve recognition performance. The results of experiments conducted using the custom-made Dongguk face and body database (DFB-DB1) and the open ChokePoint database demonstrate that the method proposed in this study achieves high recognition accuracy (the equal error rates of 1.52% and 0.58%, respectively) in comparison to face or body single modality-based recognition and other methods used in previous studies. View Full-Text
Keywords: multimodal human recognition; surveillance environment; CNN; human recognition by face and body multimodal human recognition; surveillance environment; CNN; human recognition by face and body

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Share & Cite This Article

MDPI and ACS Style

Koo, J.H.; Cho, S.W.; Baek, N.R.; Kim, M.C.; Park, K.R. CNN-Based Multimodal Human Recognition in Surveillance Environments. Sensors 2018, 18, 3040.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Sensors EISSN 1424-8220 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top