Next Article in Journal
Observers and Their Notion of Spacetime beyond Special Relativity
Next Article in Special Issue
A Coarse-to-Fine Approach for 3D Facial Landmarking by Using Deep Feature Fusion
Previous Article in Journal
Group-Theoretic Exploitations of Symmetry in Novel Prestressed Structures
Open AccessArticle

Towards Real-Time Facial Landmark Detection in Depth Data Using Auxiliary Information

Visual Computing Lab, School of Computing, Mathematics and Digital Technology, Manchester Metropolitan University, Chester Street, Manchester M1 5GD, UK
Image Metrics Ltd., Manchester M1 3HZ, UK
Author to whom correspondence should be addressed.
Symmetry 2018, 10(6), 230;
Received: 15 May 2018 / Revised: 7 June 2018 / Accepted: 14 June 2018 / Published: 17 June 2018
(This article belongs to the Special Issue Deep Learning for Facial Informatics)
Modern facial motion capture systems employ a two-pronged approach for capturing and rendering facial motion. Visual data (2D) is used for tracking the facial features and predicting facial expression, whereas Depth (3D) data is used to build a series of expressions on 3D face models. An issue with modern research approaches is the use of a single data stream that provides little indication of the 3D facial structure. We compare and analyse the performance of Convolutional Neural Networks (CNN) using visual, Depth and merged data to identify facial features in real-time using a Depth sensor. First, we review the facial landmarking algorithms and its datasets for Depth data. We address the limitation of the current datasets by introducing the Kinect One Expression Dataset (KOED). Then, we propose the use of CNNs for the single data stream and merged data streams for facial landmark detection. We contribute to existing work by performing a full evaluation on which streams are the most effective for the field of facial landmarking. Furthermore, we improve upon the existing work by extending neural networks to predict into 3D landmarks in real-time with additional observations on the impact of using 2D landmarks as auxiliary information. We evaluate the performance by using Mean Square Error (MSE) and Mean Average Error (MAE). We observe that the single data stream predicts accurate facial landmarks on Depth data when auxiliary information is used to train the network. The codes and dataset used in this paper will be made available. View Full-Text
Keywords: deep learning; RGB; depth; facial landmarking; merging networks deep learning; RGB; depth; facial landmarking; merging networks
Show Figures

Figure 1

MDPI and ACS Style

Kendrick, C.; Tan, K.; Walker, K.; Yap, M.H. Towards Real-Time Facial Landmark Detection in Depth Data Using Auxiliary Information. Symmetry 2018, 10, 230.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

Back to TopTop