Next Article in Journal
1.2 V Differential Difference Transconductance Amplifier and Its Application in Mixed-Mode Universal Filter
Next Article in Special Issue
Facial Emotion Recognition in Verbal Communication Based on Deep Learning
Previous Article in Journal
Selection of Noninvasive Features in Wrist-Based Wearable Sensors to Predict Blood Glucose Concentrations Using Machine Learning Algorithms
Previous Article in Special Issue
Emotion Recognition from Physiological Channels Using Graph Neural Network

Single Image Video Prediction with Auto-Regressive GANs

Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
Information Systems Technology and Design (ISTD), Singapore University of Technology and Design, Singapore 487372, Singapore
Cognitive Systems Lab, Department of Mathematics and Computer Science, University of Bremen, 28359 Bremen, Germany
Department of Experimental Psychology, University College London, 26 Bedford Way, London WC1H 0AP, UK
Department of Computer Science, Goethe University Frankfurt, Robert-Meyer-Str. 11-15, 60325 Frankfurt, Germany
Author to whom correspondence should be addressed.
Academic Editor: Mariusz Szwoch
Sensors 2022, 22(9), 3533;
Received: 1 March 2022 / Revised: 29 April 2022 / Accepted: 3 May 2022 / Published: 6 May 2022
(This article belongs to the Special Issue Emotion Recognition Based on Sensors)
In this paper, we introduce an approach for future frames prediction based on a single input image. Our method is able to generate an entire video sequence based on the information contained in the input frame. We adopt an autoregressive approach in our generation process, i.e., the output from each time step is fed as the input to the next step. Unlike other video prediction methods that use “one shot” generation, our method is able to preserve much more details from the input image, while also capturing the critical pixel-level changes between the frames. We overcome the problem of generation quality degradation by introducing a “complementary mask” module in our architecture, and we show that this allows the model to only focus on the generation of the pixels that need to be changed, and to reuse those that should remain static from its previous frame. We empirically validate our methods against various video prediction models on the UT Dallas Dataset, and show that our approach is able to generate high quality realistic video sequences from one static input image. In addition, we also validate the robustness of our method by testing a pre-trained model on the unseen ADFES facial expression dataset. We also provide qualitative results of our model tested on a human action dataset: The Weizmann Action database. View Full-Text
Keywords: video prediction; autoregressive GANs; emotion generation video prediction; autoregressive GANs; emotion generation
Show Figures

Figure 1

MDPI and ACS Style

Huang, J.; Chia, Y.K.; Yu, S.; Yee, K.; Küster, D.; Krumhuber, E.G.; Herremans, D.; Roig, G. Single Image Video Prediction with Auto-Regressive GANs. Sensors 2022, 22, 3533.

AMA Style

Huang J, Chia YK, Yu S, Yee K, Küster D, Krumhuber EG, Herremans D, Roig G. Single Image Video Prediction with Auto-Regressive GANs. Sensors. 2022; 22(9):3533.

Chicago/Turabian Style

Huang, Jiahui, Yew Ken Chia, Samson Yu, Kevin Yee, Dennis Küster, Eva G. Krumhuber, Dorien Herremans, and Gemma Roig. 2022. "Single Image Video Prediction with Auto-Regressive GANs" Sensors 22, no. 9: 3533.

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

Back to TopTop