Next Article in Journal
Consistency Models of NoSQL Databases
Next Article in Special Issue
Tooth-Marked Tongue Recognition Using Gradient-Weighted Class Activation Maps
Previous Article in Journal
A Scheme to Design Community Detection Algorithms in Various Networks
Previous Article in Special Issue
Forward-Looking Element Recognition Based on the LSTM-CRF Model with the Integrity Algorithm
Article Menu
Issue 2 (February) cover image

Export Article

Open AccessArticle

3D-CNN-Based Fused Feature Maps with LSTM Applied to Action Recognition

Information and Communication Engineering, Beijing Institute of Technology, Beijing 100081, China
Authors to whom correspondence should be addressed.
Future Internet 2019, 11(2), 42;
Received: 20 December 2018 / Revised: 6 February 2019 / Accepted: 6 February 2019 / Published: 13 February 2019
(This article belongs to the Special Issue Innovative Topologies and Algorithms for Neural Networks)
PDF [3479 KB, uploaded 15 February 2019]
  |     |  


Human activity recognition is an active field of research in computer vision with numerous applications. Recently, deep convolutional networks and recurrent neural networks (RNN) have received increasing attention in multimedia studies, and have yielded state-of-the-art results. In this research work, we propose a new framework which intelligently combines 3D-CNN and LSTM networks. First, we integrate discriminative information from a video into a map called a ‘motion map’ by using a deep 3-dimensional convolutional network (C3D). A motion map and the next video frame can be integrated into a new motion map, and this technique can be trained by increasing the training video length iteratively; then, the final acquired network can be used for generating the motion map of the whole video. Next, a linear weighted fusion scheme is used to fuse the network feature maps into spatio-temporal features. Finally, we use a Long-Short-Term-Memory (LSTM) encoder-decoder for final predictions. This method is simple to implement and retains discriminative and dynamic information. The improved results on benchmark public datasets prove the effectiveness and practicability of the proposed method. View Full-Text
Keywords: action recognition; fused features; 3D convolution neural network; motion map; long short-term-memory action recognition; fused features; 3D convolution neural network; motion map; long short-term-memory

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Share & Cite This Article

MDPI and ACS Style

Arif, S.; Wang, J.; Ul Hassan, T.; Fei, Z. 3D-CNN-Based Fused Feature Maps with LSTM Applied to Action Recognition. Future Internet 2019, 11, 42.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Future Internet EISSN 1999-5903 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top