Next Article in Journal
Decaying Dark Energy in Light of the Latest Cosmological Dataset
Next Article in Special Issue
Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based Recurrent Attention Networks
Previous Article in Journal
Intertemporal Choice of Fuzzy Soft Sets
Previous Article in Special Issue
A Dynamic Adjusting Novel Global Harmony Search for Continuous Optimization Problems
Article Menu
Issue 9 (September) cover image

Export Article

Open AccessArticle
Symmetry 2018, 10(9), 370;

Multimedia Data Modelling Using Multidimensional Recurrent Neural Networks

College of Intelligence Science, National University of Defense Technology, Changsha 410073, China
Department of Computer Science, University College London, London WC1E 6BT, UK
Department of Computer Science, Sichuan University, Chengdu 610065, China
Unmanned Systems Research Center, National Innovation Institute of Defense Technology, Beijing 100071, China
Authors to whom correspondence should be addressed.
Received: 19 August 2018 / Revised: 23 August 2018 / Accepted: 27 August 2018 / Published: 1 September 2018
(This article belongs to the Special Issue Information Technology and Its Applications 2018)
Full-Text   |   PDF [7976 KB, uploaded 1 September 2018]   |  


Modelling the multimedia data such as text, images, or videos usually involves the analysis, prediction, or reconstruction of them. The recurrent neural network (RNN) is a powerful machine learning approach to modelling these data in a recursive way. As a variant, the long short-term memory (LSTM) extends the RNN with the ability to remember information for longer. Whilst one can increase the capacity of LSTM by widening or adding layers, additional parameters and runtime are usually required, which could make learning harder. We therefore propose a Tensor LSTM where the hidden states are tensorised as multidimensional arrays (tensors) and updated through a cross-layer convolution. As parameters are spatially shared within the tensor, we can efficiently widen the model without extra parameters by increasing the tensorised size; as deep computations of each time step are absorbed by temporal computations of the time series, we can implicitly deepen the model with little extra runtime by delaying the output. We show by experiments that our model is well-suited for various multimedia data modelling tasks, including text generation, text calculation, image classification, and video prediction. View Full-Text
Keywords: multimedia data modelling; recurrent neural network (RNN); long short-term memory (LSTM); tensor; convolution; deep learning multimedia data modelling; recurrent neural network (RNN); long short-term memory (LSTM); tensor; convolution; deep learning

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Share & Cite This Article

MDPI and ACS Style

He, Z.; Gao, S.; Xiao, L.; Liu, D.; He, H. Multimedia Data Modelling Using Multidimensional Recurrent Neural Networks. Symmetry 2018, 10, 370.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Symmetry EISSN 2073-8994 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top