sensors-logo

Journal Browser

Journal Browser

Action Recognition and Tracking Using Deep Learning

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Physical Sensors".

Deadline for manuscript submissions: closed (30 October 2022) | Viewed by 8321

Special Issue Editors


E-Mail Website
Guest Editor
Dept. of Electrical Engineering, National Taiwan Normal University, Taipei, Taiwan
Interests: parallel computation; GPU programming; Machine Learning; Internet of Thing

E-Mail Website
Guest Editor
Dept. of Electrical Engineering, National Taiwan Normal University, Taipei, Taiwan
Interests: navigation of mobile robots, evolutionary algorithms (EAs) and their applications; image-based distance measurement and localization; digital (sampled-data) control systems

E-Mail Website
Guest Editor
Department of Biomedical Engineering, National Yangming University, Taipei, Taiwan
Interests: hearing and speech sciences; biomedical signal processing; artificial intelligence; assistive devices for hearing and communication

Special Issue Information

Dear Colleagues,

In recent years, action recognition and tracking technology based on deep learning has been widely used in many fields, such as security surveillance, healthcare, sports science, and somatosensory entertainment. Action recognition and tacking can be performed based on low-level appearance features, such as color, optical flow, and spatiotemporal gradients or on skeleton information derived from the human pose. There are several challenges in the field of action recognition. Although the recognition task has been developed for a long time, and the effect has also led to many breakthroughs, most of the current methods still rely on a large number of samples. How to obtain better accuracy without a large amount of labeled data is a very challenging area. In addition, for the task of action segmentation or temporal action localization, labeling is a difficult problem, because it is difficult to label the time point when an action starts and ends, and it will be a big challenge to predict the time period when the action occurs. To solve this problem, there have been weakly supervised learning or unsupervised learning approaches to complete, but the accuracy still needs to be improved. The more difficult task is the spatiotemporal action detection or spatiotemporal action localization. The goal of the task is to locate the position of the human body and the actions that the person is completing, which is equivalent to completing the identification and tracking tasks at the same time. Furthermore, there are also many tasks, such as DeepFake detection, video generation, and video noise reduction, that need to be developed urgently. Deep video compression is also a very important application.

This Special Issue addresses the innovative developments, technologies, and challenges related to action recognition and tracking using deep learning. It seeks the latest findings from research and ongoing projects. Additionally, review articles that provide readers with current research trends and solutions are also welcome. Potential topics include, but are not limited to, the following:

  • Action recognition and tracking model;
  • Lightweight model for action recognition and tracking model;
  • RGB-based action recognition model;
  • Skeleton-based action recognition model;
  • Spatiotemporal action detection task;
  • Action segmentation task (weakly supervised or unsupervised) ;
  • Deep Fake detection;
  • Video generation task;
  • Video noise reduction;
  • Deep video compression;
  • Smart surveillance.

Prof. Dr. Cheng-Hung Lin
Prof. Dr. Chen Chien James Hsu
Dr. Ying-Hui Lai
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 8308 KiB  
Article
Two-Level Attention Module Based on Spurious-3D Residual Networks for Human Action Recognition
by Bo Chen, Fangzhou Meng, Hongying Tang and Guanjun Tong
Sensors 2023, 23(3), 1707; https://doi.org/10.3390/s23031707 - 3 Feb 2023
Cited by 16 | Viewed by 2587
Abstract
In recent years, deep learning techniques have excelled in video action recognition. However, currently commonly used video action recognition models minimize the importance of different video frames and spatial regions within some specific frames when performing action recognition, which makes it difficult for [...] Read more.
In recent years, deep learning techniques have excelled in video action recognition. However, currently commonly used video action recognition models minimize the importance of different video frames and spatial regions within some specific frames when performing action recognition, which makes it difficult for the models to adequately extract spatiotemporal features from the video data. In this paper, an action recognition method based on improved residual convolutional neural networks (CNNs) for video frames and spatial attention modules is proposed to address this problem. The network can guide what and where to emphasize or suppress with essentially little computational cost using the video frame attention module and the spatial attention module. It also employs a two-level attention module to emphasize feature information along the temporal and spatial dimensions, respectively, highlighting the more important frames in the overall video sequence and the more important spatial regions in some specific frames. Specifically, we create the video frame and spatial attention map by successively adding the video frame attention module and the spatial attention module to aggregate the spatial and temporal dimensions of the intermediate feature maps of the CNNs to obtain different feature descriptors, thus directing the network to focus more on important video frames and more contributing spatial regions. The experimental results further show that the network performs well on the UCF-101 and HMDB-51 datasets. Full article
(This article belongs to the Special Issue Action Recognition and Tracking Using Deep Learning)
Show Figures

Figure 1

21 pages, 1434 KiB  
Article
Fast Temporal Graph Convolutional Model for Skeleton-Based Action Recognition
by Mihai Nan and Adina Magda Florea
Sensors 2022, 22(19), 7117; https://doi.org/10.3390/s22197117 - 20 Sep 2022
Cited by 4 | Viewed by 1532
Abstract
Human action recognition has a wide range of applications, including Ambient Intelligence systems and user assistance. Starting from the recognized actions performed by the user, a better human–computer interaction can be achieved, and improved assistance can be provided by social robots in real-time [...] Read more.
Human action recognition has a wide range of applications, including Ambient Intelligence systems and user assistance. Starting from the recognized actions performed by the user, a better human–computer interaction can be achieved, and improved assistance can be provided by social robots in real-time scenarios. In this context, the performance of the prediction system is a key aspect. The purpose of this paper is to introduce a neural network approach based on various types of convolutional layers that can achieve a good performance in recognizing actions but with a high inference speed. The experimental results show that our solution, based on a combination of graph convolutional networks (GCN) and temporal convolutional networks (TCN), is a suitable approach that reaches the proposed goal. In addition to the neural network model, we design a pipeline that contains two stages for obtaining relevant geometric features, data augmentation and data preprocessing, also contributing to an increased performance. Full article
(This article belongs to the Special Issue Action Recognition and Tracking Using Deep Learning)
Show Figures

Figure 1

15 pages, 4555 KiB  
Article
Vision-Based Learning from Demonstration System for Robot Arms
by Pin-Jui Hwang, Chen-Chien Hsu, Po-Yung Chou, Wei-Yen Wang and Cheng-Hung Lin
Sensors 2022, 22(7), 2678; https://doi.org/10.3390/s22072678 - 31 Mar 2022
Cited by 7 | Viewed by 3432
Abstract
Robotic arms have been widely used in various industries and have the advantages of cost savings, high productivity, and efficiency. Although robotic arms are good at increasing efficiency in repetitive tasks, they still need to be re-programmed and optimized when new tasks are [...] Read more.
Robotic arms have been widely used in various industries and have the advantages of cost savings, high productivity, and efficiency. Although robotic arms are good at increasing efficiency in repetitive tasks, they still need to be re-programmed and optimized when new tasks are to be deployed, resulting in detrimental downtime and high cost. It is therefore the objective of this paper to present a learning from demonstration (LfD) robotic system to provide a more intuitive way for robots to efficiently perform tasks through learning from human demonstration on the basis of two major components: understanding through human demonstration and reproduction by robot arm. To understand human demonstration, we propose a vision-based spatial-temporal action detection method to detect human actions that focuses on meticulous hand movement in real time to establish an action base. An object trajectory inductive method is then proposed to obtain a key path for objects manipulated by the human through multiple demonstrations. In robot reproduction, we integrate the sequence of actions in the action base and the key path derived by the object trajectory inductive method for motion planning to reproduce the task demonstrated by the human user. Because of the capability of learning from demonstration, the robot can reproduce the tasks that the human demonstrated with the help of vision sensors in unseen contexts. Full article
(This article belongs to the Special Issue Action Recognition and Tracking Using Deep Learning)
Show Figures

Figure 1

Back to TopTop