Supervised Learning Applications of Action Recognition and Action Prediction
A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".
Deadline for manuscript submissions: 15 November 2026 | Viewed by 158
Special Issue Editor
Special Issue Information
Dear Colleagues,
Human action recognition and prediction are fundamental research topics in computer vision and intelligent systems, with video surveillance, healthcare monitoring, human–computer interaction, robotics, sports analytics, and autonomous driving becoming increasingly relevant. Over the past decade, deep learning methods—including convolutional neural networks (CNNs), recurrent neural networks (RNNs), graph convolutional networks (GCNs), and Transformers—have significantly advanced the state of the art, enabling a more accurate and robust recognition and prediction of human actions from video, skeleton, and multimodal data. More recently, the rapid development of large language models (LLMs) and multimodal large language models (MLLMs) has made new advancements in video understanding, offering powerful capabilities in visual reasoning, open-vocabulary action recognition, video captioning, and instruction-following for action analyses.
This Special Issue aims to collect cutting-edge research on both supervised learning approaches and emerging LLM/MLLM-driven methodologies for action recognition and prediction. Particular emphasis is placed on how large-scale foundation models are transforming video understanding—from novel model architectures and efficient training strategies to zero-shot and few-shot action recognition, cross-modal knowledge transfer, and LLM-assisted temporal reasoning. Both theoretical contributions and practical application studies that push the boundaries of current methods are welcomed, including, but not limited to, the integration of multimodal large language models for enhanced action understanding.
In this Special Issue, original research articles and reviews are welcomed. Research areas may include (but are not limited to) the following:
- Supervised deep learning models for video-based action recognition and prediction;
- Skeleton-based and pose-based action recognition using deep learning and graph neural networks;
- Temporal action detection, localization, and segmentation;
- Early action prediction and future activity anticipation;
- Multimodal fusion strategies for action understanding (RGB, depth, skeleton, audio, and text);
- Attention mechanisms and Transformer architectures for action analysis;
- Large language models (LLMs) and multimodal large language models (MLLMs) for video understanding;
- Zero-shot, few-shot, and open-vocabulary action recognition leveraging foundation models;
- Video captioning, video question answering, and video language alignment for action analysis;
- LLM-assisted temporal reasoning and action chain prediction;
- Knowledge distillation and efficient deployment of large models for action recognition;
- Action recognition in complex, real-world, and domain-specific scenarios;
- Applications in autonomous driving, healthcare, sports analytics, smart environments, and human–robot interaction.
Dr. Yun Tie
Guest Editor
Manuscript Submission Information
Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.
Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.
Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.
Keywords
- action recognition
- action prediction
- supervised learning
- deep learning
- video understanding
- skeleton-based recognition
- temporal action detection
- multimodal fusion
- large language models (LLMs)
- multimodal large language models (MLLMs)
- foundation models
- zero-shot action recognition
- video-language alignment
Benefits of Publishing in a Special Issue
- Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
- Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
- Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
- External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
- Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.
Further information on MDPI's Special Issue policies can be found here.
