Topic Editors
Deep Visual Recognition: Methods, and Applications
Topic Information
Dear Colleagues,
Deep Visual Recognition focuses on enabling machines to automatically perceive and understand visual information from images and videos. Powered by deep learning, particularly Convolutional Neural Networks (CNNs) and more recently Vision Transformers (ViTs), this field has significantly advanced beyond traditional computer vision approaches that relied on hand-crafted features. By learning hierarchical and semantic representations directly from data, deep models achieve robust performance in complex visual recognition tasks. By learning hierarchical and semantic representations directly from data, deep models achieve robust performance on complex visual tasks, evolving from pattern recognition toward knowledge-driven and explainable intelligence that transforms visual data into interpretable, actionable insights for trustworthy decision-making.
Core methods in deep visual recognition address fundamental tasks such as image classification, object detection, semantic and instance segmentation, as well as face and human action recognition. Large-scale datasets (e.g., ImageNet, COCO) and high-performance computing have driven rapid improvements, while recent learning strategies—including self-supervised learning, transfer learning, and few-shot learning—have reduced dependence on extensive labeled data. Multimodal and foundation models further enhance generalization across diverse visual domains.
Deep visual recognition has become a key enabling technology in real-world applications such as autonomous driving, medical image analysis, intelligent surveillance, robotics, and industrial automation. Despite its success, challenges remain in handling domain shifts, occlusions, lighting variations, and real-time constraints. Future research increasingly emphasizes robustness, interpretability, and computational efficiency, positioning deep visual recognition as a central component of intelligent perception systems.
Prof. Dr. Min Young Kim
Dr. Francisco Gomez-Donoso
Topic Editors
Keywords
- deep visual recognition
- convolutional neural networks (CNNs)
- vision transformers (ViTs)
- object detection
- semantic segmentation
- self-supervised learning
- multimodal learning
- autonomous systems
Participating Journals
| Journal Name | Impact Factor | CiteScore | Launched Year | First Decision (median) | APC | |
|---|---|---|---|---|---|---|
AI
|
5.0 | 6.9 | 2020 | 19.2 Days | CHF 1800 | Submit |
Electronics
|
2.6 | 6.1 | 2012 | 16.4 Days | CHF 2400 | Submit |
Machine Learning and Knowledge Extraction
|
6.0 | 9.9 | 2019 | 27 Days | CHF 1800 | Submit |
Robotics
|
3.3 | 7.7 | 2012 | 23.7 Days | CHF 1800 | Submit |
Sensors
|
3.5 | 8.2 | 2001 | 17.8 Days | CHF 2600 | Submit |
Preprints.org is a multidisciplinary platform offering a preprint service designed to facilitate the early sharing of your research. It supports and empowers your research journey from the very beginning.
MDPI Topics is collaborating with Preprints.org and has established a direct connection between MDPI journals and the platform. Authors are encouraged to take advantage of this opportunity by posting their preprints at Preprints.org prior to publication:
- Share your research immediately: disseminate your ideas prior to publication and establish priority for your work.
- Safeguard your intellectual contribution: Protect your ideas with a time-stamped preprint that serves as proof of your research timeline.
- Boost visibility and impact: Increase the reach and influence of your research by making it accessible to a global audience.
- Gain early feedback: Receive valuable input and insights from peers before submitting to a journal.
- Ensure broad indexing: Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.