Topic Editors

School of Electronics Engineering, IT College, Kyungpook National University, Daegu 41566, Republic of Korea
Department of Computer Science, University of Alicante, 03690 Alicante, Spain

Deep Visual Recognition: Methods, and Applications

Abstract submission deadline
30 August 2026
Manuscript submission deadline
30 October 2026
Viewed by
1008

Topic Information

Dear Colleagues,

Deep Visual Recognition focuses on enabling machines to automatically perceive and understand visual information from images and videos. Powered by deep learning, particularly Convolutional Neural Networks (CNNs) and more recently Vision Transformers (ViTs), this field has significantly advanced beyond traditional computer vision approaches that relied on hand-crafted features. By learning hierarchical and semantic representations directly from data, deep models achieve robust performance in complex visual recognition tasks. By learning hierarchical and semantic representations directly from data, deep models achieve robust performance on complex visual tasks, evolving from pattern recognition toward knowledge-driven and explainable intelligence that transforms visual data into interpretable, actionable insights for trustworthy decision-making.

Core methods in deep visual recognition address fundamental tasks such as image classification, object detection, semantic and instance segmentation, as well as face and human action recognition. Large-scale datasets (e.g., ImageNet, COCO) and high-performance computing have driven rapid improvements, while recent learning strategies—including self-supervised learning, transfer learning, and few-shot learning—have reduced dependence on extensive labeled data. Multimodal and foundation models further enhance generalization across diverse visual domains.

Deep visual recognition has become a key enabling technology in real-world applications such as autonomous driving, medical image analysis, intelligent surveillance, robotics, and industrial automation. Despite its success, challenges remain in handling domain shifts, occlusions, lighting variations, and real-time constraints. Future research increasingly emphasizes robustness, interpretability, and computational efficiency, positioning deep visual recognition as a central component of intelligent perception systems.

Prof. Dr. Min Young Kim
Dr. Francisco Gomez-Donoso
Topic Editors

Keywords

  • deep visual recognition
  • convolutional neural networks (CNNs)
  • vision transformers (ViTs)
  • object detection
  • semantic segmentation
  • self-supervised learning
  • multimodal learning
  • autonomous systems

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
AI
ai
5.0 6.9 2020 19.2 Days CHF 1800 Submit
Electronics
electronics
2.6 6.1 2012 16.4 Days CHF 2400 Submit
Machine Learning and Knowledge Extraction
make
6.0 9.9 2019 27 Days CHF 1800 Submit
Robotics
robotics
3.3 7.7 2012 23.7 Days CHF 1800 Submit
Sensors
sensors
3.5 8.2 2001 17.8 Days CHF 2600 Submit

Preprints.org is a multidisciplinary platform offering a preprint service designed to facilitate the early sharing of your research. It supports and empowers your research journey from the very beginning.

MDPI Topics is collaborating with Preprints.org and has established a direct connection between MDPI journals and the platform. Authors are encouraged to take advantage of this opportunity by posting their preprints at Preprints.org prior to publication:

  1. Share your research immediately: disseminate your ideas prior to publication and establish priority for your work.
  2. Safeguard your intellectual contribution: Protect your ideas with a time-stamped preprint that serves as proof of your research timeline.
  3. Boost visibility and impact: Increase the reach and influence of your research by making it accessible to a global audience.
  4. Gain early feedback: Receive valuable input and insights from peers before submitting to a journal.
  5. Ensure broad indexing: Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.

Published Papers (1 paper)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
34 pages, 36077 KB  
Article
Modular Multi-Attribute Vehicle Analysis by Color, License Plate, Make and Sub-Model Using YOLO and OCR: A Benchmark Across YOLO Versions
by Cristian Japhet Islas-Yañez, Viridiana Hernández-Herrera and Moisés Márquez-Olivera
Sensors 2026, 26(9), 2785; https://doi.org/10.3390/s26092785 - 29 Apr 2026
Viewed by 728
Abstract
We present a modular multi-attribute vehicle analysis pipeline that integrates YOLO-based models and an OCR engine into a single workflow. The system detects vehicles, classifies color, recognizes make and sub-model, detects license plates, and extracts plate characters to generate a structured vehicle record. [...] Read more.
We present a modular multi-attribute vehicle analysis pipeline that integrates YOLO-based models and an OCR engine into a single workflow. The system detects vehicles, classifies color, recognizes make and sub-model, detects license plates, and extracts plate characters to generate a structured vehicle record. Vehicle detection is reported with standard metrics (precision, recall, and mAP@0.5), while license plate detection is reported at IoU = 0.3 to reflect the small-object nature of plates and downstream OCR usability. Among the evaluated versions, YOLOv8 provides the most balanced overall performance across modules, while maintaining real-time-equivalent throughput of approximately 18–22 FPS for the full pipeline on recorded traffic videos, depending on scene complexity. We emphasize module-level evaluation and runtime benchmarking; instance-level end-to-end identification across unique vehicles is defined as future work once track-based ground truth becomes available. Full article
(This article belongs to the Topic Deep Visual Recognition: Methods, and Applications)
Show Figures

Figure 1

Back to TopTop