Advances in Computer Vision: Emerging Trends and Applications

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Algorithms for Multidisciplinary Applications".

Deadline for manuscript submissions: 30 April 2026 | Viewed by 7217

Special Issue Editors

Department of Computer Science, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, Republic of Korea
Interests: energy informatics; computer vision; virtual and augmented reality; bioinformatics; IoT; IIoT; machine learning; deep learning

E-Mail Website
Guest Editor
Advanced Research and Innovation Center (ARIC), Khalifa University of Science and Technology, Abu Dhabi P.O. Box 127788, United Arab Emirates
Interests: person re-identification; bioinformatics; energy informatics; video analytics; computer vision; deep learning; machine learning; activity recognition; video summarization

Special Issue Information

Dear Colleagues,

The rapid growth of big data has transformed the landscape of computer vision, particularly in surveillance applications. With vast amounts of visual data being generated daily, advanced computer vision techniques are essential for extracting actionable insights and enhancing real-time decision making. This Special Issue focuses on applications such as intelligent surveillance systems, behavior analysis, disaster management, anomaly detection, activity recognition, person re-identification, and crowd management. We encourage submissions that tackle the challenges associated with processing and analyzing large-scale visual datasets while addressing the ethical implications of surveillance technologies. This Special Issue aims to advance the understanding and utilization of computer vision in big data by starting a dialogue on innovative methodologies and practical applications.

The topics of interest include, but are not limited to, the following:

  • Intelligent surveillance systems;
  • Behavior analysis and activity recognition;
  • Disaster management and response;
  • Anomaly detection in video streams;
  • Human behavior analysis;
  • Crowd management and safety;
  • Person re-identification in surveillance applications;
  • Object detection and segmentation.

Dr. Noman Khan
Dr. Samee Ullah Khan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • computer vision
  • activity recognition
  • intelligent surveillance systems
  • behavior analysis
  • person re-identification
  • object detection

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

23 pages, 3875 KB  
Article
Edge AI for Industrial Visual Inspection: YOLOv8-Based Visual Conformity Detection Using Raspberry Pi
by Marcelo T. Okano, William Aparecido Celestino Lopes, Sergio Miele Ruggero, Oduvaldo Vendrametto and João Carlos Lopes Fernandes
Algorithms 2025, 18(8), 510; https://doi.org/10.3390/a18080510 - 14 Aug 2025
Viewed by 1854
Abstract
This paper presents a lightweight and cost-effective computer vision solution for automated industrial inspection using You Only Look Once (YOLO) v8 models deployed on embedded systems. The YOLOv8 Nano model, trained for 200 epochs, achieved a precision of 0.932, an mAP@0.5 of 0.938, [...] Read more.
This paper presents a lightweight and cost-effective computer vision solution for automated industrial inspection using You Only Look Once (YOLO) v8 models deployed on embedded systems. The YOLOv8 Nano model, trained for 200 epochs, achieved a precision of 0.932, an mAP@0.5 of 0.938, and an F1-score of 0.914, with an average inference time of ~470 ms on a Raspberry Pi 500, confirming its feasibility for real-time edge applications. The proposed system aims to replace physical jigs used for the dimensional verification of extruded polyamide tubes in the automotive sector. The YOLOv8 Nano and YOLOv8 Small models were trained on a Graphics Processing Unit (GPU) workstation and subsequently tested on a Central Processing Unit (CPU)-only Raspberry Pi 500 to evaluate their performance in constrained environments. The experimental results show that the Small model achieved higher accuracy (a precision of 0.951 and an mAP@0.5 of 0.941) but required a significantly longer inference time (~1315 ms), while the Nano model achieved faster execution (~470 ms) with stable metrics (precision of 0.932 and mAP@0.5 of 0.938), therefore making it more suitable for real-time applications. The system was validated using authentic images in an industrial setting, confirming its feasibility for edge artificial intelligence (AI) scenarios. These findings reinforce the feasibility of embedded AI in smart manufacturing, demonstrating that compact models can deliver reliable performance without requiring high-end computing infrastructure. Full article
(This article belongs to the Special Issue Advances in Computer Vision: Emerging Trends and Applications)
Show Figures

Figure 1

20 pages, 25324 KB  
Article
DGSS-YOLOv8s: A Real-Time Model for Small and Complex Object Detection in Autonomous Vehicles
by Siqiang Cheng, Lingshan Chen and Kun Yang
Algorithms 2025, 18(6), 358; https://doi.org/10.3390/a18060358 - 11 Jun 2025
Cited by 1 | Viewed by 2021
Abstract
Object detection in complex road scenes is vital for autonomous driving, facing challenges such as object occlusion, small target sizes, and irregularly shaped targets. To address these issues, this paper introduces DGSS-YOLOv8s, a model designed to enhance detection accuracy and high-FPS performance within [...] Read more.
Object detection in complex road scenes is vital for autonomous driving, facing challenges such as object occlusion, small target sizes, and irregularly shaped targets. To address these issues, this paper introduces DGSS-YOLOv8s, a model designed to enhance detection accuracy and high-FPS performance within the You Only Look Once version 8 small (YOLOv8s) framework. The key innovation lies in the synergistic integration of several architectural enhancements: the DCNv3_LKA_C2f module, leveraging Deformable Convolution v3 (DCNv3) and Large Kernel Attention (LKA) for better the capture of complex object shapes; an Optimized Feature Pyramid Network structure (Optimized-GFPN) for improved multi-scale feature fusion; the Detect_SA module, incorporating spatial Self-Attention (SA) at the detection head for broader context awareness; and an Inner-Shape Intersection over Union (IoU) loss function to improve bounding box regression accuracy. These components collectively target the aforementioned challenges in road environments. Evaluations on the Berkeley DeepDrive 100K (BDD100K) and Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) datasets demonstrate the model’s effectiveness. Compared to baseline YOLOv8s, DGSS-YOLOv8s achieves mean Average Precision (mAP)@50 improvements of 2.4% (BDD100K) and 4.6% (KITTI). Significant gains were observed for challenging categories, notably 87.3% mAP@50 for cyclists on KITTI, and small object detection (AP-small) improved by up to 9.7% on KITTI. Crucially, DGSS-YOLOv8s achieved high processing speeds suitable for autonomous driving, operating at 103.1 FPS (BDD100K) and 102.5 FPS (KITTI) on an NVIDIA GeForce RTX 4090 GPU. These results highlight that DGSS-YOLOv8s effectively balances enhanced detection accuracy for complex scenarios with high processing speed, demonstrating its potential for demanding autonomous driving applications. Full article
(This article belongs to the Special Issue Advances in Computer Vision: Emerging Trends and Applications)
Show Figures

Figure 1

17 pages, 6914 KB  
Article
YOLO-TC: An Optimized Detection Model for Monitoring Safety-Critical Small Objects in Tower Crane Operations
by Dong Ding, Zhengrong Deng and Rui Yang
Algorithms 2025, 18(1), 27; https://doi.org/10.3390/a18010027 - 6 Jan 2025
Cited by 3 | Viewed by 1614
Abstract
Ensuring operational safety within high-risk environments, such as construction sites, is paramount, especially for tower crane operations where distractions can lead to severe accidents. Despite existing behavioral monitoring approaches, the task of identifying small yet hazardous objects like mobile phones and cigarettes in [...] Read more.
Ensuring operational safety within high-risk environments, such as construction sites, is paramount, especially for tower crane operations where distractions can lead to severe accidents. Despite existing behavioral monitoring approaches, the task of identifying small yet hazardous objects like mobile phones and cigarettes in real time remains a significant challenge in ensuring operator compliance and site safety. Traditional object detection models often fall short in crane operator cabins due to complex lighting conditions, cluttered backgrounds, and the small physical scale of target objects. To address these challenges, we introduce YOLO-TC, a refined object detection model tailored specifically for tower crane monitoring applications. Built upon the robust YOLOv7 architecture, our model integrates a novel channel–spatial attention mechanism, ECA-CBAM, into the backbone network, enhancing feature extraction without an increase in parameter count. Additionally, we propose the HA-PANet architecture to achieve progressive feature fusion, addressing scale disparities and prioritizing small object detection while reducing noise from unrelated objects. To improve bounding box regression, the MPDIoU Loss function is employed, resulting in superior accuracy for small, critical objects in dense environments. The experimental results on both the PASCAL VOC benchmark and a custom dataset demonstrate that YOLO-TC outperforms baseline models, showcasing its robustness in identifying high-risk objects under challenging conditions. This model holds significant promise for enhancing automated safety monitoring, potentially reducing occupational hazards by providing a proactive, resilient solution for real-time risk detection in tower crane operations. Full article
(This article belongs to the Special Issue Advances in Computer Vision: Emerging Trends and Applications)
Show Figures

Figure 1

Other

Jump to: Research

15 pages, 2850 KB  
Brief Report
Exploring the Frequency Domain Point Cloud Processing for Localisation Purposes in Arboreal Environments
by Rosa Pia Devanna, Miguel Torres-Torriti, Kamil Sacilik, Necati Cetin and Fernando Auat Cheein
Algorithms 2025, 18(8), 522; https://doi.org/10.3390/a18080522 - 18 Aug 2025
Viewed by 566
Abstract
Point clouds from 3D sensors such as LiDAR are increasingly used in agriculture for tasks like crop characterisation, pest detection, and leaf area estimation. While traditional point cloud processing typically occurs in Cartesian space using methods such as principal component analysis (PCA), this [...] Read more.
Point clouds from 3D sensors such as LiDAR are increasingly used in agriculture for tasks like crop characterisation, pest detection, and leaf area estimation. While traditional point cloud processing typically occurs in Cartesian space using methods such as principal component analysis (PCA), this paper introduces a novel frequency-domain approach for point cloud registration. The central idea is that point clouds can be transformed and analysed in the spectral domain, where key frequency components capture the most informative spatial structures. By selecting and registering only the dominant frequencies, our method achieves significant reductions in localisation error and computational complexity. We validate this approach using public datasets and compare it with standard Iterative Closest Point (ICP) techniques. Our method, which applies ICP only to points in selected frequency bands, reduces localisation error from 4.37 m to 1.22 m (MSE), an improvement of approximately 72%. These findings highlight the potential of frequency-domain analysis as a powerful and efficient tool for point cloud registration in agricultural and other GNSS-challenged environments. Full article
(This article belongs to the Special Issue Advances in Computer Vision: Emerging Trends and Applications)
Show Figures

Figure 1

Back to TopTop