applsci-logo

Journal Browser

Journal Browser

Deep Learning-Based Computer Vision Technology and Its Applications

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 October 2026 | Viewed by 2833

Special Issue Editor


E-Mail Website
Guest Editor
Department of Industrial Electronics, School of Engineering, University of Minho, 4800-058 Guimarães, Portugal
Interests: medical image processing; assisted surgery; autonomous vehicle sensors; GNSS positioning systems; moving target synthetic aperture radar
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Convolutional Neural Networks (CNNs) enable computer vision systems to learn visual data from large datasets in order to perform tasks like object detection, recognition, and localization, texture discrimination, facial recognition, and defect detection with high accuracy. For this Special Issue, we seek high-quality original research articles regarding all aspects of computer vision. We welcome both theoretical and practical studies of high technical quality across various disciplines, with the aim of highlighting methods employed in one area that may also apply to other areas.

Topics of interest include, but are not limited to, the following:

  • Systems for facilitating medical diagnostics;
  • Assisted surgery;
  • Autonomous vehicles;
  • Manufacturing (quality control);
  • Security and surveillance (facial recognition, etc.);
  • Agriculture (disease monitoring, crop yield assessment, etc.);
  • Retail and logistics (facilitating inventory management by automating stock tracking and visual auditing within warehouses and stores);
  • Moving-target indicators in synthetic aperture radar.

Dr. Carlos Lima
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • semantic/instance segmentation
  • automatic brain segmentation
  • wireless capsule endoscopy
  • cystoscopic image analysis
  • surgical-tool detection and segmentation
  • object recognition and scene segmentation
  • defect detection
  • facial recognition
  • pest detection and/or prediction
  • automatic inventory management

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

14 pages, 14871 KB  
Article
Towards Accurate Face Detection Under Occlusion, Class Imbalance and Small-Scale Challenges
by Linrunjia Liu, Dayong Li, Shuai Wu and Qiguang Miao
Appl. Sci. 2026, 16(8), 3738; https://doi.org/10.3390/app16083738 - 10 Apr 2026
Viewed by 431
Abstract
To address face occlusion, low detection rates of small-scale faces, and sample imbalance in dense visual scenarios, we propose a YOLOv7-based detector with four key improvements: (1) an optimized MPConv module to enhance feature extraction; (2) a novel CFPM to boost sensitivity to [...] Read more.
To address face occlusion, low detection rates of small-scale faces, and sample imbalance in dense visual scenarios, we propose a YOLOv7-based detector with four key improvements: (1) an optimized MPConv module to enhance feature extraction; (2) a novel CFPM to boost sensitivity to occluded samples; (3) an integration of the DyHead block in IDetect to mitigate feature loss from sample imbalance; (4) an SW-SCE loss function with a dual-input network to better detect small faces. Experiments on the WiderFace dataset show that our method improves detection performance by 1.2%, 1.8%, and 3% on the easy, medium, and hard subsets over the baseline. These gains strengthen face detection in dense, challenging environments with heavy occlusion and small-scale targets. Full article
(This article belongs to the Special Issue Deep Learning-Based Computer Vision Technology and Its Applications)
Show Figures

Figure 1

17 pages, 2684 KB  
Article
Semantic-Enhanced Bidirectional Multimodal Fusion for 3D Object Detection Under Adverse Weather
by Tianzhe Jiao, Yuming Chen, Xiaoyue Feng, Chaopeng Guo and Jie Song
Appl. Sci. 2026, 16(6), 2943; https://doi.org/10.3390/app16062943 - 18 Mar 2026
Viewed by 519
Abstract
Multimodal fusion methods leveraging various sensors provide strong support for 3D object detection. However, under adverse weather conditions such as rain, fog, snow, and intense glare, complex environmental factors can degrade sensor data quality, leading to increased false positives and missed detections. In [...] Read more.
Multimodal fusion methods leveraging various sensors provide strong support for 3D object detection. However, under adverse weather conditions such as rain, fog, snow, and intense glare, complex environmental factors can degrade sensor data quality, leading to increased false positives and missed detections. In addition, sensor modalities (e.g., LiDAR and cameras) inherently vary in information density, and directly fusing them can cause critical details in high-density data to be diluted by low-density data, thereby increasing errors. To address these issues, we propose a Semantic-Enhanced Bidirectional Multimodal Fusion (SeBFusion) framework. By introducing a semantic enhancement mechanism and a bidirectional fusion strategy, SeBFusion mitigates the impact of noise under adverse weather and alleviates information dilution in multimodal fusion. Specifically, SeBFusion first employs a virtual point generation and camera semantic injection module to selectively map image semantic features into 3D space, producing semantically enhanced LiDAR features to compensate for the sparsity of the raw LiDAR point cloud. Then, during cross-modal interaction, we design a bidirectional cross-attention fusion module. This module estimates the confidence of each modality and adaptively reweights the bidirectional information flow, thereby reducing the risk of noise propagation across modalities and improving the robustness and accuracy of 3D object detection in complex environments. Experiments on adverse-weather versions of datasets such as KITTI-C and nuScenes-C validate the effectiveness and superiority of the proposed method. On the nuScenes-C dataset, it achieves 66.2% mAP and 66.6% mAP under fog and snow conditions, respectively. Full article
(This article belongs to the Special Issue Deep Learning-Based Computer Vision Technology and Its Applications)
Show Figures

Figure 1

23 pages, 3475 KB  
Article
YOLO-GSD-seg: YOLO for Guide Rail Surface Defect Segmentation and Detection
by Shijun Lai, Zuoxi Zhao, Yalong Mi, Kai Yuan and Qian Wang
Appl. Sci. 2026, 16(3), 1261; https://doi.org/10.3390/app16031261 - 26 Jan 2026
Viewed by 786
Abstract
To address the challenges of accurately extracting features from elongated scratches, irregular defects, and small-scale surface flaws on high-precision linear guide rails, this paper proposes a novel instance segmentation algorithm tailored for guide rail surface defect detection. The algorithm integrates the YOLOv8 instance [...] Read more.
To address the challenges of accurately extracting features from elongated scratches, irregular defects, and small-scale surface flaws on high-precision linear guide rails, this paper proposes a novel instance segmentation algorithm tailored for guide rail surface defect detection. The algorithm integrates the YOLOv8 instance segmentation framework with deformable convolutional networks and multi-scale feature fusion to enhance defect feature extraction and segmentation performance. A dedicated guide rail surface Defect (GSD) segmentation dataset is constructed to support model training and evaluation. In the backbone, the DCNv3 module is incorporated to strengthen the extraction of elongated and irregular defect features while simultaneously reducing model parameters. In the feature fusion network, a multi-scale feature fusion module and a triple-feature encoding module are introduced to jointly capture global contextual information and preserve fine-grained local defect details. Furthermore, a Channel and Position Attention Module (CPAM) is employed to integrate global and local features, improving the model’s sensitivity to channel and positional cues of small-target defects and thereby enhancing segmentation accuracy. Experimental results show that, compared with the original YOLOv8n-Seg, the proposed method achieves improvements of 3.9% and 3.8% in Box and Mask mAP50, while maintaining a real-time inference speed of 148 FPS. Additional evaluations on the public MSD dataset further demonstrate the model’s strong versatility and robustness. Full article
(This article belongs to the Special Issue Deep Learning-Based Computer Vision Technology and Its Applications)
Show Figures

Figure 1

Review

Jump to: Research

52 pages, 4733 KB  
Review
Monocular Camera Localization in Known Environments: An In-Depth Review
by Hailun Yan, Albert Lau and Hongchao Fan
Appl. Sci. 2026, 16(5), 2332; https://doi.org/10.3390/app16052332 - 27 Feb 2026
Viewed by 586
Abstract
Monocular camera localization in known environments is a critical task for applications like autonomous navigation, augmented reality, and robotic positioning, requiring precise spatial awareness. Unlike localization in unknown environments, which builds maps in real time, this leverages pre-existing data for higher accuracy. This [...] Read more.
Monocular camera localization in known environments is a critical task for applications like autonomous navigation, augmented reality, and robotic positioning, requiring precise spatial awareness. Unlike localization in unknown environments, which builds maps in real time, this leverages pre-existing data for higher accuracy. This review comprehensively analyzes monocular camera localization methods in known environments, categorizing them into 2D-2D feature matching, 2D-3D feature matching, and regression-based approaches. It consolidates foundational techniques and recent advancements, providing inter-class and intra-class performance comparisons on mainstream datasets. Key findings show that 2D-3D methods generally offer the highest accuracy, especially in structured outdoor environments, due to robust use of 3D spatial information. However, recent scene coordinate regression methods, such as ACE and ACE++, achieve comparable or superior performance in indoor scenes with more efficient pipelines. This review highlights challenges and proposes future directions: (1) synthetic data generation to meet deep learning demands, while addressing domain gaps; (2) improving generalization to unseen scenes and reducing retraining; (3) multi-sensor fusion for enhanced robustness; (4) exploring transformer-based and graph neural network architectures; (5) developing lightweight models for real-time performance on resource-constrained devices. This review aims to guide researchers and practitioners in method selection and identify key research directions. Full article
(This article belongs to the Special Issue Deep Learning-Based Computer Vision Technology and Its Applications)
Show Figures

Figure 1

Back to TopTop