sensors-logo

Journal Browser

Journal Browser

Image Processing and Analysis for Object Detection: 3rd Edition

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: 30 June 2026 | Viewed by 9271

Special Issue Editor


E-Mail Website
Guest Editor
School of Information and Control, Nanjing University of Information Science and Technology, Nanjing, China
Interests: computer vision; pattern recognition
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, there has been a huge rise in interest in the development of deep learning techniques for computer vision. As deep learning comes to encompass almost all fields of science and engineering, computer vision remains one of its primary application areas. Specifically, the use of deep learning to handle computer vision tasks has led to numerous unprecedented applications, such as high-accuracy object detection, visual tracking, image segmentation, image/video super-resolution, satellite image processing, and saliency object detection, which cannot achieve promising performance through the use of conventional methods.

This Special Issue aims to cover the latest advances in the field of computer vision, involving the use of sensors (such as cameras, video cameras, drones, etc.) for image acquisition, the use of deep learning methods, and a special focus on low-level and high-level computer vision tasks. Original research and review articles are welcome to be submitted. Potential topics may include, but are not limited to, the following:

  • Image/video super-resolution with deep learning approaches;
  • Object detection, visual tracking, and image/video segmentation with
  • deep learning approaches;
  • Supervised and unsupervised learning for image/video processing;
  • Satellite image processing with deep learning techniques;
  • Low-light image enhancement using deep learning approaches.

Prof. Dr. Kaihua Zhang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • augmented reality
  • artificial intelligence
  • computer vision
  • classification algorithms
  • defect detection
  • deep learning
  • feature extraction
  • image processing
  • image classification
  • image super-resolution
  • machine vision
  • object detection, tracking, and recognition techniques
  • semantic segmentation
  • sensing technologies
  • sensor fusion and technologies
  • visual tracking
  • vision sensors
  • video classification

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

20 pages, 2605 KB  
Article
Spatial-Frequency Decoupling Alignment Encoding for Remote Sensing Change Detection
by Xu Zhang, Yue Du, Weiran Zhou and Kaihua Zhang
Sensors 2026, 26(6), 1979; https://doi.org/10.3390/s26061979 - 21 Mar 2026
Viewed by 520
Abstract
Existing remote sensing change detection methods often struggle to accurately capture the contours of complex change targets and subtle textural differences. This makes it difficult to effectively distinguish between the boundaries of change targets and the background. To address this challenge, we propose [...] Read more.
Existing remote sensing change detection methods often struggle to accurately capture the contours of complex change targets and subtle textural differences. This makes it difficult to effectively distinguish between the boundaries of change targets and the background. To address this challenge, we propose a novel method called spatial-frequency decoupling alignment encoding (SDA-Encoding), which is designed to fully leverage information from both the spatial and frequency domains. Specifically, we first use a Transformer encoder to extract bi-temporal features. Next, we apply wavelet transform to decouple these features into low-frequency and high-frequency components. In the multi-scale high-frequency interaction (MHI) module, we combine local spatial enhancement using spatial pyramid pooling with cross-scale dependency supplementation via the dual-domain alignment fusion (DAF) module. Meanwhile, in the position-aware low-frequency enhancement (PLE) module, spatial position sensitivity is restored using coordinate attention, and region-level contextual dependencies are captured through the selective fusion attention (SFA) module. Finally, the two frequency-domain branches are complementarily fused within the spatial domain to achieve unified detection of both fine-grained and structural changes. Experimental results on three benchmark datasets demonstrate the significant performance improvements of SDA-Encoding. Full article
(This article belongs to the Special Issue Image Processing and Analysis for Object Detection: 3rd Edition)
Show Figures

Figure 1

22 pages, 2817 KB  
Article
A Dual-Branch Spatial Interaction and Multi-Scale Separable Aggregation Driven Hybrid Network for Infrared Image Super-Resolution
by Jiajia Liu, Wenxiang Dong, Xuan Zhao, Jianhua Liu and Xiaoguang Tu
Sensors 2026, 26(4), 1332; https://doi.org/10.3390/s26041332 - 19 Feb 2026
Viewed by 403
Abstract
Single image super-resolution (SISR) is a classical computer vision task that aims to reconstruct a high-resolution image from a low-resolution input, thereby improving detail sharpness and visual quality. In recent years, convolutional neural network (CNN)-based methods and transformer-based methods using self-attention mechanisms have [...] Read more.
Single image super-resolution (SISR) is a classical computer vision task that aims to reconstruct a high-resolution image from a low-resolution input, thereby improving detail sharpness and visual quality. In recent years, convolutional neural network (CNN)-based methods and transformer-based methods using self-attention mechanisms have achieved significant progress in visible-image super-resolution. However, the direct application of these two types of methods to infrared images still poses considerable challenges. On the one hand, infrared images generally suffer from low signal-to-noise ratio, blurred edges, and missing details, and relying only on local convolutions makes it difficult to adequately model long-range dependencies across regions. On the other hand, although pure transformer models have a strong global modeling ability, they usually have large numbers of parameters and are sensitive to the amount of training data, making it difficult to balance efficiency and detail restoration in infrared imaging scenarios. To address these issues, we propose a hybrid neural network architecture for infrared image super-resolution reconstruction, termed RDSR (Residual Dual-branch Separable Super-Resolution Network), which organically integrates multi-scale depthwise separable convolutions with shifted-window self-attention. Specifically, we design a dual-branch spatial interaction module (BDSI, Dual-Branch Spatial Interaction) and a multi-scale separable spatial aggregation module (MSSA, Multi-Scale Separable Spatial Aggregation). The BDSI module models correlations along rows and columns through grouped convolutions in the horizontal and vertical directions, effectively strengthening the spatial information interaction between the convolution branch and the self-attention branch. The MSSA module replaces the conventional MLP with three parallel depthwise separable convolution branches, improving the feature representation and nonlinear modeling through multi-scale spatial aggregation and a star-shaped gating operation. The experimental results on multiple public infrared image datasets show that for ×2 and ×4 upscaling, the proposed RDSR achieves higher PSNR and SSIM values than CNN-based methods such as EDSR, RCAN, and RDN, as well as transformer-based methods such as SwinIR, DAT, and HAT, demonstrating the effectiveness of the proposed modules and the overall framework. Full article
(This article belongs to the Special Issue Image Processing and Analysis for Object Detection: 3rd Edition)
Show Figures

Figure 1

27 pages, 4033 KB  
Article
DCDW-YOLOv11: An Intelligent Defect-Detection Method for Key Transmission-Line Equipment
by Dezhi Wang, Riqing Song, Minghui Liu, Xingqian Wang, Chengyu Zhang, Ziang Wang and Dongxue Zhao
Sensors 2026, 26(3), 1029; https://doi.org/10.3390/s26031029 - 4 Feb 2026
Viewed by 707
Abstract
The detection of defects in key transmission-line equipment under complex environments often suffers from insufficient accuracy and reliability due to background interference and multi-scale feature variations. To address this issue, this paper proposes an improved defect detection model based on YOLOv11, named DCDW-YOLOv11. [...] Read more.
The detection of defects in key transmission-line equipment under complex environments often suffers from insufficient accuracy and reliability due to background interference and multi-scale feature variations. To address this issue, this paper proposes an improved defect detection model based on YOLOv11, named DCDW-YOLOv11. The model introduces deformable convolution C2f_DCNv3 in the backbone network to enhance adaptability to geometric deformations of targets, and incorporates the convolutional block attention module (CBAM) to highlight defect features while suppressing background interference. In the detection head, a dynamic head structure (DyHead) is adopted to achieve cross-layer multi-scale feature fusion and collaborative perception, along with the WIoU loss function to optimize bounding box regression and sample weight allocation. Experimental results demonstrate that on the transmission-line equipment defect dataset, DCDW-YOLOv11 achieves an accuracy, recall, and mAP of 94.4%, 92.8%, and 96.3%, respectively, representing improvements of 2.8%, 7.0%, and 4.4% over the original YOLOv11, and outperforming other mainstream detection models. The proposed method can provide high-precision and highly reliable defect detection support for intelligent inspection of transmission lines in complex scenarios. Full article
(This article belongs to the Special Issue Image Processing and Analysis for Object Detection: 3rd Edition)
Show Figures

Figure 1

15 pages, 3967 KB  
Article
Low-Light Image Segmentation on Edge Computing System
by Sung-Chan Choi and Sung-Yeon Kim
Sensors 2026, 26(1), 327; https://doi.org/10.3390/s26010327 - 4 Jan 2026
Viewed by 702
Abstract
Segmenting low-light images, such as images showing cracks on tunnel walls, is challenging due to limited visibility. Hence, we need to combine image brightness enhancement and a segmentation algorithm. We introduce essential preliminaries, specifically highlighting deep learning-based low-light image enhancement methods and the [...] Read more.
Segmenting low-light images, such as images showing cracks on tunnel walls, is challenging due to limited visibility. Hence, we need to combine image brightness enhancement and a segmentation algorithm. We introduce essential preliminaries, specifically highlighting deep learning-based low-light image enhancement methods and the pixel-level image segmentation algorithm. After that, we provide a three-step low-light image segmentation algorithm. The proposed algorithm begins with brightness and contrast enhancement of low-light images, followed by accurate segmentation using a U-Net model. By various experimental results, we show the performance metrics of the proposed low-light image segmentation algorithm and compare the proposed algorithm’s performance against several baseline models. Furthermore, we demonstrate the implementation of the proposed low-light image segmentation pipeline on an edge computing platform. The implementation results show that the proposed algorithm is sufficiently fast for real-time processing. Full article
(This article belongs to the Special Issue Image Processing and Analysis for Object Detection: 3rd Edition)
Show Figures

Figure 1

20 pages, 3806 KB  
Article
Fusing Multi-Temporal Context for Image Super-Resolution Reconstruction in Cultural Heritage Monitoring
by Caiyan Chen, Fulong Chen, Sheng Gao, Hongqiang Li, Xinru Zhang and Yanni Cheng
Sensors 2026, 26(1), 228; https://doi.org/10.3390/s26010228 - 30 Dec 2025
Cited by 1 | Viewed by 592
Abstract
Effective conservation of World Heritage Sites relies on high-precision and continuous dynamic monitoring of their status. However, cloud cover, limitations in sensor resolution, and the vast distribution of heritage areas make it challenging to consistently acquire high-resolution imagery for key years, thereby hindering [...] Read more.
Effective conservation of World Heritage Sites relies on high-precision and continuous dynamic monitoring of their status. However, cloud cover, limitations in sensor resolution, and the vast distribution of heritage areas make it challenging to consistently acquire high-resolution imagery for key years, thereby hindering accurate characterization of their temporal evolution. To overcome this bottleneck, this paper proposes a temporal change-aware super-resolution reconstruction model. This model innovatively utilizes the temporal evolution information of heritage landscapes as a key clue for reconstructing high-quality imagery of the target year. We design a multi-branch architecture that takes the low-resolution image of the target year as the core input, while also incorporating the high- and low-resolution images from its preceding (t − 1) and subsequent (t + 1) years. Through parallel encoding branches, the model separately learns to: (1) extract spatial features from the multi-temporal low-resolution images, and (2) explicitly model the change patterns recorded in the high-resolution imagery from year t − 1 to t + 1, via a dedicated temporal change encoder. Finally, by deeply fusing these features, the model generates a simulated high-resolution image for the target year (t). Experimental results on a real-world dataset of the Weiyang Palace (WYP) core area (2017–2019), with 2018 as the target year, demonstrate that the proposed method achieves superior performance, significantly outperforming traditional single-image super-resolution models and a contrastive model without explicit temporal change modeling. Full article
(This article belongs to the Special Issue Image Processing and Analysis for Object Detection: 3rd Edition)
Show Figures

Figure 1

30 pages, 7695 KB  
Article
RTUAV-YOLO: A Family of Efficient and Lightweight Models for Real-Time Object Detection in UAV Aerial Imagery
by Ruizhi Zhang, Jinghua Hou, Le Li, Ke Zhang, Li Zhao and Shuo Gao
Sensors 2025, 25(21), 6573; https://doi.org/10.3390/s25216573 - 25 Oct 2025
Cited by 2 | Viewed by 2959
Abstract
Real-time object detection in Unmanned Aerial Vehicle (UAV) imagery is critical yet challenging, requiring high accuracy amidst complex scenes with multi-scale and small objects, under stringent onboard computational constraints. While existing methods struggle to balance accuracy and efficiency, we propose RTUAV-YOLO, a family [...] Read more.
Real-time object detection in Unmanned Aerial Vehicle (UAV) imagery is critical yet challenging, requiring high accuracy amidst complex scenes with multi-scale and small objects, under stringent onboard computational constraints. While existing methods struggle to balance accuracy and efficiency, we propose RTUAV-YOLO, a family of lightweight models based on YOLOv11 tailored for UAV real-time object detection. First, to mitigate the feature imbalance and progressive information degradation of small objects in current architectures multi-scale processing, we developed a Multi-Scale Feature Adaptive Modulation module (MSFAM) that enhances small-target feature extraction capabilities through adaptive weight generation mechanisms and dual-pathway heterogeneous feature aggregation. Second, to overcome the limitations in contextual information acquisition exhibited by current architectures in complex scene analysis, we propose a Progressive Dilated Separable Convolution Module (PDSCM) that achieves effective aggregation of multi-scale target contextual information through continuous receptive field expansion. Third, to preserve fine-grained spatial information of small objects during feature map downsampling operations, we engineered a Lightweight DownSampling Module (LDSM) to replace the traditional convolutional module. Finally, to rectify the insensitivity of current Intersection over Union (IoU) metrics toward small objects, we introduce the Minimum Point Distance Wise IoU (MPDWIoU) loss function, which enhances small-target localization precision through the integration of distance-aware penalty terms and adaptive weighting mechanisms. Comprehensive experiments on the VisDrone2019 dataset show that RTUAV-YOLO achieves an average improvement of 3.4% and 2.4% in mAP50 and mAP50-95, respectively, compared to the baseline model, while reducing the number of parameters by 65.3%. Its generalization capability for UAV object detection is further validated on the UAVDT and UAVVaste datasets. The proposed model is deployed on a typical airborne platform, Jetson Orin Nano, providing an effective solution for real-time object detection scenarios in actual UAVs. Full article
(This article belongs to the Special Issue Image Processing and Analysis for Object Detection: 3rd Edition)
Show Figures

Figure 1

21 pages, 3489 KB  
Article
GA-YOLOv11: A Lightweight Subway Foreign Object Detection Model Based on Improved YOLOv11
by Ning Guo, Min Huang and Wensheng Wang
Sensors 2025, 25(19), 6137; https://doi.org/10.3390/s25196137 - 4 Oct 2025
Cited by 2 | Viewed by 2455
Abstract
Modern subway platforms are generally equipped with platform screen door systems to enhance safety, but the gap between the platform screen doors and train doors may cause passengers or objects to become trapped, leading to accidents. Addressing the issues of excessive parameter counts [...] Read more.
Modern subway platforms are generally equipped with platform screen door systems to enhance safety, but the gap between the platform screen doors and train doors may cause passengers or objects to become trapped, leading to accidents. Addressing the issues of excessive parameter counts and computational complexity in existing foreign object intrusion detection algorithms, as well as false positives and false negatives for small objects, this article introduces a lightweight deep learning model based on YOLOv11n, named GA-YOLOv11. First, a lightweight GhostConv convolution module is introduced into the backbone network to reduce computational resource waste in irrelevant areas, thereby lowering model complexity and computational load. Additionally, the GAM attention mechanism is incorporated into the head network to enhance the model’s ability to distinguish features, enabling precise identification of object location and category, and significantly reducing the probability of false positives and false negatives. Experimental results demonstrate that in comparison to the original YOLOv11n model, the improved model achieves 3.3%, 3.2%, 1.2%, and 3.5% improvements in precision, recall, mAP@0.5, and mAP@0.5: 0.95, respectively. In contrast to the original YOLOv11n model, the number of parameters and GFLOPs were reduced by 18% and 7.9%, respectfully, while maintaining the same model size. The improved model is more lightweight while ensuring real-time performance and accuracy, designed for detecting foreign objects in subway platform gaps. Full article
(This article belongs to the Special Issue Image Processing and Analysis for Object Detection: 3rd Edition)
Show Figures

Figure 1

Back to TopTop