Computer Vision and Image Processing, 3rd Edition

Journal Name	Impact Factor	CiteScore	Launched Year	First Decision (median)	APC
Applied Sciences applsci	2.5	5.5	2011	16 Days	CHF 2400	Submit
Electronics electronics	2.6	6.1	2012	16.4 Days	CHF 2400	Submit
Journal of Imaging jimaging	3.3	6.7	2015	18 Days	CHF 1800	Submit
Modelling modelling	1.5	2.2	2020	24.9 Days	CHF 1200	Submit
Remote Sensing remotesensing	4.1	8.6	2009	24.3 Days	CHF 2700	Submit

26 pages, 12156 KB

Open AccessArticle

Precision Micro-Vibration Measurement for Linear Array Imaging via Complex Morlet Wavelet Phase Magnification

by Meiyi Zhu, Dezhi Zheng, Ying Zhang and Shuai Wang

Appl. Sci. 2026, 16(7), 3518; https://doi.org/10.3390/app16073518 - 3 Apr 2026

Viewed by 167

Traditional vision-based vibration measurement is fundamentally constrained by the low sampling rates of area-scan cameras and the noise sensitivity of existing motion magnification algorithms. To overcome these spatiotemporal barriers, we propose a high-fidelity framework that integrates ultra-high-speed line-scan imaging with a 1D Complex [...] Read more.

Traditional vision-based vibration measurement is fundamentally constrained by the low sampling rates of area-scan cameras and the noise sensitivity of existing motion magnification algorithms. To overcome these spatiotemporal barriers, we propose a high-fidelity framework that integrates ultra-high-speed line-scan imaging with a 1D Complex Morlet Wavelet Phase-Based Video Magnification (CMW-PVM) algorithm. By extracting and manipulating the localized phase of 1D spatial signals, CMW-PVM effectively decouples structural dynamics from background noise while eliminating the computational redundancy associated with 2D spatial pyramid methods. Simulations demonstrate that CMW-PVM significantly extends the linear magnification range (up to

α \approx 35

) while preserving exceptional structural fidelity (FSIM

> 0.87

) under severe noise conditions (SNR = 10 dB). Experimental validation against a laser Doppler vibrometer (LDV) reveals near-perfect kinematic accuracy, with a relative amplitude error of only 1.65%. Furthermore, at a 100 Hz high-frequency excitation, the system successfully resolves microscopic displacements (≈10 μm) without temporal aliasing—enabled not by violating sampling theory but by leveraging the high physical line rate of the line-scan sensor. This establishes a robust, non-contact, and computationally efficient paradigm for broadband, micro-amplitude vibration monitoring in industrial environments. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

26 pages, 6199 KB

Open AccessArticle

WeatherMAR: Complementary Masking of Paired Tokens for Adverse-Weather Image Restoration

by Junyuan Ma, Qunbo Lv and Zheng Tan

J. Imaging 2026, 12(4), 154; https://doi.org/10.3390/jimaging12040154 - 2 Apr 2026

Viewed by 226

Abstract

Image restoration under adverse weather conditions has attracted increasing attention because of its importance for both human perception and downstream vision applications. Existing methods, however, are often designed for a single degradation type. We present WeatherMAR, a multi-weather restoration framework that formulates [...] Read more.

Image restoration under adverse weather conditions has attracted increasing attention because of its importance for both human perception and downstream vision applications. Existing methods, however, are often designed for a single degradation type. We present WeatherMAR, a multi-weather restoration framework that formulates adverse-weather restoration as a paired-domain completion problem in a shared continuous token space. Specifically, WeatherMAR concatenates degraded and clean token sequences into a joint paired-domain sequence and performs restoration through masked autoregressive modeling, in which self-attention enables direct cross-domain interaction. To strengthen conditional learning while avoiding trivial paired correspondences, we introduce complementary bidirectional masking together with an optional reverse objective used only during training to encourage degradation-aware representations. WeatherMAR further employs a conditional diffusion objective for continuous token prediction and adopts a progress-to-step schedule to improve inference efficiency. Extensive experiments on standard multi-weather benchmarks, including Snow100K, Outdoor-Rain, and RainDrop, show that WeatherMAR achieves the best PSNR/SSIM on Snow100K-S (38.14/0.9684), the best SSIM on Outdoor-Rain (0.9396), and the best PSNR on Snow100K-L (32.58) and RainDrop (33.12). These results demonstrate that paired-domain token completion provides an effective solution for adverse-weather restoration. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

24 pages, 3448 KB

Open AccessArticle

Gaussian-Guided Stage-Aware Deformable FPN with Coarse-to-Fine Unit-Circle Resolver for Oriented SAR Ship Detection

by Liangjie Meng, Qingle Guo, Danxia Li, Jinrong He and Zhixin Li

Remote Sens. 2026, 18(7), 1019; https://doi.org/10.3390/rs18071019 - 29 Mar 2026

Viewed by 216

Abstract

Synthetic Aperture Radar (SAR) enables all-weather maritime surveillance, yet ship-oriented bounding box (OBB) detection remains challenging in complex scenes. Strong sea clutter and dense harbor scatterers often mask the slender characteristics of ships as well as the weak responses of small ships. Meanwhile, [...] Read more.

Synthetic Aperture Radar (SAR) enables all-weather maritime surveillance, yet ship-oriented bounding box (OBB) detection remains challenging in complex scenes. Strong sea clutter and dense harbor scatterers often mask the slender characteristics of ships as well as the weak responses of small ships. Meanwhile, the periodicity of angle parameterization introduces regression discontinuities, and near-symmetric, bright-scatterer-dominated signatures further cause heading ambiguity, undermining the stability of orientation prediction. Moreover, in most detectors, multi-scale feature fusion and angle estimation lack explicit coordination, and rotated-box localization performance is often jointly affected by feature degradation and unstable orientation prediction. To this end, we propose a unified framework that simultaneously strengthens multi-scale representations and stabilizes orientation modeling. Specifically, we design a Gaussian-Guided Stage-Aware Deformable Feature Pyramid Network (GSDFPN) and a Coarse-to-Fine Unit-Circle Resolver (CF-UCR). GSDFPN enhances multi-scale fusion with two plug-in components: (i) a Gaussian-guided High-level Semantic Refinement Module (GHSRM) that suppresses clutter-dominated semantics while strengthening ship-responsive cues, and (ii) a Stage-aware Deformable Fusion Module (SDFM) for low-level features, which disentangles channels into a geometry-preserving spatial stream and a clutter-resistant semantic stream, and couples them via deformable interaction with bidirectional cross-stream gating to better capture the inherent slender characteristics of ships and localize small ships. For orientation, CF-UCR decomposes angle prediction into direction-cluster classification and intra-cluster residual regression on the unit circle, effectively mitigating periodicity-induced discontinuities and stabilizing rotated-box estimation. On SSDD+ and RSDD, our method achieves AP/AP₅₀/AP₇₅ of 0.5390/0.9345/0.4529 and 0.4895/0.9210/0.4712, respectively, while reaching AP_s75/AP_m75/AP_l75 of 0.5614/0.8300/0.8392 and 0.4986/0.8163/0.8934, evidencing strong rotated-box localization across target scales in complex maritime scenes. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

21 pages, 4335 KB

Open AccessArticle

Real-Time Small UAV Detection in Complex Airspace Using YOLOv11 with Residual Attention and High-Resolution Feature Enhancement

by Chuang Han, Md Redwan Ullah, Amrul Kayes, Khalid Hasan, Md Abdur Rouf, Md Rakib Hasan, Shen Tao, Guo Gengli and Mohammad Masum Billah

J. Imaging 2026, 12(3), 140; https://doi.org/10.3390/jimaging12030140 - 20 Mar 2026

Viewed by 370

Abstract

Detecting small unmanned aerial vehicles (UAVs) in complex airspace presents significant challenges due to their minimal pixel footprint, resemblance to birds, and frequent occlusion. To address these issues, we propose YOLOv11-ResCBAM, a novel real-time detection framework that integrates a Residual Convolutional Block Attention [...] Read more.

Detecting small unmanned aerial vehicles (UAVs) in complex airspace presents significant challenges due to their minimal pixel footprint, resemblance to birds, and frequent occlusion. To address these issues, we propose YOLOv11-ResCBAM, a novel real-time detection framework that integrates a Residual Convolutional Block Attention Module (ResCBAM) and a high-resolution P2 detection head into the YOLOv11 architecture. ResCBAM enhances channel and spatial feature refinement while preserving original feature contexts through residual connections, and the P2 head maintains fine spatial details crucial for small-object localization. Evaluated on a custom dataset of 4917 images (11,733 after augmentation) across three classes (drone, bird, airplane), our model achieves a mean average precision at the 0.5–0.95 IoU threshold (mAP@0.5–0.95) of 0.845, representing a 7.9% improvement over the baseline YOLOv11n, while maintaining real-time inference at 50.51 FPS. Cross-dataset validation on VisDrone2019-DET and UAVDT benchmarks demonstrates promising generalization trends. This work demonstrates the effectiveness of the proposed approach for UAV surveillance systems, balancing detection accuracy with computational efficiency for deployment in security-critical environments. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

23 pages, 2010 KB

Open AccessArticle

Visibility-Prior Guided Dual-Stream Mixture-of-Experts for Robust Facial Expression Recognition Under Complex Occlusions

by Siyuan Ma, Long Liu, Mingzhi Cheng, Peijun Qin, Zixuan Han, Cui Chen, Shizhao Yang and Hongjuan Wang

Electronics 2026, 15(6), 1230; https://doi.org/10.3390/electronics15061230 - 16 Mar 2026

Viewed by 294

Abstract

Facial occlusion induces sample-wise reliability shifts in facial expression recognition (FER), where the usefulness of global context and local discriminative cues varies dramatically with the amount of visible facial information. Existing occlusion-robust FER studies often evaluate under limited or homogeneous occlusion settings and [...] Read more.

Facial occlusion induces sample-wise reliability shifts in facial expression recognition (FER), where the usefulness of global context and local discriminative cues varies dramatically with the amount of visible facial information. Existing occlusion-robust FER studies often evaluate under limited or homogeneous occlusion settings and commonly adopt static fusion strategies, which are insufficient for complex and heterogeneous real-world occlusions. In this work, we establish a rigorous occlusion robustness evaluation protocol by constructing a fixed offline test benchmark with diverse synthetic occlusion patterns (e.g., masks, sunglasses, texture blocks, and mixed occlusions) on top of public FER test splits. We further propose a Dual-Stream Adaptive Weighting Mixture-of-Experts framework (DS-AW-MoE) that fuses a global contextual expert and a local discriminative expert via an occlusion-aware weighting network. Crucially, we introduce a facial visibility assessment as a task-agnostic prior to explicitly regulate expert contributions, enabling dynamic re-allocation of model capacity according to input-dependent feature reliability. Extensive experiments on public datasets and the constructed occlusion benchmark demonstrate that DS-AW-MoE achieves more stable recognition under complex occlusions, characterized by a smaller and more consistent performance drop. To support reproducibility under dataset license constraints, we will release an anonymous, fully runnable repository containing the complete occlusion synthesis pipeline, evaluation protocol, and configuration files, allowing researchers to reproduce the benchmark after obtaining the original datasets. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

28 pages, 5420 KB

Open AccessArticle

HEMS-RTDETR: A Lightweight Edge-Enhanced and Deformation-Aware Detector for Floating Debris in Complex Water Environments

by Yiwei Cui, Xinyi Jiang, Haiting Yu, Meizhen Lei and Jia Ren

Electronics 2026, 15(6), 1226; https://doi.org/10.3390/electronics15061226 - 15 Mar 2026

Viewed by 338

Abstract

Floating debris detection in complex aquatic environments holds significant importance for water resource protection and maritime safety monitoring. However, this task faces three core challenges: severe background interference leading to blurred target textures, significant non-rigid deformations, and the frequent loss of small targets [...] Read more.

Floating debris detection in complex aquatic environments holds significant importance for water resource protection and maritime safety monitoring. However, this task faces three core challenges: severe background interference leading to blurred target textures, significant non-rigid deformations, and the frequent loss of small targets at long distances. To address these issues, we propose a high-performance lightweight detection algorithm, termed High-Efficiency Edge-Aware Multi-Scale Real-Time Detection Transformer (HEMS-RTDETR), built upon the Real-Time Detection Transformer (RT-DETR) architecture. First, to suppress disturbances induced by water surface ripples and specular reflections, a Cross-Stage Partial Multi-Scale Edge Information Enhancement (CSP-MSEIE) module is introduced to reconstruct the backbone network. By removing computational redundancy while incorporating explicit edge enhancement, feature extraction capability and noise robustness for weak-texture targets are significantly improved. Second, to handle irregular debris morphology, a Deformable Attention Transformer (DAT) module is integrated, enabling adaptive attention focusing on geometrically deformed regions. Finally, an Efficient Multi-Scale Bidirectional Feature Pyramid Network (EMBSFPN) is constructed to enhance cross-scale semantic interaction and alleviate small-target signal loss. Experimental results demonstrate that, compared with RTDETR-r18, HEMS-RTDETR reduces parameters to 12.57 M, improves mAP@0.5 and mAP@0.5:0.95 by 2.44% and 3.05%, respectively, and maintains real-time inference at 93 FPS, indicating strong robustness and application potential in dynamic aquatic environments. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

23 pages, 14232 KB

Open AccessArticle

A Dual-Branch Perception Network for High-Precision Oriented Object Detection in Remote Sensing

by Qi Wang and Wei Sun

Remote Sens. 2026, 18(5), 839; https://doi.org/10.3390/rs18050839 - 9 Mar 2026

Viewed by 381

Abstract

With the rapid evolution of remote sensing earth observation technology, high-resolution object detection is crucial in military and civilian domains but faces challenges from expansive views and complex backgrounds. Small objects are particularly challenging due to their low pixel coverage, poor textures, and [...] Read more.

With the rapid evolution of remote sensing earth observation technology, high-resolution object detection is crucial in military and civilian domains but faces challenges from expansive views and complex backgrounds. Small objects are particularly challenging due to their low pixel coverage, poor textures, and susceptibility to drastic illumination changes and background clutter. To address these problems, this paper proposes MDCA-YOLO for oriented object detection. A Dual-Branch Perception Module (DBPM) is designed utilizing a synergistic mechanism of large-kernel and strip convolutions to establish long-range dependencies, accurately capturing geometric features of tiny objects even in the absence of local details; Multi-Adaptive Selection Fusion (MASF) is proposed to address cross-scale feature loss by adaptively enhancing feature response while suppressing background noise; furthermore, a reconstructed decoupled detection head, CoordAttOBB, significantly improves angle regression accuracy while reducing complexity. Experimental results on the DIOR-R dataset show MDCA-YOLO surpasses YOLO11s, improving mAP50 and mAP50:95 by 2.5% and 2.7%, respectively, effectively proving the algorithm’s superiority in remote sensing tasks. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

31 pages, 11349 KB

Open AccessArticle

Recognition, Localization and 3D Geometric Morphology Calculation of Microblind Holes in Complex Backgrounds Based on the Improved YOLOv11 Network and AVC Algorithm

by Chengfen Zhang, Dong Xia, Ruizhao Chen, Qunfeng Niu, Tao Wang and Li Wang

J. Imaging 2026, 12(3), 96; https://doi.org/10.3390/jimaging12030096 - 24 Feb 2026

Viewed by 366

Abstract

Microblind hole processing quality inspection, especially accurately identifying microblind hole contour features and precisely detecting 3D and morphological parameters, has always been challenging, especially for accurately identifying those of different sizes, depths, and contour features simultaneously. This poses a great challenge for identifying [...] Read more.

Microblind hole processing quality inspection, especially accurately identifying microblind hole contour features and precisely detecting 3D and morphological parameters, has always been challenging, especially for accurately identifying those of different sizes, depths, and contour features simultaneously. This poses a great challenge for identifying and localizing microblind hole contours based on machine vision and accurately calculating three-dimensional parameters. This study takes cigarette microblind holes (diameter of 0.1–0.2 mm, depth of approximately 35 µm) as the research object. It focuses on solving two major challenges: recognizing and localizing microblind hole contours in complex texture backgrounds and accurately calculating their 3D geometric morphology. An improved YOLOv11s model is proposed for microblind hole image multiobject detection with complex texture backgrounds to extract their features completely. An Area–Volume Computation (AVC) algorithm, which utilizes discrete integral estimation and curve-fitting principles, is also proposed for computing their surface area and volume. The experimental results show that the precision, recall, mAP@0.5, mAP@0.5:0.95, and prediction time of the improved YOLOv11 network are 0.915, 0.948, 0.925, 0.615, and 1.27 ms, respectively. The relative errors (REs) of the surface area and volume calculation of the microblind holes are 5.236% and 3.964%, respectively. The proposed method achieves microblind hole recognition, localization and 3D morphology calculation accuracy, meeting cigarette on-site inspection criteria. Additionally, a reference for detecting other similar objects in complex texture backgrounds and accurately calculating 3D tasks is provided. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

22 pages, 38551 KB

Open AccessArticle

Tiny Object Detection via Normalized Gaussian Label Assignment and Multi-Scale Hybrid Attention

by Shihao Lin, Li Zhong, Si Chen and Da-Han Wang

Remote Sens. 2026, 18(3), 396; https://doi.org/10.3390/rs18030396 - 24 Jan 2026

Viewed by 765

Abstract

The rapid development of Convolutional Neural Networks (CNNs) has markedly boosted the performance of object detection in remote sensing. Nevertheless, tiny objects typically account for an extremely small fraction of the total area in remote sensing images, rendering existing IoU-based or area-based evaluation [...] Read more.

The rapid development of Convolutional Neural Networks (CNNs) has markedly boosted the performance of object detection in remote sensing. Nevertheless, tiny objects typically account for an extremely small fraction of the total area in remote sensing images, rendering existing IoU-based or area-based evaluation metrics highly sensitive to minor pixel deviations. Meanwhile, classic detection models face inherent bottlenecks in efficiently mining discriminative features for tiny objects, leaving the task of tiny object detection in remote sensing images as an ongoing challenge in this field. To alleviate these issues, this paper proposes a tiny object detection method based on Normalized Gaussian Label Assignment and Multi-scale Hybrid Attention. Firstly, 2D Gaussian modeling is performed on the feature receptive field and the actual bounding box, using Normalized Bhattacharyya Distance for precise similarity measurement. Furthermore, a candidate sample quality ranking mechanism is constructed to select high-quality positive samples. Finally, a Multi-scale Hybrid Attention module is designed to enhance the discriminative feature extraction of tiny objects. The proposed method achieves 25.7% and 27.9% AP on the AI-TOD-v2 and VisDrone2019 datasets, respectively, significantly improving the detection capability of tiny objects in complex remote sensing scenarios. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

Topic Menu

Topic Editors

Computer Vision and Image Processing, 3rd Edition

Topic Information

Keywords

Participating Journals

Published Papers (9 papers)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Topic Menu

Topic Editors

Computer Vision and Image Processing, 3rd Edition

Topic Information

Keywords

Participating Journals

Related Topic

Published Papers (9 papers)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI