MDPI - Publisher of Open Access Journals

21 pages, 5958 KB

Open AccessArticle

Robust Satellite Techniques (RSTs) for SO₂ Detection with MSG-SEVIRI Data: A Case Study of the 2021 Tajogaite Eruption

by Rui Mota, Carolina Filizzola, Alfredo Falconieri, Francesco Marchese, Nicola Pergola, Valerio Tramutoli, Artur Gil and José Pacheco

Remote Sens. 2025, 17(19), 3345; https://doi.org/10.3390/rs17193345 - 1 Oct 2025

Viewed by 375

Abstract

Volcanic gas emissions, particularly sulfur dioxide (SO₂), are crucial for volcano monitoring. SO₂ has a significant impact on air quality, the climate, and human health, making it a critical component of volcano monitoring programs. Additionally, SO₂ can be used [...] Read more.

Volcanic gas emissions, particularly sulfur dioxide (SO₂), are crucial for volcano monitoring. SO₂ has a significant impact on air quality, the climate, and human health, making it a critical component of volcano monitoring programs. Additionally, SO₂ can be used to assess the state of a volcano and the progression of an individual eruption and can serve as a proxy for volcanic ash. The Tajogaite La Palma (Spain) eruption in 2021 emitted large amounts of SO₂ over 85 days, with the plume reaching Central Europe. In this study, we present the results achieved by monitoring Tajogaite SO₂ emissions from 19 September to 31 October 2021 at different acquisition times (i.e., 10:00 UTC, 12:00 UTC, 14:00 UTC, and 16:00 UTC). An optimized configuration of the Robust Satellite Technique (RST) approach, tailored to volcanic SO₂ detection and exploiting the Spinning Enhanced Visible and InfraRed Imager (SEVIRI) channel at an 8.7 µm wavelength, was used. The results, assessed by means of a performance evaluation compared with masks drawn from the EUMETSAT Volcanic Ash RGB, show that the RST product identified volcanic SO₂ plumes on approximately 81% of eruption days, with a very low false-positive rate (2% and 0.3% for the mid/low and high-confidence-level RST products, respectively), a weighted precision of ~79%, and an F1-score of ~54%. In addition, the comparison with the Tropospheric Monitoring Instrument (TROPOMI) S5P Product Algorithm Laboratory (S5P-PAL) L3 grid Daily SO₂ CBR product shows that RST-SEVIRI detections were mostly associated with SO₂ plumes having a column density greater than 0.4 Dobson Units (DU). This study gives rise to some interesting scenarios regarding the near-real-time monitoring of volcanic SO₂ by means of the Flexible Combined Imager (FCI) aboard the Meteosat Third-Generation (MTG) satellites, offering improved instrumental features compared with the SEVIRI. Full article

► Show Figures

Figure 1

25 pages, 13955 KB

Open AccessArticle

Adaptive Energy–Gradient–Contrast (EGC) Fusion with AIFI-YOLOv12 for Improving Nighttime Pedestrian Detection in Security

by Lijuan Wang, Zuchao Bao and Dongming Lu

Appl. Sci. 2025, 15(19), 10607; https://doi.org/10.3390/app151910607 - 30 Sep 2025

Viewed by 97

Abstract

In security applications, visible-light pedestrian detectors are highly sensitive to changes in illumination and fail under low-light or nighttime conditions, while infrared sensors, though resilient to lighting, often produce blurred object boundaries that hinder precise localization. To address these complementary limitations, we propose [...] Read more.

In security applications, visible-light pedestrian detectors are highly sensitive to changes in illumination and fail under low-light or nighttime conditions, while infrared sensors, though resilient to lighting, often produce blurred object boundaries that hinder precise localization. To address these complementary limitations, we propose a practical multimodal pipeline—Adaptive Energy–Gradient–Contrast (EGC) Fusion with AIFI-YOLOv12—that first fuses infrared and low-light visible images using per-pixel weights derived from local energy, gradient magnitude and contrast measures, then detects pedestrians with an improved YOLOv12 backbone. The detector integrates an AIFI attention module at high semantic levels, replaces selected modules with A2C2f blocks to enhance cross-channel feature aggregation, and preserves P3–P5 outputs to improve small-object localization. We evaluate the complete pipeline on the LLVIP dataset and report Precision, Recall, mAP@50, mAP@50–95, GFLOPs, FPS and detection time, comparing against YOLOv8, YOLOv10–YOLOv12 baselines (n and s scales). Quantitative and qualitative results show that the proposed fusion restores complementary thermal and visible details and that the AIFI-enhanced detector yields more robust nighttime pedestrian detection while maintaining a competitive computational profile suitable for real-world security deployments. Full article

(This article belongs to the Special Issue Advanced Image Analysis and Processing Technologies and Applications)

24 pages, 5484 KB

Open AccessArticle

TFI-Fusion: Hierarchical Triple-Stream Feature Interaction Network for Infrared and Visible Image Fusion

by Mingyang Zhao, Shaochen Su and Hao Li

Information 2025, 16(10), 844; https://doi.org/10.3390/info16100844 - 30 Sep 2025

Viewed by 188

Abstract

As a key technology in multimodal information processing, infrared and visible image fusion holds significant application value in fields such as military reconnaissance, intelligent security, and autonomous driving. To address the limitations of existing methods, this paper proposes the Hierarchical Triple-Feature Interaction Fusion [...] Read more.

As a key technology in multimodal information processing, infrared and visible image fusion holds significant application value in fields such as military reconnaissance, intelligent security, and autonomous driving. To address the limitations of existing methods, this paper proposes the Hierarchical Triple-Feature Interaction Fusion Network (TFI-Fusion). Based on a hierarchical triple-stream feature interaction mechanism, the network achieves high-quality fusion through a two-stage, separate-model processing approach: In the first stage, a single model extracts low-rank components (representing global structural features) and sparse components (representing local detail features) from source images via the Low-Rank Sparse Decomposition (LSRSD) module, while capturing cross-modal shared features using the Shared Feature Extractor (SFE). In the second stage, another model performs fusion and reconstruction: it first enhances the complementarity between low-rank and sparse features through the innovatively introduced Bi-Feature Interaction (BFI) module, realizes multi-level feature fusion via the Triple-Feature Interaction (TFI) module, and finally generates fused images with rich scene representation through feature reconstruction. This separate-model design reduces memory usage and improves operational speed. Additionally, a multi-objective optimization function is designed based on the network’s characteristics. Experiments demonstrate that TFI-Fusion exhibits excellent fusion performance, effectively preserving image details and enhancing feature complementarity, thus providing reliable visual data support for downstream tasks. Full article

► Show Figures

Figure 1

34 pages, 9527 KB

Open AccessArticle

High-Resolution 3D Thermal Mapping: From Dual-Sensor Calibration to Thermally Enriched Point Clouds

by Neri Edgardo Güidi, Andrea di Filippo and Salvatore Barba

Appl. Sci. 2025, 15(19), 10491; https://doi.org/10.3390/app151910491 - 28 Sep 2025

Viewed by 232

Abstract

Thermal imaging is increasingly applied in remote sensing to identify material degradation, monitor structural integrity, and support energy diagnostics. However, its adoption is limited by the low spatial resolution of thermal sensors compared to RGB cameras. This study proposes a modular pipeline to [...] Read more.

Thermal imaging is increasingly applied in remote sensing to identify material degradation, monitor structural integrity, and support energy diagnostics. However, its adoption is limited by the low spatial resolution of thermal sensors compared to RGB cameras. This study proposes a modular pipeline to generate thermally enriched 3D point clouds by fusing RGB and thermal imagery acquired simultaneously with a dual-sensor unmanned aerial vehicle system. The methodology includes geometric calibration of both cameras, image undistortion, cross-spectral feature matching, and projection of radiometric data onto the photogrammetric model through a computed homography. Thermal values are extracted using a custom parser and assigned to 3D points based on visibility masks and interpolation strategies. Calibration achieved 81.8% chessboard detection, yielding subpixel reprojection errors. Among twelve evaluated algorithms, LightGlue retained 99% of its matches and delivered a reprojection accuracy of 18.2% at 1 px, 65.1% at 3 px and 79% at 5 px. A case study on photovoltaic panels demonstrates the method’s capability to map thermal patterns with low temperature deviation from ground-truth data. Developed entirely in Python, the workflow integrates into Agisoft Metashape or other software. The proposed approach enables cost-effective, high-resolution thermal mapping with applications in civil engineering, cultural heritage conservation, and environmental monitoring applications. Full article

(This article belongs to the Special Issue Intelligent Techniques and 3D Virtual Reconstruction for Architectural Heritage)

► Show Figures

Figure 1

38 pages, 14848 KB

Open AccessArticle

Image Sand–Dust Removal Using Reinforced Multiscale Image Pair Training

by Dong-Min Son, Jun-Ru Huang and Sung-Hak Lee

Sensors 2025, 25(19), 5981; https://doi.org/10.3390/s25195981 - 26 Sep 2025

Viewed by 368

Abstract

This study proposes an image-enhancement method to address the challenges of low visibility and color distortion in images captured during yellow sandstorms for an image sensor based outdoor surveillance system. The technique combines traditional image processing with deep learning to improve image quality [...] Read more.

This study proposes an image-enhancement method to address the challenges of low visibility and color distortion in images captured during yellow sandstorms for an image sensor based outdoor surveillance system. The technique combines traditional image processing with deep learning to improve image quality while preserving color consistency during transformation. Conventional methods can partially improve color representation and reduce blurriness in sand–dust environments. However, they are limited in their ability to restore fine details and sharp object boundaries effectively. In contrast, the proposed method incorporates Retinex-based processing into the training phase, enabling enhanced clarity and sharpness in the restored images. The proposed framework comprises three main steps. First, a cycle-consistent generative adversarial network (CycleGAN) is trained with unpaired images to generate synthetically paired data. Second, CycleGAN is retrained using these generated images along with clear images obtained through multiscale image decomposition, allowing the model to transform dust-interfered images into clear ones. Finally, color preservation is achieved by selecting the A and B chrominance channels from the small-scale model to maintain the original color characteristics. The experimental results confirmed that the proposed method effectively restores image color and removes sand–dust-related interference, thereby providing enhanced visual quality under sandstorm conditions. Specifically, it outperformed algorithm-based dust removal methods such as Sand-Dust Image Enhancement (SDIE), Chromatic Variance Consistency Gamma and Correction-Based Dehazing (CVCGCBD), and Rank-One Prior (ROP+), as well as machine learning-based methods including Fusion strategy and Two-in-One Low-Visibility Enhancement Network (TOENet), achieving a Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) score of 17.238, which demonstrates improved perceptual quality, and an Local Phase Coherence-Sharpness Index (LPC-SI) value of 0.973, indicating enhanced sharpness. Both metrics showed superior performance compared to conventional methods. When applied to Closed-Circuit Television (CCTV) systems, the proposed method is expected to mitigate the adverse effects of color distortion and image blurring caused by sand–dust, thereby effectively improving visual clarity in practical surveillance applications. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing: 2nd Edition)

► Show Figures

Figure 1

16 pages, 2888 KB

Open AccessArticle

A Novel Application of Deep Learning–Based Estimation of Fish Abundance and Temporal Patterns in Agricultural Drainage Canals for Sustainable Ecosystem Monitoring

by Shigeya Maeda and Tatsuru Akiba

Sustainability 2025, 17(19), 8578; https://doi.org/10.3390/su17198578 - 24 Sep 2025

Viewed by 316

Abstract

Agricultural drainage canals provide critical habitats for fish species that are highly sensitive to agricultural practices. However, conventional monitoring methods such as capture surveys are invasive and labor-intensive, which means they can disturb fish populations and hinder long-term ecological assessment. Therefore, there is [...] Read more.

Agricultural drainage canals provide critical habitats for fish species that are highly sensitive to agricultural practices. However, conventional monitoring methods such as capture surveys are invasive and labor-intensive, which means they can disturb fish populations and hinder long-term ecological assessment. Therefore, there is a strong need for effective and non-invasive monitoring techniques. In this study, we developed a practical method using the YOLOv8n deep learning model to automatically detect and quantify fish occurrence in underwater images from a canal in Ibaraki Prefecture, Japan. The model showed high performance in validation (F1-score = 91.6%, Precision = 95.1%, Recall = 88.4%) but exhibited reduced performance under real field conditions (F1-score = 61.6%) due to turbidity, variable lighting, and sediment resuspension. By correcting for detection errors, we estimated that approximately 7300 individuals of Pseudorasbora parva and 80 individuals of Cyprinus carpio passed through the observation site during a seven-hour monitoring period. These findings demonstrate the feasibility of deep learning-based monitoring to capture temporal patterns of fish occurrence in agricultural drainage canals. This approach provides a promising tool for sustainable aquatic ecosystem management in agricultural landscapes and emphasizes the need for further improvements in recall under turbid and low-visibility conditions. Full article

(This article belongs to the Section Environmental Sustainability and Applications)

► Show Figures

Figure 1

25 pages, 3276 KB

Open AccessArticle

CPB-YOLOv8: An Enhanced Multi-Scale Traffic Sign Detector for Complex Road Environment

by Wei Zhao, Lanlan Li and Xin Gong

Information 2025, 16(9), 798; https://doi.org/10.3390/info16090798 - 15 Sep 2025

Viewed by 612

Abstract

Traffic sign detection is critically important for intelligent transportation systems, yet persistent challenges like multi-scale variation and complex background interference severely degrade detection accuracy and real-time performance. To address these limitations, this study presents CPB-YOLOv8, an advanced multi-scale detection framework based on the [...] Read more.

Traffic sign detection is critically important for intelligent transportation systems, yet persistent challenges like multi-scale variation and complex background interference severely degrade detection accuracy and real-time performance. To address these limitations, this study presents CPB-YOLOv8, an advanced multi-scale detection framework based on the YOLOv8 architecture. A Cross-Stage Partial-Partitioned Transformer Block (CSP-PTB) is incorporated into the feature extraction stage to preserve semantic information during downsampling while enhancing global feature representation. For feature fusion, a four-level bidirectional feature pyramid BiFPN integrated with a P2 detection layer significantly improves small-target detection capability. Further enhancement is achieved via an optimized loss function that balances multi-scale objective localization. Comprehensive evaluations were conducted on the TT100K, the CCTSDB, and a custom multi-scenario road image dataset capturing urban and suburban environments at 1920 × 1080 resolution. Results demonstrate compelling performance: On TT100K, CPB-YOLOv8 achieved 90.73% mAP@0.5 with a 12.5 MB model size, exceeding the YOLOv8s baseline by 3.94 percentage points and achieving 6.43% higher small-target recall. On CCTSDB, it attained a near-saturation performance of 99.21% mAP@0.5. Crucially, the model demonstrated exceptional robustness across diverse environmental conditions. Rigorous analysis on partitioned CCTSDB subsets based on weather and illumination, alongside validation using a separate self-collected dataset reserved solely for inference, confirmed strong adaptability to real-world distribution shifts and low-visibility scenarios. Cross-dataset validation and visual comparisons further substantiated the model’s robustness and its effective suppression of background interference. Full article

► Show Figures

Graphical abstract

24 pages, 11967 KB

Open AccessArticle

Smartphone-Based Edge Intelligence for Nighttime Visibility Estimation in Smart Cities

by Chengyuan Duan and Shiqi Yao

Electronics 2025, 14(18), 3642; https://doi.org/10.3390/electronics14183642 - 15 Sep 2025

Viewed by 409

Abstract

Impaired visibility, a major global environmental threat, is a result of light scattering by atmospheric particulate matter. While digital photographs are increasingly used for daytime visibility estimation, such methods are largely ineffective at night owing to the different scattering effects. Here, we introduce [...] Read more.

Impaired visibility, a major global environmental threat, is a result of light scattering by atmospheric particulate matter. While digital photographs are increasingly used for daytime visibility estimation, such methods are largely ineffective at night owing to the different scattering effects. Here, we introduce an image-based algorithm for inferring nighttime visibility from a single photograph by analyzing the forward scattering index and optical thickness retrieved from glow effects around light sources. Using photographs crawled from social media platforms across mainland China, we estimated the nationwide visibility for one year using the proposed algorithm, achieving high goodness-of-fit values (R² = 0.757; RMSE = 4.318 km), demonstrating robust performance under various nighttime scenarios. The model also captures both chronic and episodic visibility degradation, including localized pollution events. These results highlight the potential of using ubiquitous smartphone photography as a low-cost, scalable, and real-time sensing solution for nighttime atmospheric monitoring in urban areas. Full article

(This article belongs to the Special Issue Advanced Edge Intelligence in Smart Environments)

► Show Figures

Figure 1

12 pages, 1009 KB

Open AccessArticle

Contrast-Enhanced Transcranial Doppler for Detecting Residual Leaks—A Single-Center Study on the Effectiveness of Percutaneous PFO Closure

by Malwina Smolarek-Nicpoń, Grzegorz Smolka, Aleksandra Michalewska-Włudarczyk, Piotr Pysz, Anetta Lasek-Bal, Wojciech Wojakowski and Andrzej Kułach

J. Clin. Med. 2025, 14(18), 6483; https://doi.org/10.3390/jcm14186483 - 15 Sep 2025

Viewed by 382

Abstract

Background: A persistent connection between the atria, known as a patent foramen ovale (PFO), is present in approximately 25% of the general population. PFO closure is indicated in patients under 60 years of age who have experienced an embolic stroke of undetermined source [...] Read more.

Background: A persistent connection between the atria, known as a patent foramen ovale (PFO), is present in approximately 25% of the general population. PFO closure is indicated in patients under 60 years of age who have experienced an embolic stroke of undetermined source (ESUS) or transient ischemic attack (TIA) confirmed by neurological imaging, and in selected cases of peripheral embolism. Follow-up after the procedure is indicated to confirm the position of the occluder, assess the effectiveness of the closure, and evaluate any potential thrombus formation on the device. Methods: We analyzed data from 75 consecutive patients who underwent percutaneous PFO closure procedures and were followed up for at least one year. The procedure was performed under fluoroscopy and transesophageal echocardiography (TEE) guidance, and occluder size selection was made using TEE multiplanar imaging (MPR). All patients had standard transthoracic echocardiography (TTE) at 1 and 6–12 months after the procedure. To assess the long-term efficacy, contrast-enhanced transcranial Doppler (ce-TCD) was performed at 12 months to record high-intensity transient signals (HITSs). Cases with positive ce-TCD had TEE performed. Results: During follow-up evaluations after 1 and 6–12 months (TTE), we did not observe any device dislodgements, thrombi, or residual leaks visible in TTE. ce-TCD detected HITSs in eight patients, prompting additional TEE examinations performed in seven cases. In five out of seven patients, a leak around the occluder was identified, including two patients with grade 2 HITSs. Conclusions: Assessing the effectiveness of PFO occluder placement is crucial for the residual embolic risk and thus the necessity of antithrombotic therapy. Even low grades of HITSs observed in ce-TCD help to identify patients with residual leaks confirmed in TEE. Full article

(This article belongs to the Special Issue Patent Foramen Ovale 2023: More Lights than Shadows)

► Show Figures

Figure 1

15 pages, 2947 KB

Open AccessArticle

Visible-Light Spectroscopy and Laser Scattering for Screening Brewed Coffee Types Using a Low-Cost Portable Platform

by Eleftheria Maliaritsi, Georgios Violakis and Evangelos Hristoforou

Electronics 2025, 14(18), 3625; https://doi.org/10.3390/electronics14183625 - 12 Sep 2025

Viewed by 336

Abstract

Visible-light spectroscopy has long been used to assess various quality indicators in coffee, from green beans to brewed beverages. High-end absorption spectroscopy systems can identify chemical compounds, monitor roasting chemistry, and support flavor profiling. Despite advances in low-cost spectroscopy, such techniques are rarely [...] Read more.

Visible-light spectroscopy has long been used to assess various quality indicators in coffee, from green beans to brewed beverages. High-end absorption spectroscopy systems can identify chemical compounds, monitor roasting chemistry, and support flavor profiling. Despite advances in low-cost spectroscopy, such techniques are rarely applied during coffee-drink preparation. Most coffee shops, instead, rely on simple refractometers to measure total dissolved solids (TDS) as a proxy for beverage strength. This study explores a portable, low-cost screening system that integrates visible absorption-transmittance, laser-induced scattering, and fluorescence spectroscopy to estimate brew strength and investigate potential differentiation between coffee-drink types. Experiments were conducted on four common drink preparations. A dual-region exponential decay model was applied to absorption-transmittance spectra, while laser-scattered light imaging revealed distinctive color patterns across samples. The results demonstrate the feasibility of optical fingerprinting as a non-invasive tool to support quality assessment in brewed coffee. Full article

(This article belongs to the Special Issue Section Collection Series: Recent Advances in Optoelectronics from Lab to Industry, 2nd Edition)

► Show Figures

Figure 1

15 pages, 4635 KB

Open AccessArticle

GLNet-YOLO: Multimodal Feature Fusion for Pedestrian Detection

by Yi Zhang, Qing Zhao, Xurui Xie, Yang Shen, Jinhe Ran, Shu Gui, Haiyan Zhang, Xiuhe Li and Zhen Zhang

AI 2025, 6(9), 229; https://doi.org/10.3390/ai6090229 - 12 Sep 2025

Viewed by 675

Abstract

In the field of modern computer vision, pedestrian detection technology holds significant importance in applications such as intelligent surveillance, autonomous driving, and robot navigation. However, single-modal images struggle to achieve high-precision detection in complex environments. To address this, this study proposes a GLNet-YOLO [...] Read more.

In the field of modern computer vision, pedestrian detection technology holds significant importance in applications such as intelligent surveillance, autonomous driving, and robot navigation. However, single-modal images struggle to achieve high-precision detection in complex environments. To address this, this study proposes a GLNet-YOLO framework based on cross-modal deep feature fusion, aiming to improve pedestrian detection performance in complex environments by fusing feature information from visible light and infrared images. By extending the YOLOv11 architecture, the framework adopts a dual-branch network structure to process visible light and infrared modal inputs, respectively, and introduces the FM module to realize global feature fusion and enhancement, as well as the DMR module to accomplish local feature separation and interaction. Experimental results show that on the LLVIP dataset, compared to the single-modal YOLOv11 baseline, our fused model improves the mAP@50 by 9.2% over the visible-light-only model and 0.7% over the infrared-only model. This significantly improves the detection accuracy under low-light and complex background conditions and enhances the robustness of the algorithm, and its effectiveness is further verified on the KAIST dataset. Full article

(This article belongs to the Special Issue Deep Learning Technologies and Their Applications in Image Processing, Computer Vision, and Computational Intelligence)

► Show Figures

Figure 1

5 pages, 1216 KB

Open AccessAbstract

Low-Power Vibrothermography for Detecting and Quantifying Defects on CFRP Composites

by Zulham Hidayat, Muhammet E. Torbali, Konstantinos Salonitis, Nicolas P. Avdelidis and Henrique Fernandes

Proceedings 2025, 129(1), 50; https://doi.org/10.3390/proceedings2025129050 - 12 Sep 2025

Viewed by 245

Abstract

Detecting and quantifying barely visible impact damage (BVID) in carbon fiber-reinforced polymer (CFRP) materials is a key challenge in maintaining the safety and reliability of composite structures. This study presents the application of low-power vibrothermography to identify and quantify such defects. Using a [...] Read more.

Detecting and quantifying barely visible impact damage (BVID) in carbon fiber-reinforced polymer (CFRP) materials is a key challenge in maintaining the safety and reliability of composite structures. This study presents the application of low-power vibrothermography to identify and quantify such defects. Using a long-wave infrared (LWIR) camera, thermal data were captured from the CFRP specimens that inhibit BVID. How image processing, specifically principal component analysis (PCA) and sparse principal component analysis (SPCA), can enhance thermal contrast and improve the accuracy of defect size is also explored. By combining low-energy excitation with advanced data analysis, this research aims to develop a more accessible and reliable approach to non-destructive testing (NDT) for composite materials. Full article

► Show Figures

Figure 1

19 pages, 20856 KB

Open AccessArticle

A Wavelet-Recalibrated Semi-Supervised Network for Infrared Small Target Detection Under Data Scarcity

by Cheng Jiang, Jingwen Ma, Xinpeng Zhang, Chiming Tong, Zhongqi Ma and Yongshi Jie

Sensors 2025, 25(18), 5677; https://doi.org/10.3390/s25185677 - 11 Sep 2025

Viewed by 349

Abstract

Infrared small target detection has long faced significant challenges due to the extremely small size of targets, low contrast, and the scarcity of annotated data. To tackle these issues, we propose a wavelet-recalibrated semi-supervised network (WRSSNet) that integrates synthetic data augmentation, feature reconstruction, [...] Read more.

Infrared small target detection has long faced significant challenges due to the extremely small size of targets, low contrast, and the scarcity of annotated data. To tackle these issues, we propose a wavelet-recalibrated semi-supervised network (WRSSNet) that integrates synthetic data augmentation, feature reconstruction, and semi-supervised learning, aiming to fully exploit the potential of unlabeled infrared images under limited supervision. We construct a dataset containing 843 visible-light small target images and employ an improved CycleGAN model to convert them into high-quality pseudo-infrared images, effectively expanding the scale of training data for infrared small target detection. In addition, we design a lightweight wavelet-enhanced channel recalibration and fusion (WECRF) module, which integrates wavelet decomposition with both channel and spatial attention mechanisms. This module enables adaptive reweighting and efficient fusion of multi-scale features, highlighting high-frequency details and weak target responses. Extensive experiments on two public infrared small target datasets, NUAA-SIRST and IRSTD-1K, demonstrate that WRSSNet achieves superior detection accuracy and lower false alarm rates compared to several state-of-the-art methods, while maintaining low computational complexity. Full article

(This article belongs to the Special Issue Target Detection, Tracking and Identification Using Multi-Sensor Systems)

► Show Figures

Figure 1

25 pages, 18797 KB

Open AccessArticle

AEFusion: Adaptive Enhanced Fusion of Visible and Infrared Images for Night Vision

by Xiaozhu Wang, Chenglong Zhang, Jianming Hu, Qin Wen, Guifeng Zhang and Min Huang

Remote Sens. 2025, 17(18), 3129; https://doi.org/10.3390/rs17183129 - 9 Sep 2025

Viewed by 674

Abstract

Under night vision conditions, visible-spectrum images often fail to capture background details. Conventional visible and infrared fusion methods generally overlay thermal signatures without preserving latent features in low-visibility regions. This paper proposes a novel deep learning-based fusion algorithm to enhance visual perception in [...] Read more.

Under night vision conditions, visible-spectrum images often fail to capture background details. Conventional visible and infrared fusion methods generally overlay thermal signatures without preserving latent features in low-visibility regions. This paper proposes a novel deep learning-based fusion algorithm to enhance visual perception in night driving scenarios. Firstly, a local adaptive enhancement algorithm corrects underexposed and overexposed regions in visible images, thereby preventing oversaturation during brightness adjustment. Secondly, ResNet152 extracts hierarchical feature maps from enhanced visible and infrared inputs. Max pooling and average pooling operations preserve critical features and distinct information across these feature maps. Finally, Linear Discriminant Analysis (LDA) reduces dimensionality and decorrelates features. We reconstruct the fused image by the weighted integration of the source images. The experimental results on benchmark datasets show that our approach outperforms state-of-the-art methods in both objective metrics and subjective visual assessments. Full article

► Show Figures

Graphical abstract

29 pages, 3367 KB

Open AccessArticle

Small Object Detection in Synthetic Aperture Radar with Modular Feature Encoding and Vectorized Box Regression

by Xinmiao Du and Xihong Wu

Remote Sens. 2025, 17(17), 3094; https://doi.org/10.3390/rs17173094 - 5 Sep 2025

Viewed by 1033

Abstract

Object detection in synthetic aperture radar (SAR) imagery poses significant challenges due to low resolution, small objects, arbitrary orientations, and complex backgrounds. Standard object detectors often fail to capture sufficient semantic and geometric cues for such tiny targets. To address this issue, a [...] Read more.

Object detection in synthetic aperture radar (SAR) imagery poses significant challenges due to low resolution, small objects, arbitrary orientations, and complex backgrounds. Standard object detectors often fail to capture sufficient semantic and geometric cues for such tiny targets. To address this issue, a new Convolutional Neural Network (CNN) framework called Deformable Vectorized Detection Network (DVDNet) has been proposed, specifically designed for detecting small, oriented, and densely packed objects in SAR images. The DVDNet consists of Grouped-Deformable Convolution for adaptive receptive field adjustment to diverse object scales, a Local Binary Pattern (LBP) Enhancement Module that enriches texture representations and enhances the visibility of small or camouflaged objects, and a Vector Decomposition Module that enables accurate regression of oriented bounding boxes via learnable geometric vectors. The DVDNet is embedded in a two-stage detection architecture and is particularly effective in preserving fine-grained features critical for mall object localization. The performance of DVDNet is validated on two SAR small target detection datasets, HRSID and SSDD, and it is experimentally demonstrated that it achieves 90.9% mAP on HRSID and 87.2% mAP on SSDD. The generalizability of DVDNet was also verified on the self-built SAR ship dataset and the remote sensing optical dataset HRSC2016. All these experiments show that DVDNet outperforms the standard detector. Notably, our framework shows substantial gains in precision and recall for small object subsets, validating the importance of combining deformable sampling, texture enhancement, and vector-based box representation for high-fidelity small object detection in complex SAR scenes. Full article

(This article belongs to the Special Issue Deep Learning Techniques and Applications of MIMO Radar Theory)

► Show Figures

Figure 1

Search Results (927)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (927)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI