Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (622)

Search Parameters:
Keywords = infrared and visible light images

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 5269 KB  
Article
Micro-Multiband Imaging (µMBI) in the Technical Study and Condition Assessment of Paintings: An Insight into Its Potential and Limitations
by Miguel. A. Herrero-Cortell, Irene Samaniego-Jiménez, Candela Belenguer-Salvador, Marta Raïch-Creus, Laura Osete-Cortina, Arianna Abbafati, Anna Vila, Marcello Picollo and Laura Fuster-López
Heritage 2026, 9(2), 54; https://doi.org/10.3390/heritage9020054 - 31 Jan 2026
Viewed by 116
Abstract
Multiband imaging (MBI) is a non-invasive, portable digital technique that has become increasingly widespread in the technical study and condition assessment of paintings, owing to its affordability and ease of use. This paper presents an experimental study aimed at optimising MBI at the [...] Read more.
Multiband imaging (MBI) is a non-invasive, portable digital technique that has become increasingly widespread in the technical study and condition assessment of paintings, owing to its affordability and ease of use. This paper presents an experimental study aimed at optimising MBI at the microscopic scale—referred to as micro-multiband imaging (µMBI)—with the particular aim of expanding its diagnostic capabilities. A range of µMBI techniques was used on custom-made mock-ups made up of pigments selected for their spectral responses, and representative of traditional artistic materials. The techniques used included microphotography of polarised and unpolarised visible light (µVIS), raking light microphotography (µRL), transmitted light microphotography (µTL), ultraviolet-induced visible luminescence microphotography (µUVL), near-infrared microphotography (µIR), near-infrared micro-trans-irradiation (µIRT), and near-infrared false-colour microphotography (µIRFC). The results obtained through µMBI were compared with those from standard MBI methods, allowing for a critical discussion of the strengths and limitations of this emerging approach. Results evidence that µMBI provides high-resolution, spatially specific insights into materials and painting techniques, offering a more detailed understanding at the microscale of how a painting was executed. It also enables the assessment of deterioration processes (e.g., cracking, delamination, and metal soap formation), contributing to a deeper comprehension of the origin and progression of failure phenomena and supporting the development of more informed, preventive conservation strategies. Full article
Show Figures

Figure 1

20 pages, 49658 KB  
Article
Dead Chicken Identification Method Based on a Spatial-Temporal Graph Convolution Network
by Jikang Yang, Chuang Ma, Haikun Zheng, Zhenlong Wu, Xiaohuan Chao, Cheng Fang and Boyi Xiao
Animals 2026, 16(3), 368; https://doi.org/10.3390/ani16030368 - 23 Jan 2026
Viewed by 152
Abstract
In intensive cage rearing systems, accurate dead hen detection remains difficult due to complex environments, severe occlusion, and the high visual similarity between dead hens and live hens in a prone posture. To address these issues, this study proposes a dead hen identification [...] Read more.
In intensive cage rearing systems, accurate dead hen detection remains difficult due to complex environments, severe occlusion, and the high visual similarity between dead hens and live hens in a prone posture. To address these issues, this study proposes a dead hen identification method based on a Spatial-Temporal Graph Convolutional Network (STGCN). Unlike conventional static image-based approaches, the proposed method introduces temporal information to enable dynamic spatial-temporal modeling of hen health states. First, a multimodal fusion algorithm is applied to visible light and thermal infrared images to strengthen multimodal feature representation. Then, an improved YOLOv7-Pose algorithm is used to extract the skeletal keypoints of individual hens, and the ByteTrack algorithm is employed for multi-object tracking. Based on these results, spatial-temporal graph-structured data of hens are constructed by integrating spatial and temporal dimensions. Finally, a spatial-temporal graph convolution model is used to identify dead hens by learning spatial-temporal dependency features from skeleton sequences. Experimental results show that the improved YOLOv7-Pose model achieves an average precision (AP) of 92.8% in keypoint detection. Based on the constructed spatial-temporal graph data, the dead hen identification model reaches an overall classification accuracy of 99.0%, with an accuracy of 98.9% for the dead hen category. These results demonstrate that the proposed method effectively reduces interference caused by feeder occlusion and ambiguous visual features. By using dynamic spatial-temporal information, the method substantially improves robustness and accuracy of dead hen detection in complex cage rearing environments, providing a new technical route for intelligent monitoring of poultry health status. Full article
(This article belongs to the Special Issue Welfare and Behavior of Laying Hens)
Show Figures

Figure 1

33 pages, 23667 KB  
Article
Full-Wave Optical Modeling of Leaf Internal Light Scattering for Early-Stage Fungal Disease Detection
by Da-Young Lee and Dong-Yeop Na
Agriculture 2026, 16(2), 286; https://doi.org/10.3390/agriculture16020286 - 22 Jan 2026
Viewed by 146
Abstract
Modifications in leaf architecture disrupt optical properties and internal light-scattering dynamics. Accurate modeling of leaf-scale light scattering is therefore essential not only for understanding how disease affects the availability of light for chlorophyll absorption, but also for evaluating its potential as an early [...] Read more.
Modifications in leaf architecture disrupt optical properties and internal light-scattering dynamics. Accurate modeling of leaf-scale light scattering is therefore essential not only for understanding how disease affects the availability of light for chlorophyll absorption, but also for evaluating its potential as an early optical marker for plant disease detection prior to visible symptom development. Conventional ray-tracing and radiative-transfer models rely on high-frequency approximations and thus fail to capture diffraction and coherent multiple-scattering effects when internal leaf structures are comparable to optical wavelengths. To overcome these limitations, we present a GPU-accelerated finite-difference time-domain (FDTD) framework for full-wave simulation of light propagation within plant leaves, using anatomically realistic dicot and monocot leaf cross-section geometries. Microscopic images acquired from publicly available sources were segmented into distinct tissue regions and assigned wavelength-dependent complex refractive indices to construct realistic electromagnetic models. The proposed FDTD framework successfully reproduced characteristic reflectance and transmittance spectra of healthy leaves across the visible and near-infrared (NIR) ranges. Quantitative agreement between the FDTD-computed spectral reflectance and transmittance and those predicted by the reference PROSPECT leaf optical model was evaluated using Lin’s concordance correlation coefficient. Higher concordance was observed for dicot leaves (Cb=0.90) than for monocot leaves (Cb=0.79), indicating a stronger agreement for anatomically complex dicot structures. Furthermore, simulations mimicking an early-stage fungal infection in a dicot leaf—modeled by the geometric introduction of melanized hyphae penetrating the cuticle and upper epidermis—revealed a pronounced reduction in visible green reflectance and a strong suppression of the NIR reflectance plateau. These trends are consistent with experimental observations reported in previous studies. Overall, this proof-of-concept study represents the first full-wave FDTD-based optical modeling of internal light scattering in plant leaves. The proposed framework enables direct electromagnetic analysis of pre- and post-penetration light-scattering dynamics during early fungal infection and establishes a foundation for exploiting leaf-scale light scattering as a next-generation, pre-symptomatic diagnostic indicator for plant fungal diseases. Full article
(This article belongs to the Special Issue Exploring Sustainable Strategies That Control Fungal Plant Diseases)
Show Figures

Figure 1

19 pages, 3198 KB  
Article
Interface-Engineered Zn@TiO2 and Ti@ZnO Nanocomposites for Advanced Photocatalytic Degradation of Levofloxacin
by Ishita Raval, Atindra Shukla, Vimal G. Gandhi, Khoa Dang Dang, Niraj G. Nair and Van-Huy Nguyen
Catalysts 2026, 16(1), 109; https://doi.org/10.3390/catal16010109 - 22 Jan 2026
Viewed by 194
Abstract
The extensive consumption of freshwater resources and the continuous discharge of pharmaceutical residues pose serious risks to aquatic ecosystems and public health. In this study, pristine ZnO, TiO2, Zn@TiO2, and Ti@ZnO nanocomposites were synthesized via a precipitation-assisted solid–liquid interference [...] Read more.
The extensive consumption of freshwater resources and the continuous discharge of pharmaceutical residues pose serious risks to aquatic ecosystems and public health. In this study, pristine ZnO, TiO2, Zn@TiO2, and Ti@ZnO nanocomposites were synthesized via a precipitation-assisted solid–liquid interference method and systematically evaluated for the photocatalytic degradation of the antibiotic levofloxacin under UV and visible light irradiation. The structural, optical, and surface properties of the synthesized materials were characterized using X-ray diffraction (XRD), Fourier transform infrared spectroscopy (FTIR), scanning electron microscopy (SEM), UV–visible diffuse reflectance spectroscopy (UV–DRS), and X-ray photoelectron spectroscopy (XPS). XRD analysis confirmed the crystalline nature of all samples, while SEM images revealed spherical and agglomerated morphologies. Photocatalytic experiments were conducted using a 50-ppm levofloxacin solution with a catalyst dosage of 1 g L−1. Pristine ZnO exhibited limited visible-light activity (33.81%) but high UV-driven degradation (92.98%), whereas TiO2 showed comparable degradation efficiencies under UV (78.6%) and visible light (78.9%). Notably, Zn@TiO2 nanocomposites demonstrated superior photocatalytic performance, achieving over 90% and near 70% degradation under both UV and visible light, respectively, while Ti@ZnO composites exhibited less than 60% degradation. The enhanced activity of Zn@TiO2 is attributed to improved interfacial charge transfer, suppressed electron–hole recombination, and extended light absorption. These findings highlight Zn@TiO2 nanocomposites as promising photocatalysts for efficient treatment of pharmaceutical wastewater under dual-light irradiation. Full article
Show Figures

Graphical abstract

17 pages, 3642 KB  
Article
Spatiotemporal Analysis for Real-Time Non-Destructive Brix Estimation in Apples
by Ha-Na Kim, Myeong-Won Bae, Yong-Jin Cho and Dong-Hoon Lee
Agriculture 2026, 16(2), 172; https://doi.org/10.3390/agriculture16020172 - 9 Jan 2026
Viewed by 227
Abstract
Predicting internal quality parameters, such as Brix and water content, of apples, is essential for quality control. Existing near-infrared (NIR) and hyperspectral imaging (HSI)-based techniques have limited applicability due to their dependence on equipment and environmental sensitivity. In this study, a transportable quality [...] Read more.
Predicting internal quality parameters, such as Brix and water content, of apples, is essential for quality control. Existing near-infrared (NIR) and hyperspectral imaging (HSI)-based techniques have limited applicability due to their dependence on equipment and environmental sensitivity. In this study, a transportable quality assessment system was proposed using spatiotemporal domain analysis with long-wave infrared (LWIR)-based thermal diffusion phenomics, enabling non-destructive prediction of the internal Brix of apples during transport. After cooling, the thermal gradient of the apple surface during the cooling-to-equilibrium interval was extracted. This gradient was used as an input variable for multiple linear regression, Ridge, and Lasso models, and the prediction performance was assessed. Overall, 492 specimens of 5 cultivars of apple (Hongro, Arisoo, Sinano Gold, Stored Fuji, and Fuji) were included in the experiment. The thermal diffusion response of each specimen was imaged at a sampling frequency of 8.9 Hz using LWIR-based thermal imaging, and the temperature changes over time were compared. In cross-validation of the integrated model for all cultivars, the coefficient of determination (R2cv) was 0.80, and the RMSEcv was 0.86 °Brix, demonstrating stable prediction accuracy within ±1 °Brix. In terms of cultivar, Arisoo (Cultivar 2) and Fuji (Cultivar 5) showed high prediction reliability (R2cv = 0.74–0.77), while Hongro (Cultivar 1) and Stored Fuji (Cultivar 4) showed relatively weak correlations. This is thought to be due to differences in thermal diffusion characteristics between cultivars, depending on their tissue density and water content. The LWIR-based thermal diffusion analysis presented in this study is less sensitive to changes in reflectance and illuminance compared to conventional NIR and visible light spectrophotometry, as it enables real-time measurements during transport without requiring a separate light source. Surface heat distribution phenomics due to external heat sources serves as an index that proximally reflects changes in the internal Brix of apples. Later, this could be developed into a reliable commercial screening system to obtain extensive data accounting for diversity between cultivars and to elucidate the effects of interference using external environmental factors. Full article
Show Figures

Figure 1

16 pages, 4121 KB  
Article
Uncovering Fishing Area Patterns Using Convolutional Autoencoder and Gaussian Mixture Model on VIIRS Nighttime Imagery
by Jeong Chang Seong, Jina Jang, Jiwon Yang, Seung Hee Choi and Chul Sue Hwang
ISPRS Int. J. Geo-Inf. 2026, 15(1), 25; https://doi.org/10.3390/ijgi15010025 - 5 Jan 2026
Viewed by 393
Abstract
The availability of nighttime satellite imagery provides unique opportunities for monitoring fishing activity in data-sparse ocean regions. This study leverages Visible Infrared Imaging Radiometer Suite (VIIRS) Day/Night Band monthly composite imagery to identify and classify recurring spatial patterns of fishing activity in the [...] Read more.
The availability of nighttime satellite imagery provides unique opportunities for monitoring fishing activity in data-sparse ocean regions. This study leverages Visible Infrared Imaging Radiometer Suite (VIIRS) Day/Night Band monthly composite imagery to identify and classify recurring spatial patterns of fishing activity in the Korean Exclusive Economic Zone from 2014 to 2024. While prior research has primarily produced static hotspot maps, our approach advances geospatial fishing activity identification by employing machine learning techniques to group similar spatiotemporal configurations, thereby capturing recurring fishing patterns and their temporal variability. A convolutional autoencoder and a Gaussian Mixture Model (GMM) were used to cluster the VIIRS imagery. Results revealed seven major nighttime light hotspots. Results also identified four cluster patterns: Cluster 0 dominated in December, January, and February, Cluster 1 in March, April, and May, Cluster 2 in July, August, and September, and Cluster 3 in October and November. Interannual variability was also identified. In particular, Clusters 0 and 3 expanded into later months in recent years (2022–2024), whereas Cluster 1 contracted. These findings align with environmental changes in the region, including ocean temperature rise and declining primary productivity. By integrating autoencoders with probabilistic clustering, this research demonstrates a framework for uncovering recurrent fishing activity patterns and highlights the utility of satellite imagery with GeoAI in advancing marine fisheries monitoring. Full article
(This article belongs to the Special Issue Spatial Data Science and Knowledge Discovery)
Show Figures

Figure 1

16 pages, 8307 KB  
Article
Accurate Automatic Object Identification Under Complex Lighting Conditions via AI Vision on Enhanced Infrared Polarization Images
by Ruixin Jia, Hongming Fei, Han Lin, Yibiao Yang, Xin Liu, Mingda Zhang and Liantuan Xiao
Optics 2026, 7(1), 3; https://doi.org/10.3390/opt7010003 - 3 Jan 2026
Viewed by 287
Abstract
Object identification (OI) is widely used in fields like autonomous driving, security, robotics, environmental monitoring, and medical diagnostics. OI using infrared (IR) images provides high visibility in low light for all-day operation compared to visible light. However, the low contrast often causes OI [...] Read more.
Object identification (OI) is widely used in fields like autonomous driving, security, robotics, environmental monitoring, and medical diagnostics. OI using infrared (IR) images provides high visibility in low light for all-day operation compared to visible light. However, the low contrast often causes OI failure in complex scenes with similar target and background temperatures. Therefore, there is a stringent requirement to enhance IR image contrast for accurate OI, and it is ideal to develop a fully automatic process for identifying objects in IR images under any lighting condition, especially in photon-deficient conditions. Here, we demonstrate for the first time a highly accurate automatic IR OI process based on the combination of polarization IR imaging and artificial intelligence (AI) vision (Yolov7), which can quickly identify objects with a high discrimination confidence level (DCL, up to 0.96). In addition, we demonstrate that it is possible to achieve accurate IR OI in complex environments, such as photon-deficient, foggy conditions, and opaque-covered objects with a high DCL. Finally, through training the model, we can identify any object. In this paper, we use a UAV as an example to conduct experiments, further expanding the capabilities of this method. Therefore, our method enables broad OI applications with high all-day performance. Full article
Show Figures

Figure 1

15 pages, 3046 KB  
Article
Maritime Small Target Image Detection Algorithm Based on Improved YOLOv11n
by Zhaohua Liu, Yanli Sun, Pengfei He, Ningbo Liu and Zhongxun Wang
Sensors 2026, 26(1), 163; https://doi.org/10.3390/s26010163 - 26 Dec 2025
Viewed by 287
Abstract
Aiming at the problems of small-sized ships (such as small patrol boats) in complex open-sea backgrounds, including small sizes, insufficient feature information, and high missed detection rates, this paper proposes a maritime small target image detection algorithm based on the improved YOLOv11n. Firstly, [...] Read more.
Aiming at the problems of small-sized ships (such as small patrol boats) in complex open-sea backgrounds, including small sizes, insufficient feature information, and high missed detection rates, this paper proposes a maritime small target image detection algorithm based on the improved YOLOv11n. Firstly, the BIE module is introduced into the neck feature fusion stage of YOLOv11n. Utilizing its dual-branch information interaction design, independent branches for key features of maritime small targets in infrared and visible light images are constructed, enabling the progressive fusion of infrared and visible light target features. Secondly, RepViTBlock is incorporated into the backbone network and combined with the C3k2 module of YOLOv11n to form C3k2-RepViTBlock. Through the lightweight attention mechanism and multi-branch convolution structure, this addresses the insufficient capture of tiny target features by the C3k2 module and enhances the model’s ability to extract local features of maritime small targets. Finally, the ConvAttn module is embedded at the end of the backbone network. With its dynamic small-kernel convolution, it adaptively extracts the contour features of small targets, maintaining the overall model’s light weight while reducing the missed detection rate for maritime small targets. Experiments on a collected infrared and visible light ship image dataset (IVships) and a public dataset (SeaShips) show that, on the basis of increasing only a small number of parameters, the improved algorithm increases the mAP@0.5 by 1.9% and 1.7%, respectively, and the average precision by 2.2% and 2.4%, respectively, compared with the original model, which significantly improves the model’s small target detection capabilities. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

26 pages, 8829 KB  
Article
YOLO-MSLT: A Multimodal Fusion Network Based on Spatial Linear Transformer for Cattle and Sheep Detection in Challenging Environments
by Yixing Bai, Yongquan Li, Ruoyu Di, Jingye Liu, Xiaole Wang, Chengkai Li and Pan Gao
Agriculture 2026, 16(1), 35; https://doi.org/10.3390/agriculture16010035 - 23 Dec 2025
Viewed by 450
Abstract
Accurate detection of cattle and sheep is a core task in precision livestock farming. However, the complexity of agricultural settings, where visible light images perform poorly under low-light or occluded conditions and infrared images are limited in resolution, poses significant challenges for current [...] Read more.
Accurate detection of cattle and sheep is a core task in precision livestock farming. However, the complexity of agricultural settings, where visible light images perform poorly under low-light or occluded conditions and infrared images are limited in resolution, poses significant challenges for current smart monitoring systems. To tackle these challenges, this study aims to develop a robust multimodal fusion detection network for the accurate and reliable detection of cattle and sheep in complex scenes. To achieve this, we propose YOLO-MSLT, a multimodal fusion detection network based on YOLOv10, which leverages the complementary nature of visible light and infrared data. The core of YOLO-MSLT incorporates a Cross Flatten Fusion Transformer (CFFT), composed of the Linear Cross-modal Spatial Transformer (LCST) and Deep-wise Enhancement (DWE), designed to enhance modality collaboration by performing complementary fusion at the feature level. Furthermore, a Content-Guided Attention Feature Pyramid Network (CGA-FPN) is integrated into the neck to improve the representation of multi-scale object features. Validation was conducted on a cattle and sheep dataset built from 5056 pairs of multimodal images (visible light and infrared) collected in the Manas River Basin, Xinjiang. Results demonstrate that YOLO-MSLT performs robustly in complex terrain, low-light, and occlusion scenarios, achieving an mAP@0.5 of 91.8% and a precision of 93.2%, significantly outperforming mainstream detection models. This research provides an impactful and practical solution for cattle and sheep detection in challenging agricultural environments. Full article
(This article belongs to the Section Farm Animal Production)
Show Figures

Figure 1

14 pages, 3389 KB  
Article
A Cascaded Enhancement-Fusion Network for Visible-Infrared Imaging in Darkness
by Hanchang Huang, Hao Liu, Hailu Wang, Yunzhuo Yang, Chuan Guo, Minsun Chen and Kai Han
Photonics 2025, 12(12), 1231; https://doi.org/10.3390/photonics12121231 - 15 Dec 2025
Viewed by 309
Abstract
This paper presents a cascaded imaging method that combines low-light enhancement and visible–long-wavelength infrared (VIS-LWIR) image fusion to mitigate image degradation in dark environments. The framework incorporates a Low-Light Enhancer Network (LLENet) for improving visible image illumination and a heterogeneous information fusion subnetwork [...] Read more.
This paper presents a cascaded imaging method that combines low-light enhancement and visible–long-wavelength infrared (VIS-LWIR) image fusion to mitigate image degradation in dark environments. The framework incorporates a Low-Light Enhancer Network (LLENet) for improving visible image illumination and a heterogeneous information fusion subnetwork (IXNet) for integrating features from enhanced VIS and LWIR images. Using a joint training strategy with a customized loss function, the approach effectively preserves salient targets and texture details. Experimental results on the LLVIP, M3FD, TNO, and MSRS datasets demonstrate that the method produces high-quality fused images with superior performance evaluated by quantitative metrics. It also exhibits excellent generalization ability, maintains a compact model size with low computational complexity, and significantly enhances performance in high-level visual tasks like object detection, particularly in challenging low-light scenarios. Full article
(This article belongs to the Special Issue Technologies and Applications of Optical Imaging)
Show Figures

Figure 1

18 pages, 1457 KB  
Article
Research on Multi-Modal Fusion Detection Method for Low-Slow-Small UAVs Based on Deep Learning
by Zhengtang Liu, Yongjie Zou, Zhenzhen Hu, Han Xue, Meng Li and Bin Rao
Drones 2025, 9(12), 852; https://doi.org/10.3390/drones9120852 - 11 Dec 2025
Cited by 1 | Viewed by 719
Abstract
Addressing the technical challenges in detecting Low-Slow-Small Unmanned Aerial Vehicle (LSS-UAV) cluster targets, such as weak signals and complex environmental interference coupling with strong features, this paper proposes a visible-infrared multi-modal fusion detection method based on deep learning. The method utilizes deep learning [...] Read more.
Addressing the technical challenges in detecting Low-Slow-Small Unmanned Aerial Vehicle (LSS-UAV) cluster targets, such as weak signals and complex environmental interference coupling with strong features, this paper proposes a visible-infrared multi-modal fusion detection method based on deep learning. The method utilizes deep learning techniques to separately identify morphological features in visible light images and thermal radiation features in infrared images. A hierarchical multi-modal fusion framework integrating feature-level and decision-level fusion is designed, incorporating an Environment-Aware Dynamic Weighting (EADW) mechanism and Dempster-Shafer evidence theory (D-S evidence theory). This framework effectively leverages the complementary advantages of feature-level and decision-level fusion. This effectively enhances the detection and recognition capability, as well as the system robustness, for LSS-UAV cluster targets in complex environments. Experimental results demonstrate that the proposed method achieves a detection accuracy of 93.5% for LSS-UAV clusters in complex urban environments, representing an average improvement of 18.7% compared to single-modal methods, while the false alarm rate is reduced to 4.2%. Furthermore, the method demonstrates strong environmental adaptability, maintaining high performance under challenging conditions such as nighttime and haze. This method provides an efficient and reliable technical solution for LSS-UAV cluster target detection. Full article
Show Figures

Figure 1

23 pages, 7617 KB  
Article
A Dual-Modal Adaptive Pyramid Transformer Algorithm for UAV Cross-Modal Object Detection
by Qiqin Li, Ming Yang, Xiaoqiang Zhang, Nannan Wang, Xiaoguang Tu, Xijun Liu and Xinyu Zhu
Sensors 2025, 25(24), 7541; https://doi.org/10.3390/s25247541 - 11 Dec 2025
Viewed by 486
Abstract
Unmanned Aerial Vehicles (UAVs) play vital roles in traffic surveillance, disaster management, and border security, highlighting the importance of reliable infrared–visible image detection under complex illumination conditions. However, UAV-based infrared–visible detection still faces challenges in multi-scale target recognition, robustness to lighting variations, and [...] Read more.
Unmanned Aerial Vehicles (UAVs) play vital roles in traffic surveillance, disaster management, and border security, highlighting the importance of reliable infrared–visible image detection under complex illumination conditions. However, UAV-based infrared–visible detection still faces challenges in multi-scale target recognition, robustness to lighting variations, and efficient cross-modal information utilization. To address these issues, this study proposes a lightweight Dual-modality Adaptive Pyramid Transformer (DAP) module integrated into the YOLOv8 framework. The DAP module employs a hierarchical self-attention mechanism and a residual fusion structure to achieve adaptive multi-scale representation and cross-modal semantic alignment while preserving modality-specific features. This design enables effective feature fusion with reduced computational cost, enhancing detection accuracy in complex environments. Experiments on the DroneVehicle and LLVIP datasets demonstrate that the proposed DAP-based YOLOv8 achieves mAP50:95 scores of 61.2% and 62.1%, respectively, outperforming conventional methods. The results validate the capability of the DAP module to optimize cross-modal feature interaction and improve UAV real-time infrared–visible target detection, offering a practical and efficient solution for UAV applications such as traffic monitoring and disaster response. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

29 pages, 6232 KB  
Article
Research on Multi-Temporal Infrared Image Generation Based on Improved CLE Diffusion
by Hua Gong, Wenfei Gao, Fang Liu and Yuanjing Ma
Computers 2025, 14(12), 548; https://doi.org/10.3390/computers14120548 - 11 Dec 2025
Viewed by 338
Abstract
To address the problems of dynamic brightness imbalance in image sequences and blurred object edges in multi-temporal infrared image generation, we propose an improved multi-temporal infrared image generation model based on CLE Diffusion. First, the model adopts CLE Diffusion to capture the dynamic [...] Read more.
To address the problems of dynamic brightness imbalance in image sequences and blurred object edges in multi-temporal infrared image generation, we propose an improved multi-temporal infrared image generation model based on CLE Diffusion. First, the model adopts CLE Diffusion to capture the dynamic evolution patterns of image sequences. By modeling brightness variation through the noise evolution of the diffusion process, it enables controllable generation across multiple time points. Second, we design a periodic time encoding strategy and a feature linear modulator and build a temporal control module. Through channel-level modulation, this module jointly models temporal information and brightness features to improve the model’s temporal representation capability. Finally, to tackle structural distortion and edge blurring in infrared images, we design a multi-scale edge pyramid strategy and build a structure consistency module based on attention mechanisms. This module jointly computes multi-scale edge and structural features to enforce edge enhancement and structural consistency. Extensive experiments on both public visible-light and self-constructed infrared multi-temporal datasets demonstrate our model’s state-of-the-art (SOTA) performance. It generates high-quality images across all time points, achieving superior performance on the PSNR, SSIM, and LPIPS metrics. The generated images have clear edges and structural consistency. Full article
(This article belongs to the Special Issue Advanced Image Processing and Computer Vision (2nd Edition))
Show Figures

Graphical abstract

18 pages, 13145 KB  
Article
CDFFusion: A Color-Deviation-Free Fusion Network for Nighttime Infrared and Visible Images
by Hao Chen, Tinghua Zhang, Shijie Zhai, Xiaoyun Tong and Rui Zhu
Sensors 2025, 25(23), 7337; https://doi.org/10.3390/s25237337 - 2 Dec 2025
Viewed by 365
Abstract
The purpose of infrared and visible image fusion is to integrate their complementary information into a single image, thereby increasing the amount of information expression. However, previously used methods often struggle to extract information hidden in darkness, and existing methods—which integrate brightness enhancement [...] Read more.
The purpose of infrared and visible image fusion is to integrate their complementary information into a single image, thereby increasing the amount of information expression. However, previously used methods often struggle to extract information hidden in darkness, and existing methods—which integrate brightness enhancement and image fusion—can cause overexposure, image blocking effects, and color deviation. Therefore, we propose a visible light and infrared image fusion method, CDFFusion, for low-light scenarios. The premise is to utilize Retinex theory to decompose the illumination and reflection components of visible light images at the feature level before fusing and decoding the reflection features with infrared features to obtain the Y component of the fused image. Next, the proposed color mapping formula is used to adjust the Cb and Cr components of the original visible light image; finally, the Y component of the fused image is concatenated to obtain the final fused image. The SF, CC, Nabf, Qabf, SCD, MS-SSIM, and ΔE indicators of this method reached 17.6531, 0.6619, 0.1075, 0.4279, 1.2760, 0.8335, and 0.0706, respectively, on the LLVIP dataset. The experimental results show that this method can effectively alleviate visual overexposure and image blocking effects, and it has the smallest color deviation. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

20 pages, 5765 KB  
Article
Infrared–Visible Fusion via Cross-Modality Attention and Small-Object Enhancement for Pedestrian Detection
by Jie Yang, Yanxuan Jiang, Dengyin Jiang and Zhichao Chen
ISPRS Int. J. Geo-Inf. 2025, 14(12), 477; https://doi.org/10.3390/ijgi14120477 - 2 Dec 2025
Cited by 1 | Viewed by 797
Abstract
Pedestrian detection under low illumination and complex environments remains a significant challenge for vision-based systems, particularly in safety-critical applications such as urban rail transit. To address the limitations of single-modality detection in adverse conditions, this paper proposes IVIFusion, a lightweight yet robust pedestrian [...] Read more.
Pedestrian detection under low illumination and complex environments remains a significant challenge for vision-based systems, particularly in safety-critical applications such as urban rail transit. To address the limitations of single-modality detection in adverse conditions, this paper proposes IVIFusion, a lightweight yet robust pedestrian detection framework that fuses infrared and visible images at the feature level. The method integrates a dual-branch Transformer-based backbone for modality-specific feature extraction and introduces a Cross-Modality Attention Fusion Module (CMAFM) to adaptively enhance cross-modal representations while suppressing noise. Furthermore, a dedicated small-object detection layer is incorporated to improve the recall of distant and occluded pedestrians. Extensive experiments conducted on the public LLVIP dataset and the custom HGPD dataset demonstrate the superior performance of IVIFusion, achieving mAP0.5 scores of 98.6% and 97.2%, respectively. The results validate the effectiveness of the proposed architecture in handling challenging lighting conditions while maintaining real-time efficiency and low computational cost. Full article
Show Figures

Figure 1

Back to TopTop