MDPI - Publisher of Open Access Journals

20 pages, 3147 KiB

Open AccessArticle

Crossed Wavelet Convolution Network for Few-Shot Defect Detection of Industrial Chips

by Zonghai Sun, Yiyu Lin, Yan Li and Zihan Lin

Sensors 2025, 25(14), 4377; https://doi.org/10.3390/s25144377 - 13 Jul 2025

Viewed by 359

In resistive polymer humidity sensors, the quality of the resistor chips directly affects the performance. Detecting chip defects remains challenging due to the scarcity of defective samples, which limits traditional supervised-learning methods requiring abundant labeled data. While few-shot learning (FSL) shows promise for [...] Read more.

In resistive polymer humidity sensors, the quality of the resistor chips directly affects the performance. Detecting chip defects remains challenging due to the scarcity of defective samples, which limits traditional supervised-learning methods requiring abundant labeled data. While few-shot learning (FSL) shows promise for industrial defect detection, existing approaches struggle with mixed-scene conditions (e.g., daytime and night-version scenes). In this work, we propose a crossed wavelet convolution network (CWCN), including a dual-pipeline crossed wavelet convolution training framework (DPCWC) and a loss value calculation module named ProSL. Our method innovatively applies wavelet transform convolution and prototype learning to industrial defect detection, which effectively fuses feature information from multiple scenarios and improves the detection performance. Experiments across various few-shot tasks on chip datasets illustrate the better detection quality of CWCN, with an improvement in mAP ranging from 2.76% to 16.43% over other FSL methods. In addition, experiments on an open-source dataset NEU-DET further validate our proposed method. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

26 pages, 6668 KiB

Open AccessArticle

Dark Ship Detection via Optical and SAR Collaboration: An Improved Multi-Feature Association Method Between Remote Sensing Images and AIS Data

by Fan Li, Kun Yu, Chao Yuan, Yichen Tian, Guang Yang, Kai Yin and Youguang Li

Remote Sens. 2025, 17(13), 2201; https://doi.org/10.3390/rs17132201 - 26 Jun 2025

Viewed by 612

Abstract

Dark ships, vessels deliberately disabling their AIS signals, constitute a grave maritime safety hazard, with detection efforts hindered by issues like over-reliance on AIS, inadequate surveillance coverage, and significant mismatch rates. This paper proposes an improved multi-feature association method that integrates satellite remote [...] Read more.

Dark ships, vessels deliberately disabling their AIS signals, constitute a grave maritime safety hazard, with detection efforts hindered by issues like over-reliance on AIS, inadequate surveillance coverage, and significant mismatch rates. This paper proposes an improved multi-feature association method that integrates satellite remote sensing and AIS data, with a focus on oriented bounding box course estimation, to improve the detection of dark ships and enhance maritime surveillance. Firstly, the oriented bounding box object detection model (YOLOv11n-OBB) is trained to break through the limitations of horizontal bounding box orientation representation. Secondly, by integrating position, dimensions (length and width), and course characteristics, we devise a joint cost function to evaluate the combined significance of multiple features. Subsequently, an advanced JVC global optimization algorithm is employed to ensure high-precision association in dense scenes. Finally, by integrating data from Gaofen-6 (optical) and Gaofen-3B (SAR) satellites, a day-and-night collaborative monitoring framework is constructed to address the blind spots of single-sensor monitoring during night-time or adverse weather conditions. Our results indicate that the detection model demonstrates a high average precision (AP50) of 0.986 on the optical dataset and 0.903 on the SAR dataset. The association accuracy of the multi-feature association algorithm is 91.74% in optical image and AIS data matching, and 91.33% in SAR image and AIS data matching. The association rate reaches 96.03% (optical) and 74.24% (SAR), respectively. This study provides an efficient technical tool for maritime safety regulation through multi-source data fusion and algorithm innovation. Full article

(This article belongs to the Special Issue SAR Imaging and Deep Learning for Sea Target Detection and Maritime Surveillance)

► Show Figures

Graphical abstract

17 pages, 2200 KiB

Open AccessArticle

Visual Place Recognition Based on Dynamic Difference and Dual-Path Feature Enhancement

by Guogang Wang, Yizhen Lv, Lijie Zhao and Yunpeng Liu

Sensors 2025, 25(13), 3947; https://doi.org/10.3390/s25133947 - 25 Jun 2025

Viewed by 390

Abstract

Aiming at the problem of appearance drift and susceptibility to noise interference in visual place recognition (VPR), we propose DD–DPFE: a Dynamic Difference and Dual-Path Feature Enhancement method. Embedding differential attention mechanisms in the DINOv2 model to mitigate the effects of process interference [...] Read more.

Aiming at the problem of appearance drift and susceptibility to noise interference in visual place recognition (VPR), we propose DD–DPFE: a Dynamic Difference and Dual-Path Feature Enhancement method. Embedding differential attention mechanisms in the DINOv2 model to mitigate the effects of process interference and adding serial-parallel adapters allows efficient model parameter migration and task adaptation. Our method constructs a two-way feature enhancement module with global–local branching synergy. The global branch employs a dynamic fusion mechanism with a multi-layer Transformer encoder to strengthen the structured spatial representation to cope with appearance changes, while the local branch suppresses the over-response of redundant noise through an adaptive weighting mechanism and fuses the contextual information from the multi-scale feature aggregation module to enhance the robustness of the scene. The experimental results show that the model architecture proposed in this paper is an obvious improvement in different environmental tests. This is most obvious in the simulation test of a night scene, verifying that the proposed method can effectively enhance the discriminative power of the system and its anti-jamming ability in complex scenes. Full article

(This article belongs to the Section Electronic Sensors)

► Show Figures

Figure 1

18 pages, 4774 KiB

Open AccessArticle

InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset

by Yuandong Niu, Limin Liu, Fuyu Huang, Juntao Ma, Chaowen Zheng, Yunfeng Jiang, Ting An, Zhongchen Zhao and Shuangyou Chen

Remote Sens. 2025, 17(12), 2035; https://doi.org/10.3390/rs17122035 - 13 Jun 2025

Viewed by 459

Abstract

In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in [...] Read more.

In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in a significant decline in image quality and making it difficult to meet the task requirements. The method based on lidar has poor imaging effects in rainy and foggy weather, close-range scenes, and scenarios requiring thermal imaging data. In contrast, infrared cameras can effectively overcome this challenge because their imaging mechanisms are different from those of RGB cameras and lidar. However, the research on three-dimensional scene reconstruction of infrared images is relatively immature, especially in the field of infrared binocular stereo matching. There are two main challenges given this situation: first, there is a lack of a dataset specifically for infrared binocular stereo matching; second, the lack of texture information in infrared images causes a limit in the extension of the RGB method to the infrared reconstruction problem. To solve these problems, this study begins with the construction of an infrared binocular stereo matching dataset and then proposes an innovative perspective projection positional encoding-based transformer method to complete the infrared binocular stereo matching task. In this paper, a stereo matching network combined with transformer and cost volume is constructed. The existing work in the positional encoding of the transformer usually uses a parallel projection model to simplify the calculation. Our method is based on the actual perspective projection model so that each pixel is associated with a different projection ray. It effectively solves the problem of feature extraction and matching caused by insufficient texture information in infrared images and significantly improves matching accuracy. We conducted experiments based on the infrared binocular stereo matching dataset proposed in this paper. Experiments demonstrated the effectiveness of the proposed method. Full article

(This article belongs to the Collection Visible Infrared Imaging Radiometers and Applications)

► Show Figures

Figure 1

20 pages, 24073 KiB

Open AccessArticle

Comparison of Directional and Diffused Lighting for Pixel-Level Segmentation of Concrete Cracks

by Hamish Dow, Marcus Perry, Jack McAlorum and Sanjeetha Pennada

Infrastructures 2025, 10(6), 129; https://doi.org/10.3390/infrastructures10060129 - 25 May 2025

Viewed by 452

Abstract

Visual inspections of concrete infrastructure in low-light environments require external lighting to ensure adequate visibility. Directional lighting sources, where an image scene is illuminated with an angled lighting source from one direction, can enhance the visibility of surface defects in an image. This [...] Read more.

Visual inspections of concrete infrastructure in low-light environments require external lighting to ensure adequate visibility. Directional lighting sources, where an image scene is illuminated with an angled lighting source from one direction, can enhance the visibility of surface defects in an image. This paper compares directional and diffused scene illumination images for pixel-level concrete crack segmentation. A novel directional lighting image segmentation algorithm is proposed, which applies crack segmentation image processing techniques to each directionally lit image before combining all images into a single output, highlighting the extremities of the defect. This method was benchmarked against two diffused lighting crack detection techniques across a dataset with crack widths typically ranging from 0.07 mm to 0.4 mm. When tested on cracked and uncracked data, the directional lighting method significantly outperformed other benchmarked diffused lighting methods, attaining a 10% higher true-positive rate (TPR), 12% higher intersection over union (IoU), and 10% higher F1 score with minimal impact on precision. Further testing on only cracked data revealed that directional lighting was superior across all crack widths in the dataset. This research shows that directional lighting can enhance pixel-level crack segmentation in infrastructure requiring external illumination, such as low-light indoor spaces (e.g., tunnels and containment structures) or night-time outdoor inspections (e.g., pavement and bridges). Full article

(This article belongs to the Section Infrastructures Inspection and Maintenance)

► Show Figures

Figure 1

14 pages, 210 KiB

Open AccessArticle

No Small Parts (Only Speechless Women)

by Paige Martin Reynolds

Humanities 2025, 14(5), 111; https://doi.org/10.3390/h14050111 - 20 May 2025

Viewed by 346

Abstract

When it comes to acting in modern productions of Shakespeare’s plays, size is more than all talk. That is, though how much a character speaks often serves as the measure of a role’s size, “small parts” may have a lot to say—and, as [...] Read more.

When it comes to acting in modern productions of Shakespeare’s plays, size is more than all talk. That is, though how much a character speaks often serves as the measure of a role’s size, “small parts” may have a lot to say—and, as it turns out, the actors playing them may have a lot (or too little) to do. Some modern approaches to dramaturgy and practice may mean that the performers playing roles not qualified as large are susceptible to isolation throughout the artistic process, possibly having reduced rehearsal time. If the number of spoken lines influences the number of rehearsal hours, an actor playing a “small part” may be at a disadvantage when it comes to opportunities for character development and the benefits of creative collaboration. (In a rehearsal process for A Midsummer Night’s Dream, for example, how active might Hippolyta’s participation be if she is not doubling as Titania?) Additionally, having fewer lines on the stage can mean inheriting more labor behind the scenes, since an available body is a valuable commodity in the economy of production (what tasks might Ursula undertake during Much Ado About Nothing?). The tension between “playing conditions” and “working conditions” in the theater is thus especially heightened for Shakespeare’s women, whose onstage existence can throw an uncanny shadow upon the offstage experiences of those who play them. Full article

(This article belongs to the Special Issue Shakespearean Performance: Contemporary Approaches, Findings, and Practices)

22 pages, 9648 KiB

Open AccessArticle

Three-Dimensional Real-Scene-Enhanced GNSS/Intelligent Vision Surface Deformation Monitoring System

by Yuanrong He, Weijie Yang, Qun Su, Qiuhua He, Hongxin Li, Shuhang Lin and Shaochang Zhu

Appl. Sci. 2025, 15(9), 4983; https://doi.org/10.3390/app15094983 - 30 Apr 2025

Viewed by 664

Abstract

With the acceleration of urbanization, surface deformation monitoring has become crucial. Existing monitoring systems face several challenges, such as data singularity, the poor nighttime monitoring quality of video surveillance, and fragmented visual data. To address these issues, this paper presents a 3D real-scene [...] Read more.

With the acceleration of urbanization, surface deformation monitoring has become crucial. Existing monitoring systems face several challenges, such as data singularity, the poor nighttime monitoring quality of video surveillance, and fragmented visual data. To address these issues, this paper presents a 3D real-scene (3DRS)-enhanced GNSS/intelligent vision surface deformation monitoring system. The system integrates GNSS monitoring terminals and multi-source meteorological sensors to accurately capture minute displacements at monitoring points and multi-source Internet of Things (IoT) data, which are then automatically stored in MySQL databases. To enhance the functionality of the system, the visual sensor data are fused with 3D models through streaming media technology, enabling 3D real-scene augmented reality to support dynamic deformation monitoring and visual analysis. WebSocket-based remote lighting control is implemented to enhance the quality of video data at night. The spatiotemporal fusion of UAV aerial data with 3D models is achieved through Blender image-based rendering, while edge detection is employed to extract crack parameters from intelligent inspection vehicle data. The 3DRS model is constructed through UAV oblique photography, 3D laser scanning, and the combined use of SVSGeoModeler and SketchUp. A visualization platform for surface deformation monitoring is built on the 3DRS foundation, adopting an “edge collection–cloud fusion–terminal interaction” approach. This platform dynamically superimposes GNSS and multi-source IoT monitoring data onto the 3D spatial base, enabling spatiotemporal correlation analysis of millimeter-level displacements and early risk warning. Full article

► Show Figures

Figure 1

22 pages, 7233 KiB

Open AccessArticle

R-SABMNet: A YOLOv8-Based Model for Oriented SAR Ship Detection with Spatial Adaptive Aggregation

by Xiaoting Li, Wei Duan, Xikai Fu and Xiaolei Lv

Remote Sens. 2025, 17(3), 551; https://doi.org/10.3390/rs17030551 - 6 Feb 2025

Cited by 4 | Viewed by 1235

Abstract

Synthetic Aperture Radar (SAR) is extensively utilized in ship detection due to its robust performance under various weather conditions and its capability to operate effectively both during the day and at night. However, ships in SAR images exhibit various characteristics including complex land [...] Read more.

Synthetic Aperture Radar (SAR) is extensively utilized in ship detection due to its robust performance under various weather conditions and its capability to operate effectively both during the day and at night. However, ships in SAR images exhibit various characteristics including complex land scattering interference, variable scales, and dense spatial arrangements. Existing algorithms are insufficient in effectively addressing these challenges. To enhance detection accuracy, this paper proposes the Rotated model with Spatial Aggregation and a Balanced-Shifted Mechanism (R-SABMNet) built upon YOLOv8. First, we introduce the Spatial-Guided Adaptive Feature Aggregation (SG-AFA) module, which enhances sensitivity to ship features while suppressing land scattering interference. Subsequently, we propose the Balanced Shifted Multi-Scale Fusion (BSMF) module, which effectively enhances local detail information and improves adaptability to multi-scale targets. Finally, we introduce the Gaussian Wasserstein Distance Loss (GWD), which effectively addresses localization errors arising from angle and scale inconsistencies in dense scenes. Our R-SABMNet outperforms other deep learning-based methods on the SSDD+ and HRSID datasets. Specifically, our method achieves a detection accuracy of 96.32%, a recall of 93.13%, and an average level of accuracy of 95.28% on the SSDD+ dataset. Full article

(This article belongs to the Special Issue Advances in Synthetic Aperture Radar (SAR) Data Processing and Applications)

► Show Figures

Figure 1

22 pages, 25824 KiB

Open AccessArticle

NoctuDroneNet: Real-Time Semantic Segmentation of Nighttime UAV Imagery in Complex Environments

by Ruokun Qu, Jintao Tan, Yelu Liu, Chenglong Li and Hui Jiang

Drones 2025, 9(2), 97; https://doi.org/10.3390/drones9020097 - 27 Jan 2025

Viewed by 1129

Abstract

Nighttime semantic segmentation represents a challenging frontier in computer vision, made particularly difficult by severe low-light conditions, pronounced noise, and complex illumination patterns. These challenges intensify when dealing with Unmanned Aerial Vehicle (UAV) imagery, where varying camera angles and altitudes compound the difficulty. [...] Read more.

Nighttime semantic segmentation represents a challenging frontier in computer vision, made particularly difficult by severe low-light conditions, pronounced noise, and complex illumination patterns. These challenges intensify when dealing with Unmanned Aerial Vehicle (UAV) imagery, where varying camera angles and altitudes compound the difficulty. In this paper, we introduce NoctuDroneNet (Nocturnal UAV Drone Network, hereinafter referred to as NoctuDroneNet), a real-time segmentation model tailored specifically for nighttime UAV scenarios. Our approach integrates convolution-based global reasoning with training-only semantic alignment modules to effectively handle diverse and extreme nighttime conditions. We construct a new dataset, NUI-Night, focusing on low-illumination UAV scenes to rigorously evaluate performance under conditions rarely represented in standard benchmarks. Beyond NUI-Night, we assess NoctuDroneNet on the Varied Drone Dataset (VDD), a normal-illumination UAV dataset, demonstrating the model’s robustness and adaptability to varying flight domains despite the lack of large-scale low-light UAV benchmarks. Furthermore, evaluations on the Night-City dataset confirm its scalability and applicability to complex nighttime urban environments. NoctuDroneNet achieves state-of-the-art performance on NUI-Night, surpassing strong real-time baselines in both segmentation accuracy and speed. Qualitative analyses highlight its resilience to under-/over-exposure and small-object detection, underscoring its potential for real-world applications like UAV emergency landings under minimal illumination. Full article

► Show Figures

Figure 1

23 pages, 3232 KiB

Open AccessArticle

Comparative Analysis of LiDAR and Photogrammetry for 3D Crime Scene Reconstruction

by Fatemah M. Sheshtar, Wajd M. Alhatlani, Michael Moulden and Jong Hyuk Kim

Appl. Sci. 2025, 15(3), 1085; https://doi.org/10.3390/app15031085 - 22 Jan 2025

Cited by 3 | Viewed by 4286

Abstract

Accurate and fast 3D mapping of crime scenes is crucial in law enforcement, and first responders often need to document scenes in detail under challenging conditions and within a limited time. Traditional methods often fail to capture the details required to understand these [...] Read more.

Accurate and fast 3D mapping of crime scenes is crucial in law enforcement, and first responders often need to document scenes in detail under challenging conditions and within a limited time. Traditional methods often fail to capture the details required to understand these scenes comprehensively. This study investigates the effectiveness of recent mobile phone-based mapping technologies equipped with a LiDAR (Light Detection and Ranging) sensor. The performance of LiDAR and pure photogrammetry is evaluated under different illumination (day and night) and scanning conditions (slow and fast scanning) in a mock-up crime scene. The results reveal that the mapping using an iPhone LIDAR in daylight conditions with 5 min of fast scanning shows the best results, yielding 0.1084 m of error. Also, the cloud-to-cloud distance showed that 90% of the point clouds exhibited under 0.1224 m of error, demonstrating the utility of these tools for rapid and portable scanning in crime scenes. Full article

(This article belongs to the Special Issue Current Advances in 3D Scene Classification and Object Recognition)

► Show Figures

Figure 1

28 pages, 21353 KiB

Open AccessArticle

ThermalGS: Dynamic 3D Thermal Reconstruction with Gaussian Splatting

by Yuxiang Liu, Xi Chen, Shen Yan, Zeyu Cui, Huaxin Xiao, Yu Liu and Maojun Zhang

Remote Sens. 2025, 17(2), 335; https://doi.org/10.3390/rs17020335 - 19 Jan 2025

Cited by 4 | Viewed by 2965

Abstract

Thermal infrared (TIR) images capture temperature in a non-invasive manner, making them valuable for generating 3D models that reflect the spatial distribution of thermal properties within a scene. Current TIR image-based 3D reconstruction methods primarily focus on static conditions, which only capture the [...] Read more.

Thermal infrared (TIR) images capture temperature in a non-invasive manner, making them valuable for generating 3D models that reflect the spatial distribution of thermal properties within a scene. Current TIR image-based 3D reconstruction methods primarily focus on static conditions, which only capture the spatial distribution of thermal radiation but lack the ability to represent its temporal dynamics. The absence of dedicated datasets and effective methods for dynamic 3D representation are two key challenges that hinder progress in this field. To address these challenges, we propose a novel dynamic thermal 3D reconstruction method, named ThermalGS, based on 3D Gaussian Splatting (3DGS). ThermalGS employs a data-driven approach to directly learn both scene structure and dynamic thermal representation, using RGB and TIR images as input. The position, orientation, and scale of Gaussian primitives are guided by the RGB mesh. We introduce feature encoding and embedding networks to integrate semantic and temporal information into the Gaussian primitives, allowing them to capture dynamic thermal radiation characteristics. Moreover, we construct the Thermal Scene Day-and-Night (TSDN) dataset, which includes multi-view, high-resolution aerial RGB reference images and TIR images captured at five different times throughout the day and night, providing a benchmark for dynamic thermal 3D reconstruction tasks. Experimental results demonstrate that the proposed method achieves state-of-the-art performance on the TSDN dataset, with an average absolute temperature error of 1 °C and the ability to predict surface temperature variations over time. Full article

(This article belongs to the Special Issue Advances in 3D Reconstruction Based on Remote Sensing Imagery and Lidar Point Cloud)

► Show Figures

Graphical abstract

25 pages, 3292 KiB

Open AccessArticle

Lane Detection Based on CycleGAN and Feature Fusion in Challenging Scenes

by Eric Hsueh-Chan Lu and Wei-Chih Chiu

Vehicles 2025, 7(1), 2; https://doi.org/10.3390/vehicles7010002 - 1 Jan 2025

Cited by 3 | Viewed by 1495

Abstract

Lane detection is a pivotal technology of the intelligent driving system. By identifying the position and shape of the lane, the vehicle can stay in the correct lane and avoid accidents. Image-based deep learning is currently the most advanced method for lane detection. [...] Read more.

Lane detection is a pivotal technology of the intelligent driving system. By identifying the position and shape of the lane, the vehicle can stay in the correct lane and avoid accidents. Image-based deep learning is currently the most advanced method for lane detection. Models using this method already have a very good recognition ability in general daytime scenes, and can almost achieve real-time detection. However, these models often fail to accurately identify lanes in challenging scenarios such as night, dazzle, or shadows. Furthermore, the lack of diversity in the training data restricts the capacity of the models to handle different environments. This paper proposes a novel method to train CycleGAN with existing daytime and nighttime datasets. This method can extract features of different styles and multi-scales, thereby increasing the richness of model input. We use CycleGAN as a domain adaptation model combined with an image segmentation model to boost the model’s performance in different styles of scenes. The proposed consistent loss function is employed to mitigate performance disparities of the model in different scenarios. Experimental results indicate that our method enhances the detection performance of original lane detection models in challenging scenarios. This research helps improve the dependability and robustness of intelligent driving systems, ultimately making roads safer and enhancing the driving experience. Full article

(This article belongs to the Special Issue Recent Developments in Intelligent Transportation Systems (ITSs), 2nd Edition)

► Show Figures

Figure 1

27 pages, 9095 KiB

Open AccessArticle

BMFusion: Bridging the Gap Between Dark and Bright in Infrared-Visible Imaging Fusion

by Chengwen Liu, Bin Liao and Zhuoyue Chang

Electronics 2024, 13(24), 5005; https://doi.org/10.3390/electronics13245005 - 19 Dec 2024

Viewed by 1121

Abstract

The fusion of infrared and visible light images is a crucial technology for enhancing visual perception in complex environments. It plays a pivotal role in improving visual perception and subsequent performance in advanced visual tasks. However, due to the significant degradation of visible [...] Read more.

The fusion of infrared and visible light images is a crucial technology for enhancing visual perception in complex environments. It plays a pivotal role in improving visual perception and subsequent performance in advanced visual tasks. However, due to the significant degradation of visible light image quality in low-light or nighttime scenes, most existing fusion methods often struggle to obtain sufficient texture details and salient features when processing such scenes. This can lead to a decrease in fusion quality. To address this issue, this article proposes a new image fusion method called BMFusion. Its aim is to significantly improve the quality of fused images in low-light or nighttime scenes and generate high-quality fused images around the clock. This article first designs a brightness attention module composed of brightness attention units. It extracts multimodal features by combining the SimAm attention mechanism with a Transformer architecture. Effective enhancement of brightness and features has been achieved, with gradual brightness attention performed during feature extraction. Secondly, a complementary fusion module was designed. This module deeply fuses infrared and visible light features to ensure the complementarity and enhancement of each modal feature during the fusion process, minimizing information loss to the greatest extent possible. In addition, a feature reconstruction network combining CLIP-guided semantic vectors and neighborhood attention enhancement was proposed in the feature reconstruction stage. It uses the KAN module to perform channel adaptive optimization on the reconstruction process, ensuring semantic consistency and detail integrity of the fused image during the reconstruction phase. The experimental results on a large number of public datasets demonstrate that the BMFusion method can generate fusion images with higher visual quality and richer details in night and low-light environments compared with various existing state-of-the-art (SOTA) algorithms. At the same time, the fusion image can significantly improve the performance of advanced visual tasks. This shows the great potential and application prospect of this method in the field of multimodal image fusion. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)

► Show Figures

Figure 1

14 pages, 13514 KiB

Open AccessArticle

A Nighttime Driving-Scene Segmentation Method Based on Light-Enhanced Network

by Lihua Bi, Wenjiao Zhang, Xiangfei Zhang and Canlin Li

World Electr. Veh. J. 2024, 15(11), 490; https://doi.org/10.3390/wevj15110490 - 27 Oct 2024

Cited by 1 | Viewed by 1495

Abstract

To solve the semantic segmentation problem of night driving-scene images, which often have low brightness, low contrast, and uneven illumination, a nighttime driving-scene segmentation method based on a light-enhanced network was proposed. Firstly, we designed a light enhancement network, which comprises two parts: [...] Read more.

To solve the semantic segmentation problem of night driving-scene images, which often have low brightness, low contrast, and uneven illumination, a nighttime driving-scene segmentation method based on a light-enhanced network was proposed. Firstly, we designed a light enhancement network, which comprises two parts: a color correction module and a parameter predictor. The color correction module mitigates the impact of illumination variations on the segmentation network by adjusting the color information of the image. Meanwhile, the parameter predictor accurately predicts the parameters of the image filter through the analysis of global content, including factors such as brightness, contrast, hue, and exposure level, thereby effectively enhancing the image quality. Subsequently, the output of the light enhancement network is input into the segmentation network to obtain the final segmentation prediction. Experimental results show that the proposed method achieves mean Intersection over Union (mIoU) values of 59.4% on the Dark Zurich-test dataset, outperforming other segmentation algorithms for nighttime driving-scenes. Full article

(This article belongs to the Special Issue Vehicle-Road Collaboration and Connected Automated Driving)

► Show Figures

Figure 1

20 pages, 8023 KiB

Open AccessArticle

Channel Interaction and Transformer Depth Estimation Network: Robust Self-Supervised Depth Estimation Under Varied Weather Conditions

by Jianqiang Liu, Zhengyu Guo, Peng Ping, Hao Zhang and Quan Shi

Sustainability 2024, 16(20), 9131; https://doi.org/10.3390/su16209131 - 21 Oct 2024

Viewed by 1450

Abstract

Monocular depth estimation provides low-cost environmental information for intelligent systems such as autonomous vehicles and robots, supporting sustainable development by reducing reliance on expensive, energy-intensive sensors and making technology more accessible and efficient. However, in practical applications, monocular vision is highly susceptible to [...] Read more.

Monocular depth estimation provides low-cost environmental information for intelligent systems such as autonomous vehicles and robots, supporting sustainable development by reducing reliance on expensive, energy-intensive sensors and making technology more accessible and efficient. However, in practical applications, monocular vision is highly susceptible to adverse weather conditions, significantly reducing depth perception accuracy and limiting its ability to deliver reliable environmental information. To improve the robustness of monocular depth estimation in challenging weather, this paper first utilizes generative models to adjust image exposure and generate synthetic images of rainy, foggy, and nighttime scenes, enriching the diversity of the training data. Next, a channel interaction module and Multi-Scale Fusion Module are introduced. The former enhances information exchange between channels, while the latter effectively integrates multi-level feature information. Finally, an enhanced consistency loss is added to the loss function to prevent the depth estimation bias caused by data augmentation. Experiments on datasets such as DrivingStereo, Foggy CityScapes, and NuScenes-Night demonstrate that our method, CIT-Depth, exhibits superior generalization across various complex conditions. Full article

(This article belongs to the Section Sustainable Transportation)

► Show Figures

Figure 1

Search Results (102)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (102)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI