MDPI - Publisher of Open Access Journals

33 pages, 172200 KB

Open AccessArticle

HDCGAN+: A Low-Illumination UAV Remote Sensing Image Enhancement and Evaluation Method Based on WPID

by Kelly Chen Ke, Min Sun, Xinyi Wang, Dong Liu and Hanjun Yang

Remote Sens. 2026, 18(7), 999; https://doi.org/10.3390/rs18070999 - 26 Mar 2026

Cited by 1 | Viewed by 542

Remote sensing images acquired by UAVs under nighttime or low-illumination conditions suffer from insufficient illumination, leading to degraded image quality, detail loss, and noise, which restrict their application in public security and disaster emergency scenarios. Although existing machine learning-based enhancement methods can recover [...] Read more.

Remote sensing images acquired by UAVs under nighttime or low-illumination conditions suffer from insufficient illumination, leading to degraded image quality, detail loss, and noise, which restrict their application in public security and disaster emergency scenarios. Although existing machine learning-based enhancement methods can recover part of the missing information, they often cause color distortion and texture inconsistency. This study proposes an improved low-illumination image enhancement method based on a Weakly Paired Image Dataset (WPID), combining the Hierarchical Deep Convolutional Generative Adversarial Network (HDCGAN) with a low-rank image fusion strategy to enhance the quality of low-illumination UAV remote sensing images. First, YCbCr color channel separation is applied to preserve color information from visible images. Then, a Low-Rank Representation Fusion Network (LRRNet) is employed to perform structure-aware fusion between thermal infrared (TIR) and visible images, thereby enabling effective preservation of structural details and realistic color appearance. Furthermore, a weakly paired training mechanism is incorporated into HDCGAN to enhance detail restoration and structural fidelity. To achieve objective evaluation, a structural consistency assessment framework is constructed based on semantic segmentation results from the Segment Anything Model (SAM). Experimental results demonstrate that the proposed method outperforms state-of-the-art approaches in both visual quality and application-oriented evaluation metrics. Full article

(This article belongs to the Section Remote Sensing Image Processing)

► Show Figures

Figure 1

22 pages, 7022 KB

Open AccessArticle

Mapping Spectral Composition of Nighttime Lighting in Urban Green Spaces Using SDGSAT-1 NTL Data and Google Earth Imagery

by Yuan Yuan, Zhiqiang Lu, Hongbo Liu, Boyang Wang, Yanni Xu, Zhirong Zhang, Jiahuan Li and Bin Wu

Remote Sens. 2026, 18(5), 732; https://doi.org/10.3390/rs18050732 - 28 Feb 2026

Viewed by 729

Abstract

Characterizing the spectral composition of artificial light at night (ALAN) within urban green spaces (UGS) is vital for ecological conservation, yet traditional sensors often lack the requisite spatial and spectral resolution for fine-scale analysis. To address this gap, this study leverages high-resolution multispectral [...] Read more.

Characterizing the spectral composition of artificial light at night (ALAN) within urban green spaces (UGS) is vital for ecological conservation, yet traditional sensors often lack the requisite spatial and spectral resolution for fine-scale analysis. To address this gap, this study leverages high-resolution multispectral nighttime light (NTL) data from the SDGSAT-1 to perform a fine-scale characterization of lighting across diverse UGS typologies. We developed UGS-STUNet, a semantic segmentation framework based on Swin Transformer architecture, to accurately extract five UGS categories from Google Earth imagery. Two specialized spectral indices, blue-to-green (B/G) and green-to-red (G/R) ratios, were derived from SDGSAT-1 NTL data to quantify the lighting’s spectral composition. Application in Shanghai demonstrated that UGS-STUNet achieved a precision of 85.72%, significantly outperforming existing methods. Our findings reveal that street trees are subjected to the highest red-light intensity and the lowest B/G and G/R ratios due to their proximity to roadway illumination. In contrast, forest patches and belts exhibit higher spectral ratios, indicating a relatively higher exposure to blue and green wavelengths. This study provides a robust and scalable method for monitoring the spectral quality of urban nightscapes, offering critical insights for sustainable urban planning and lighting mitigation strategies to safeguard global biodiversity and public health. Full article

(This article belongs to the Special Issue Nighttime Light Remote Sensing Products for Sustainable Development Goals (SDGs))

► Show Figures

Figure 1

30 pages, 14511 KB

Open AccessArticle

Rural Settlement Segmentation in Large-Scale Remote Sensing Imagery Using MSF-AL Auto-Labeling and the SELPFormer Model

by Qian Zhou, Yongqi Sun, Yanjun Tian, Qiqi Deng, Shireli Erkin and Yongnian Gao

Remote Sens. 2026, 18(4), 579; https://doi.org/10.3390/rs18040579 - 12 Feb 2026

Viewed by 599

Abstract

Accurate delineation of rural settlements at large spatial extents is fundamental to territorial spatial governance, rural revitalization, and the improvement of human living environments. However, in medium-resolution remote sensing imagery, rural settlement patches are typically small, morphologically complex, and easily confused with other [...] Read more.

Accurate delineation of rural settlements at large spatial extents is fundamental to territorial spatial governance, rural revitalization, and the improvement of human living environments. However, in medium-resolution remote sensing imagery, rural settlement patches are typically small, morphologically complex, and easily confused with other impervious surfaces. As a result, existing products still fall short in characterizing these features. Here, we propose a lightweight Transformer-based semantic segmentation model, SELPFormer, and develop a multi-source fusion automatic labeling pipeline that integrates Global Impervious Surface Dynamics dataset, OpenStreetMap spatial priors, and nighttime lights constraints. Built upon SegFormer as the backbone, SELPFormer introduces a lightweight pyramid pooling module at the deepest feature level to aggregate multi-scale global context and embeds an SCSE channel–spatial attention mechanism into deep features to suppress background interference. In addition, it incorporates an efficient local attention module into multi-scale lateral connections to enhance boundary and texture representations, thereby jointly improving small-object recognition and fine boundary preservation. We evaluate the proposed method using Landsat multispectral imagery covering five provinces on the North China Plain. SELPFormer achieves IoU = 74.23%, mIoU = 86.43%, F1 = 85.21%, OA = 98.69%, and Kappa = 0.8452 under a unified training and evaluation protocol, yielding IoU gains of +1.44, +3.98, and +12.35 percentage points over SegFormer, U-Net, and DeepLabV3+, respectively. SELPFormer has 15.44 M parameters and attains a parameter efficiency of 3.93% IoU per million parameters and an ROC-AUC of 0.993, indicating strong threshold-independent discriminative capability. These results indicate that the proposed method can effectively extract rural settlements from medium-resolution imagery and provides a generic “global–channel–local” collaborative framework for model design and data construction. Full article

(This article belongs to the Special Issue The Recent Progression of Machine Learning in Remote Sensing: Theory and Modelling (Second Edition))

► Show Figures

Figure 1

17 pages, 3062 KB

Open AccessArticle

Dynamic Multi-Parameter Sensing Technology for Ecological Flows Based on the Improved DSC-YOLOv8n Model

by Jun Yu, Yongsheng Li, Ting Wang, Peipei Zhang, Wenlong Jiang and Lei Xing

Water 2026, 18(2), 146; https://doi.org/10.3390/w18020146 - 6 Jan 2026

Viewed by 539

Abstract

Ecological flow management is important for maintaining ecosystem stability and promoting sustainable development. Dynamic ecological flow regulation depends on precise real-time monitoring of water levels and flow velocities. To address challenges in ecological flow monitoring, including maintenance difficulties and insufficient accuracy, an improved [...] Read more.

Ecological flow management is important for maintaining ecosystem stability and promoting sustainable development. Dynamic ecological flow regulation depends on precise real-time monitoring of water levels and flow velocities. To address challenges in ecological flow monitoring, including maintenance difficulties and insufficient accuracy, an improved DSC-YOLOv8n-seg model is proposed for dynamic multi-parameter sensing, achieving more efficient object detection and semantic segmentation. Compared with traditional affine transformation-edge detection, this approach enables joint recognition of water level lines and staff gauge characters, achieving an average recognition error of ±1.2 cm, with a model accuracy of 93.1%, recall rate of 94.5%, and mAP50:95 of 93.9%. A deep learning-based spectral principal direction recognition method was also employed to calculate the surface water flow velocity, which demonstrated stable and efficient performance, achieving a relative error of 0.005 m/s for the surface velocity. Experimental results confirm that it can effectively address issues such as environmental interference, exhibiting enhanced robustness in low-light and nighttime scenarios. The proposed method provides efficient and accurate identification for dynamic water level monitoring and for real-time detection of river surface flow velocities to improve ecological flow management. Full article

(This article belongs to the Section New Sensors, New Technologies and Machine Learning in Water Sciences)

► Show Figures

Figure 1

16 pages, 1131 KB

Open AccessArticle

HDRSeg-UDA: Semantic Segmentation for HDR Images with Unsupervised Domain Adaptation

by Huei-Yung Lin and Ming-Yiao Chen

Smart Cities 2026, 9(1), 10; https://doi.org/10.3390/smartcities9010010 - 4 Jan 2026

Viewed by 1116

Abstract

Accurate detection and localization of traffic objects are essential for autonomous driving tasks such as path planning. While semantic segmentation is able to provide pixel-level classification, existing networks often fail under challenging conditions like nighttime or rain. In this paper, we introduce a [...] Read more.

Accurate detection and localization of traffic objects are essential for autonomous driving tasks such as path planning. While semantic segmentation is able to provide pixel-level classification, existing networks often fail under challenging conditions like nighttime or rain. In this paper, we introduce a new training framework that combines unsupervised domain adaptation with high dynamic range imaging. The proposed network uses labeled daytime images along with unlabeled nighttime HDR images. By utilizing the fine details typically lost in conventional SDR images due to dynamic range compression, and incorporating the UDA training strategy, the framework effectively trains a model that is capable of semantic segmentation across adverse weather conditions. Experiments conducted on four datasets have demonstrated substantial improvements in inference performance under nighttime and rainy scenarios. The accuracy for daytime images is also enhanced through expanded training diversity. Full article

(This article belongs to the Section Artificial Intelligence and LLM Agents for Data-Driven Decisions in Smart Cities)

► Show Figures

Figure 1

25 pages, 3835 KB

Open AccessArticle

BuildFunc-MoE: An Adaptive Multimodal Mixture-of-Experts Network for Fine-Grained Building Function Identification

by Ru Wang, Zhan Zhang, Daoyu Shu, Nan Jia, Fang Wan, Wenkai Hu, Xiaoling Chen and Zhenghong Peng

Remote Sens. 2026, 18(1), 90; https://doi.org/10.3390/rs18010090 - 26 Dec 2025

Cited by 1 | Viewed by 1925

Abstract

Fine-grained building function identification (BFI) is essential for sustainable urban development, land-use analysis, and data-driven spatial planning. Recent progress in fully supervised semantic segmentation has advanced multimodal BFI; however, most approaches still rely on static fusion and lack explicit multi-scale alignment. As a [...] Read more.

Fine-grained building function identification (BFI) is essential for sustainable urban development, land-use analysis, and data-driven spatial planning. Recent progress in fully supervised semantic segmentation has advanced multimodal BFI; however, most approaches still rely on static fusion and lack explicit multi-scale alignment. As a result, they struggle to adaptively integrate heterogeneous inputs and suppress cross-modal interference, which constrains representation learning. To overcome these limitations, we propose BuildFunc-MoE, an adaptive multimodal Mixture-of-Experts (MoE) network built on an effective end-to-end Swin-UNet backbone. The model treats high-resolution remote sensing imagery as the primary input and integrates auxiliary geospatial data such as nighttime light imagery, DEM, and point-of-interest information. An Adaptive Multimodal Fusion Gate (AMMFG) first refines auxiliary features into informative fused representations, which are then combined with the primary modality and passed through multi-scale Swin-MoE blocks that extend standard Swin Transformer blocks with MoE routing. This enables fine-grained, dynamic fusion and alignment between primary and auxiliary modalities across feature scales. BuildFunc-MoE further introduces a Shared Task-Expert Module (STEM), which extends the MoE framework to share experts between the main BFI task and auxiliary tasks (road extraction, green space segmentation, and water body detection), enabling parameter-level transfer. This design enables complementary feature learning, where structural and contextual information jointly enhance the discrimination of building functions, thereby improving identification accuracy while maintaining model compactness. Experiments on the proposed Wuhan-BF multimodal dataset show that, under identical supervision, BuildFunc-MoE outperforms the strongest multimodal baseline by over 2% on average across metrics. Both PyTorch and LuoJiaNET implementations validate its effectiveness, while the latter achieves higher accuracy and faster inference through optimized computation. Overall, BuildFunc-MoE offers a scalable solution for fine-grained BFI with strong potential for urban planning and sustainable governance. Full article

(This article belongs to the Special Issue High-Resolution Remote Sensing Image Processing and Applications)

► Show Figures

Figure 1

25 pages, 18442 KB

Open AccessArticle

Exploring the Spatial Coupling Between Visual and Ecological Sensitivity: A Cross-Modal Approach Using Deep Learning in Tianjin’s Central Urban Area

by Zhihao Kang, Chenfeng Xu, Yang Gu, Lunsai Wu, Zhiqiu He, Xiaoxu Heng, Xiaofei Wang and Yike Hu

Land 2025, 14(11), 2104; https://doi.org/10.3390/land14112104 - 23 Oct 2025

Cited by 2 | Viewed by 1363

Abstract

Amid rapid urbanization, Chinese cities face mounting ecological pressure, making it critical to balance environmental protection with public well-being. As visual perception accounts for over 80% of environmental information acquisition, it plays a key role in shaping experiences and evaluations of ecological space. [...] Read more.

Amid rapid urbanization, Chinese cities face mounting ecological pressure, making it critical to balance environmental protection with public well-being. As visual perception accounts for over 80% of environmental information acquisition, it plays a key role in shaping experiences and evaluations of ecological space. However, current ecological planning often overlooks public perception, leading to increasing mismatches between ecological conditions and spatial experiences. While previous studies have attempted to introduce public perspectives, a systematic framework for analyzing the spatial relationship between ecological and visual sensitivity remains lacking. This study takes 56,210 street-level points in Tianjin’s central urban area to construct a coordinated analysis framework of ecological and perceptual sensitivity. Visual sensitivity is derived from social media sentiment analysis (via GPT-4o) and street-view image semantic features extracted using the ADE20K semantic segmentation model, and subsequently processed through a Multilayer Perceptron (MLP) model. Ecological sensitivity is calculated using the Analytic Hierarchy Process (AHP)—based model integrating elevation, slope, normalized difference vegetation index (NDVI), land use, and nighttime light data. A coupling coordination model and bivariate Moran’s I are employed to examine spatial synergy and mismatches between the two dimensions. Results indicate that while 72.82% of points show good coupling, spatial mismatches are widespread. The dominant types include “HL” (high visual–low ecological) areas (e.g., Wudadao) with high visual attention but low ecological resilience, and “LH” (low visual–high ecological) areas (e.g., Huaiyuanli) with strong ecological value but low public perception. This study provides a systematic path for analyzing the spatial divergence between ecological and perceptual sensitivity, offering insights into ecological landscape optimization and perception-driven street design. Full article

(This article belongs to the Special Issue Sustainable Urbanscapes: The Role of Green Infrastructure on the Resilience of Ecosystem Services)

► Show Figures

Figure 1

27 pages, 5654 KB

Open AccessArticle

Intelligent Detection and Description of Foreign Object Debris on Airport Pavements via Enhanced YOLOv7 and GPT-Based Prompt Engineering

by Hanglin Cheng, Ruoxi Zhang, Ruiheng Zhang, Yihao Li, Yang Lei and Weiguang Zhang

Sensors 2025, 25(16), 5116; https://doi.org/10.3390/s25165116 - 18 Aug 2025

Cited by 4 | Viewed by 2585

Abstract

Foreign Object Debris (FOD) on airport pavements poses a serious threat to aviation safety, making accurate detection and interpretable scene understanding crucial for operational risk management. This paper presents an integrated multi-modal framework that combines an enhanced YOLOv7-X detector, a cascaded YOLO-SAM segmentation [...] Read more.

Foreign Object Debris (FOD) on airport pavements poses a serious threat to aviation safety, making accurate detection and interpretable scene understanding crucial for operational risk management. This paper presents an integrated multi-modal framework that combines an enhanced YOLOv7-X detector, a cascaded YOLO-SAM segmentation module, and a structured prompt engineering mechanism to generate detailed semantic descriptions of detected FOD. Detection performance is improved through the integration of Coordinate Attention, Spatial–Depth Conversion (SPD-Conv), and a Gaussian Similarity IoU (GSIoU) loss, leading to a 3.9% gain in mAP@0.5 for small objects with only a 1.7% increase in inference latency. The YOLO-SAM cascade leverages high-quality masks to guide structured prompt generation, which incorporates spatial encoding, material attributes, and operational risk cues, resulting in a substantial improvement in description accuracy from 76.0% to 91.3%. Extensive experiments on a dataset of 12,000 real airport images demonstrate competitive detection and segmentation performance compared to recent CNN- and transformer-based baselines while achieving robust semantic generalization in challenging scenarios, such as complete darkness, low-light, high-glare nighttime conditions, and rainy weather. A runtime breakdown shows that the enhanced YOLOv7-X requires 40.2 ms per image, SAM segmentation takes 142.5 ms, structured prompt construction adds 23.5 ms, and BLIP-2 description generation requires 178.6 ms, resulting in an end-to-end latency of 384.8 ms per image. Although this does not meet strict real-time video requirements, it is suitable for semi-real-time or edge-assisted asynchronous deployment, where detection robustness and semantic interpretability are prioritized over ultra-low latency. The proposed framework offers a practical, deployable solution for airport FOD monitoring, combining high-precision detection with context-aware description generation to support intelligent runway inspection and maintenance decision-making. Full article

(This article belongs to the Special Issue AI and Smart Sensors for Intelligent Transportation Systems)

► Show Figures

Figure 1

22 pages, 25824 KB

Open AccessArticle

NoctuDroneNet: Real-Time Semantic Segmentation of Nighttime UAV Imagery in Complex Environments

by Ruokun Qu, Jintao Tan, Yelu Liu, Chenglong Li and Hui Jiang

Drones 2025, 9(2), 97; https://doi.org/10.3390/drones9020097 - 27 Jan 2025

Cited by 3 | Viewed by 3112

Abstract

Nighttime semantic segmentation represents a challenging frontier in computer vision, made particularly difficult by severe low-light conditions, pronounced noise, and complex illumination patterns. These challenges intensify when dealing with Unmanned Aerial Vehicle (UAV) imagery, where varying camera angles and altitudes compound the difficulty. [...] Read more.

Nighttime semantic segmentation represents a challenging frontier in computer vision, made particularly difficult by severe low-light conditions, pronounced noise, and complex illumination patterns. These challenges intensify when dealing with Unmanned Aerial Vehicle (UAV) imagery, where varying camera angles and altitudes compound the difficulty. In this paper, we introduce NoctuDroneNet (Nocturnal UAV Drone Network, hereinafter referred to as NoctuDroneNet), a real-time segmentation model tailored specifically for nighttime UAV scenarios. Our approach integrates convolution-based global reasoning with training-only semantic alignment modules to effectively handle diverse and extreme nighttime conditions. We construct a new dataset, NUI-Night, focusing on low-illumination UAV scenes to rigorously evaluate performance under conditions rarely represented in standard benchmarks. Beyond NUI-Night, we assess NoctuDroneNet on the Varied Drone Dataset (VDD), a normal-illumination UAV dataset, demonstrating the model’s robustness and adaptability to varying flight domains despite the lack of large-scale low-light UAV benchmarks. Furthermore, evaluations on the Night-City dataset confirm its scalability and applicability to complex nighttime urban environments. NoctuDroneNet achieves state-of-the-art performance on NUI-Night, surpassing strong real-time baselines in both segmentation accuracy and speed. Qualitative analyses highlight its resilience to under-/over-exposure and small-object detection, underscoring its potential for real-world applications like UAV emergency landings under minimal illumination. Full article

► Show Figures

Figure 1

20 pages, 5608 KB

Open AccessArticle

Cross-Granularity Infrared Image Segmentation Network for Nighttime Marine Observations

by Hu Xu, Yang Yu, Xiaomin Zhang and Ju He

J. Mar. Sci. Eng. 2024, 12(11), 2082; https://doi.org/10.3390/jmse12112082 - 18 Nov 2024

Cited by 4 | Viewed by 2093

Abstract

Infrared image segmentation in marine environments is crucial for enhancing nighttime observations and ensuring maritime safety. While recent advancements in deep learning have significantly improved segmentation accuracy, challenges remain due to nighttime marine scenes including low contrast and noise backgrounds. This paper introduces [...] Read more.

Infrared image segmentation in marine environments is crucial for enhancing nighttime observations and ensuring maritime safety. While recent advancements in deep learning have significantly improved segmentation accuracy, challenges remain due to nighttime marine scenes including low contrast and noise backgrounds. This paper introduces a cross-granularity infrared image segmentation network CGSegNet designed to address these challenges specifically for infrared images. The proposed method designs a hybrid feature framework with cross-granularity to enhance segmentation performance in complex water surface scenarios. To suppress feature semantic disparity against different feature granularity, we propose an adaptive multi-scale fusion module (AMF) that combines local granularity extraction with global context granularity. Additionally, incorporating a handcrafted histogram of oriented gradients (HOG) features, we designed a novel HOG feature fusion module to improve edge detection accuracy under low-contrast conditions. Comprehensive experiments conducted on the public infrared segmentation dataset demonstrate that our method outperforms state-of-the-art techniques, achieving superior segmentation results compared to professional infrared image segmentation methods. The results highlight the potential of our approach in facilitating accurate infrared image segmentation for nighttime marine observation, with implications for maritime safety and environmental monitoring. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

14 pages, 13514 KB

Open AccessArticle

A Nighttime Driving-Scene Segmentation Method Based on Light-Enhanced Network

by Lihua Bi, Wenjiao Zhang, Xiangfei Zhang and Canlin Li

World Electr. Veh. J. 2024, 15(11), 490; https://doi.org/10.3390/wevj15110490 - 27 Oct 2024

Cited by 3 | Viewed by 3241

Abstract

To solve the semantic segmentation problem of night driving-scene images, which often have low brightness, low contrast, and uneven illumination, a nighttime driving-scene segmentation method based on a light-enhanced network was proposed. Firstly, we designed a light enhancement network, which comprises two parts: [...] Read more.

To solve the semantic segmentation problem of night driving-scene images, which often have low brightness, low contrast, and uneven illumination, a nighttime driving-scene segmentation method based on a light-enhanced network was proposed. Firstly, we designed a light enhancement network, which comprises two parts: a color correction module and a parameter predictor. The color correction module mitigates the impact of illumination variations on the segmentation network by adjusting the color information of the image. Meanwhile, the parameter predictor accurately predicts the parameters of the image filter through the analysis of global content, including factors such as brightness, contrast, hue, and exposure level, thereby effectively enhancing the image quality. Subsequently, the output of the light enhancement network is input into the segmentation network to obtain the final segmentation prediction. Experimental results show that the proposed method achieves mean Intersection over Union (mIoU) values of 59.4% on the Dark Zurich-test dataset, outperforming other segmentation algorithms for nighttime driving-scenes. Full article

(This article belongs to the Special Issue Vehicle-Road Collaboration and Connected Automated Driving)

► Show Figures

Figure 1

19 pages, 20515 KB

Open AccessArticle

Deep Neural Network-Based Flood Monitoring System Fusing RGB and LWIR Cameras for Embedded IoT Edge Devices

by Youn Joo Lee, Jun Young Hwang, Jiwon Park, Ho Gi Jung and Jae Kyu Suhr

Remote Sens. 2024, 16(13), 2358; https://doi.org/10.3390/rs16132358 - 27 Jun 2024

Cited by 10 | Viewed by 5553

Abstract

Floods are among the most common disasters, causing loss of life and enormous damage to private property and public infrastructure. Monitoring systems that detect and predict floods help respond quickly in the pre-disaster phase to prevent and mitigate flood risk and damages. Thus, [...] Read more.

Floods are among the most common disasters, causing loss of life and enormous damage to private property and public infrastructure. Monitoring systems that detect and predict floods help respond quickly in the pre-disaster phase to prevent and mitigate flood risk and damages. Thus, this paper presents a deep neural network (DNN)-based real-time flood monitoring system for embedded Internet of Things (IoT) edge devices. The proposed system fuses long-wave infrared (LWIR) and RGB cameras to overcome a critical drawback of conventional RGB camera-based systems: severe performance deterioration at night. This system recognizes areas occupied by water using a DNN-based semantic segmentation network, whose input is a combination of RGB and LWIR images. Flood warning levels are predicted based on the water occupancy ratio calculated by the water segmentation result. The warning information is delivered to authorized personnel via a mobile message service. For real-time edge computing, the heavy semantic segmentation network is simplified by removing unimportant channels while maintaining performance by utilizing the network slimming technique. Experiments were conducted based on the dataset acquired from the sensor module with RGB and LWIR cameras installed in a flood-prone area. The results revealed that the proposed system successfully conducts water segmentation and correctly sends flood warning messages in both daytime and nighttime. Furthermore, all of the algorithms in this system were embedded on an embedded IoT edge device with a Qualcomm QCS610 System on Chip (SoC) and operated in real time. Full article

(This article belongs to the Special Issue Real-Time Flood Monitoring and Prediction Using Integrative Remote Sensing and AI)

► Show Figures

Figure 1

20 pages, 28589 KB

Open AccessArticle

An Adaptive Semantic Segmentation Network for Adversarial Learning Domain Based on Low-Light Enhancement and Decoupled Generation

by Meng Wang, Zhuoran Zhang and Haipeng Liu

Appl. Sci. 2024, 14(8), 3295; https://doi.org/10.3390/app14083295 - 13 Apr 2024

Cited by 4 | Viewed by 2831

Abstract

Nighttime semantic segmentation due to issues such as low contrast, fuzzy imaging, and low-quality annotation results in significant degradation of masks. In this paper, we introduce a domain adaptive approach for nighttime semantic segmentation that overcomes the reliance on low-light image annotations to [...] Read more.

Nighttime semantic segmentation due to issues such as low contrast, fuzzy imaging, and low-quality annotation results in significant degradation of masks. In this paper, we introduce a domain adaptive approach for nighttime semantic segmentation that overcomes the reliance on low-light image annotations to transfer the source domain model to the target domain. On the front end, a low-light image enhancement sub-network combining lightweight deep learning with mapping curve iteration is adopted to enhance nighttime foreground contrast. In the segmentation network, the body generation and edge preservation branches are implemented to generate consistent representations within the same semantic region. Additionally, a pixel weighting strategy is embedded to increase the prediction accuracy for small targets. During the training, a discriminator is implemented to distinguish features between the source and target domains, thereby guiding the segmentation network for adversarial transfer learning. The proposed approach’s effectiveness is verified through testing on Dark Zurich, Nighttime Driving, and CityScapes, including evaluations of mIoU, PSNR, and SSIM. They confirm that our approach surpasses existing baselines in segmentation scenarios. Full article

► Show Figures

Figure 1

19 pages, 10524 KB

Open AccessArticle

VELIE: A Vehicle-Based Efficient Low-Light Image Enhancement Method for Intelligent Vehicles

by Linwei Ye, Dong Wang, Dongyi Yang, Zhiyuan Ma and Quan Zhang

Sensors 2024, 24(4), 1345; https://doi.org/10.3390/s24041345 - 19 Feb 2024

Cited by 19 | Viewed by 5890

Abstract

In Advanced Driving Assistance Systems (ADAS), Automated Driving Systems (ADS), and Driver Assistance Systems (DAS), RGB camera sensors are extensively utilized for object detection, semantic segmentation, and object tracking. Despite their popularity due to low costs, RGB cameras exhibit weak robustness in complex [...] Read more.

In Advanced Driving Assistance Systems (ADAS), Automated Driving Systems (ADS), and Driver Assistance Systems (DAS), RGB camera sensors are extensively utilized for object detection, semantic segmentation, and object tracking. Despite their popularity due to low costs, RGB cameras exhibit weak robustness in complex environments, particularly underperforming in low-light conditions, which raises a significant concern. To address these challenges, multi-sensor fusion systems or specialized low-light cameras have been proposed, but their high costs render them unsuitable for widespread deployment. On the other hand, improvements in post-processing algorithms offer a more economical and effective solution. However, current research in low-light image enhancement still shows substantial gaps in detail enhancement on nighttime driving datasets and is characterized by high deployment costs, failing to achieve real-time inference and edge deployment. Therefore, this paper leverages the Swin Vision Transformer combined with a gamma transformation integrated U-Net for the decoupled enhancement of initial low-light inputs, proposing a deep learning enhancement network named Vehicle-based Efficient Low-light Image Enhancement (VELIE). VELIE achieves state-of-the-art performance on various driving datasets with a processing time of only 0.19 s, significantly enhancing high-dimensional environmental perception tasks in low-light conditions. Full article

(This article belongs to the Special Issue Advances in Remote Sensing Image Enhancement and Classification)

► Show Figures

Figure 1

18 pages, 4175 KB

Open AccessArticle

Semantic and Geometric-Aware Day-to-Night Image Translation Network

by Geonkyu Bang, Jinho Lee, Yuki Endo, Toshiaki Nishimori, Kenta Nakao and Shunsuke Kamijo

Sensors 2024, 24(4), 1339; https://doi.org/10.3390/s24041339 - 19 Feb 2024

Cited by 10 | Viewed by 4784

Abstract

Autonomous driving systems heavily depend on perception tasks for optimal performance. However, the prevailing datasets are primarily focused on scenarios with clear visibility (i.e., sunny and daytime). This concentration poses challenges in training deep-learning-based perception models for environments with adverse conditions (e.g., rainy [...] Read more.

Autonomous driving systems heavily depend on perception tasks for optimal performance. However, the prevailing datasets are primarily focused on scenarios with clear visibility (i.e., sunny and daytime). This concentration poses challenges in training deep-learning-based perception models for environments with adverse conditions (e.g., rainy and nighttime). In this paper, we propose an unsupervised network designed for the translation of images from day-to-night to solve the ill-posed problem of learning the mapping between domains with unpaired data. The proposed method involves extracting both semantic and geometric information from input images in the form of attention maps. We assume that the multi-task network can extract semantic and geometric information during the estimation of semantic segmentation and depth maps, respectively. The image-to-image translation network integrates the two distinct types of extracted information, employing them as spatial attention maps. We compare our method with related works both qualitatively and quantitatively. The proposed method shows both qualitative and qualitative improvements in visual presentation over related work. Full article

(This article belongs to the Special Issue Advances in Sensing, Imaging and Computing for Autonomous Driving)

► Show Figures

Figure 1

Search Results (18)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (18)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI