Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (984)

Search Parameters:
Keywords = lidar network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 1903 KB  
Article
GMAFNet: Gated Mechanism Adaptive Fusion Network for 3D Semantic Segmentation of LiDAR Point Clouds
by Xiangbin Kong, Weijun Wu, Minghu Wu, Zhihang Gui, Zhe Luo and Chuyu Miao
Electronics 2025, 14(24), 4917; https://doi.org/10.3390/electronics14244917 - 15 Dec 2025
Abstract
Three-dimensional semantic segmentation plays a crucial role in advancing scene understanding in fields such as autonomous driving, drones, and robotic applications. Existing studies usually improve prediction accuracy by fusing data from vehicle-mounted cameras and vehicle-mounted LiDAR. However, current semantic segmentation methods face two [...] Read more.
Three-dimensional semantic segmentation plays a crucial role in advancing scene understanding in fields such as autonomous driving, drones, and robotic applications. Existing studies usually improve prediction accuracy by fusing data from vehicle-mounted cameras and vehicle-mounted LiDAR. However, current semantic segmentation methods face two main challenges: first, they often directly fuse 2D and 3D features, leading to the problem of information redundancy in the fusion process; second, there are often issues of image feature loss and missing point cloud geometric information in the feature extraction stage. From the perspective of multimodal fusion, this paper proposes a point cloud semantic segmentation method based on a multimodal gated attention mechanism. The method comprises a feature extraction network and a gated attention fusion and segmentation network. The feature extraction network utilizes a 2D image feature extraction structure and a 3D point cloud feature extraction structure to extract RGB image features and point cloud features, respectively. Through feature extraction and global feature supplementation, it effectively mitigates the issues of fine-grained image feature loss and point cloud geometric structure deficiency. The gated attention fusion and segmentation network increases the network’s attention to important categories such as vehicles and pedestrians through an attention mechanism and then uses a dynamic gated attention mechanism to control the respective weights of 2D and 3D features in the fusion process, enabling it to solve the problem of information redundancy in feature fusion. Finally, a 3D decoder is used for point cloud semantic segmentation. In this paper, tests will be conducted on the SemanticKITTI and nuScenes large-scene point cloud datasets. Full article
Show Figures

Figure 1

15 pages, 2248 KB  
Article
A Multimodal Sensor Fusion and Dynamic Prediction-Based Personnel Intrusion Detection System for Crane Operations
by Fengyu Wu, Maoqian Hu, Fangcheng Xie, Wenxie Bu and Zongxi Zhang
Processes 2025, 13(12), 4017; https://doi.org/10.3390/pr13124017 - 12 Dec 2025
Viewed by 150
Abstract
With the rapid development of industries such as construction and port hoisting, the operational safety of truck cranes in crowded areas has become a critical issue. Under complex working conditions, traditional monitoring methods are often plagued by issues such as compromised image quality, [...] Read more.
With the rapid development of industries such as construction and port hoisting, the operational safety of truck cranes in crowded areas has become a critical issue. Under complex working conditions, traditional monitoring methods are often plagued by issues such as compromised image quality, increased parallax computation errors, delayed fence response times, and inadequate accuracy in dynamic target recognition. To address these challenges, this study proposes a personnel intrusion detection system based on multimodal sensor fusion and dynamic prediction. The system utilizes the combined application of a binocular camera and a lidar, integrates the spatiotemporal attention mechanism and an improved LSTM network to predict the movement trajectory of the crane boom in real time, and generates a dynamic 3D fence with an advance margin. It classifies intrusion risks by matching the spatiotemporal prediction of pedestrian trajectories with the fence boundaries, and finally generates early warning information. The experimental results show that this method can significantly improve the detection accuracy of personnel intrusion under complex environments such as rain, fog, and strong light. This system provides a feasible solution for the safety monitoring of truck crane operations and significantly enhances operational safety. Full article
(This article belongs to the Section Chemical Processes and Systems)
Show Figures

Figure 1

20 pages, 2057 KB  
Article
Applying Deep Learning to Bathymetric LiDAR Point Cloud Data for Classifying Submerged Environments
by Nabila Tabassum, Henri Giudici, Vimala Nunavath and Ivar Oveland
Appl. Sci. 2025, 15(24), 12914; https://doi.org/10.3390/app152412914 - 8 Dec 2025
Viewed by 183
Abstract
Subsea environments are vital for global biodiversity, climate regulation, and human activities such as fishing, transport, and resource extraction. Accurate mapping and monitoring of these ecosystems are essential for sustainable management. Airborne LiDAR bathymetry (ALB) provides high-resolution underwater data but produces large and [...] Read more.
Subsea environments are vital for global biodiversity, climate regulation, and human activities such as fishing, transport, and resource extraction. Accurate mapping and monitoring of these ecosystems are essential for sustainable management. Airborne LiDAR bathymetry (ALB) provides high-resolution underwater data but produces large and complex datasets that make efficient analysis challenging. This study employs deep learning (DL) models for the multi-class classification of ALB waveform data, comparing two recurrent neural networks, i.e., Long Short-Term Memory (LSTM) and Bidirectional LSTM (BiLSTM). A preprocessing pipeline was developed to extract and label waveform peaks corresponding to five classes: sea surface, water, vegetation, seabed, and noise. Experimental results from two datasets demonstrated high classification accuracy for both models, with LSTM achieving 95.22% and 94.85%, and BiLSTM obtaining 94.37% and 84.18% on Dataset 1 and Dataset 2, respectively. Results show that the LSTM exhibited robustness and generalization, confirming its suitability for modeling causal, time-of-flight ALB signals. Overall, the findings highlight the potential of DL-based ALB data processing to improve underwater classification accuracy, thereby supporting safe navigation, resource management, and marine environmental monitoring. Full article
(This article belongs to the Special Issue AI for Sustainability and Innovation—2nd Edition)
Show Figures

Figure 1

19 pages, 2090 KB  
Article
Towards In-Vehicle Non-Contact Estimation of EDA-Based Arousal with LiDAR
by Jonas Brandstetter, Eva-Maria Knoch and Frank Gauterin
Sensors 2025, 25(23), 7395; https://doi.org/10.3390/s25237395 - 4 Dec 2025
Viewed by 306
Abstract
Driver monitoring systems are increasingly relying on physiological signals to assess cognitive and emotional states for improved safety and user experience. Electrodermal activity (EDA) is a particularly informative biomarker of arousal but is conventionally measured with skin-contact electrodes, limiting its applicability in vehicles. [...] Read more.
Driver monitoring systems are increasingly relying on physiological signals to assess cognitive and emotional states for improved safety and user experience. Electrodermal activity (EDA) is a particularly informative biomarker of arousal but is conventionally measured with skin-contact electrodes, limiting its applicability in vehicles. This work explores the feasibility of non-contact EDA estimation using Light Detection and Ranging (LiDAR) as a novel sensing modality. In a controlled laboratory setup, LiDAR reflection intensity from the forehead was recorded simultaneously with conventional finger-based EDA. Both classification and regression tasks were performed as follows: feature-based machine learning models (e.g., Random Forest and Extra Trees) and sequence-based deep learning models (e.g., CNN, LSTM, and TCN) were evaluated. Results demonstrate that LiDAR signals capture arousal-related changes, with the best regression model (Temporal Convolutional Network) achieving a mean absolute error of 14.6 on the normalized arousal factor scale (–50 to +50) and a correlation of r = 0.85 with ground-truth EDA. While random split validations yielded high accuracy, performance under leave-one-subject-out evaluation highlighted challenges in cross-subject generalization. The algorithms themselves were not the primary research focus but served to establish feasibility of the approach. These findings provide the first proof-of-concept that LiDAR can remotely estimate EDA-based arousal without direct skin contact, addressing a central limitation of current driver monitoring systems. Future research should focus on larger datasets, multimodal integration, and real-world driving validation to advance LiDAR towards practical in-vehicle deployment. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

16 pages, 13328 KB  
Article
Multi-Calib: A Scalable LiDAR–Camera Calibration Network for Variable Sensor Configurations
by Leyun Hu, Chao Wei, Meijing Wang, Zengbin Wu and Yang Xu
Sensors 2025, 25(23), 7321; https://doi.org/10.3390/s25237321 - 2 Dec 2025
Viewed by 333
Abstract
Traditional calibration methods rely on precise targets and frequent manual intervention, making them time-consuming and unsuitable for large-scale deployment. Existing learning-based approaches, while automating the process, are typically limited to single LiDAR–camera pairs, resulting in poor scalability and high computational overhead. To address [...] Read more.
Traditional calibration methods rely on precise targets and frequent manual intervention, making them time-consuming and unsuitable for large-scale deployment. Existing learning-based approaches, while automating the process, are typically limited to single LiDAR–camera pairs, resulting in poor scalability and high computational overhead. To address these limitations, we propose a lightweight calibration network with flexibility in the number of sensor pairs, making it capable of jointly calibrating multiple cameras and LiDARs in a single forward pass. Our method employs a frozen pre-trained Swin Transformer as a shared backbone to extract unified features from both RGB images and corresponding depth maps. Additionally, we introduce a cross-modal channel-wise attention module to enhance key feature alignment and suppress irrelevant noise. Moreover, to handle variations in viewpoint, we design a modular calibration head that independently estimates the extrinsics for each LiDAR–camera pair. Through large-scale experiments on the nuScenes dataset, we show that our model, requiring merely 78.79 M parameters, attains a mean translation error of 2.651 cm and a rotation error of 0.246, achieving comparable performance to existing methods while significantly reducing the computational cost. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

26 pages, 2310 KB  
Systematic Review
A Systematic Review of Intelligent Navigation in Smart Warehouses Using Prisma: Integrating AI, SLAM, and Sensor Fusion for Mobile Robots
by Domagoj Zimmer, Mladen Jurišić, Ivan Plaščak, Željko Barač, Hrvoje Glavaš, Dorijan Radočaj and Robert Benković
Eng 2025, 6(12), 339; https://doi.org/10.3390/eng6120339 - 1 Dec 2025
Viewed by 486
Abstract
This systematic review focuses on intelligent navigation as a core enabler of autonomy in smart warehouses, where mobile robots must dynamically perceive, reason, and act in complex, human-shared environments. By synthesizing advancements in AI-driven decision-making, SLAM, and multi-sensor fusion, the study highlights how [...] Read more.
This systematic review focuses on intelligent navigation as a core enabler of autonomy in smart warehouses, where mobile robots must dynamically perceive, reason, and act in complex, human-shared environments. By synthesizing advancements in AI-driven decision-making, SLAM, and multi-sensor fusion, the study highlights how intelligent navigation architectures reduce operational uncertainty and enhance task efficiency in logistics automation. Smart warehouses, powered by mobile robots and AGVs and integrated with AI and algorithms, are enabling more efficient storage with less human labour. This systematic review followed PRISMA 2020 guidelines to systematically identify, screen, and synthesize evidence from 106 peer-reviewed scientific articles (including pri-mary studies, technical papers, and reviews) published between 2020–2025, sourced from Web of Science. Thematic synthesis was conducted across 8 domains: AI, SLAM, sensor fusion, safety, network, path planning, implementation, and design. The transition to smart warehouses requires modern technologies to automate tasks and optimize resources. This article examines how intelligent systems can be integrated with mathematical models to improve navigation accuracy, reduce costs and prioritize human safety. Real-time data management with precise information for AMRs and AGVs is crucial for low-risk operation. This article studies AI, the IoT, LiDAR, machine learning (ML), SLAM and other new technologies for the successful implementation of mobile robots in smart warehouses. Modern technologies such as reinforcement learning optimize the routes and tasks of mobile robots. Data and sensor fusion methods integrate information from various sources to provide a more precise understanding of the indoor environment and inventory. Semantic mapping enables mobile robots to navigate and interact with complex warehouse environments with high accuracy in real time. The article also analyses how virtual reality (VR) can improve the spatial orientation of mobile robots by developing sophisticated navigation solutions that reduce time and financial costs. Full article
(This article belongs to the Special Issue Interdisciplinary Insights in Engineering Research)
Show Figures

Figure 1

17 pages, 4983 KB  
Article
TAGNet: A Tidal Flat-Attentive Graph Network Designed for Airborne Bathymetric LiDAR Point Cloud Classification
by Ahram Song
ISPRS Int. J. Geo-Inf. 2025, 14(12), 466; https://doi.org/10.3390/ijgi14120466 - 28 Nov 2025
Viewed by 246
Abstract
Airborne LiDAR bathymetry (ALB) provides dense three-dimensional point clouds that enable the detailed mapping of tidal flat environments. However, surface classification using these point clouds remains challenging due to residual noise, water surface reflectivity, and subtle class boundaries that persist even after standard [...] Read more.
Airborne LiDAR bathymetry (ALB) provides dense three-dimensional point clouds that enable the detailed mapping of tidal flat environments. However, surface classification using these point clouds remains challenging due to residual noise, water surface reflectivity, and subtle class boundaries that persist even after standard preprocessing. To address these challenges, this study introduces Tidal flat-Attentive Graph Network (TAGNet), a graph-based deep learning framework designed to leverage both local geometric relationships and global contextual cues for the point-wise classification of tidal flat surface classes. The model incorporates multi-scale EdgeConv layers for capturing fine-grained neighborhood structures and employs squeeze-and-excitation channel attention to enhance global feature representation. To validate TAGNet’s effectiveness, classification was conducted on ALB point clouds collected from adjacent tidal flat regions, focusing on four major surface classes: exposed flat, sea surface, sea floor, and vegetation. In benchmarking tests against baseline models, including Dynamic Graph Convolutional Neural Network, PointNeXt with Single-Scale Grouping, and PointNet Transformer, TAGNet consistently achieved higher macro F1-scores. Moreover, ablation studies isolating positional encoding, attention mechanisms, and detrended Z-features confirmed their complementary contributions to TAGNet’s performance. Notably, the full TAGNet outperformed all baselines by a substantial margin, particularly when distinguishing closely related classes, such as sea floor and exposed flat. These findings highlight the potential of graph-based architectures specifically designed for ALB data in enhancing the precision of coastal monitoring and habitat mapping. Full article
Show Figures

Figure 1

21 pages, 3883 KB  
Article
Individual Tree-Level Biomass Mapping in Chinese Coniferous Plantation Forests Using Multimodal UAV Remote Sensing Approach Integrating Deep Learning and Machine Learning
by Yiru Wang, Zhaohua Liu, Jiping Li, Hui Lin, Jiangping Long, Guangyi Mu, Sijia Li and Yong Lv
Remote Sens. 2025, 17(23), 3830; https://doi.org/10.3390/rs17233830 - 26 Nov 2025
Cited by 1 | Viewed by 285
Abstract
Accurate estimation of individual tree aboveground biomass (AGB) is essential for understanding forest carbon dynamics, optimizing resource management, and addressing climate change. Conventional methods rely on destructive sampling, whereas unmanned aerial vehicle (UAV) remote sensing provides a non-destructive alternative. In this study, spectral [...] Read more.
Accurate estimation of individual tree aboveground biomass (AGB) is essential for understanding forest carbon dynamics, optimizing resource management, and addressing climate change. Conventional methods rely on destructive sampling, whereas unmanned aerial vehicle (UAV) remote sensing provides a non-destructive alternative. In this study, spectral indices, textural features, and canopy height attributes were extracted from high-resolution UAV optical imagery and Light Detection And Ranging (LiDAR) point clouds. We developed an improved YOLOv8 model (NB-YOLOv8), incorporating Neural Architecture Manipulation (NAM) attention and a Bidirectional Feature Pyramid Network (BiFPN), for individual tree detection. Combined with a random forest algorithm, this hybrid framework enabled accurate biomass estimation of Chinese fir, Chinese pine, and larch plantations. NB-YOLOv8 achieved superior detection performance, with 92.3% precision and 90.6% recall, outperforming the original YOLOv8 by 4.8% and 4.2%, and the watershed algorithm by 12.4% and 11.7%, respectively. The integrated model produced reliable tree-level AGB predictions (R2 = 0.65–0.76). SHapley Additive exPlanation (SHAP) analysis further revealed that local feature contributions often diverged from global rankings, underscoring the importance of interpretable modeling. These results demonstrate the effectiveness of combining deep learning and machine learning for tree-level AGB estimation, and highlight the potential of multi-source UAV remote sensing to support large-scale, fine-resolution forest carbon monitoring and management. Full article
Show Figures

Figure 1

40 pages, 12756 KB  
Article
4D Pointwise Terrestrial Laser Scanning Calibration: Radiometric Calibration of Point Clouds
by Mansoor Sabzali and Lloyd Pilgrim
Sensors 2025, 25(22), 7035; https://doi.org/10.3390/s25227035 - 18 Nov 2025
Viewed by 428
Abstract
Terrestrial Laser Scanners (TLS), as monostatic LiDAR systems, emit and receive laser pulses through a single aperture, which ensures the simultaneous measurement of signal geometry and intensity. The relative intensity of a signal, defined as the ratio of received to transmitted power, directly [...] Read more.
Terrestrial Laser Scanners (TLS), as monostatic LiDAR systems, emit and receive laser pulses through a single aperture, which ensures the simultaneous measurement of signal geometry and intensity. The relative intensity of a signal, defined as the ratio of received to transmitted power, directly describes the strength and quality of the reflected signal and the corresponding radiometric uncertainty of individual points. The LiDAR range equation provides the physical connection for characterizing signal strength as a function of reflectivity and other spatial parameters. In this research, theoretical developments of the texture-dependent LiDAR range equation, in conjunction with a neural network method, are presented. The two-step approach aims to improve the accuracy of signal intensities by enhancing signal reflectivity estimation and the precision of signal intensities by reducing their sensitivity to variations in spatial characteristics—range and incidence angle. This establishes the intensity as the standard fourth dimension of the 3D point cloud based on the inherent target quality. For validation, four terrestrial laser scanners—Leica ScanStation P50, Leica ScanStation C10, Leica RTC360, and Trimble X9—are evaluated. Results demonstrate significant improvements of at least 40% in accuracy and 97% in precision for the color intensities of individual points across the devices. This research enables a 4D TLS point cloud calibration framework for further investigations on other internal and external geometries of targets (target materials, roughness, albedo, and edgy and tilted surfaces), which allows the standardization of radiometric values. Full article
(This article belongs to the Section Radar Sensors)
Show Figures

Figure 1

20 pages, 4682 KB  
Article
EAS-Det: Edge-Aware Semantic Feature Fusion for Robust 3D Object Detection in LiDAR Point Clouds
by Huishan Wang, Jie Ma, Yuehua Zhao, Jianlei Zhang and Fangwei Chen
Remote Sens. 2025, 17(22), 3743; https://doi.org/10.3390/rs17223743 - 18 Nov 2025
Viewed by 504
Abstract
Accurate 3D object detection and localization in LiDAR point clouds are crucial for applications such as autonomous driving and UAV-based monitoring. However, existing detectors often suffer from the loss of critical geometric information during network processing, mainly due to downsampling and pooling operations. [...] Read more.
Accurate 3D object detection and localization in LiDAR point clouds are crucial for applications such as autonomous driving and UAV-based monitoring. However, existing detectors often suffer from the loss of critical geometric information during network processing, mainly due to downsampling and pooling operations. This leads to imprecise object boundaries and degraded detection accuracy, particularly for small objects. To address these challenges, we propose Edge-Aware Semantic Feature Fusion for Detection (EAS-Det), a lightweight, plug-and-play framework for LiDAR-based perception. The core module, Edge-Semantic Interaction (ESI), employs a dual-attention mechanism to adaptively fuse geometric edge cues with high-level semantic context, yielding multi-scale representations that preserve structural details while enhancing contextual awareness. EAS-Det is compatible with mainstream backbones such as PointPillars and PV-RCNN. Extensive experiments on the KITTI and Waymo datasets demonstrate consistent and significant improvements, achieving up to 10.34% and 8.66% AP gains for pedestrians and cyclists, respectively, on the KITTI benchmark. These results underscore the effectiveness and generalizability of EAS-Det for robust 3D object detection in complex real-world environments. Full article
Show Figures

Graphical abstract

14 pages, 3554 KB  
Article
Unsupervised Optical-Sensor Extrinsic Calibration via Dual-Transformer Alignment
by Yuhao Wang, Yong Zuo, Yi Tang, Xiaobin Hong, Jian Wu and Ziyu Bian
Sensors 2025, 25(22), 6944; https://doi.org/10.3390/s25226944 - 13 Nov 2025
Viewed by 355
Abstract
Accurate extrinsic calibration between optical sensors, such as camera and LiDAR, is crucial for multimodal perception. Traditional methods based on specific calibration targets exhibit poor robustness in complex optical environments such as glare, reflections, or low light, and they rely on cumbersome manual [...] Read more.
Accurate extrinsic calibration between optical sensors, such as camera and LiDAR, is crucial for multimodal perception. Traditional methods based on specific calibration targets exhibit poor robustness in complex optical environments such as glare, reflections, or low light, and they rely on cumbersome manual operations. To address this, we propose a fully unsupervised, end-to-end calibration framework. Our approach adopts a dual-Transformer architecture: a Vision Transformer extracts semantic features from the image stream, while a Point Transformer captures the geometric structure of the 3D LiDAR point cloud. These cross-modal representations are aligned and fused through a neural network, and a regression algorithm is used to obtain the 6-DoF extrinsic transformation matrix. A multi-constraint loss function is designed to enhance structural consistency between modalities, thereby improving calibration stability and accuracy. On the KITTI benchmark, our method achieves a mean rotation error of 0.21° and a translation error of 3.31 cm; on a self-collected dataset, it attains an average reprojection error of 1.52 pixels. These results demonstrate a generalizable and robust solution for optical-sensor extrinsic calibration, enabling precise and self-sufficient perception in real-world applications. Full article
(This article belongs to the Section Optical Sensors)
Show Figures

Figure 1

22 pages, 5851 KB  
Article
A Multi-Stage Deep Learning Framework for Multi-Source Cloud Top Height Retrieval from FY-4A/AGRI Data
by Yinhe Cheng, Long Shen, Jiulei Zhang, Hongjian He, Xiaomin Gu, Shengxiang Wang and Tinghuai Ma
Atmosphere 2025, 16(11), 1288; https://doi.org/10.3390/atmos16111288 - 12 Nov 2025
Viewed by 438
Abstract
Cloud Top Height (CTH), defined as the altitude of the highest cloud layer above mean sea level, is a crucial geophysical parameter for quantifying cloud radiative effects, analyzing severe weather systems, and improving climate models. To enhance the accuracy of CTH retrieval from [...] Read more.
Cloud Top Height (CTH), defined as the altitude of the highest cloud layer above mean sea level, is a crucial geophysical parameter for quantifying cloud radiative effects, analyzing severe weather systems, and improving climate models. To enhance the accuracy of CTH retrieval from Fengyun-4A (FY-4A) satellite data, this study proposes a multi-stage deep learning framework that progressively refines cloud parameter estimation. The method utilizes cloud information from the FY-4A/AGRI (Advanced Geosynchronous Radiation Imager) Level 1 calibrated scanning imager radiance data product to construct a multi-source data fusion neural network model. The model inputs combine multi-channel radiance data with cloud parameters, including Cloud Top Temperature (CTT) and Cloud Top Pressure (CTP). We used the CTH measurement data from the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) satellite as a reference to verify the model output. Results demonstrate that the proposed multi-stage model significantly improves retrieval accuracy. Compared to the official FY-4A CTH product, the Mean Absolute Error (MAE) was reduced by 49.12% to 2.03 km, and the Pearson Correlation Coefficient (PCC) reached 0.85. To test the applicability of the model under complex weather conditions, we applied it to the CTH inversion of the double typhoon event on 10 August 2019. The model successfully characterized the spatial distribution of CTH within the typhoon regions. The results are consistent with the National Satellite Meteorological Centre (NSMC) reports and clearly reveal the different intensity evolutions of the two typhoons. This research provides an effective solution for high-precision retrieval of high-level cloud CTH at a large scale, using geostationary meteorological satellite remote sensing data. Full article
(This article belongs to the Section Atmospheric Techniques, Instruments, and Modeling)
Show Figures

Figure 1

14 pages, 3128 KB  
Article
WCGNet: A Weather Codebook and Gating Fusion for Robust 3D Detection Under Adverse Conditions
by Wenfeng Chen, Fei Yan, Ning Wang, Jiale He and Yiqi Wu
Electronics 2025, 14(22), 4379; https://doi.org/10.3390/electronics14224379 - 10 Nov 2025
Viewed by 484
Abstract
Three-dimensional (3D) object detection constitutes a fundamental task in the field of environmental perception. While LiDAR provides high-precision 3D geometric data, its performance significantly degrades under adverse weather conditions like dense fog and heavy snow, where point cloud quality deteriorates. To address this [...] Read more.
Three-dimensional (3D) object detection constitutes a fundamental task in the field of environmental perception. While LiDAR provides high-precision 3D geometric data, its performance significantly degrades under adverse weather conditions like dense fog and heavy snow, where point cloud quality deteriorates. To address this challenge, WCGNet is proposed as a robust 3D detection framework that enhances feature representation against weather corruption. The framework introduces two key components: a Weather Codebook module and a Weather-Aware Gating Fusion module. The Weather Codebook, trained on paired clear and adverse weather scenes, learns to store clear-scene reference features, providing structural guidance for foggy scenarios. The Weather-Aware Gating Fusion module then integrates the degraded features with the codebook’s reference features through a spatial attention mechanism, a multi-head attention network, a gating mechanism, and a fusion module to dynamically recalibrate and combine features, thereby effectively restoring weather-robust representations. Additionally, a foggy point cloud dataset, nuScenes-fog, is constructed based on the nuScenes dataset. Systematic evaluations are conducted on nuScenes, nuScenes-fog, and the STF multi-weather dataset. Experimental results indicate that the proposed framework significantly enhances detection performance and generalization capability under challenging weather conditions, demonstrating strong adaptability across different weather scenarios. Full article
(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images, 2nd Edition)
Show Figures

Figure 1

2779 KB  
Proceeding Paper
Federated Edge Learning for Distributed Weed Detection in Precision Agriculture Using Multimodal Sensor Fusion
by Dasaradha Arangi and Neelamadhab Padhy
Eng. Proc. 2025, 118(1), 33; https://doi.org/10.3390/ECSA-12-26608 - 7 Nov 2025
Viewed by 46
Abstract
In this work, our goal is to develop a privacy-preserving distributed weed detection and management system. The proposed work integrates FEL (Federated Learning) and deep learning with multi-modal sensor fusion to enhance the model’s performance while minimising data transfer, latency, and energy consumption. [...] Read more.
In this work, our goal is to develop a privacy-preserving distributed weed detection and management system. The proposed work integrates FEL (Federated Learning) and deep learning with multi-modal sensor fusion to enhance the model’s performance while minimising data transfer, latency, and energy consumption. In this study, we used multimodal sensors, such as LiDAR (Light Detection and Ranging), RGB (Red–Green–Blue) cameras, multispectral imaging devices, and soil moisture sensors placed in controlled agricultural plots. Deep learning models, such as Convolutional Neural Networks (CNNs), LSTM–CNN hybrids, and Vision Transformers, were trained using standardized parameters. A proposed Federated CNN (FedCNN) was deployed across multiple edge devices, each locally trained on sensor data without exchanging raw data, using FedAvg and FedProx algorithms. The experimental work revealed that the model FedCNN performed well in comparison to other models and achieved the highest accuracy of 94.1%, precision of 94.3%, recall of 93.9%, F1-score of 94.1%, and AUC of 94.1% during hybrid fusion strategies. We compared the centralized and federated learning performance. Full article
Show Figures

Figure 1

21 pages, 8098 KB  
Article
Multi-Sensor AI-Based Urban Tree Crown Segmentation from High-Resolution Satellite Imagery for Smart Environmental Monitoring
by Amirmohammad Sharifi, Reza Shah-Hosseini, Danesh Shokri and Saeid Homayouni
Smart Cities 2025, 8(6), 187; https://doi.org/10.3390/smartcities8060187 - 6 Nov 2025
Viewed by 836
Abstract
Urban tree detection is fundamental to effective forestry management, biodiversity preservation, and environmental monitoring—key components of sustainable smart city development. This study introduces a deep learning framework for urban tree crown segmentation that exclusively leverages high-resolution satellite imagery from GeoEye-1, WorldView-2, and WorldView-3, [...] Read more.
Urban tree detection is fundamental to effective forestry management, biodiversity preservation, and environmental monitoring—key components of sustainable smart city development. This study introduces a deep learning framework for urban tree crown segmentation that exclusively leverages high-resolution satellite imagery from GeoEye-1, WorldView-2, and WorldView-3, thereby eliminating the need for additional data sources such as LiDAR or UAV imagery. The proposed framework employs a Residual U-Net architecture augmented with Attention Gates (AGs) to address major challenges, including class imbalance, overlapping crowns, and spectral interference from complex urban structures, using a custom composite loss function. The main contribution of this work is to integrate data from three distinct satellite sensors with varying spatial and spectral characteristics into a single processing pipeline, demonstrating that such well-established architectures can yield reliable, high-accuracy results across heterogeneous resolutions and imaging conditions. A further advancement of this study is the development of a hybrid ground-truth generation strategy that integrates NDVI-based watershed segmentation, manual annotation, and the Segment Anything Model (SAM), thereby reducing annotation effort while enhancing mask fidelity. In addition, by training on 4-band RGBN imagery from multiple satellite sensors, the model exhibits generalization capabilities across diverse urban environments. Despite being trained on a relatively small dataset comprising only 1200 image patches, the framework achieves state-of-the-art performance (F1-score: 0.9121; IoU: 0.8384; precision: 0.9321; recall: 0.8930). These results stem from the integration of the Residual U-Net with Attention Gates, which enhance feature representation and suppress noise from urban backgrounds, as well as from hybrid ground-truth generation and the combined BCE–Dice loss function, which effectively mitigates class imbalance. Collectively, these design choices enable robust model generalization and clear performance superiority over baseline networks such as DeepLab v3 and U-Net with VGG19. Fully automated and computationally efficient, the proposed approach delivers cost-effective, accurate segmentation using satellite data alone, rendering it particularly suitable for scalable, operational smart city applications and environmental monitoring initiatives. Full article
Show Figures

Figure 1

Back to TopTop