MDPI - Publisher of Open Access Journals

18 pages, 2545 KiB

Open AccessArticle

Reliable Indoor Fire Detection Using Attention-Based 3D CNNs: A Fire Safety Engineering Perspective

by Mostafa M. E. H. Ali and Maryam Ghodrat

Fire 2025, 8(7), 285; https://doi.org/10.3390/fire8070285 - 21 Jul 2025

Viewed by 534

Despite recent advances in deep learning for fire detection, much of the current research prioritizes model-centric metrics over dataset fidelity, particularly from a fire safety engineering perspective. Commonly used datasets are often dominated by fully developed flames, mislabel smoke-only frames as non-fire, or [...] Read more.

Despite recent advances in deep learning for fire detection, much of the current research prioritizes model-centric metrics over dataset fidelity, particularly from a fire safety engineering perspective. Commonly used datasets are often dominated by fully developed flames, mislabel smoke-only frames as non-fire, or lack intra-video diversity due to redundant frames from limited sources. Some works treat smoke detection alone as early-stage detection, even though many fires (e.g., electrical or chemical) begin with visible flames and no smoke. Additionally, attempts to improve model applicability through mixed-context datasets—combining indoor, outdoor, and wildland scenes—often overlook the unique false alarm sources and detection challenges specific to each environment. To address these limitations, we curated a new video dataset comprising 1108 annotated fire and non-fire clips captured via indoor surveillance cameras. Unlike existing datasets, ours emphasizes early-stage fire dynamics (pre-flashover) and includes varied fire sources (e.g., sofa, cupboard, and attic fires), realistic false alarm triggers (e.g., flame-colored objects, artificial lighting), and a wide range of spatial layouts and illumination conditions. This collection enables robust training and benchmarking for early indoor fire detection. Using this dataset, we developed a spatiotemporal fire detection model based on the mixed convolutions ResNets (MC3_18) architecture, augmented with Convolutional Block Attention Modules (CBAM). The proposed model achieved 86.11% accuracy, 88.76% precision, and 84.04% recall, along with low false positive (11.63%) and false negative (15.96%) rates. Compared to its CBAM-free baseline, the model exhibits notable improvements in F1-score and interpretability, as confirmed by Grad-CAM++ visualizations highlighting attention to semantically meaningful fire features. These results demonstrate that effective early fire detection is inseparable from high-quality, context-specific datasets. Our work introduces a scalable, safety-driven approach that advances the development of reliable, interpretable, and deployment-ready fire detection systems for residential environments. Full article

(This article belongs to the Special Issue Computer Vision and Artificial Intelligence in Fire and Flame Detection)

► Show Figures

Figure 1

21 pages, 15478 KiB

Open AccessReview

Small Object Detection in Traffic Scenes for Mobile Robots: Challenges, Strategies, and Future Directions

by Zhe Wei, Yurong Zou, Haibo Xu and Sen Wang

Electronics 2025, 14(13), 2614; https://doi.org/10.3390/electronics14132614 - 28 Jun 2025

Viewed by 563

Abstract

Small object detection in traffic scenes presents unique challenges for mobile robots operating under constrained computational resources and highly dynamic environments. Unlike general object detection, small targets often suffer from low resolution, weak semantic cues, and frequent occlusion, especially in complex outdoor scenarios. [...] Read more.

Small object detection in traffic scenes presents unique challenges for mobile robots operating under constrained computational resources and highly dynamic environments. Unlike general object detection, small targets often suffer from low resolution, weak semantic cues, and frequent occlusion, especially in complex outdoor scenarios. This study systematically analyses the challenges, technical advances, and deployment strategies for small object detection tailored to mobile robotic platforms. We categorise existing approaches into three main strategies: feature enhancement (e.g., multi-scale fusion, attention mechanisms), network architecture optimisation (e.g., lightweight backbones, anchor-free heads), and data-driven techniques (e.g., augmentation, simulation, transfer learning). Furthermore, we examine deployment techniques on embedded devices such as Jetson Nano and Raspberry Pi, and we highlight multi-modal sensor fusion using Light Detection and Ranging (LiDAR), cameras, and Inertial Measurement Units (IMUs) for enhanced environmental perception. A comparative study of public datasets and evaluation metrics is provided to identify current limitations in real-world benchmarking. Finally, we discuss future directions, including robust detection under extreme conditions and human-in-the-loop incremental learning frameworks. This research aims to offer a comprehensive technical reference for researchers and practitioners developing small object detection systems for real-world robotic applications. Full article

(This article belongs to the Special Issue New Trends in Computer Vision and Image Processing)

► Show Figures

Figure 1

19 pages, 4327 KiB

Open AccessArticle

Research on a Two-Stage Human-like Trajectory-Planning Method Based on a DAC-MCLA Network

by Hao Xu, Guanyu Zhang and Huanyu Zhao

Vehicles 2025, 7(3), 63; https://doi.org/10.3390/vehicles7030063 - 24 Jun 2025

Viewed by 509

Abstract

Due to the complexity of the unstructured environment and the high-level requirement of smoothness when a tracked transportation vehicle is traveling, making the vehicle travel as safely and smoothly as when a skilled operator is maneuvering the vehicle is a critical issue worth [...] Read more.

Due to the complexity of the unstructured environment and the high-level requirement of smoothness when a tracked transportation vehicle is traveling, making the vehicle travel as safely and smoothly as when a skilled operator is maneuvering the vehicle is a critical issue worth studying. To this end, this study proposes a trajectory-planning method for human-like maneuvering. First, several field equipment operators are invited to manipulate the model vehicle for obstacle avoidance driving in an outdoor scene with densely distributed obstacles, and the manipulation data are collected. Then, in terms of the lateral displacement, by comparing the similarity between the data as well as the curvature change degree, the data with better smoothness are screened for processing, and a dataset of human manipulation behaviors is established for the training and testing of the trajectory-planning network. Then, using the dynamic parameters as constraints, a two-stage planning approach utilizes a modified deep network model to map trajectory points at multiple future time steps through the relationship between the spatial environment and the time series. Finally, after the experimental test and analysis with multiple methods, the root-mean-square-error and the mean-average-error indexes between the planned trajectory and the actual trajectory, as well as the trajectory-fitting situation, reveal that this study’s method is capable of planning long-step trajectory points in line with human manipulation habits, and the standard deviation of the angular acceleration and the curvature of the planned trajectory show that the trajectory planned using this study’s method has a satisfactory smoothness. Full article

► Show Figures

Figure 1

18 pages, 3132 KiB

Open AccessArticle

ICAFormer: An Image Dehazing Transformer Based on Interactive Channel Attention

by Yanfei Chen, Tong Yue, Pei An, Hanyu Hong, Tao Liu, Yangkai Liu and Yihui Zhou

Sensors 2025, 25(12), 3750; https://doi.org/10.3390/s25123750 - 15 Jun 2025

Cited by 1 | Viewed by 614

Abstract

Single image dehazing is a fundamental task in computer vision, aiming to recover a clear scene from a hazy input image. To address the limitations of traditional dehazing algorithms—particularly in global feature association and local detail preservation—this study proposes a novel Transformer-based dehazing [...] Read more.

Single image dehazing is a fundamental task in computer vision, aiming to recover a clear scene from a hazy input image. To address the limitations of traditional dehazing algorithms—particularly in global feature association and local detail preservation—this study proposes a novel Transformer-based dehazing model enhanced by an interactive channel attention mechanism. The proposed architecture adopts a U-shaped encoder–decoder framework, incorporating key components such as a feature extraction module and a feature fusion module based on interactive attention. Specifically, the interactive channel attention mechanism facilitates cross-layer feature interaction, enabling the dynamic fusion of global contextual information and local texture details. The network architecture leverages a multi-scale feature pyramid to extract image information across different dimensions, while an improved cross-channel attention weighting mechanism enhances feature representation in regions with varying haze densities. Extensive experiments conducted on both synthetic and real-world datasets—including the RESIDE benchmark—demonstrate the superior performance of the proposed method. Quantitatively, it achieves PSNR gains of 0.53 dB for indoor scenes and 1.64 dB for outdoor scenes, alongside SSIM improvements of 1.4% and 1.7%, respectively, compared with the second-best performing method. Qualitative assessments further confirm that the proposed model excels in restoring fine structural details in dense haze regions while maintaining high color fidelity. These results validate the effectiveness of the proposed approach in enhancing both perceptual quality and quantitative accuracy in image dehazing tasks. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

24 pages, 1732 KiB

Open AccessArticle

Model-Based Design of Contrast-Limited Histogram Equalization for Low-Complexity, High-Speed, and Low-Power Tone-Mapping Operation

by Wei Dong, Maikon Nascimento and Dileepan Joseph

Electronics 2025, 14(12), 2416; https://doi.org/10.3390/electronics14122416 - 13 Jun 2025

Viewed by 382

Abstract

Imaging applications involving outdoor scenes and fast motion require sensing and processing of high-dynamic-range images at video rates. In turn, image signal processing pipelines that serve low-dynamic-range displays require tone mapping operators (TMOs). For high-speed and low-power applications with low-cost field-programmable gate arrays [...] Read more.

Imaging applications involving outdoor scenes and fast motion require sensing and processing of high-dynamic-range images at video rates. In turn, image signal processing pipelines that serve low-dynamic-range displays require tone mapping operators (TMOs). For high-speed and low-power applications with low-cost field-programmable gate arrays (FPGAs), global TMOs that employ contrast-limited histogram equalization prove ideal. To develop such TMOs, this work proposes a MATLAB–Simulink–Vivado design flow. A realized design capable of megapixel video rates using milliwatts of power requires only a fraction of the resources available in the lowest-cost Artix-7 device from Xilinx (now Advanced Micro Devices). Unlike histogram-based TMO approaches for nonlinear sensors in the literature, this work exploits Simulink modeling to reduce the total required FPGA memory by orders of magnitude with minimal impact on video output. After refactoring an approach from the literature that incorporates two subsystems (Base Histograms and Tone Mapping) to one incorporating four subsystems (Scene Histogram, Perceived Histogram, Tone Function, and Global Mapping), memory is exponentially reduced by introducing a fifth subsystem (Interpolation). As a crucial stepping stone between MATLAB algorithm abstraction and Vivado circuit realization, the Simulink modeling facilitated a bit-true design flow. Full article

(This article belongs to the Special Issue Design of Low-Voltage and Low-Power Integrated Circuits)

► Show Figures

Figure 1

19 pages, 5986 KiB

Open AccessArticle

Gaussian-UDSR: Real-Time Unbounded Dynamic Scene Reconstruction with 3D Gaussian Splatting

by Yang Sun, Yue Zhou, Bin Tian, Haiyang Wang, Yongchao Zhao and Songdi Wu

Appl. Sci. 2025, 15(11), 6262; https://doi.org/10.3390/app15116262 - 2 Jun 2025

Viewed by 1319

Abstract

Unbounded dynamic scene reconstruction is crucial for applications such as autonomous driving, robotics, and virtual reality. However, existing methods struggle to reconstruct dynamic scenes in unbounded outdoor environments due to challenges such as lighting variation, object motion, and sensor limitations, leading to inaccurate [...] Read more.

Unbounded dynamic scene reconstruction is crucial for applications such as autonomous driving, robotics, and virtual reality. However, existing methods struggle to reconstruct dynamic scenes in unbounded outdoor environments due to challenges such as lighting variation, object motion, and sensor limitations, leading to inaccurate geometry and low rendering fidelity. In this paper, we proposed Gaussian-UDSR, a novel 3D Gaussian-based representation that efficiently reconstructs and renders high-quality, unbounded dynamic scenes in real time. Our approach fused LiDAR point clouds and Structure-from-Motion (SfM) point clouds obtained from an RGB camera, significantly improving depth estimation and geometric accuracy. To address dynamic appearance variations, we introduced a Gaussian color feature prediction network, which adaptively captures global and local feature information, enabling robust rendering under changing lighting conditions. Additionally, a pose-tracking mechanism ensured precise motion estimation for dynamic objects, enhancing realism and consistency. We evaluated Gaussian-UDSR on the Waymo and KITTI datasets, demonstrating state-of-the-art rendering quality with an 8.8% improvement in PSNR, a 75% reduction in LPIPS, and a fourfold speed improvement over existing methods. Our approach enables efficient, high-fidelity 3D reconstruction and fast real-time rendering of large-scale dynamic environments, while significantly reducing model storage overhead. Full article

► Show Figures

Figure 1

26 pages, 10564 KiB

Open AccessArticle

DynaFusion-SLAM: Multi-Sensor Fusion and Dynamic Optimization of Autonomous Navigation Algorithms for Pasture-Pushing Robot

by Zhiwei Liu, Jiandong Fang and Yudong Zhao

Sensors 2025, 25(11), 3395; https://doi.org/10.3390/s25113395 - 28 May 2025

Viewed by 642

Abstract

Aiming to address the problems of fewer related studies on autonomous navigation algorithms based on multi-sensor fusion in complex scenarios in pastures, lower degrees of fusion, and insufficient cruising accuracy of the operation path in complex outdoor environments, a multimodal autonomous navigation system [...] Read more.

Aiming to address the problems of fewer related studies on autonomous navigation algorithms based on multi-sensor fusion in complex scenarios in pastures, lower degrees of fusion, and insufficient cruising accuracy of the operation path in complex outdoor environments, a multimodal autonomous navigation system is proposed based on a loosely coupled architecture of Cartographer–RTAB-Map (real-time appearance-based mapping). Through laser-vision inertial guidance multi-sensor data fusion, the system achieves high-precision mapping and robust path planning in complex scenes. First, comparing the mainstream laser SLAM algorithms (Hector/Gmapping/Cartographer) through simulation experiments, Cartographer is found to have a significant memory efficiency advantage in large-scale scenarios and is thus chosen as the front-end odometer. Secondly, a two-way position optimization mechanism is innovatively designed: (1) When building the map, Cartographer processes the laser with IMU and odometer data to generate mileage estimations, which provide positioning compensation for RTAB-Map. (2) RTAB-Map fuses the depth camera point cloud and laser data, corrects the global position through visual closed-loop detection, and then uses 2D localization to construct a bimodal environment representation containing a 2D raster map and a 3D point cloud, achieving a complete description of the simulated ranch environment and material morphology and constructing a framework for the navigation algorithm of the pushing robot based on the two types of fused data. During navigation, the combination of RTAB-Map’s global localization and AMCL’s local localization is used to generate a smoother and robust positional attitude by fusing IMU and odometer data through the EKF algorithm. Global path planning is performed using Dijkstra’s algorithm and combined with the TEB (Timed Elastic Band) algorithm for local path planning. Finally, experimental validation is performed in a laboratory-simulated pasture environment. The results indicate that when the RTAB-Map algorithm fuses with the multi-source odometry, its performance is significantly improved in the laboratory-simulated ranch scenario, the maximum absolute value of the error of the map measurement size is narrowed from 24.908 cm to 4.456 cm, the maximum absolute value of the relative error is reduced from 6.227% to 2.025%, and the absolute value of the error at each location is significantly reduced. At the same time, the introduction of multi-source mileage fusion can effectively avoid the phenomenon of large-scale offset or drift in the process of map construction. On this basis, the robot constructs a fusion map containing a simulated pasture environment and material patterns. In the navigation accuracy test experiments, our proposed method reduces the root mean square error (RMSE) coefficient by 1.7% and Std by 2.7% compared with that of RTAB-MAP. The RMSE is reduced by 26.7% and Std by 22.8% compared to that of the AMCL algorithm. On this basis, the robot successfully traverses the six preset points, and the measured X and Y directions and the overall position errors of the six points meet the requirements of the pasture-pushing task. The robot successfully returns to the starting point after completing the task of multi-point navigation, achieving autonomous navigation of the robot. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

28 pages, 16050 KiB

Open AccessArticle

Advancing ALS Applications with Large-Scale Pre-Training: Framework, Dataset, and Downstream Assessment

by Haoyi Xiu, Xin Liu, Taehoon Kim and Kyoung-Sook Kim

Remote Sens. 2025, 17(11), 1859; https://doi.org/10.3390/rs17111859 - 27 May 2025

Viewed by 521

Abstract

The pre-training and fine-tuning paradigm has significantly advanced satellite remote sensing applications. However, its potential remains largely underexplored for airborne laser scanning (ALS), a key technology in domains such as forest management and urban planning. In this study, we address this gap by [...] Read more.

The pre-training and fine-tuning paradigm has significantly advanced satellite remote sensing applications. However, its potential remains largely underexplored for airborne laser scanning (ALS), a key technology in domains such as forest management and urban planning. In this study, we address this gap by constructing a large-scale ALS point cloud dataset and evaluating its effectiveness in downstream applications. We first propose a simple, generalizable framework for dataset construction, designed to maximize land cover and terrain diversity while allowing flexible control over dataset size. We instantiate this framework using ALS, land cover, and terrain data collected across the contiguous United States, resulting in a dataset geographically covering 17,000 +

{km}^{2}

(184 billion points) with diverse land cover and terrain types included. As a baseline self-supervised learning model, we adopt BEV-MAE, a state-of-the-art masked autoencoder for 3D outdoor point clouds, and pre-train it on the constructed dataset. The resulting models are fine-tuned for several downstream tasks, including tree species classification, terrain scene recognition, and point cloud semantic segmentation. Our results show that pre-trained models consistently outperform their counterparts trained from scratch across all downstream tasks, demonstrating the strong transferability of the learned representations. Additionally, we find that scaling the dataset using the proposed framework leads to consistent performance improvements, whereas datasets constructed via random sampling fail to achieve comparable gains. Full article

► Show Figures

Figure 1

24 pages, 1301 KiB

Open AccessArticle

Semantic-Guided Multi-Feature Attention Aggregation Network for LiDAR-Based 3D Object Detection

by Jingwen Zhao, Zhicong Huang, Zhijie Zheng, Yunliang Long and Haifeng Hu

Electronics 2025, 14(11), 2154; https://doi.org/10.3390/electronics14112154 - 26 May 2025

Viewed by 471

Abstract

The sparse and uneven distribution of point clouds in LiDAR-captured outdoor scenes poses significant challenges for 3D object detection in autonomous driving. Specifically, the imbalance between foreground and background points can degrade detection accuracy. While existing approaches attempt to address this issue through [...] Read more.

The sparse and uneven distribution of point clouds in LiDAR-captured outdoor scenes poses significant challenges for 3D object detection in autonomous driving. Specifically, the imbalance between foreground and background points can degrade detection accuracy. While existing approaches attempt to address this issue through sampling or segmentation strategies, effectively retaining informative foreground points and integrating features from multiple sources remains a challenge. To tackle these issues, we propose SMA², a semantic-guided multi-feature attention aggregation network. It consists of two key components: the Keypoint Attention Enhancement (KAE) module, which refines keypoints by leveraging semantic information through attention-based local aggregation, and the Multi-Feature Attention Aggregation (MFAA) module, which adaptively integrates keypoint, voxel, and BEV features using a keypoint-guided attention mechanism. Compared to existing fusion methods such as PV-RCNN, SMA² provides a more flexible and context-aware feature integration strategy. Experimental results on the KITTI test set demonstrate consistent performance improvements, especially in detection accuracy for small and distant objects. Additional tests on the Waymo and DAIR-V2X-V datasets further highlight the method’s strong generalization capability across diverse environments. Full article

► Show Figures

Figure 1

20 pages, 24073 KiB

Open AccessArticle

Comparison of Directional and Diffused Lighting for Pixel-Level Segmentation of Concrete Cracks

by Hamish Dow, Marcus Perry, Jack McAlorum and Sanjeetha Pennada

Infrastructures 2025, 10(6), 129; https://doi.org/10.3390/infrastructures10060129 - 25 May 2025

Viewed by 452

Abstract

Visual inspections of concrete infrastructure in low-light environments require external lighting to ensure adequate visibility. Directional lighting sources, where an image scene is illuminated with an angled lighting source from one direction, can enhance the visibility of surface defects in an image. This [...] Read more.

Visual inspections of concrete infrastructure in low-light environments require external lighting to ensure adequate visibility. Directional lighting sources, where an image scene is illuminated with an angled lighting source from one direction, can enhance the visibility of surface defects in an image. This paper compares directional and diffused scene illumination images for pixel-level concrete crack segmentation. A novel directional lighting image segmentation algorithm is proposed, which applies crack segmentation image processing techniques to each directionally lit image before combining all images into a single output, highlighting the extremities of the defect. This method was benchmarked against two diffused lighting crack detection techniques across a dataset with crack widths typically ranging from 0.07 mm to 0.4 mm. When tested on cracked and uncracked data, the directional lighting method significantly outperformed other benchmarked diffused lighting methods, attaining a 10% higher true-positive rate (TPR), 12% higher intersection over union (IoU), and 10% higher F1 score with minimal impact on precision. Further testing on only cracked data revealed that directional lighting was superior across all crack widths in the dataset. This research shows that directional lighting can enhance pixel-level crack segmentation in infrastructure requiring external illumination, such as low-light indoor spaces (e.g., tunnels and containment structures) or night-time outdoor inspections (e.g., pavement and bridges). Full article

(This article belongs to the Section Infrastructures Inspection and Maintenance)

► Show Figures

Figure 1

20 pages, 11623 KiB

Open AccessArticle

Research on the Improvement of the Signal Time Delay Estimation Method of Acoustic Positioning for Anti-Low Altitude UAVs

by Miao Liu, Jiyan Yu and Zhengpeng Yang

Sensors 2025, 25(9), 2735; https://doi.org/10.3390/s25092735 - 25 Apr 2025

Viewed by 449

Abstract

With the popularity of low-altitude small unmanned aerial vehicles (UAVs), UAVs are often used to take candid photos or even carry out malicious attacks. Acoustic detection can be used to locate UAVs in order to prevent malicious attacks by UAVs. Aiming at the [...] Read more.

With the popularity of low-altitude small unmanned aerial vehicles (UAVs), UAVs are often used to take candid photos or even carry out malicious attacks. Acoustic detection can be used to locate UAVs in order to prevent malicious attacks by UAVs. Aiming at the problem of a large error in the time delay estimation algorithm under a low SNR, a time delay estimation algorithm based on an improved weighted function combined with a generalized cubic cross-correlation is introduced. By analyzing and comparing the performance of generalized cross-correlation time delay estimation of different traditional weighting functions, an improved weighting function that combines improved smooth coherent transform (SCOT) and phase transform (PHAT) is proposed. Compared with the traditional generalized cross-correlation weighted function, the improved weighted function has a sharper and higher peak value, and the time delay estimation error is smaller at a low SNR. Secondly, by combining the improved weight function with the generalized cubic cross-correlation, the main peak value is further increased and sharpened, and the time delay estimation performance is better than that when combined with the generalized cubic cross-correlation and the generalized quadratic correlation. Experimental results show that in complex outdoor scenes, the positioning error of the unimproved GCC PHAT method is 45.22 cm, and the positioning error of the improved weighted function generalized cubic cross-correlation algorithm is no more than 22.1 cm. Compared with the unimproved GCC PHAT method, the performance is improved by 35.55%. It is proven that this method is helpful for improving the positioning ability of low-flying UAVs and can provide help for anti-terrorism security against malicious attacks by UAVs. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

15 pages, 3167 KiB

Open AccessArticle

Building a Realistic Virtual Luge Experience Using Photogrammetry

by Bernhard Hollaus, Jonas Kreiner, Maximilian Gallinat, Meggy Hayotte and Denny Yu

Sensors 2025, 25(8), 2568; https://doi.org/10.3390/s25082568 - 18 Apr 2025

Viewed by 505

Abstract

Virtual reality (VR) continues to evolve, offering immersive experiences across various domains, especially in virtual training scenarios. The aim of this study is to present the development of a VR simulator and to examine its realism, usability, and acceptance by luge experts after [...] Read more.

Virtual reality (VR) continues to evolve, offering immersive experiences across various domains, especially in virtual training scenarios. The aim of this study is to present the development of a VR simulator and to examine its realism, usability, and acceptance by luge experts after an experiment with a VR simulation. We present a novel photogrammetry sensing to VR pipeline for the sport of luge designed with the goal to be as close to the real luge experience as possible, potentially enabling users to learn critical techniques safely prior to real-world trials. Key features of our application include realistic terrain created with photogrammetry and responsive sled dynamics. A consultation of experts from the Austrian Luge Federation led to several design improvements to the VR environment, especially based on user experience aspects such as lifelike feedback and interface responsiveness. Furthermore, user interaction was optimized to enable precise steering and maneuvering. Moreover, two learning modes were developed to accommodate user experience levels (novice and expert). The results indicated a good level of realism of the VR luge simulator. Participants reported scene, audience behavior, and sound realism scores that ranged from 3/5 to 4/5. Our findings indicated adequate usability (system usability score: 72.7, SD = 13.9). Moderate scores were observed for the acceptance of VRodel. In conclusion, our virtual luge application offers a promising avenue for exploring the potential of VR technology in delivering authentic outdoor recreation experiences that could increase safety in the sport of luge. By integrating advanced sensing, simulations, and interactive features, we aim to push the boundaries of realism in virtual lugeing and pave the way for future advancements in immersive entertainment and simulation applications. Full article

(This article belongs to the Special Issue Sensors and Techniques for Virtual Reality, Augmented Reality and Mixed Reality Applications)

► Show Figures

Figure 1

32 pages, 8687 KiB

Open AccessArticle

Hybrid Deep Learning Methods for Human Activity Recognition and Localization in Outdoor Environments

by Yirga Yayeh Munaye, Metadel Addis, Yenework Belayneh, Atinkut Molla and Wasyihun Admass

Algorithms 2025, 18(4), 235; https://doi.org/10.3390/a18040235 - 18 Apr 2025

Viewed by 851

Abstract

Activity recognition and localization in outdoor environments involve identifying and tracking human movements using sensor data, computer vision, or deep learning techniques. This process is crucial for applications such as smart surveillance, autonomous systems, healthcare monitoring, and human–computer interaction. However, several challenges arise [...] Read more.

Activity recognition and localization in outdoor environments involve identifying and tracking human movements using sensor data, computer vision, or deep learning techniques. This process is crucial for applications such as smart surveillance, autonomous systems, healthcare monitoring, and human–computer interaction. However, several challenges arise in outdoor settings, including varying lighting conditions, occlusions caused by obstacles, environmental noise, and the complexity of differentiating between similar activities. This study presents a hybrid deep learning approach that integrates human activity recognition and localization in outdoor environments using Wi-Fi signal data. The study focuses on applying the hybrid long short-term memory–bi-gated recurrent unit (LSTM-BIGRU) architecture, designed to enhance the accuracy of activity recognition and location estimation. Moreover, experiments were conducted using a real-world dataset collected with the PicoScene Wi-Fi sensing device, which captures both magnitude and phase information. The results demonstrated a significant improvement in accuracy for both activity recognition and localization tasks. To mitigate data scarcity, this study utilized the conditional tabular generative adversarial network (CTGAN) to generate synthetic channel state information (CSI) data. Additionally, carrier frequency offset (CFO) and cyclic shift delay (CSD) preprocessing techniques were implemented to mitigate phase fluctuations. The experiments were conducted in a line-of-sight (LoS) outdoor environment, where CSI data were collected using the PicoScene Wi-Fi sensor platform across four different activities at outdoor locations. Finally, a comparative analysis of the experimental results highlights the superior performance of the proposed hybrid LSTM-BIGRU model, achieving 99.81% and 98.93% accuracy for activity recognition and location prediction, respectively. Full article

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

► Show Figures

Figure 1

38 pages, 9310 KiB

Open AccessReview

From ADAS to Material-Informed Inspection: Review of Hyperspectral Imaging Applications on Mobile Ground Robots

by Daniil Valme, Anton Rassõlkin and Dhanushka C. Liyanage

Sensors 2025, 25(8), 2346; https://doi.org/10.3390/s25082346 - 8 Apr 2025

Cited by 1 | Viewed by 1433

Abstract

Hyperspectral imaging (HSI) has evolved from its origins in space missions to become a promising sensing technology for mobile ground robots, offering unique capabilities in material identification and scene understanding. This review examines the integration and applications of HSI systems in ground-based mobile [...] Read more.

Hyperspectral imaging (HSI) has evolved from its origins in space missions to become a promising sensing technology for mobile ground robots, offering unique capabilities in material identification and scene understanding. This review examines the integration and applications of HSI systems in ground-based mobile platforms, with emphasis on outdoor implementations. The analysis covers recent developments in two main application domains: autonomous navigation and inspection tasks. In navigation, the review explores HSI applications in Advanced Driver Assistance Systems (ADAS) and off-road scenarios, examining how spectral information enhances environmental perception and decision making. For inspection applications, the investigation covers HSI deployment in search and rescue operations, mining exploration, and infrastructure monitoring. The review addresses key technical aspects including sensor types, acquisition modes, and platform integration challenges, particularly focusing on environmental factors affecting outdoor HSI deployment. Additionally, it analyzes available datasets and annotation approaches, highlighting their significance for developing robust classification algorithms. While recent advances in sensor design and processing capabilities have expanded HSI applications, challenges remain in real-time processing, environmental robustness, and system cost. The review concludes with a discussion of future research directions and opportunities for advancing HSI technology in mobile robotics applications. Full article

(This article belongs to the Special Issue Advancing Land Monitoring through Synergistic Harmonization of Optical, Radar and Lidar Satellite Technologies)

► Show Figures

Figure 1

19 pages, 15020 KiB

Open AccessArticle

Discrete Diffusion-Based Generative Semantic Scene Completion

by Yiqi Wu, Xuan Huang, Boxiong Yang, Yong Chen, Fadi Aburaid and Dejun Zhang

Electronics 2025, 14(7), 1447; https://doi.org/10.3390/electronics14071447 - 3 Apr 2025

Viewed by 530

Abstract

Semantic scene completion through AI-driven content generation is a rapidly evolving field with crucial applications in 3D reconstruction and scene understanding. This task presents considerable challenges, arising from the intrinsic data sparsity and incomplete nature of input points generated by LiDAR. This paper [...] Read more.

Semantic scene completion through AI-driven content generation is a rapidly evolving field with crucial applications in 3D reconstruction and scene understanding. This task presents considerable challenges, arising from the intrinsic data sparsity and incomplete nature of input points generated by LiDAR. This paper proposes a generative semantic scene completion method based on a discrete denoising diffusion probabilistic model to tackle these issues. In the discrete diffusion phase, a weighted K-nearest neighbor uniform transition kernel is introduced based on feature distance in the discretized voxel space to control the category distribution transition processes by capturing the local structure of data, which is more in line with the diffusion process in the real world. Moreover, to mitigate the feature information loss during point cloud voxelization, the aggregated point features are integrated into the corresponding voxel space, thereby enhancing the granularity of the completion. Accordingly, a combined loss function is designed for network training that considers both the KL divergence for global completion and the cross-entropy for local details. The evaluation, which results from multiple public outdoor datasets, demonstrates that the proposed method effectively accomplishes semantic scene completion. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

Search Results (283)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (283)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI