MDPI - Publisher of Open Access Journals

18 pages, 2545 KiB

Open AccessArticle

Reliable Indoor Fire Detection Using Attention-Based 3D CNNs: A Fire Safety Engineering Perspective

by Mostafa M. E. H. Ali and Maryam Ghodrat

Fire 2025, 8(7), 285; https://doi.org/10.3390/fire8070285 - 21 Jul 2025

Viewed by 474

Despite recent advances in deep learning for fire detection, much of the current research prioritizes model-centric metrics over dataset fidelity, particularly from a fire safety engineering perspective. Commonly used datasets are often dominated by fully developed flames, mislabel smoke-only frames as non-fire, or [...] Read more.

Despite recent advances in deep learning for fire detection, much of the current research prioritizes model-centric metrics over dataset fidelity, particularly from a fire safety engineering perspective. Commonly used datasets are often dominated by fully developed flames, mislabel smoke-only frames as non-fire, or lack intra-video diversity due to redundant frames from limited sources. Some works treat smoke detection alone as early-stage detection, even though many fires (e.g., electrical or chemical) begin with visible flames and no smoke. Additionally, attempts to improve model applicability through mixed-context datasets—combining indoor, outdoor, and wildland scenes—often overlook the unique false alarm sources and detection challenges specific to each environment. To address these limitations, we curated a new video dataset comprising 1108 annotated fire and non-fire clips captured via indoor surveillance cameras. Unlike existing datasets, ours emphasizes early-stage fire dynamics (pre-flashover) and includes varied fire sources (e.g., sofa, cupboard, and attic fires), realistic false alarm triggers (e.g., flame-colored objects, artificial lighting), and a wide range of spatial layouts and illumination conditions. This collection enables robust training and benchmarking for early indoor fire detection. Using this dataset, we developed a spatiotemporal fire detection model based on the mixed convolutions ResNets (MC3_18) architecture, augmented with Convolutional Block Attention Modules (CBAM). The proposed model achieved 86.11% accuracy, 88.76% precision, and 84.04% recall, along with low false positive (11.63%) and false negative (15.96%) rates. Compared to its CBAM-free baseline, the model exhibits notable improvements in F1-score and interpretability, as confirmed by Grad-CAM++ visualizations highlighting attention to semantically meaningful fire features. These results demonstrate that effective early fire detection is inseparable from high-quality, context-specific datasets. Our work introduces a scalable, safety-driven approach that advances the development of reliable, interpretable, and deployment-ready fire detection systems for residential environments. Full article

(This article belongs to the Special Issue Computer Vision and Artificial Intelligence in Fire and Flame Detection)

► Show Figures

Figure 1

18 pages, 3132 KiB

Open AccessArticle

ICAFormer: An Image Dehazing Transformer Based on Interactive Channel Attention

by Yanfei Chen, Tong Yue, Pei An, Hanyu Hong, Tao Liu, Yangkai Liu and Yihui Zhou

Sensors 2025, 25(12), 3750; https://doi.org/10.3390/s25123750 - 15 Jun 2025

Cited by 1 | Viewed by 587

Abstract

Single image dehazing is a fundamental task in computer vision, aiming to recover a clear scene from a hazy input image. To address the limitations of traditional dehazing algorithms—particularly in global feature association and local detail preservation—this study proposes a novel Transformer-based dehazing [...] Read more.

Single image dehazing is a fundamental task in computer vision, aiming to recover a clear scene from a hazy input image. To address the limitations of traditional dehazing algorithms—particularly in global feature association and local detail preservation—this study proposes a novel Transformer-based dehazing model enhanced by an interactive channel attention mechanism. The proposed architecture adopts a U-shaped encoder–decoder framework, incorporating key components such as a feature extraction module and a feature fusion module based on interactive attention. Specifically, the interactive channel attention mechanism facilitates cross-layer feature interaction, enabling the dynamic fusion of global contextual information and local texture details. The network architecture leverages a multi-scale feature pyramid to extract image information across different dimensions, while an improved cross-channel attention weighting mechanism enhances feature representation in regions with varying haze densities. Extensive experiments conducted on both synthetic and real-world datasets—including the RESIDE benchmark—demonstrate the superior performance of the proposed method. Quantitatively, it achieves PSNR gains of 0.53 dB for indoor scenes and 1.64 dB for outdoor scenes, alongside SSIM improvements of 1.4% and 1.7%, respectively, compared with the second-best performing method. Qualitative assessments further confirm that the proposed model excels in restoring fine structural details in dense haze regions while maintaining high color fidelity. These results validate the effectiveness of the proposed approach in enhancing both perceptual quality and quantitative accuracy in image dehazing tasks. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

26 pages, 5598 KiB

Open AccessArticle

DeepLabV3+-Based Semantic Annotation Refinement for SLAM in Indoor Environments

by Shuangfeng Wei, Hongrui Tang, Changchang Liu, Tong Yang, Xiaohang Zhou, Sisi Zlatanova, Junlin Fan, Liping Tu and Yaqin Mao

Sensors 2025, 25(11), 3344; https://doi.org/10.3390/s25113344 - 26 May 2025

Cited by 1 | Viewed by 422

Abstract

Visual SLAM systems frequently encounter challenges in accurately reconstructing three-dimensional scenes from monocular imagery in semantically deficient environments, which significantly compromises robotic operational efficiency. While conventional manual annotation approaches can provide supplemental semantic information, they are inherently inefficient, procedurally complex, and labor-intensive. This [...] Read more.

Visual SLAM systems frequently encounter challenges in accurately reconstructing three-dimensional scenes from monocular imagery in semantically deficient environments, which significantly compromises robotic operational efficiency. While conventional manual annotation approaches can provide supplemental semantic information, they are inherently inefficient, procedurally complex, and labor-intensive. This paper presents an optimized DeepLabV3+-based framework for visual SLAM that integrates image semantic segmentation with automated point cloud semantic annotation. The proposed method utilizes MobileNetV3 as the backbone network for DeepLabV3+ to maintain segmentation accuracy while reducing computational demands. In this paper, we introduce a parameter-adaptive Density-Based Spatial Clustering of Applications with Noise (DBSCAN) clustering algorithm incorporating K-nearest neighbors and accelerated by KD-tree structures, effectively addressing the limitations of manual parameter tuning and erroneous annotations in conventional methods. Furthermore, a novel point cloud processing strategy featuring dynamic radius thresholding is developed to enhance annotation completeness and boundary precision. Experimental results demonstrate that our approach achieves significant improvements in annotation efficiency while preserving high accuracy, thereby providing reliable technical support for enhanced environmental understanding and navigation capabilities in indoor robotic applications. Full article

(This article belongs to the Special Issue Indoor Localization Technologies and Applications)

► Show Figures

Figure 1

17 pages, 13434 KiB

Open AccessFeature PaperArticle

Utilization of Calorimetric Analysis and Fire Dynamics Simulator (FDS) to Determine the Cause of Plant Fire in Taiwan: Thermogravimetric Analyzer (TGA), Differential Scanning Calorimetry (DSC), and FDS Reconstruction

by Yi-Hao Huang, Jen-Hao Chi and Chi-Min Shu

Processes 2025, 13(5), 1450; https://doi.org/10.3390/pr13051450 - 9 May 2025

Viewed by 531

Abstract

This study investigated a factory fire that resulted in an unusual situation that caused the deaths of two firefighters. The official fire investigation report was analyzed, records were obtained, and on-site investigations and interviews were conducted. Using these additional data and a calorimetric [...] Read more.

This study investigated a factory fire that resulted in an unusual situation that caused the deaths of two firefighters. The official fire investigation report was analyzed, records were obtained, and on-site investigations and interviews were conducted. Using these additional data and a calorimetric analysis to determine the combustibility of goods stored in the building at the time, a functional 3D model was produced, and a fire dynamics simulator (FDS) was run. The model was augmented using the results of calorimetric experiments for three types of primary goods being stored in the warehouse area: paper lunch boxes, tissue paper, and corrugated boxes. The reaction heat data obtained for each of the three sample types was 848.24, 468.29, and 301.21 J g⁻¹, respectively. The maximum mass loss data were 98.522, 84.439, and 90.811 mass% for each of the three types, respectively. A full-scale fire scene reconstruction confirmed the fire propagation routes and changes in fire hazard factors, such as indoor temperature, visibility, and carbon monoxide concentration. The FDS results were compared to the NIST recommended values for firefighter heat exposure time. The cause of death for both firefighters was also investigated in terms of the heat resistance of the facepiece lenses of their self-contained breathing apparatus. Based on the findings of this study, recommendations can be made to forestall the recurrence of similar events. Full article

(This article belongs to the Special Issue Experimental and Numerical Study of Flame Propagation of Biofuels/Oxidizers/Inert Mixtures (Volume II))

► Show Figures

Figure 1

19 pages, 24555 KiB

Open AccessArticle

A Multi-Strategy Visual SLAM System for Motion Blur Handling in Indoor Dynamic Environments

by Shuo Huai, Long Cao, Yang Zhou, Zhiyang Guo and Jingyao Gai

Sensors 2025, 25(6), 1696; https://doi.org/10.3390/s25061696 - 9 Mar 2025

Cited by 2 | Viewed by 979

Abstract

Typical SLAM systems adhere to the assumption of environment rigidity, which limits their functionality when deployed in the dynamic indoor environments commonly encountered by household robots. Prevailing methods address this issue by employing semantic information for the identification and processing of dynamic objects [...] Read more.

Typical SLAM systems adhere to the assumption of environment rigidity, which limits their functionality when deployed in the dynamic indoor environments commonly encountered by household robots. Prevailing methods address this issue by employing semantic information for the identification and processing of dynamic objects in scenes. However, extracting reliable semantic information remains challenging due to the presence of motion blur. In this paper, a novel visual SLAM algorithm is proposed in which various approaches are integrated to obtain more reliable semantic information, consequently reducing the impact of motion blur on visual SLAM systems. Specifically, to accurately distinguish moving objects and static objects, we introduce a missed segmentation compensation mechanism into our SLAM system for predicting and restoring semantic information, and depth and semantic information is then leveraged to generate masks of dynamic objects. Additionally, to refine keypoint filtering, a probability-based algorithm for dynamic feature detection and elimination is incorporated into our SLAM system. Evaluation experiments using the TUM and Bonn RGB-D datasets demonstrated that our SLAM system achieves lower absolute trajectory error (ATE) than existing systems in different dynamic indoor environments, particularly those with large view angle variations. Our system can be applied to enhance the autonomous navigation and scene understanding capabilities of domestic robots. Full article

(This article belongs to the Section Sensors and Robotics)

► Show Figures

Figure 1

21 pages, 9794 KiB

Open AccessArticle

Research on a Density-Based Clustering Method for Eliminating Inter-Frame Feature Mismatches in Visual SLAM Under Dynamic Scenes

by Zhiyong Yang, Kun Zhao, Shengze Yang, Yuhong Xiong, Changjin Zhang, Lielei Deng and Daode Zhang

Sensors 2025, 25(3), 622; https://doi.org/10.3390/s25030622 - 22 Jan 2025

Viewed by 995

Abstract

Visual SLAM relies on the motion information of static feature points in keyframes for both localization and map construction. Dynamic feature points interfere with inter-frame motion pose estimation, thereby affecting the accuracy of map construction and the overall robustness of the visual SLAM [...] Read more.

Visual SLAM relies on the motion information of static feature points in keyframes for both localization and map construction. Dynamic feature points interfere with inter-frame motion pose estimation, thereby affecting the accuracy of map construction and the overall robustness of the visual SLAM system. To address this issue, this paper proposes a method for eliminating feature mismatches between frames in visual SLAM under dynamic scenes. First, a spatial clustering-based RANSAC method is introduced. This method eliminates mismatches by leveraging the distribution of dynamic and static feature points, clustering the points, and separating dynamic from static clusters, retaining only the static clusters to generate a high-quality dataset. Next, the RANSAC method is introduced to fit the geometric model of feature matches, eliminating local mismatches in the high-quality dataset with fewer iterations. The accuracy of the DSSAC-RANSAC method in eliminating feature mismatches between frames is then tested on both indoor and outdoor dynamic datasets, and the robustness of the proposed algorithm is further verified on self-collected outdoor datasets. Experimental results demonstrate that the proposed algorithm reduces the average reprojection error by 58.5% and 49.2%, respectively, when compared to traditional RANSAC and GMS-RANSAC methods. The reprojection error variance is reduced by 65.2% and 63.0%, while the processing time is reduced by 69.4% and 31.5%, respectively. Finally, the proposed algorithm is integrated into the initialization thread of ORB-SLAM2 and the tracking thread of ORB-SLAM3 to validate its effectiveness in eliminating feature mismatches between frames in visual SLAM. Full article

(This article belongs to the Special Issue Simultaneous Localization and Mapping (SLAM) and Artificial Intelligence (AI) Based Localization for Positioning Applications and Mobile Robot Navigation—Second Edition)

► Show Figures

Figure 1

17 pages, 4607 KiB

Open AccessArticle

Event-Based Visual/Inertial Odometry for UAV Indoor Navigation

by Ahmed Elamin, Ahmed El-Rabbany and Sunil Jacob

Sensors 2025, 25(1), 61; https://doi.org/10.3390/s25010061 - 25 Dec 2024

Cited by 6 | Viewed by 3231

Abstract

Indoor navigation is becoming increasingly essential for multiple applications. It is complex and challenging due to dynamic scenes, limited space, and, more importantly, the unavailability of global navigation satellite system (GNSS) signals. Recently, new sensors have emerged, namely event cameras, which show great [...] Read more.

Indoor navigation is becoming increasingly essential for multiple applications. It is complex and challenging due to dynamic scenes, limited space, and, more importantly, the unavailability of global navigation satellite system (GNSS) signals. Recently, new sensors have emerged, namely event cameras, which show great potential for indoor navigation due to their high dynamic range and low latency. In this study, an event-based visual–inertial odometry approach is proposed, emphasizing adaptive event accumulation and selective keyframe updates to reduce computational overhead. The proposed approach fuses events, standard frames, and inertial measurements for precise indoor navigation. Features are detected and tracked on the standard images. The events are accumulated into frames and used to track the features between the standard frames. Subsequently, the IMU measurements and the feature tracks are fused to continuously estimate the sensor states. The proposed approach is evaluated using both simulated and real-world datasets. Compared with the state-of-the-art U-SLAM algorithm, our approach achieves a substantial reduction in the mean positional error and RMSE in simulated environments, showing up to 50% and 47% reductions along the x- and y-axes, respectively. The approach achieves 5–10 ms latency per event batch and 10–20 ms for frame updates, demonstrating real-time performance on resource-constrained platforms. These results underscore the potential of our approach as a robust solution for real-world UAV indoor navigation scenarios. Full article

(This article belongs to the Special Issue Multi-sensor Integration for Navigation and Environmental Sensing)

► Show Figures

Figure 1

26 pages, 2585 KiB

Open AccessArticle

Depth Prediction Improvement for Near-Field iToF Lidar in Low-Speed Motion State

by Mena Nagiub, Thorsten Beuth, Ganesh Sistu, Heinrich Gotzig and Ciarán Eising

Sensors 2024, 24(24), 8020; https://doi.org/10.3390/s24248020 - 16 Dec 2024

Cited by 1 | Viewed by 1194

Abstract

Current deep learning-based phase unwrapping techniques for iToF Lidar sensors focus mainly on static indoor scenarios, ignoring motion blur in dynamic outdoor scenarios. Our paper proposes a two-stage semi-supervised method to unwrap ambiguous depth maps affected by motion blur in dynamic outdoor scenes. [...] Read more.

Current deep learning-based phase unwrapping techniques for iToF Lidar sensors focus mainly on static indoor scenarios, ignoring motion blur in dynamic outdoor scenarios. Our paper proposes a two-stage semi-supervised method to unwrap ambiguous depth maps affected by motion blur in dynamic outdoor scenes. The method trains on static datasets to learn unwrapped depth map prediction and then adapts to dynamic datasets using continuous learning methods. Additionally, blind deconvolution is introduced to mitigate the blur. The combined use of these methods produces high-quality depth maps with reduced blur noise. Full article

(This article belongs to the Collection Navigation Systems and Sensors)

► Show Figures

Figure 1

33 pages, 14639 KiB

Open AccessArticle

Multi-Sensor Fusion for Wheel-Inertial-Visual Systems Using a Fuzzification-Assisted Iterated Error State Kalman Filter

by Guohao Huang, Haibin Huang, Yaning Zhai, Guohao Tang, Ling Zhang, Xingyu Gao, Yang Huang and Guoping Ge

Sensors 2024, 24(23), 7619; https://doi.org/10.3390/s24237619 - 28 Nov 2024

Cited by 2 | Viewed by 2749

Abstract

This paper investigates the odometry drift problem in differential-drive indoor mobile robots and proposes a multi-sensor fusion approach utilizing a Fuzzy Inference System (FIS) within a Wheel-Inertial-Visual Odometry (WIVO) framework to optimize the 6-DoF localization of the robot in unstructured scenes. The structure [...] Read more.

This paper investigates the odometry drift problem in differential-drive indoor mobile robots and proposes a multi-sensor fusion approach utilizing a Fuzzy Inference System (FIS) within a Wheel-Inertial-Visual Odometry (WIVO) framework to optimize the 6-DoF localization of the robot in unstructured scenes. The structure and principles of the multi-sensor fusion system are developed, incorporating an Iterated Error State Kalman Filter (IESKF) for enhanced accuracy. An FIS is integrated with the IESKF to address the limitations of traditional fixed covariance matrices in process and observation noise, which fail to adapt effectively to complex kinematic characteristics and visual observation challenges such as varying lighting conditions and unstructured scenes in dynamic environments. The fusion filter gains in FIS-IESKF are adaptively adjusted for noise predictions, optimizing the rule parameters of the fuzzy inference process. Experimental results demonstrate that the proposed method effectively enhances the localization accuracy and system robustness of differential-drive indoor mobile robots in dynamically changing movements and environments. Full article

(This article belongs to the Special Issue Simultaneous Localization and Mapping (SLAM) and Artificial Intelligence (AI) Based Localization for Positioning Applications and Mobile Robot Navigation—Second Edition)

► Show Figures

Figure 1

28 pages, 20242 KiB

Open AccessArticle

PLM-SLAM: Enhanced Visual SLAM for Mobile Robots in Indoor Dynamic Scenes Leveraging Point-Line Features and Manhattan World Model

by Jiale Liu and Jingwen Luo

Electronics 2024, 13(23), 4592; https://doi.org/10.3390/electronics13234592 - 21 Nov 2024

Cited by 1 | Viewed by 1383

Abstract

This paper proposes an enhanced visual simultaneous localization and mapping (vSLAM) algorithm tailored for mobile robots operating in indoor dynamic scenes. By incorporating point-line features and leveraging the Manhattan world model, the proposed PLM-SLAM framework significantly improves localization accuracy and map consistency. This [...] Read more.

This paper proposes an enhanced visual simultaneous localization and mapping (vSLAM) algorithm tailored for mobile robots operating in indoor dynamic scenes. By incorporating point-line features and leveraging the Manhattan world model, the proposed PLM-SLAM framework significantly improves localization accuracy and map consistency. This algorithm optimizes the line features detected by the Line Segment Detector (LSD) through merging and pruning strategies, ensuring real-time performance. Subsequently, dynamic point-line features are rejected based on Lucas–Kanade (LK) optical flow, geometric constraints, and depth information, minimizing the impact of dynamic objects. The Manhattan world model is then utilized to reduce rotational estimation errors and optimize pose estimation. High-precision line feature matching and loop closure detection mechanisms further enhance the robustness and accuracy of the system. Experimental results demonstrate the superior performance of PLM-SLAM, particularly in high-dynamic indoor environments, outperforming existing state-of-the-art methods. Full article

► Show Figures

Figure 1

18 pages, 3002 KiB

Open AccessArticle

Three-Dimensional Instance Segmentation Using the Generalized Hough Transform and the Adaptive n-Shifted Shuffle Attention

by Desire Burume Mulindwa, Shengzhi Du and Qingxue Liu

Sensors 2024, 24(22), 7215; https://doi.org/10.3390/s24227215 - 12 Nov 2024

Cited by 1 | Viewed by 1325

Abstract

The progress of 3D instance segmentation techniques has made it essential for several applications, such as augmented reality, autonomous driving, and robotics. Traditional methods usually have challenges with complex indoor scenes made of multiple objects with different occlusions and orientations. In this work, [...] Read more.

The progress of 3D instance segmentation techniques has made it essential for several applications, such as augmented reality, autonomous driving, and robotics. Traditional methods usually have challenges with complex indoor scenes made of multiple objects with different occlusions and orientations. In this work, the authors present an innovative model that integrates a new adaptive n-shifted shuffle (ANSS) attention mechanism with the Generalized Hough Transform (GHT) for robust 3D instance segmentation of indoor scenes. The proposed technique leverages the n-shifted sigmoid activation function, which improves the adaptive shuffle attention mechanism, permitting the network to dynamically focus on relevant features across various regions. A learnable shuffling pattern is produced through the proposed ANSS attention mechanism to spatially rearrange the relevant features, thus augmenting the model’s ability to capture the object boundaries and their fine-grained details. The integration of GHT furnishes a vigorous framework to localize and detect objects in the 3D space, even when heavy noise and partial occlusions are present. The authors evaluate the proposed method on the challenging Stanford 3D Indoor Spaces Dataset (S3DIS), where it establishes its superiority over existing methods. The proposed approach achieves state-of-the-art performance in both mean Intersection over Union (IoU) and overall accuracy, showcasing its potential for practical deployment in real-world scenarios. These results illustrate that the integration of the ANSS and the GHT yields a robust solution for 3D instance segmentation tasks. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

22 pages, 20719 KiB

Open AccessArticle

A Computationally Efficient Neuronal Model for Collision Detection with Contrast Polarity-Specific Feed-Forward Inhibition

by Guangxuan Gao, Renyuan Liu, Mengying Wang and Qinbing Fu

Biomimetics 2024, 9(11), 650; https://doi.org/10.3390/biomimetics9110650 - 22 Oct 2024

Cited by 1 | Viewed by 1645

Abstract

Animals utilize their well-evolved dynamic vision systems to perceive and evade collision threats. Driven by biological research, bio-inspired models based on lobula giant movement detectors (LGMDs) address certain gaps in constructing artificial collision-detecting vision systems with robust selectivity, offering reliable, low-cost, and miniaturized [...] Read more.

Animals utilize their well-evolved dynamic vision systems to perceive and evade collision threats. Driven by biological research, bio-inspired models based on lobula giant movement detectors (LGMDs) address certain gaps in constructing artificial collision-detecting vision systems with robust selectivity, offering reliable, low-cost, and miniaturized collision sensors across various scenes. Recent progress in neuroscience has revealed the energetic advantages of dendritic arrangements presynaptic to the LGMDs, which receive contrast polarity-specific signals on separate dendritic fields. Specifically, feed-forward inhibitory inputs arise from parallel ON/OFF pathways interacting with excitation. However, none of the previous research has investigated the evolution of a computational LGMD model with feed-forward inhibition (FFI) separated by opposite polarity. This study fills this vacancy by presenting an optimized neuronal model where FFI is divided into ON/OFF channels, each with distinct synaptic connections. To align with the energy efficiency of biological systems, we introduce an activation function associated with neural computation of FFI and interactions between local excitation and lateral inhibition within ON/OFF channels, ignoring non-active signal processing. This approach significantly improves the time efficiency of the LGMD model, focusing only on substantial luminance changes in image streams. The proposed neuronal model not only accelerates visual processing in relatively stationary scenes but also maintains robust selectivity to ON/OFF-contrast looming stimuli. Additionally, it can suppress translational motion to a moderate extent. Comparative testing with state-of-the-art based on ON/OFF channels was conducted systematically using a range of visual stimuli, including indoor structured and complex outdoor scenes. The results demonstrated significant time savings in silico while retaining original collision selectivity. Furthermore, the optimized model was implemented in the embedded vision system of a micro-mobile robot, achieving the highest success ratio of collision avoidance at 97.51% while nearly halving the processing time compared with previous models. This highlights a robust and parsimonious collision-sensing mode that effectively addresses real-world challenges. Full article

(This article belongs to the Special Issue Bio-Inspired and Biomimetic Intelligence in Robotics: 2nd Edition)

► Show Figures

Figure 1

19 pages, 20386 KiB

Open AccessArticle

YOD-SLAM: An Indoor Dynamic VSLAM Algorithm Based on the YOLOv8 Model and Depth Information

by Yiming Li, Yize Wang, Liuwei Lu and Qi An

Electronics 2024, 13(18), 3633; https://doi.org/10.3390/electronics13183633 - 12 Sep 2024

Cited by 3 | Viewed by 1895

Abstract

Aiming at the problems of low positioning accuracy and poor mapping effect of the visual SLAM system caused by the poor quality of the dynamic object mask in an indoor dynamic environment, an indoor dynamic VSLAM algorithm based on the YOLOv8 model and [...] Read more.

Aiming at the problems of low positioning accuracy and poor mapping effect of the visual SLAM system caused by the poor quality of the dynamic object mask in an indoor dynamic environment, an indoor dynamic VSLAM algorithm based on the YOLOv8 model and depth information (YOD-SLAM) is proposed based on the ORB-SLAM3 system. Firstly, the YOLOv8 model obtains the original mask of a priori dynamic objects, and the depth information is used to modify the mask. Secondly, the mask’s depth information and center point are used to a priori determine if the dynamic object has missed detection and if the mask needs to be redrawn. Then, the mask edge distance and depth information are used to judge the movement state of non-prior dynamic objects. Finally, all dynamic object information is removed, and the remaining static objects are used for posing estimation and dense point cloud mapping. The accuracy of camera positioning and the construction effect of dense point cloud maps are verified using the TUM RGB-D dataset and real environment data. The results show that YOD-SLAM has a higher positioning accuracy and dense point cloud mapping effect in dynamic scenes than other advanced SLAM systems such as DS-SLAM and DynaSLAM. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

24 pages, 7202 KiB

Open AccessArticle

A WKNN Indoor Fingerprint Localization Technique Based on Improved Discrimination Capability of RSS Similarity

by Baofeng Wang, Qinghai Li, Jia Liu, Zumin Wang, Qiudong Yu and Rui Liang

Sensors 2024, 24(14), 4586; https://doi.org/10.3390/s24144586 - 15 Jul 2024

Viewed by 1343

Abstract

There are various indoor fingerprint localization techniques utilizing the similarity of received signal strength (RSS) to discriminate the similarity of positions. However, due to the varied states of different wireless access points (APs), each AP’s contribution to RSS similarity varies, which affects the [...] Read more.

There are various indoor fingerprint localization techniques utilizing the similarity of received signal strength (RSS) to discriminate the similarity of positions. However, due to the varied states of different wireless access points (APs), each AP’s contribution to RSS similarity varies, which affects the accuracy of localization. In our study, we analyzed several critical causes that affect APs’ contribution, including APs’ health states and APs’ positions. Inspired by these insights, for a large-scale indoor space with ubiquitous APs, a threshold was set for all sample RSS to eliminate the abnormal APs dynamically, a correction quantity for each RSS was provided by the distance between the AP and the sample position to emphasize closer APs, and a priority weight was designed by RSS differences (RSSD) to further optimize the capability of fingerprint distances (FDs, the Euclidean distance of RSS) to discriminate physical distance (PDs, the Euclidean distance of positions). Integrating the above policies for the classical WKNN algorithm, a new indoor fingerprint localization technique is redefined, referred to as FDs’ discrimination capability improvement WKNN (FDDC-WKNN). Our simulation results showed that the correlation and consistency between FDs and PDs are well improved, with the strong correlation increasing from 0 to 76% and the high consistency increasing from 26% to 99%, which confirms that the proposed policies can greatly enhance the discrimination capabilities of RSS similarity. We also found that abnormal APs can cause significant impact on FDs discrimination capability. Further, by implementing the FDDC-WKNN algorithm in experiments, we obtained the optimal K value in both the simulation scene and real library scene, under which the mean errors have been reduced from 2.2732 m to 1.2290 m and from 4.0489 m to 2.4320 m, respectively. In addition, compared to not using the FDDC-WKNN, the cumulative distribution function (CDF) of the localization errors curve converged faster and the error fluctuation was smaller, which demonstrates the FDDC-WKNN having stronger robustness and more stable localization performance. Full article

(This article belongs to the Special Issue Sensors and Techniques for Indoor Positioning and Localization)

► Show Figures

Figure 1

34 pages, 30845 KiB

Open AccessArticle

Semantic Visual SLAM Algorithm Based on Improved DeepLabV3+ Model and LK Optical Flow

by Yiming Li, Yize Wang, Liuwei Lu, Yiran Guo and Qi An

Appl. Sci. 2024, 14(13), 5792; https://doi.org/10.3390/app14135792 - 2 Jul 2024

Cited by 2 | Viewed by 1686

Abstract

Aiming at the problem that dynamic targets in indoor environments lead to low accuracy and large errors in the localization and position estimation of visual SLAM systems and the inability to build maps containing semantic information, a semantic visual SLAM algorithm based on [...] Read more.

Aiming at the problem that dynamic targets in indoor environments lead to low accuracy and large errors in the localization and position estimation of visual SLAM systems and the inability to build maps containing semantic information, a semantic visual SLAM algorithm based on the semantic segmentation network DeepLabV3+ and LK optical flow is proposed based on the ORB-SLAM2 system. First, the dynamic target feature points are detected and rejected based on the lightweight semantic segmentation network DeepLabV3+ and LK optical flow method. Second, the static environment occluded by the dynamic target is repaired using the time-weighted multi-frame fusion background repair technique. Lastly, the filtered static feature points are used for feature matching and position calculation. Meanwhile, the semantic labeling information of static objects obtained based on the lightweight semantic segmentation network DeepLabV3+ is fused with the static environment information after background repair to generate dense point cloud maps containing semantic information, and the semantic dense point cloud maps are transformed into semantic octree maps using the octree spatial segmentation data structure. The localization accuracy of the visual SLAM system and the construction of the semantic maps are verified using the widely used TUM RGB-D dataset and real scene data, respectively. The experimental results show that the proposed semantic visual SLAM algorithm can effectively reduce the influence of dynamic targets on the system, and compared with other advanced algorithms, such as DynaSLAM, it has the highest performance in indoor dynamic environments in terms of localization accuracy and time consumption. In addition, semantic maps can be constructed so that the robot can better understand and adapt to the indoor dynamic environment. Full article

(This article belongs to the Section Robotics and Automation)

► Show Figures

Figure 1

Search Results (75)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (75)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI