Robust Autonomous Perception for Indoor Service Machines via Geometry-Aware RGB-D SLAM and Probabilistic Dynamic Modeling
Abstract
1. Introduction
- A geometry-aware perception backbone for indoor service machines. A unified RGB-D perception framework is developed by jointly exploiting point features, parallel-line structures, and planar regions. By integrating multi-granularity geometric representations, the proposed backbone improves geometric observability and pose stability in weak-texture and structurally repetitive indoor environments.
- A feature-level probabilistic dynamic modeling mechanism for reliability-aware perception. A probabilistic dynamic model is introduced to explicitly characterize the reliability of point and line features under environmental motion. Dynamic probabilities are initialized using object detection and continuously updated through temporal consistency analysis, spatial neighborhood propagation, and multi-view geometric verification, while large-scale planar structures serve as stable anchors in dynamic scenes.
- A semantic–geometric dynamic observation handling strategy for back-end optimization. A structure-aware observation handling mechanism is designed by jointly considering near-frame motion residuals, multi-keyframe projection consistency, and epipolar-geometry-based constraints. This strategy enables adaptive weighting and suppression of unreliable observations caused by motion and occlusion during pose optimization.
2. Related Work
2.1. Geometry-Aware SLAM for Robust Indoor Perception
2.2. Dynamic Scene Handling in Visual SLAM
3. Algorithm
3.1. System Overview
3.2. Multi-Granularity Feature Extraction
3.3. Dynamic Object Detection
3.4. Probabilistic Dynamic Modeling
3.5. Semantic–Geometric Observation Likelihood Construction
3.6. Tightly Coupled Pose Optimization with Multi-Granularity Geometric Fusion
4. Results
4.1. Experimental Setup
4.2. Quantitative Evaluation of Pose Accuracy on Dynamic RGB-D Sequences
4.3. Effectiveness of the Dynamic Feature Removal Strategy
4.4. Evaluation in Complex Large-Scale Dynamic Indoor Environments
4.5. Algorithm Runtime Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Li, L.; Li, L.; Li, M.; Liang, K. AI-Driven Robotics: Innovations in Design, Perception, and Decision-Making. Machines 2025, 13, 615. [Google Scholar] [CrossRef]
- Noomwongs, N.; Kiataramgul, K.; Chantranuwathana, S.; Phanomchoeng, G. GNSS-RTK-Based Navigation with Real-Time Obstacle Avoidance for Low-Speed Micro Electric Vehicles. Machines 2025, 13, 471. [Google Scholar] [CrossRef]
- Li, X.; Li, T.; Zhang, Y.; Li, Z.; Ban, L.; Ning, Y. GL-VSLAM: A General Lightweight Visual SLAM Approach for RGB-D and Stereo Cameras. Sensors 2025, 25, 7467. [Google Scholar] [CrossRef] [PubMed]
- Zhou, L.; Huang, G.; Mao, Y.; Wang, S.; Kaess, M. EDPLVO: Efficient Direct Point-Line Visual Odometry. In Proceedings of the 2022 IEEE International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 7559–7565. [Google Scholar] [CrossRef]
- Mur-Artal, R.; Montiel, J.M.M.; Tardós, J.D. ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Trans. Robot. 2015, 31, 1147–1163. [Google Scholar] [CrossRef]
- Mur-Artal, R.; Tardós, J.D. ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras. IEEE Trans. Robot. 2017, 33, 1255–1262. [Google Scholar] [CrossRef]
- Campos, C.; Elvira, R.; Rodríguez, J.J.G.; Montiel, J.M.M.; Tardós, J.D. ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial, and Multimap SLAM. IEEE Trans. Robot. 2021, 37, 1874–1890. [Google Scholar] [CrossRef]
- Qin, T.; Li, P.; Shen, S. VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator. IEEE Trans. Robot. 2018, 34, 1004–1020. [Google Scholar] [CrossRef]
- Pumarola, A.; Vakhitov, A.; Agudo, A.; Sanfeliu, A.; Moreno-Noguer, F. PL-SLAM: Real-Time Monocular Visual SLAM with Points and Lines. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 4503–4508. [Google Scholar] [CrossRef]
- Gomez-Ojeda, R.; Moreno, F.-A.; Zuñiga-Noël, D.; Scaramuzza, D.; Gonzalez-Jimenez, J. PL-SLAM: A Stereo SLAM System through the Combination of Points and Line Segments. IEEE Trans. Robot. 2019, 35, 734–746. [Google Scholar] [CrossRef]
- Shu, F.; Wang, J.; Pagani, A.; Stricker, D. Structure PLP-SLAM: Efficient Sparse Mapping and Localization Using Point, Line and Plane for Monocular, RGB-D and Stereo Cameras. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 2105–2112. [Google Scholar] [CrossRef]
- Xu, L.; Yin, H.; Shi, T.; Jiang, D.; Huang, B. EPLF-VINS: Real-Time Monocular Visual-Inertial SLAM with Efficient Point-Line Flow Features. IEEE Robot. Autom. Lett. 2023, 8, 752–759. [Google Scholar] [CrossRef]
- Gao, R.; Wu, S.; Zhang, L.; Pan, L.; Zhang, G.; Wang, H.; Wang, X. PLFG-SLAM: A Visual SLAM for Indoor Weak-Texture Environments with Point-Line Feature Glue Matching. Measurement 2025, 256, 118435. [Google Scholar] [CrossRef]
- Zhou, L.; Wang, S.; Kaess, M. DPLVO: Direct Point-Line Monocular Visual Odometry. IEEE Robot. Autom. Lett. 2021, 6, 7113–7120. [Google Scholar] [CrossRef]
- Lim, H.; Jeon, J.; Myung, H. UV-SLAM: Unconstrained Line-Based SLAM Using Vanishing Points for Structural Mapping. IEEE Robot. Autom. Lett. 2022, 7, 1518–1525. [Google Scholar] [CrossRef]
- Jiang, H.; Qian, R.; Du, L.; Pu, J.; Feng, J. UL-SLAM: A Universal Monocular Line-Based SLAM via Unifying Structural and Non-Structural Constraints. IEEE Trans. Autom. Sci. Eng. 2025, 22, 2682–2699. [Google Scholar] [CrossRef]
- Wang, Z.; Ding, W.; Zhang, Y.; Hua, C. PPL-SLAM: RGB-D-Based Point and Parallel-Line Structural Constraints for Enhanced Pose Estimation. Measurement 2026, 260, 119853. [Google Scholar] [CrossRef]
- Zhang, X.; Wang, W.; Qi, X.; Liao, Z.; Wei, R. Point–Plane SLAM Using Supposed Planes for Indoor Environments. Sensors 2019, 19, 3795. [Google Scholar] [CrossRef]
- Sun, Q.; Yuan, J.; Zhang, X.; Duan, F. Plane-Edge-SLAM: Seamless Fusion of Planes and Edges for SLAM in Indoor Environments. IEEE Trans. Autom. Sci. Eng. 2021, 18, 2061–2075. [Google Scholar] [CrossRef]
- Yang, H.; Yuan, J.; Gao, Y.; Sun, X.; Zhang, X. UPLP-SLAM: Unified Point–Line–Plane Feature Fusion for RGB-D Visual SLAM. Inf. Fusion 2023, 96, 51–65. [Google Scholar] [CrossRef]
- Kim, P.; Coltin, B.; Kim, H.J. Linear RGB-D SLAM for Planar Environments. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 350–366. [Google Scholar] [CrossRef]
- Li, Y.; Yunus, R.; Brasch, N.; Navab, N.; Tombari, F. RGB-D SLAM with Structural Regularities. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 11581–11587. [Google Scholar] [CrossRef]
- Yunus, R.; Li, Y.; Tombari, F. ManhattanSLAM: Robust Planar Tracking and Mapping Leveraging Mixture of Manhattan Frames. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 6687–6693. [Google Scholar] [CrossRef]
- Wang, Z.; Ding, W.; Zhang, Y.; Hua, C. OTPS-VO: Enhanced RGB-D Odometry for Indoor Service Robots Leveraging Structural Features. Expert Syst. Appl. 2026, 298, 129704. [Google Scholar] [CrossRef]
- Li, S.; Lee, D. RGB-D SLAM in Dynamic Environments Using Static Point Weighting. IEEE Robot. Autom. Lett. 2017, 2, 2263–2270. [Google Scholar] [CrossRef]
- Song, S.; Lim, H.; Lee, A.J.; Myung, H. DynaVINS++: Robust Visual-Inertial State Estimator in Dynamic Environments by Adaptive Truncated Least Squares and Stable State Recovery. IEEE Robot. Autom. Lett. 2024, 9, 9127–9134. [Google Scholar] [CrossRef]
- Zhang, C.; Zhang, R.; Jin, S.; Yi, X. PFD-SLAM: A New RGB-D SLAM for Dynamic Indoor Environments Based on Non-Prior Semantic Segmentation. Remote Sens. 2022, 14, 2445. [Google Scholar] [CrossRef]
- Xue, C.; Huang, Y.; Zhao, C.; Li, X.; Mihaylova, L.; Li, Y.; Chambers, J.A. A Gaussian–Generalized-Inverse-Gaussian Joint-Distribution-Based Adaptive MSCKF for Visual-Inertial Odometry Navigation. IEEE Trans. Aerosp. Electron. Syst. 2023, 59, 2307–2328. [Google Scholar] [CrossRef]
- Yu, C.; Liu, Z.; Liu, X.-J.; Xie, F.; Yang, Y.; Wei, Q.; Fei, Q. DS-SLAM: A Semantic Visual SLAM toward Dynamic Environments. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 1168–1174. [Google Scholar] [CrossRef]
- Scona, R.; Jaimez, M.; Petillot, Y.R.; Fallon, M.; Cremers, D. StaticFusion: Background Reconstruction for Dense RGB-D SLAM in Dynamic Environments. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018; pp. 3849–3856. [Google Scholar] [CrossRef]
- Cheng, S.; Sun, C.; Zhang, S.; Zhang, D. SG-SLAM: A real-time RGB-D visual SLAM toward dynamic scenes with semantic and geometric information. IEEE Trans. Instrum. Meas. 2023, 72, 7501012. [Google Scholar] [CrossRef]
- Liu, J.; Li, X.; Liu, Y.; Chen, H. RGB-D inertial odometry for a resource-restricted robot in dynamic environments. IEEE Robot. Autom. Lett. 2022, 7, 9573–9580. [Google Scholar] [CrossRef]
- Wang, Y.; Xu, K.; Tian, Y.; Ding, X. DRG-SLAM: A semantic RGB-D SLAM using geometric features for indoor dynamic scene. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022; pp. 1352–1359. [Google Scholar] [CrossRef]
- Zhu, Z.; Peng, S.; Larsson, V.; Xu, W.; Bao, H.; Cui, Z.; Oswald, M.R.; Pollefeys, M. NICE-SLAM: Neural implicit scalable encoding for SLAM. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 12776–12786. [Google Scholar] [CrossRef]
- Johari, M.M.; Carta, C.; Fleuret, F. ESLAM: Efficient dense SLAM system based on hybrid representation of signed distance fields. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 17408–17419. [Google Scholar] [CrossRef]
- Xu, Z.; Niu, J.; Li, Q.; Ren, T.; Chen, C. NID-SLAM: Neural implicit representation-based RGB-D SLAM in dynamic environments. In Proceedings of the 2024 IEEE International Conference on Multimedia and Expo (ICME), Niagara Falls, ON, Canada, 15–19 July 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Ruan, C.; Zang, Q.; Zhang, K.; Huang, K. DN-SLAM: A visual SLAM with ORB features and NeRF mapping in dynamic environments. IEEE Sens. J. 2024, 24, 5279–5287. [Google Scholar] [CrossRef]
- Jiang, H.; Xu, Y.; Li, K.; Feng, J.; Zhang, L. RoDyn-SLAM: Robust dynamic dense RGB-D SLAM with neural radiance fields. IEEE Robot. Autom. Lett. 2024, 9, 7509–7516. [Google Scholar] [CrossRef]
- Li, M.; Guo, Z.; Deng, T.; Zhou, Y.; Ren, Y.; Wang, H. DDN-SLAM: Real-time dense dynamic neural implicit SLAM. IEEE Robot. Autom. Lett. 2025, 10, 4300–4307. [Google Scholar] [CrossRef]
- von Gioi, R.G.; Jakubowicz, J.; Morel, J.-M.; Randall, G. LSD: A line segment detector. Image Process. Line 2012, 2, 35–55. [Google Scholar] [CrossRef]
- Akinlar, C.; Topal, C. EDLines: A real-time line segment detector with a false detection control. Pattern Recognit. Lett. 2011, 32, 1633–1642. [Google Scholar] [CrossRef]
- Ding, W.; Wang, Z.; Hu, S. OTPL: A novel measurement method of structural parallelism based on orientation transformation and geometric constraints. Signal Process. Image Commun. 2025, 137, 117310. [Google Scholar] [CrossRef]
- Feng, C.; Taguchi, Y.; Kamat, V.R. Fast Plane Extraction in Organized Point Clouds Using Agglomerative Hierarchical Clustering. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 31 May–7 June 2014; pp. 6218–6225. [Google Scholar] [CrossRef]
- Ultralytics. YOLOv11: Ultralytics Object Detection Framework. Available online: https://github.com/ultralytics/ultralytics (accessed on 3 August 2025).
- Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar] [CrossRef]
- Sturm, J.; Engelhard, N.; Endres, F.; Burgard, W.; Cremers, D. A benchmark for the evaluation of RGB-D SLAM systems. In Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura-Algarve, Portugal, 7–12 October 2012; pp. 573–580. [Google Scholar] [CrossRef]
- Tang, Y.-F.; Tai, C.; Chen, F.-X.; Zhang, W.-T.; Zhang, T.; Liu, X.-P.; Liu, Y.-J.; Zeng, L. Mobile robot oriented large-scale indoor dataset for dynamic scene understanding. In Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 13–17 May 2024; pp. 613–620. [Google Scholar] [CrossRef]
- Grupp, M. evo: Python Package for the Evaluation of Odometry and SLAM. Available online: https://github.com/MichaelGrupp/evo (accessed on 20 October 2025).














| Parameter | Description | Value |
|---|---|---|
| Temporal smoothing factor | 0.7 | |
| Descriptor matching threshold | 0.6 | |
| r | Diffusion radius | 0.15 m |
| Velocity magnitude threshold | 3 | |
| Velocity direction threshold | 2 | |
| Reprojection error threshold | 2 | |
| Epipolar distance threshold | 2 | |
| Likelihood fusion weights | 0.4, 0.4, 0.2 |
| Sequence | ORB-SLAM3 | SG-SLAM | DRG-SLAM | Dynamic-VINS | DS-SLAM | MGD-SLAM |
|---|---|---|---|---|---|---|
| s_half | 0.019 | 0.357 | 0.073 | 0.079 | 0.015 | 0.034 |
| s_rpy | 0.031 | 0.454 | 0.032 | 0.098 | 0.029 | 0.042 |
| s_static | 0.009 | 0.163 | 0.007 | 0.011 | 0.012 | 0.009 |
| s_xyz | 0.031 | 0.316 | 0.014 | 0.033 | 0.031 | 0.021 |
| w_half | 0.427 | 0.521 | 0.567 | 0.059 | 0.032 | 0.022 |
| w_rpy | 0.829 | 0.641 | 0.042 | 0.167 | 0.446 | 0.038 |
| w_static | 0.144 | 0.349 | 0.011 | 0.049 | 0.007 | 0.007 |
| w_xyz | 0.501 | 0.551 | 0.022 | 0.050 | 0.032 | 0.013 |
| Avg. | 0.249 | 0.419 | 0.096 | 0.068 | 0.076 | 0.023 |
| Sequence | NICE-SLAM | ESLAM | NID-SLAM | Rodyn-SLAM | DDN-SLAM | MGD-SLAM |
|---|---|---|---|---|---|---|
| w_half | 0.134 | 0.617 | 0.070 | 0.056 | 0.023 | 0.022 |
| w_rpy | 0.734 | 0.573 | 0.678 | 0.045 | 0.039 | 0.038 |
| w_static | 0.491 | 0.075 | 0.063 | 0.017 | 0.010 | 0.007 |
| w_xyz | 0.421 | 0.435 | 0.072 | 0.083 | 0.013 | 0.013 |
| Avg. | 0.445 | 0.425 | 0.221 | 0.050 | 0.021 | 0.020 |
| Sequence | ORB-SLAM3 | MWSLAM | MGD-SLAM-nD | MGD-SLAM |
|---|---|---|---|---|
| w_half | 0.427 | 0.293 | 0.340 | 0.022 |
| w_rpy | 0.829 | 0.164 | 0.148 | 0.038 |
| w_static | 0.144 | 0.019 | 0.020 | 0.007 |
| w_xyz | 0.501 | 0.291 | 0.279 | 0.013 |
| Avg. | 0.475 | 0.192 | 0.197 | 0.020 |
| Sequence | ORB-SLAM3 | MWSLAM | MGD-SLAM-nD | MGD-SLAM |
|---|---|---|---|---|
| w_half | 0.521 | 1.268 | 0.705 | 0.424 |
| w_rpy | 0.641 | 0.878 | 0.832 | 0.595 |
| w_static | 0.349 | 0.585 | 0.551 | 0.184 |
| w_xyz | 0.551 | 0.813 | 0.612 | 0.396 |
| Avg. | 0.516 | 0.886 | 0.675 | 0.400 |
| Module | Average Time (ms) |
|---|---|
| Multi-granularity feature extraction | 31 |
| Dynamic feature removal | 9 |
| Semantic detection (asynchronous) | 12 |
| Total tracking time | 46 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Wang, Z.; Ding, W.; Wang, W. Robust Autonomous Perception for Indoor Service Machines via Geometry-Aware RGB-D SLAM and Probabilistic Dynamic Modeling. Machines 2026, 14, 222. https://doi.org/10.3390/machines14020222
Wang Z, Ding W, Wang W. Robust Autonomous Perception for Indoor Service Machines via Geometry-Aware RGB-D SLAM and Probabilistic Dynamic Modeling. Machines. 2026; 14(2):222. https://doi.org/10.3390/machines14020222
Chicago/Turabian StyleWang, Zhiyu, Weili Ding, and Wenna Wang. 2026. "Robust Autonomous Perception for Indoor Service Machines via Geometry-Aware RGB-D SLAM and Probabilistic Dynamic Modeling" Machines 14, no. 2: 222. https://doi.org/10.3390/machines14020222
APA StyleWang, Z., Ding, W., & Wang, W. (2026). Robust Autonomous Perception for Indoor Service Machines via Geometry-Aware RGB-D SLAM and Probabilistic Dynamic Modeling. Machines, 14(2), 222. https://doi.org/10.3390/machines14020222

