Deep Reinforcement Learning for Navigation via Multi-Modal Belief State Representation from LiDAR and Depth Sensors

Xu, Degang; Wang, Haiou; Wang, Yizhi

doi:10.3390/app16083787

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Deep Reinforcement Learning for Navigation via Multi-Modal Belief State Representation from LiDAR and Depth Sensors

by

Degang Xu

,

Haiou Wang

and

Yizhi Wang

^*

School of Automation, Central South University, Changsha 410083, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(8), 3787; https://doi.org/10.3390/app16083787

Submission received: 13 March 2026 / Revised: 3 April 2026 / Accepted: 7 April 2026 / Published: 13 April 2026

(This article belongs to the Special Issue AI Applications in Modern Industrial Systems)

Download Versions Notes

Abstract

This paper presents a deep reinforcement learning framework for autonomous navigation based on multi-modal belief state representation learned from LiDAR and depth sensors. To address the challenges posed by partial observability and sensor-specific uncertainty, we propose a probabilistic representation module that models belief states as Gaussian distributions over latent environmental features. Sensor-specific encoders extract structured features from raw LiDAR and depth inputs, which are fused using a Q-value-guided weighting scheme derived from the policy critic. A motion-prediction pretraining strategy and a cross-modal coherence loss are introduced to enhance the alignment and reliability of the learned belief states. The resulting representation is integrated into a Soft Actor–Critic (SAC) framework to enable policy-driven decision-making under uncertainty. Extensive experiments in simulated environments demonstrate that the proposed method improves success rate, navigation efficiency, and generalization. Real-world experiments further validate these findings, with the proposed method outperforming a classical navigation baseline by reducing average travel time by 16% and path length by 4%. These results support the use of probabilistic multi-modal belief modeling for autonomous navigation under partial observability.

Keywords: autonomous navigation; deep reinforcement learning; belief state; multi-sensor fusion

Share and Cite

MDPI and ACS Style

Xu, D.; Wang, H.; Wang, Y. Deep Reinforcement Learning for Navigation via Multi-Modal Belief State Representation from LiDAR and Depth Sensors. Appl. Sci. 2026, 16, 3787. https://doi.org/10.3390/app16083787

AMA Style

Xu D, Wang H, Wang Y. Deep Reinforcement Learning for Navigation via Multi-Modal Belief State Representation from LiDAR and Depth Sensors. Applied Sciences. 2026; 16(8):3787. https://doi.org/10.3390/app16083787

Chicago/Turabian Style

Xu, Degang, Haiou Wang, and Yizhi Wang. 2026. "Deep Reinforcement Learning for Navigation via Multi-Modal Belief State Representation from LiDAR and Depth Sensors" Applied Sciences 16, no. 8: 3787. https://doi.org/10.3390/app16083787

APA Style

Xu, D., Wang, H., & Wang, Y. (2026). Deep Reinforcement Learning for Navigation via Multi-Modal Belief State Representation from LiDAR and Depth Sensors. Applied Sciences, 16(8), 3787. https://doi.org/10.3390/app16083787

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Reinforcement Learning for Navigation via Multi-Modal Belief State Representation from LiDAR and Depth Sensors

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI