LiDAR Dreamer: Efficient World Model for Autonomous Racing with Cartesian-Polar Encoding and Lightweight State-Space Cells
Abstract
1. Introduction
- Utilizing a Cartesian Polar Bar Chart enables the use of more input information compared to existing Dreamer-based autonomous driving algorithms. While it maintains a consistent input representation format across different maps, showing better results on tracks with gradual curves than those with sharp corners.
- Proposing and utilizing Light Structured State-Space Cell (LS3C) resulted in smaller model size and better performance when using LiDAR-based images.
- Utilizing Displacement-Covariance Distance reduces the probability of model collapse compared to the KL-divergence used in the baseline dreamer model.
2. Related Works
2.1. Model-Free Reinforcement Learning
2.2. Model-Based Reinforcement Learning
2.3. Dreamer
2.4. Dreamer-Based Autonomous Driving
2.5. Dreamer-Based Autonomous Racing
3. LiDAR-Dreamer
3.1. Cartesian Polar Bar Chart
3.2. Light Structured State-Space Cell
3.3. Displacement Covariance Distance
3.4. Synergy of the Three Components
4. Experiment Results
4.1. Input Plot—Cartesian Polar Bar Chart
4.2. Recurrent Neural Network—Light Structured State-Space Cell
4.3. Divergence—Displacement Covariance Distance
4.4. Compare with Other Reinforcement Learning Models
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Lyu, C.; Lu, D.; Xiong, C.; Hu, R.; Jin, Y.; Wang, J.; Zeng, Z.; Lian, L. Toward a gliding hybrid aerial underwater vehicle: Design, fabrication, and experiments. J. Field Robot. 2022, 39, 543–556. [Google Scholar] [CrossRef]
- Kabzan, J.; Valls, M.I.; Reijgwart, V.J.; Hendrikx, H.F.; Ehmke, C.; Prajapat, M.; Bühler, A.; Gosala, N.; Gupta, M.; Sivanesan, R.; et al. AMZ driverless: The full autonomous racing system. J. Field Robot. 2020, 37, 1267–1294. [Google Scholar] [CrossRef]
- Law, C.K.; Dalal, D.; Shearrow, S. Robust model predictive control for autonomous vehicles/self driving cars. arXiv 2018, arXiv:1805.08551. [Google Scholar] [CrossRef]
- Rosolia, U.; Borrelli, F. Learning how to autonomously race a car: A predictive control approach. IEEE Trans. Control. Syst. Technol. 2019, 28, 2713–2719. [Google Scholar] [CrossRef]
- Sezer, V.; Gokasan, M. A novel obstacle avoidance algorithm: “follow the gap method”. Robot. Auton. Syst. 2012, 60, 1123–1134. [Google Scholar] [CrossRef]
- Otterness, N. Disparity Extender. Available online: https://www.nathanotterness.com/2019/04/the-disparity-extender-algorithm-and.html (accessed on 9 October 2025).
- Scharf, L.L.; Harthill, W.P.; Moose, P.H. A comparison of expected flight times for intercept and pure pursuit missiles. IEEE Trans. Aerosp. Electron. Syst. 1969, AES-5, 672–673. [Google Scholar] [CrossRef]
- Wit, J.; Crane, C.D., III; Armstrong, D. Autonomous ground vehicle path tracking. J. Robot. Syst. 2004, 21, 439–449. [Google Scholar] [CrossRef]
- Garcia, C.E.; Prett, D.M.; Morari, M. Model predictive control: Theory and practice—A survey. Automatica 1989, 25, 335–348. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, UK, 1998; Volume 1, pp. 9–11. [Google Scholar]
- Silver, D.; Hubert, T.; Schrittwieser, J.; Antonoglou, I.; Lai, M.; Guez, A.; Lanctot, M.; Sifre, L.; Kumaran, D.; Graepel, T.; et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 2018, 362, 1140–1144. [Google Scholar] [CrossRef]
- Koh, J.Y.; Lee, H.; Yang, Y.; Baldridge, J.; Anderson, P. Pathdreamer: A world model for indoor navigation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 14738–14748. [Google Scholar]
- Li, Q.; Jia, X.; Wang, S.; Yan, J. Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-V2). In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; Springer Nature: Cham, Switzerland, 2024; pp. 142–158. [Google Scholar]
- Seo, Y.; Kim, J.; James, S.; Lee, K.; Shin, J.; Abbeel, P. Multi-view masked world models for visual robotic manipulation. In Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; PMLR: New York, NY, USA, 2023; pp. 30613–30632. [Google Scholar]
- Wu, P.; Escontrela, A.; Hafner, D.; Abbeel, P.; Goldberg, K. Daydreamer: World models for physical robot learning. In Proceedings of the Conference on Robot Learning, Auckland, New Zealand, 14–18 December 2022; PMLR: New York, NY, USA, 2023; pp. 2226–2240. [Google Scholar]
- Barth-Maron, G.; Hoffman, M.W.; Budden, D.; Dabney, W.; Horgan, D.; Tb, D.; Muldal, A.; Heess, N.; Lillicrap, T. Distributed distributional deterministic policy gradients. arXiv 2018, arXiv:1804.08617. [Google Scholar] [CrossRef]
- Abdolmaleki, A.; Springenberg, J.T.; Tassa, Y.; Munos, R.; Heess, N.; Riedmiller, M. Maximum a posteriori policy optimisation. arXiv 2018, arXiv:1806.06920. [Google Scholar] [CrossRef]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal policy optimization algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar] [CrossRef]
- Haarnoja, T.; Zhou, A.; Abbeel, P.; Levine, S. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; PMLR: New York, NY, USA, 2018; pp. 1861–1870. [Google Scholar]
- Deisenroth, M.; Rasmussen, C.E. PILCO: A model-based and data-efficient approach to policy search. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), Bellevue, WA, USA, 28 June–2 July 2011; pp. 465–472. [Google Scholar]
- Janner, M.; Fu, J.; Zhang, M.; Levine, S. When to trust your model: Model-based policy optimization. In Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, 8–14 December 2019; Volume 32. Available online: https://proceedings.neurips.cc/paper_files/paper/2019/file/5faf461eff3099671ad63c6f3f094f7f-Paper.pdf (accessed on 9 October 2025).
- Hafner, D.; Lillicrap, T.; Fischer, I.; Villegas, R.; Ha, D.; Lee, H.; Davidson, J. Learning latent dynamics for planning from pixels. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; PMLR: New York, NY, USA, 2019; pp. 2555–2565. [Google Scholar]
- Hafner, D.; Lillicrap, T.; Ba, J.; Norouzi, M. Dream to control: Learning behaviors by latent imagination. arXiv 2019, arXiv:1912.01603. [Google Scholar]
- Hafner, D.; Lillicrap, T.; Norouzi, M.; Ba, J. Mastering atari with discrete world models. arXiv 2020, arXiv:2010.02193. [Google Scholar]
- Hafner, D.; Pasukonis, J.; Ba, J.; Lillicrap, T. Mastering diverse domains through world models. arXiv 2023, arXiv:2301.04104. [Google Scholar]
- Brunnbauer, A.; Berducci, L.; Brandstátter, A.; Lechner, M.; Hasani, R.; Rus, D.; Grosu, R. Latent imagination facilitates zero-shot transfer in autonomous racing. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; IEEE: New York, NY, USA, 2022; pp. 7513–7520. [Google Scholar]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2623–2631. [Google Scholar]
Mathematical Symbol | Meaning | Mathematical Symbol | Meaning |
---|---|---|---|
Update Gate | Weight for the input
to the gate | ||
Input vector at time | Weight for the previous state to the gate | ||
Previous cell state | Weight for the input
to the candidate state | ||
Candidate cell state | Weight for the previous state
to the candidate state | ||
Current cell state | Structured State Transition Matrix |
Step | GRU Cell | LS3 Cell | Key Difference |
---|---|---|---|
Update Gate | Same | Same | |
Reset/Modulation Gate | (none) | r-gate removed → fewer parameters and computations | |
Candidate state | Without the r-gate, the entire previous state feeds into the candidate | ||
Past-state transformation | None (implicitly an identity matrix) | Extra linear transform introduced to capture sequential structure | |
Final state | Uses instead of raw |
Mathematical Symbol | Meaning |
---|---|
Two probability distributions on | |
Optimal transport plan | |
The set of all possible joint distributions | |
A specific transport plan | |
Real -dimensional space | |
A realized value of a sample drawn from distributions | |
Displacement | |
The mean of each distribution. | |
Cross-covariance |
Track | Track Layout | Min. Width (m) | Track Length (m) | Min. Radius (m) |
---|---|---|---|---|
Austria | 1.86 m | 79.45 m | 2.78 m | |
Columbia | 3.53 m | 61.20 m | 7.68 m | |
Treitlstrasse | 0.89 m | 51.65 m | 3.55 m |
Model | Austria | Columbia | Treitlstrasse | |||
---|---|---|---|---|---|---|
Step at Peak | Progress at Peak | Step at Peak | Progress at Peak | Step at Peak | Progress at Peak | |
dreamerv1+lidar | 2,963,040 | 1.1557 | 2,873,880 | 1.4029 | 2,255,480 | 1.2099 |
dreamerv1+ lidaroccupancy | 2,955,520 | 1.2144 | 2,778,560 | 1.4805 | 1,255,160 | 1.3090 |
dreamerv2+lidar | 2,856,880 | 1.0671 | 2,160,680 | 1.6327 | 2,952,920 | 1.3595 |
dreamerv2+ lidaroccupancy | 1,565,320 | 1.2321 | 2,762,720 | 1.7101 | 2,955,120 | 1.6414 |
dreamerv3+lidar | 1,555,400 | 1.2742 | 2,971,120 | 1.4764 | 2,352,760 | 1.2849 |
dreamerv3+ lidaroccupancy | 2,955,920 | 1.1753 | 2,470,880 | 1.6361 | 2,952,040 | 1.3658 |
model-based; mbpo | 1,355,400 | 0.9791 | 260,280 | 1.1416 | 152,240 | 0.9868 |
model-based; pilco | 956,040 | 0.9849 | 1,161,680 | 0.9975 | 351,640 | 0.9836 |
model-free; d4pg | 0 | 0.0302 | 2,423,400 | 0.2125 | 0 | 0.0692 |
model-free; mpo | 0 | 0.0717 | 2,960,560 | 1.9109 | 0 | 0.1153 |
model-free; ppo | 860,160 | 0.3600 | 1,167,360 | 2.0939 | 2,478,080 | 0.3234 |
model-free; sac | 2,083,080 | 0.2875 | 769,860 | 2.0048 | 1,057,490 | 0.1270 |
proposed method | 1,466,400 | 1.1428 | 2,867,320 | 1.5649 | 2,251,320 | 1.1604 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, M.; Park, J.-C.; Choi, S.-M.; Kim, G.-W. LiDAR Dreamer: Efficient World Model for Autonomous Racing with Cartesian-Polar Encoding and Lightweight State-Space Cells. Information 2025, 16, 898. https://doi.org/10.3390/info16100898
Kim M, Park J-C, Choi S-M, Kim G-W. LiDAR Dreamer: Efficient World Model for Autonomous Racing with Cartesian-Polar Encoding and Lightweight State-Space Cells. Information. 2025; 16(10):898. https://doi.org/10.3390/info16100898
Chicago/Turabian StyleKim, Myeongjun, Jong-Chan Park, Sang-Min Choi, and Gun-Woo Kim. 2025. "LiDAR Dreamer: Efficient World Model for Autonomous Racing with Cartesian-Polar Encoding and Lightweight State-Space Cells" Information 16, no. 10: 898. https://doi.org/10.3390/info16100898
APA StyleKim, M., Park, J.-C., Choi, S.-M., & Kim, G.-W. (2025). LiDAR Dreamer: Efficient World Model for Autonomous Racing with Cartesian-Polar Encoding and Lightweight State-Space Cells. Information, 16(10), 898. https://doi.org/10.3390/info16100898