HBEVOcc: Height-Aware Bird’s-Eye-View Representation for 3D Occupancy Prediction from Multi-Camera Images

Lyu, Chuandong; Li, Wenkai; Liao, Iman Yi; Ding, Fengqian; Liu, Han; Zhou, Hongchao

doi:10.3390/s26030934

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

HBEVOcc: Height-Aware Bird’s-Eye-View Representation for 3D Occupancy Prediction from Multi-Camera Images

by

Chuandong Lyu

¹

,

Wenkai Li

¹,

Iman Yi Liao

²

,

Fengqian Ding

¹

,

Han Liu

¹

and

Hongchao Zhou

^1,*

¹

School of Information Science and Engineering, Shandong University, Qingdao 266237, China

²

School of Computer Science, University of Nottingham Malaysia, Semenyih 43500, Malaysia

^*

Author to whom correspondence should be addressed.

Sensors 2026, 26(3), 934; https://doi.org/10.3390/s26030934 (registering DOI)

Submission received: 26 December 2025 / Revised: 20 January 2026 / Accepted: 30 January 2026 / Published: 1 February 2026

(This article belongs to the Section Sensing and Imaging)

Download Versions Notes

Abstract

Due to the ability to perceive fine-grained 3D scenes and recognize objects of arbitrary shapes, 3D occupancy prediction plays a crucial role in vision-centric autonomous driving and robotics. However, most existing methods rely on voxel-based methods, which inevitably demand a large amount of memory and computing resources. To address this challenge and facilitate more efficient 3D occupancy prediction, we propose HBEVOcc, a Bird’s-Eye-View based method for 3D scene representation with a novel height-aware deformable attention module, which can effectively leverage latent height information within BEV framework to compensate for lack of height dimension, significantly reducing computing resource consumption while enhancing the performance. Specifically, our method first extracts multi-camera image features and lifts these 2D features into 3D BEV occupancy features via explicit and implicit view transformations. The BEV features are then further processed by a BEV feature extraction network and height-aware deformable attention module, with the final 3D occupancy prediction results obtained through a prediction head. To further enhance voxel supervision along the height axis, we introduce a height-aware voxel loss with adaptive vertical weighting. Extensive experiments on the Occ3D-nuScenes and OpenOcc dataset demonstrate that HBEVOcc can achieve state-of-the-art results in terms of both mIoU and RayIoU metrics with less training memory (even when trained on 2080Ti).

Keywords: autonomous driving; 3D occupancy prediction; multi-camera; BEV representation

Share and Cite

MDPI and ACS Style

Lyu, C.; Li, W.; Liao, I.Y.; Ding, F.; Liu, H.; Zhou, H. HBEVOcc: Height-Aware Bird’s-Eye-View Representation for 3D Occupancy Prediction from Multi-Camera Images. Sensors 2026, 26, 934. https://doi.org/10.3390/s26030934

AMA Style

Lyu C, Li W, Liao IY, Ding F, Liu H, Zhou H. HBEVOcc: Height-Aware Bird’s-Eye-View Representation for 3D Occupancy Prediction from Multi-Camera Images. Sensors. 2026; 26(3):934. https://doi.org/10.3390/s26030934

Chicago/Turabian Style

Lyu, Chuandong, Wenkai Li, Iman Yi Liao, Fengqian Ding, Han Liu, and Hongchao Zhou. 2026. "HBEVOcc: Height-Aware Bird’s-Eye-View Representation for 3D Occupancy Prediction from Multi-Camera Images" Sensors 26, no. 3: 934. https://doi.org/10.3390/s26030934

APA Style

Lyu, C., Li, W., Liao, I. Y., Ding, F., Liu, H., & Zhou, H. (2026). HBEVOcc: Height-Aware Bird’s-Eye-View Representation for 3D Occupancy Prediction from Multi-Camera Images. Sensors, 26(3), 934. https://doi.org/10.3390/s26030934

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

HBEVOcc: Height-Aware Bird’s-Eye-View Representation for 3D Occupancy Prediction from Multi-Camera Images

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI