Automated Image-to-BIM Using Neural Radiance Fields and Vision-Language Semantic Modeling

Mehraban, Mohammad H.; Mirzabeigi, Shayan; Wang, Mudan; Liu, Rui; Sepasgozar, Samad M. E.

doi:10.3390/buildings15244549

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Automated Image-to-BIM Using Neural Radiance Fields and Vision-Language Semantic Modeling

by

Mohammad H. Mehraban

^1,*

,

Shayan Mirzabeigi

^2,3,*

,

Mudan Wang

⁴,

Rui Liu

⁵ and

Samad M. E. Sepasgozar

⁶

¹

School of Construction, Property and Surveying, College of Technology and Environment, London South Bank University, London SE1 0AA, UK

²

Department of Sustainable Resources Management, State University of New York College of Environmental Science and Forestry, Syracuse, NY 13210, USA

³

Department of Mechanical and Aerospace Engineering, Syracuse University, Syracuse, NY 13244, USA

⁴

Department of Engineering, University of Cambridge, Cambridge CB3 0FA, UK

⁵

M.E. Rinker, Sr. School of Construction Management, University of Florida, Gainesville, FL 32611, USA

⁶

School of Built Environment, University of New South Wales, Sydney NSW 2052, Australia

^*

Authors to whom correspondence should be addressed.

Buildings 2025, 15(24), 4549; https://doi.org/10.3390/buildings15244549

Submission received: 30 September 2025 / Revised: 21 November 2025 / Accepted: 12 December 2025 / Published: 16 December 2025

(This article belongs to the Special Issue Artificial Intelligence in Architecture and Interior Design)

Download Versions Notes

Abstract

This study introduces a novel, automated image-to-BIM (Building Information Modeling) workflow designed to generate semantically rich and geometrically useful BIM models directly from RGB images. Conventional scan-to-BIM often relies on specialized, costly, and time-intensive equipment, specifically if LiDAR is used to generate point clouds (PCs). Typical workflows are followed by a separate post-processing step for semantic segmentation recently performed by deep learning models on the generated PCs. Instead, the proposed method integrates vision language object detection (YOLOv8x-World v2) and vision based segmentation (SAM 2.1) with Neural Radiance Fields (NeRF) 3D reconstruction to generate segmented, color-labeled PCs directly from images. The key novelty lies in bypassing post-processing on PCs by embedding semantic information at the pixel level in images, preserving it through reconstruction, and encoding it into the resulting color labeled PC, which allows building elements to be directly identified and geometrically extracted based on color labels. Extracted geometry is serialized into a JSON format and imported into Revit to automate BIM creation for walls, windows, and doors. Experimental validation on BIM models generated from Unmanned Aerial Vehicle (UAV)-based exterior datasets and standard camera-based interior datasets demonstrated high accuracy in detecting windows and doors. Spatial evaluations yielded up to 0.994 precision and 0.992 Intersection over Union (IoU). NeRF and Gaussian Splatting models, Nerfacto, Instant-NGP, and Splatfacto, were assessed. Nerfacto produced the most structured PCs suitable for geometry extraction and Splatfacto achieved the highest image reconstruction quality. The proposed method removes dependency on terrestrial surveying tools and separate segmentation processes on PCs. It provides a low-cost and scalable solution for generating BIM models in aging or undocumented buildings and supports practical applications such as renovation, digital twin, and facility management.

Keywords: facility management; architecture; artificial intelligence; as-built; reconstruction; drawing; modeling; digital twin; image to BIM; scan to BIM

Share and Cite

MDPI and ACS Style

Mehraban, M.H.; Mirzabeigi, S.; Wang, M.; Liu, R.; Sepasgozar, S.M.E. Automated Image-to-BIM Using Neural Radiance Fields and Vision-Language Semantic Modeling. Buildings 2025, 15, 4549. https://doi.org/10.3390/buildings15244549

AMA Style

Mehraban MH, Mirzabeigi S, Wang M, Liu R, Sepasgozar SME. Automated Image-to-BIM Using Neural Radiance Fields and Vision-Language Semantic Modeling. Buildings. 2025; 15(24):4549. https://doi.org/10.3390/buildings15244549

Chicago/Turabian Style

Mehraban, Mohammad H., Shayan Mirzabeigi, Mudan Wang, Rui Liu, and Samad M. E. Sepasgozar. 2025. "Automated Image-to-BIM Using Neural Radiance Fields and Vision-Language Semantic Modeling" Buildings 15, no. 24: 4549. https://doi.org/10.3390/buildings15244549

APA Style

Mehraban, M. H., Mirzabeigi, S., Wang, M., Liu, R., & Sepasgozar, S. M. E. (2025). Automated Image-to-BIM Using Neural Radiance Fields and Vision-Language Semantic Modeling. Buildings, 15(24), 4549. https://doi.org/10.3390/buildings15244549

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automated Image-to-BIM Using Neural Radiance Fields and Vision-Language Semantic Modeling

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI