A Multimodal Model- and Retrieval-Guided Framework for BIM Model Cost Estimation

Al-Derham, Hassan; Chaudhari, Ruchika Jagannath; Gao, Lu; Senouci, Ahmed

doi:10.3390/buildings16112103

Open AccessArticle

A Multimodal Model- and Retrieval-Guided Framework for BIM Model Cost Estimation

¹

Department of Civil and Environmental Engineering, Qatar University, Doha 2713, Qatar

²

Department of Civil and Environmental Engineering, University of Houston, Houston, TX 77004, USA

^*

Author to whom correspondence should be addressed.

Buildings 2026, 16(11), 2103; https://doi.org/10.3390/buildings16112103

Submission received: 10 April 2026 / Revised: 5 May 2026 / Accepted: 18 May 2026 / Published: 25 May 2026

(This article belongs to the Special Issue Digital Technologies in Construction and Built Environment)

Download

Browse Figures

Versions Notes

Abstract

BIM model-based construction cost estimation requires reliable linkage between model-derived building information and estimator-facing cost records. However, BIM models and structured cost databases use different descriptive logics: BIM model data primarily describe what a building component is in the model, whereas cost records primarily describe how that component is constructed, measured, and priced. When BIM model names are non-standard or properties are incomplete, this mismatch may lead to ambiguous cost item selection, particularly when candidate records differ in unit basis, material assembly, thickness, finish, fire rating, or performance requirements. To address this problem, this study proposes a multimodal model- and retrieval-guided framework for BIM model-based cost estimation. The framework converts BIM model content into standardized estimator-readable descriptions, retrieves cost database candidate entries, applies rule-based checks for unit, material, thickness, finish, and fire rating consistency, and produces reviewable cost item selections for database-based cost calculation. The method uses a multimodal model to supplement and standardize component information, while cost records remain the authority for unit prices rather than being replaced by model-generated estimates. The framework was evaluated using a BIM example containing 7374 building elements across 21 model element types, together with a structured cost database containing approximately 11,500 pricing records. The full workflow reduced unmatched categories and improved pricing coverage relative to direct cost item retrieval. The results indicated that the proposed method can improve the technical appropriateness and coverage of cost item selection. The study contributes a reviewable workflow that integrates BIM model content, multimodal description standardization, cost database candidate retrieval, rule-based specification filtering, and database-grounded cost synthesis for selecting justified cost items under practical estimating ambiguity.

Keywords:

BIM; BIM model; cost database; automated cost estimation; quantity takeoff; multimodal language model; assembly-aware matching; retrieval-guided matching; construction informatics; 5D BIM

1. Introduction

Construction cost estimation is an important component of design evaluation, procurement planning, and scope control. In practice, many estimating workflows still depend on manually maintained cost databases, estimator interpretation, and repeated revision of quantity and price information. Previous studies indicate that mappings between building model objects and cost information are difficult to formalize and maintain, that quantity-surveying practice continues to require expert interpretation, and that cost database-centered takeoff remains reviewable but slow to update [1,2,3]. Cost database-based estimation is also sensitive to inconsistency in quantity measurement and item selection [4].

A central difficulty is that BIM models and structured cost databases describe the built environment according to different representational logics. BIM models primarily identify what a component is in the building model, including its class, object identity, geometry, type name, property sets, and sometimes material or performance attributes. Cost records primarily specify how that component is constructed, measured, and priced, including unit basis, assembly condition, material build-up, thickness, finish, fire rating, installation context, and unit-price logic. Existing BIM-based costing research has addressed object semantics, ontology-supported cost estimation, quantity-takeoff readiness, and formal links between model entities and cost items [5,6,7,8,9]. Although these studies strengthen BIM-based costing, the practical mismatch between model-side component descriptions and estimator-facing cost record descriptions remains difficult to resolve.

Existing research has addressed several components of the BIM model-to-cost workflow, including open-BIM quantity takeoff, authoring tool-independent extraction, 5D BIM, BIM-BOQ synchronization, retrieval grounding for structured construction information, and LLM-assisted linking between cost descriptions and model objects [10,11,12,13,14,15,16]. However, a key gap remains in resolving mismatches between BIM model items and cost database items. These mismatches occur because BIM model descriptions are often incomplete, non-standard, or design-oriented, whereas cost database items are organized by construction scope, measurement unit, material assembly, dimensions, thickness, finish, fire rating, and installation conditions. As a result, one BIM item may correspond to several plausible cost database records, making accurate cost item selection difficult.

To address this problem, this paper formulates BIM model-to-cost automation as a cost item matching task rather than a generative pricing task. The proposed workflow converts BIM model content into standardized estimator-readable descriptions, retrieves candidate cost database items, checks unit basis, material assembly, thickness, finish, fire rating, and performance requirements through rule-based screening, and calculates final costs from extracted quantities and accepted database unit rates. The main contribution of this study is a multimodal, rule-constrained workflow for matching BIM model items with appropriate cost database items when model descriptions are incomplete, non-standard, or ambiguous. The proposed workflow was evaluated through a case study using a BIM model and a structured cost database.

The remainder of this paper is organized as follows. Section 2 reviews prior work on BIM-based quantity takeoff, BIM model-based estimating, model-to-cost mapping, and multimodal language model support in construction informatics. Section 3 presents the methodology and the four workflow phases. Section 4 presents the case study. Section 5 discusses implications, limitations, and reproducibility. Section 6 concludes the paper.

2. Literature Review

2.1. BIM Model-Based Quantity Takeoff

Quantity takeoff is one of the most developed areas of BIM-enabled cost estimation. Choi et al. [10] proposed an open BIM-based QTO system for schematic frame estimation and showed that early design takeoff becomes more reliable when level-of-detail assumptions and model checks are formalized. Monteiro and Martins [11] emphasized that quantity extraction depends on QTO-oriented modeling guidelines, not only on the availability of geometry. Akanbi and Zhang [12] further showed that BIM model-based QTO algorithms can remain stable across authoring environments when extraction logic is explicitly encoded.

A second group of studies has focused on validation, reproducibility, and domain-specific QTO extensions. Olsen and Taylor [17] identified practical limiting factors in model-based QTO. Moreira et al. [7] developed a BIM model-based workflow for checking whether exported models satisfy quantity takeoff requirements for regulated cost estimation. Furstenberg et al. [18] showed that automated QTO in road projects depends on explicit classification logic and reusable cost breakdown structures. Valinejadshoubi et al. [19,20] showed that automated QTO can be improved when extraction is combined with validation and quantity precision checking. Alathamneh et al. [21] similarly concluded that automated QTO is increasingly capable but remains sensitive to model quality, exchange consistency, and rule completeness.

Recent work has also connected QTO to downstream planning, auditing, and cost-oriented model enrichment. Pham et al. [22] linked BIM-based quantity extraction to daily concrete and formwork planning. Liu et al. [23] used a knowledge model-based BIM framework for automatic model auditing and quantity takeoff. Ergen and Bettemir [24] developed ontological algorithms for exact QTO of reinforced-concrete items. Marcinkeviciute et al. [25] showed that semantic enrichment and simplification can recover more cost-relevant information from low-detail BIM models. These studies provide a strong basis for BIM model quantity extraction, but they mainly address whether quantities can be extracted and checked. They do not by themselves determine which cost database item should be selected after the quantity is available.

2.2. BIM-Based Cost Estimation and 5D BIM Workflows

The broader BIM-based cost-estimation literature addresses the transition from model quantities to cost information more directly. Pishdad and Onungwa [13] reviewed cost estimation, cost control, and payment applications in 5D BIM and found that preconstruction functions are relatively mature, while lifecycle integration remains fragmented. Cassandro et al. [4] reached a related conclusion by showing that traditional 5D practice and BIM model-based cost workflows often coexist without a fully explicit mapping logic between model information and pricing records.

Other studies have attempted to make BIM-based cost estimation more interoperable and explicit within model-centered workflows. Akanbi [26] developed interoperable BIM-based systems that combined ontology-based costing with NLP-supported information extraction. Cassandro et al. [8] defined a structured cost architecture within open-BIM workflows, and Cassandro et al. [9] connected that logic to a structural model through formal cost entities. Gholamzadehmir et al. [27] extended model-linked costing toward retrofit and energy analysis. These studies show that cost information can be represented more explicitly when cost entities and model entities are formally connected.

A related branch of research has focused on cost-management, lifecycle reasoning, and prediction tasks. Chong et al. [14] studied BIM-BOQ synchronization, while Solla et al. [28] linked BIM outputs to price guideline workflows. Rostamiasl and Jrade [29] and Moradabadi et al. [30] demonstrated BIM-supported life cycle cost reasoning. Park and Yun [31], Wang and Qiao [32], and Cheng et al. [33] used machine learning and deep learning models for cost prediction. Sadikoglu and Demirkesen [34] reviewed this area and concluded that AI-based cost prediction is expanding where historical datasets are available. These studies clarify how BIM can support 5D workflows, formal cost entities, lifecycle reasoning, and predictive estimation. However, they do not fully address the entry-level problem of selecting one cost database item when model-side descriptions are incomplete and several technically plausible records coexist.

2.3. Model-to-Cost Mapping and Structured Cost Databases

Another line of research has addressed the estimating problem by structuring the cost domain instead of relying only on loosely organized text. Lawrence et al. [1] showed that mapping logic must adapt to different representational structures rather than assume one fixed model-to-cost correspondence. Lee et al. [6] developed an ontology-based approach to building cost estimation. Akanbi [26] broadened this direction through BIM-based methods that combine extraction, classification, and costing logic. The value of this line of work is that it shifts model-to-cost mapping away from simple text similarity and toward explicit semantic and structural correspondence.

Recent studies have made this mapping agenda more concrete through cost item structures, validation logic, and knowledge-supported model interrogation. Cassandro et al. [8] defined a structured cost architecture within open-BIM workflows, explained how cost items can relate to geometric entities, and proposed semi-automated validation using IDS-informed requirements. Cassandro et al. [9] then connected cost entities to a public price list in a structural model workflow. Liu et al. [23] used a knowledge model-based BIM framework to support automatic auditing and quantity reasoning. Marcinkeviciute et al. [25] showed that semantic enrichment and simplification can make more cost-relevant information available from low-detail BIM models. These approaches improve the representational basis for model-to-cost linkage before final pricing decisions are made.

Adjacent work on BOQ synchronization, semantic enrichment, and machine learning-integrated 5D informatics reinforces the same direction. Chong et al. [14] aimed to make BIM-BOQ linkage more coherent and queryable. Zhang et al. [35] showed that semantic enrichment of open-BIM models can improve cost linkage. Banihashemi et al. [36] integrated machine learning into 5D BIM workflows. Moreira et al. [7] are also relevant because they show that mapping quality depends on whether BIM model data are complete and verifiable enough to support downstream pricing decisions. This body of work improves the semantic basis for model-to-cost linkage, but the remaining practical difficulty is still the selection of a specific cost database entry under unit, material, thickness, finish, fire rating, and assembly ambiguity.

2.4. AI, NLP, and Multimodal Language Model Support in Construction Informatics

AI, NLP, and LLM research in construction informatics has expanded rapidly. Akanbi [26] showed that NLP-supported information extraction can help transform construction text into cost-relevant structured data. Chiarello et al. [37] similarly demonstrated the potential of generative LLMs for reducing information-handling effort in construction workflows. Multimodal language models extend this role by allowing visual BIM input, such as assembly views or labeled component images, to supplement text-based model attributes. However, language or visual interpretation alone does not automatically provide reviewable pricing decisions.

Several recent studies have examined retrieval-grounded or BIM-facing LLM workflows. Uhm et al. [15] showed that retrieval grounding can improve access to construction safety information without relying on unrestricted generation. Lee et al. [38] proposed an LLM-supported BIM interface for natural language access to model data, and Liu et al. [39] examined agentic BIM querying through an MCP-based workflow. Broader AI-BIM reviews by Valdebenito and Forcael [40], Du et al. [41], and Kampelopoulos et al. [42] emphasize integration patterns, deployment constraints, data readiness, and building-oriented retrieval interfaces. Li and Wang [43] further illustrate how the BuildingGPT system can connect language models with structured building information through retrieval-oriented building interfaces. These studies suggest that LLMs are most useful when they are coupled to structured data, visual or textual input, retrieval context, and execution constraints.

A smaller group of studies has applied language models more directly to cost-estimation tasks. Gatto et al. [16] examined prompt-based links between cost domain descriptions and BIM model objects. Studies on GPT-supported construction workflows and LLM-supported estimating systems further show growing interest in this area [44,45]. However, a consensus method for reviewable cost item matching has not yet been established. In cost estimation, this issue is important because an incorrect item selection can silently propagate into an incorrect unit rate, and unconstrained generation can weaken reviewability even when the generated explanation appears plausible.

The reviewed AI and LLM literature therefore supports the use of language models for extraction, retrieval support, interface design, semantic bridging, and visual input interpretation. For cost estimation, the remaining challenge is to use multimodal models to supplement and standardize component information while keeping cost records, unit checks, and specification rules as the decision backbone.

2.5. Unresolved Gap Addressed in This Study

The reviewed literature indicates that BIM model-based quantity extraction, model-readiness validation, structured model-to-cost representation, and AI-assisted semantic support are all advancing. These capabilities are necessary for BIM model-based estimating, but they do not by themselves resolve the descriptive mismatch between BIM models and cost databases. BIM model data primarily identify what an element is in the model, whereas cost databases primarily encode how the element is constructed, measured, and priced. When model names are non-standard, properties are incomplete, or assembly context is represented more clearly in visual input than in text fields, the selection of an appropriate cost item remains difficult.

The unresolved issue is therefore not whether BIM model data can contribute to cost estimation, but how BIM model content can be converted into standardized, estimator-readable information for selecting justified cost items under reviewable controls. This study addresses that issue by developing a workflow in which a multimodal model supplements and standardizes component descriptions, cost database candidates are retrieved, rule-based checks screen unit, material, thickness, finish, and fire rating consistency, and final costs are calculated only from accepted database unit rates while retaining object-level source identifiers and pricing records.

3. Methodology

3.1. Overall Workflow

The proposed method translates BIM model content into records that can be matched with a cost database. Figure 1 summarizes the four-phase process. The workflow begins with automated quantity takeoff from the BIM model, where element classes, measurable quantities, and relevant specification cues are extracted. The extracted records are then processed through multimodal description standardization, in which model-side wording and assembly cues are converted into estimator-readable descriptions. These descriptions are used in retrieval-guided cost record matching, which combines semantic retrieval, rule-based filtering, lexical refinement, and bounded reranking to select a supported cost database record under the present case study conditions. The selected unit rate is then multiplied by the extracted quantity to generate component-level and project-level costs.

3.2. Phase I: Automated Quantity Takeoff from the BIM Model

Phase I defines the quantity and specification base used by the subsequent matching and costing phases. For each estimating-relevant element, the workflow records the element class, source identifier, quantity basis, quantity value, quantity source, and available descriptive cues such as type name, material, finish, fire rating, and selected property-set values. Native model quantities are used when available, whereas geometry- or metadata-derived quantities are used when native quantities are absent or insufficient. The quantity basis is assigned according to the construction meaning of the element, with discrete objects measured by count, planar systems measured by area, and linear elements measured by length.

3.3. Phase II: Multimodal Description Standardization for Estimator-Readable Matching

Phase II addresses the descriptive gap between BIM model-derived component information and the construction-and-pricing terminology used in structured unit-price cost databases. Although BIM model records often contain technically meaningful information, their labels are rarely written in a form that directly supports reliable cost item retrieval. In practice, model-side descriptions may include abbreviations, exporter-specific naming conventions, fragmented material expressions, or incomplete wording, whereas cost database entries are organized around construction descriptions, measurement units, assemblies, finishes, and unit price assumptions.

To reduce this gap, the workflow applies multimodal description standardization before retrieval. In this study, the standardization component uses an instruction-tuned language model for BIM model text fields and a vision language model. Its role is limited to improving retrieval compatibility by converting BIM model-derived descriptive fields and visual assembly cues into concise, estimator-readable descriptions.

To demonstrate the LLM standardization step, an interior partition-wall record is used as an example. Figure 2, Figure 3 and Figure 4 present the standardization prompt, the structured YAML input from the BIM model, and the resulting JSON output. The input is supplied as structured YAML so that the model receives explicit fields, including wall type, quantity basis, material cues, nominal thickness, and fire rating, rather than an unconstrained paragraph.

The LLM output is constrained to JSON so that downstream retrieval receives a predictable field structure.

The visual example uses a window object as a separate Phase II demonstration. The window image in Figure 5 was used as supplementary visual input together with the Phase I text record. The purpose of the VLM check (VLM prompt shown in Figure 6) was to identify visible component cues that could support later retrieval, such as framed opening condition, panel arrangement, glazing presence, and whether the object appears to be a window rather than a wall panel or opening void. The VLM output was therefore treated as supplementary descriptive input and was not used to generate costs, assign quantities, or override the cost database.

The summarized VLM output identified the object as a framed window-like component with a rectangular frame, visible glazing or panel surface, and a vertical mullion or meeting rail. For cost matching, these cues support retrieval toward window records rather than generic openings, wall panels, or other envelope elements.

At the end of Phase II, the LLM output and VLM output are merged only at the same-element level. The LLM-derived standardized description remains the primary retrieval query, while BIM quantity basis, material fields, thickness, finish, and performance-related attributes are copied from the structured Phase I record. VLM-derived cues are appended as auxiliary retrieval terms only when they are visually supported, and VLM limitations are retained as restrictions on information that cannot be inferred from the image. The merged Phase II record is then passed to Phase III, where the standardized description supports dense retrieval and the structured fields support rule-based unit, material, thickness, finish, assembly, and fire rating filters.

3.4. Phase III: Retrieval-Guided Cost Matching and Record Selection

Phase III is implemented as a constraint-guided record selection process that maps each standardized BIM model record either to one cost database item or to a conservative abstention outcome. Let i index BIM model-derived query records and let j index cost database entries. For query i, the Phase II output is a standardized descriptor

q_{i}

, and each database entry is represented by a retrieval text

c_{j}

. Dense retrieval supplies candidate recall, while compatibility constraints, lexical refinement, shortlist retention, and bounded reranking preserve pricing validity.

Let

e (\cdot)

denote the sentence embedding function. The semantic similarity between query

q_{i}

and database entry

c_{j}

is computed by cosine similarity,

s_{i j} = \frac{e {(q_{i})}^{⊤} e (c_{j})}{∥ e (q_{i}) ∥_{2} {∥ e (c_{j}) ∥}_{2}}

(1)

and dense retrieval returns the top-K candidate pool

C_{i}^{(K)} = \{j : s_{i j} is among the top - K similarity values for query i\}

(2)

The set

C_{i}^{(K)}

is the retrieval–recall pool rather than a final decision set.

Each candidate

j \in C_{i}^{(K)}

is then screened by compatibility constraints. Let

u_{i j} \in {0, 1}

denote hard unit-basis compatibility, where

u_{i j} = 1

only when the query quantity basis is dimensionally compatible with the database entry unit. Let

d_{i j} \in {0, 1}

denote the non-unit hard-feasibility indicator, where

d_{i j} = 1

only when no rule-based exclusion is triggered by available class anchors, clearly contradictory dimensions, or contradictory specification-bearing cues such as material, finish, support context, and performance attributes. The feasible candidate set is therefore

F_{i} = \{j \in C_{i}^{(K)} : u_{i j} = 1, d_{i j} = 1\} .

(3)

For each feasible candidate, lexical refinement and rule-based information are combined in a unified score. The ranking indicators below are soft preferences applied only after hard feasibility screening. Let

a_{i j}^{cat}, a_{i j}^{unit}, a_{i j}^{mat 1}, a_{i j}^{mat 2}, a_{i j}^{fin}, a_{i j}^{perf} \in {0, 1}

denote agreement indicators for category, exact unit-form agreement within the already unit-compatible set, primary material, secondary material, finish, and performance-related cues, respectively. Let

ℓ_{i j}^{desc} \in [0, 1]

be the normalized lexical similarity between the Phase II descriptor and the database search text, and let

ℓ_{i j}^{name} \in [0, 1]

be the normalized lexical similarity between the source model name and the database wording; both lexical terms are implemented using RapidFuzz sequence similarity. Let

b_{i j}^{thk}

denote the graded thickness bonus,

b_{i j}^{thk} = \{\begin{matrix} β_{1}, & if | Δ t_{i j} | \leq δ_{1}, \\ β_{2}, & if δ_{1} < | Δ t_{i j} | \leq δ_{2}, \\ β_{3}, & if δ_{2} < | Δ t_{i j} | \leq δ_{3}, \\ 0, & otherwise or if no comparable thickness is available, \end{matrix}

(4)

where

Δ t_{i j}

is the thickness difference between query-side and entry-side information when both are available. The unified rule-based score is then

\begin{matrix} r_{i j} = & w_{cat} a_{i j}^{cat} + w_{unit} a_{i j}^{unit} + w_{desc} ℓ_{i j}^{desc} + w_{name} ℓ_{i j}^{name} \\ + w_{mat 1} a_{i j}^{mat 1} + w_{mat 2} a_{i j}^{mat 2} + w_{fin} a_{i j}^{fin} + w_{perf} a_{i j}^{perf} + b_{i j}^{thk}, \end{matrix}

(5)

for all

j \in F_{i}

. The additive score

r_{i j}

is an implementation-level heuristic ranking score rather than a probability or calibrated utility. Because the weighted terms and thickness bonus are summed directly,

r_{i j}

is not constrained to

[0, 1]

; the cutoffs below therefore act as case study decision thresholds rather than confidence levels. The weights

w_{(\cdot)}

, thickness bonuses

β_{(\cdot)}

, and thickness breakpoints

δ_{(\cdot)}

are preset implementation parameters for the present case study.

Only candidates with sufficient rule-based support are retained. Let

τ_{keep}

denote the minimum retention threshold and let M denote the shortlist cap. The retained shortlist is

\begin{matrix} S_{i} = {j \in F_{i} : & r_{i j} \geq τ_{keep}, \\ r_{i j} is among the top - M rule scores in F_{i}} \end{matrix}

(6)

Equation (6) matches the implementation rule that only candidates scoring at least

τ_{keep}

are retained and that at most M explicit cost database entries are passed forward to the decision phase.

The final entry decision is made by either the rule-based phase or bounded reranking. Let

{\bar{r}}_{i} = \max_{j \in S_{i}} r_{i j}

(7)

denote the strongest surviving rule-based score when

S_{i} \neq \emptyset

. When

S_{i} = \emptyset

,

{\bar{r}}_{i}

is left undefined and the decision rule defaults directly to abstention. Let

g_{LLM} (q_{i}, c_{j})

denote the ordinal preference score induced by selection-only LLM reranking over the explicit shortlist candidates only, with larger values preferred. The selection rule is

j_{i}^{★} = \{\begin{matrix} \arg \max_{j \in S_{i}} r_{i j}, & if S_{i} \neq \emptyset and {\bar{r}}_{i} \geq τ_{LLM}, \\ \arg \max_{j \in S_{i}} g_{LLM} (q_{i}, c_{j}), & if S_{i} \neq \emptyset and {\bar{r}}_{i} < τ_{LLM}, \\ \emptyset, & if S_{i} = \emptyset, \end{matrix}

(8)

where ∅ denotes abstention. The LLM is therefore invoked only when the strongest surviving rule-based candidate remains below

τ_{LLM}

.

The fallback policy preserves the same conservative logic. If bounded reranking does not return a valid shortlist selection, the workflow falls back to the strongest surviving rule-based candidate when

S_{i} \neq \emptyset

and otherwise abstains:

{\tilde{j}}_{i} = \{\begin{matrix} j_{i}^{★}, & if bounded reranking returns a valid candidate, \\ \arg \max_{j \in S_{i}} r_{i j}, & if S_{i} \neq \emptyset but reranking fails, \\ \emptyset, & if S_{i} = \emptyset . \end{matrix}

(9)

In all non-abstaining cases, the final unit rate is read directly from the selected cost database entry

{\tilde{j}}_{i}

; no unit rate is generated by the language model. Phase III is therefore a constraint-guided item-selection process in which retrieval supplies recall, rule-based filters enforce feasibility, bounded reranking resolves residual ambiguity, and abstention is preferred over unsupported pricing.

Figure 7 provides a step-by-step summary of this selection logic. It is included to make the operational sequence explicit while preserving the mathematical definitions in Equations (1)–(9).

Figure 8 illustrates the same Phase III sequence from dense semantic search through rule-based filtering to the final accepted cost database item. It should be read as a compact visual counterpart to Equations (1)–(9) and Figure 7. The figure does not add new algorithmic logic; it summarizes the same sequence of candidate recall, rule-based screening, weighted scoring, shortlist retention, bounded reranking, and final item acceptance from the cost database.

3.5. Phase IV: Database-Based Cost Synthesis and Reporting

Phase IV converts accepted quantity-by-rate pairs into component-level and project-level cost outputs. For each matched item, the extracted quantity is multiplied by the selected database unit rate:

C_{i} = Q_{i} \times R_{i}

(10)

where

Q_{i}

is the extracted quantity and

R_{i}

is the unit price read from the selected cost record. The project total is then the sum of all accepted line-item costs:

C_{total} = \sum_{i} C_{i}

(11)

This phase keeps the final estimate tied to measurable quantities and explicit cost database records. The reporting output retains the model element class, source identifier, quantity source, standardized description, matched cost record reference, unit basis, selected unit price, and final cost contribution.

4. Case Study

4.1. Case Study Data

The proposed framework was assessed using the Snowdon Towers sample BIM model as a full-building case study. The model contains 7374 extractable building elements across 21 active IFC entity types, and the entry-level pricing reference consists of an RSMeans-based cost database containing approximately 11,500 unit-price records distributed across 23 trade-specific categories. A separate Snowdon-specific total-cost reference database is used only as supplementary project-level context. The case study demonstrates how the full workflow operates on a realistic BIM dataset and makes each phase visible for review, from quantity takeoff through multimodal standardization, RSMeans item selection, and project-level cost reporting.

Figure 9 establishes the physical scope of the case study before the phase-wise results are introduced. It shows the complete three-dimensional building model together with a project-scale structural and system overview derived from the same IFC file. These panels indicate that the workflow was applied to the full building rather than to a limited sample of elements, and they show the project-wide extent of the measurable scope that later enters the quantity takeoff and cost-estimation process. They also relate the subsequent tabular outputs and cost results to a complete building representation rather than to disconnected cost database inputs.

4.2. Implementation Environment

All experiments were conducted in a GPU-enabled cloud environment to provide a consistent implementation setting. The implementation used an open-source software stack so that the methodology could be reproduced without relying on proprietary BIM authoring or commercial estimating software. Within this environment, IfcOpenShell was used for IFC parsing, entity traversal, property-set access, and quantity-related geometric extraction. Structured table handling, aggregation, and report generation were implemented using pandas and openpyxl. For retrieval, dense text embeddings represented standardized descriptions and cost database entries in a semantic vector space, while an approximate nearest-neighbor index supported top-k candidate generation. RapidFuzz was used to refine candidates through lexical comparison. The runtime environment used Python 3.12, PyTorch 2.10.0 with CUDA 12.8, Transformers 4.51.3, and Accelerate 1.6.0 on an NVIDIA GPU instance.

Language model components were incorporated only in constrained and predefined roles. Rather than allowing unrestricted generation for pricing decisions, the multimodal language model component was used to standardize IFC-derived descriptions and available BIM visual cues into estimator-readable wording, and the language model reranking component was used only when ambiguity remained within a small shortlist. The BIM visual-input examples were processed with Qwen3-VL-Flash as the vision language model, which was used only to describe visible component cues for later matching. The final cost decision therefore remained grounded in explicit cost database records and retained unit prices.

For cost-line retrieval, dense semantic embeddings were generated using the sentence-transformer model all-MiniLM-L6-v2 and indexed in a FAISS approximate-nearest-neighbor structure configured for cosine similarity over normalized embeddings. For each query, the retriever returned a top-K candidate pool for subsequent rule-based filtering and lexical refinement. Phase II description standardization used an instruction-tuned language model for IFC text fields and a vision language model for available BIM visual input. Phase III selection-only reranking, when invoked, used fixed decoding settings, an output cap, and disabled sampling to reduce output variance. The shortlist passed to the reranker was capped at M candidates, the minimum rule score for retaining a candidate in the shortlist was

τ_{keep}

, and language model reranking was skipped when the best rule-based candidate already scored at least

τ_{LLM}

.

The implementation settings include the field mapping, rule weights, thresholds, shortlist size, fallback policy, and prompt constraints used in the workflow. These constants are implementation settings for the present case study rather than universal estimating parameters. Broader archival release would require multi-project evaluation together with representative IFC inputs and cost database extracts. This boundary is stated to improve procedural transparency while keeping the reported reproducibility within the demonstrated scope.

The Phase III parameter values used in the case study workflow are summarized in Table 1. These values were selected after repeated pilot tests on the case study dataset because they provided a practical balance among candidate recall, rule-based filtering strength, shortlist interpretability, and reranking stability.

For the full Snowdon Towers case study, processing 7374 extracted elements required approximately 68 min in the reported cloud GPU environment. Table 2 summarizes the stage-level processing time. The runtime was dominated by BIM text normalization and bounded LLM reranking, whereas IFC parsing, retrieval, rule-based filtering, and cost synthesis required substantially less time.

4.3. Phase-Wise Application of the Workflow

The Snowdon Towers model was processed through all four phases of the proposed methodology. The case study is presented in this phase-wise form because the framework is not a one-step mapping from IFC object names to prices. Each phase produces an intermediate output that constrains the next phase. This section therefore reports the phase outputs, workflow comparison results, and project-level cost results in the order in which they are generated.

4.3.1. Phase I: Automated Quantity Takeoff from the IFC Model

Phase I begins with IFC parsing and quantity extraction. The output of this phase is an element-level quantity-takeoff inventory in which each record retains the IFC class, source identifier, geometry-related information, and descriptive cues such as type names, material associations, and selected property set attributes. A class-aware quantity basis is then assigned to each object according to its construction meaning. Planar systems such as walls, slabs, roofs, and coverings are measured by area; linear systems such as railings and columns are measured by length; and discrete objects such as doors, windows, flow terminals, and transport elements are measured by count. Table 3 reports the project-scale extraction summary and indicates that the workflow was applied across a broad set of building-element categories rather than to a narrow subset of components.

Table 4 reports a representative wall net-area sample to illustrate the role of opening deductions in quantity-based pricing. This sample is included because wall costs may be overstated when gross surface area is priced without deducting doors, windows, and other openings.

The wall sample indicates that quantity takeoff functions as both a geometric operation and an estimating control. The largest example in the table decreases from 518.64 m² gross area to 210.03 m² net area after opening deductions, illustrating why wall quantities must remain linked to source measurements before any RSMeans item is selected. Figure 10 presents representative isolated element views to make the Phase I quantity-basis logic visible at the object level. These examples indicate why different IFC objects cannot be measured with a single universal rule. A flow terminal is interpreted as count-based, whereas wall, roof, door, window, slab, and column objects require estimating interpretation because their cost significance depends on structural role, enclosure function, and assembly type. The roof assembly panel further illustrates why assembly-bearing information should be considered before pricing.

4.3.2. Phase II: Multimodal Description Standardization into Estimator-Readable Language

Phase II converts raw IFC-derived records and available BIM visual input into standardized descriptions that are more compatible with the structured language of RSMeans. This phase is necessary because software-generated names often contain colon-separated type strings, abbreviations, and fragmented descriptions that are meaningful in the BIM environment but not directly usable for cost retrieval. The multimodal model does not change measurable information; it restates IFC text, property cues, and visual assembly cues in a clearer and more pricing-oriented form while retaining numeric values and source fields.

Phase II outputs exhibit a consistent pattern across the sampled wall, door, fixture, and roof records. Colon-separated or abbreviated IFC labels are rewritten into estimator-readable descriptions that preserve specification-bearing cues such as fire rating, internal or external condition, thickness, swing type, membrane type, and major material or assembly terms. For example, “Basic Wall:Interior—6 1/8 in. Partition (2-hr)” is rewritten as an interior non-load-bearing partition wall with explicit thickness and 2-h fire rating; “Basic Wall:Exterior—12 5/8 in. Rainscreen w Insulation on Metal Stud” is rewritten as an exterior rainscreen wall with insulation, metal stud support, and fire rating cues preserved; “M_Single-Flush:0915 × 2134 mm” becomes a single-leaf flush door with stated swing and dimensional context; and “Basic Roof:Insulation on Metal Deck epdm” becomes an exterior roof assembly with EPDM membrane over rigid insulation on metal deck. The roof example also indicates how BIM visual input can reinforce assembly information when IFC text is incomplete. These examples are used to characterize the standardization behavior documented in the comparison materials without adding a separate text-heavy table to the manuscript.

4.3.3. Phase III: Retrieval-Guided Cost Matching and Entry Selection

Phase III converts measured IFC information and standardized multimodal descriptions into an explicit RSMeans item choice. Standardized descriptions are compared with the structured cost database using retrieval-guided matching. Semantic retrieval first identifies plausible candidates, and rule-based filtering then removes entries that fail unit, material, thickness, finish, fire rating, or specification consistency. Lexical refinement improves local ranking among technically similar entries. If ambiguity remains, bounded reranking is applied over a small set of explicit RSMeans candidates. The accepted unit rate is always read directly from the selected entry rather than generated by the language model.

Phase III is central to the case study because many RSMeans records are semantically similar but differ materially in fire rating, thickness, finish, assembly type, or unit basis. Its function is therefore not limited to candidate retrieval; it also provides a structured basis for cost review. This phase keeps final prices tied to explicit RSMeans entries rather than free-form model output. At record level, the retained review record associates each accepted cost with IFC class, source identifier, quantity basis, standardized description, visual input when used, and selected RSMeans entry.

4.3.4. Phase IV: Database-Based Cost Synthesis and Reporting

Phase IV multiplies extracted quantities by the selected database unit rates to generate component-level and project-level cost outputs. The results are then aggregated into category summaries and total-building costs. This final phase keeps the estimate database-based by deriving it from measurable quantities and explicit cost records rather than from open-ended text generation.

4.4. Comparative Case Studies of Cost-Item Matching

In order to demonstrate the entry-level matching logic of the proposed workflow, four cases were selected to examine unit-basis consistency, construction-scope alignment, assembly coverage, and thickness-sensitive item selection. Direct cost item retrieval uses the raw IFC description to retrieve the top-ranked RSMeans pricing record, whereas the proposed method standardizes BIM model content before applying unit, material, thickness, finish, and assembly checks.

4.4.1. Case 1: Direct Retrieval vs. Proposed Method for Measurement Unit Selection

Case 1 examines a measurement-unit mismatch for an exterior rainscreen wall assembly. The selected IfcWallStandardCase has the raw IFC name Basic Wall:Exterior—12 5/8” Rainscreen w Insultation on Metal Stud:676281. The IFC input identifies an external, non-load-bearing, 2-h rated wall with metal panels, an air cavity, rigid insulation, gypsum sheathing, metal studs, and gypsum wall board. The extracted face area is approximately 5.859 m² (63.07 ft²). Figure 11 represents the wall as a layered assembly because the cost-relevant issue is the assembly composition and area-based measurement rather than only the solid geometry.

Direct cost item retrieval returned a load-bearing metal stud framing item measured by linear foot. This result is textually plausible because the IFC name contains “metal stud,” but it conflicts with the wall’s area-based quantity and non-load-bearing property. The proposed method removes the linear-foot candidate and retains area-compatible wall backup and metal panel entries, producing a reviewable assembly decomposition rather than forcing a one-to-one match that is not supported by the cost database. Table 5 summarizes the comparison.

4.4.2. Case 2: Direct Retrieval vs. Proposed Method for Scope Selection

Case 2 examines a scope mismatch for a single flush passage door. The selected IfcDoor has the raw IFC name Door-Passage-Single-Flush:36” × 84”:631497. The IFC input indicates a non-fire-rated, single-swing left passage door with a nominal 36 in. by 84 in. size and birch/wood material information. Figure 12 shows the selected door object. The matching issue in this case is whether the selected cost item represents the door itself rather than an accessory system.

Direct cost item retrieval returned an automatic door operator item after non-price guide pages were excluded. The phrase “single-swing door” makes this item semantically close, but its construction scope does not correspond to a flush passage door. The proposed method uses the door type, size, material information, and each-based quantity basis to select a flush wood door pricing record, with an integrated passage door assembly retained as a secondary candidate. Table 6 reports the resulting entries.

4.4.3. Case 3: Direct Retrieval vs. Proposed Method for Assembly Selection

Case 3 examines assembly preservation for an EPDM roof system. The selected IfcSlab roof component is aggregated under an IfcRoof and has the raw IFC name Basic Roof:Insulation on Metal Deck—EPDM:759489. The IFC input gives an area of approximately 227.68 S.F. and identifies EPDM membrane, rigid insulation, concrete, and metal deck layers. Figure 13 represents this roof as an assembly because the relevant issue is whether the selected RSMeans item retains the insulation and substrate information.

Direct cost item retrieval returned an EPDM membrane-only item. This is a reasonable semantic match for the term “EPDM,” but it under-represents the IFC assembly because the model explicitly includes insulation on metal deck. The proposed method retains area-compatible roofing and insulation entries and treats the result as an assembly decomposition, while keeping the metal deck as substrate information rather than adding it to the roof subtotal. This case also explains the ablation result for removing assembly-aware cues: without assembly information, the workflow tends to select a membrane-only item, whereas the full workflow uses the roof assembly information to retain both membrane- and insulation-related cost items. Table 7 gives the direct retrieval and proposed matches.

4.4.4. Case 4: Full Workflow vs. No Thickness Check for Wall Item Selection

Case 4 examines whether thickness information affects wall cost item selection. The selected IfcWallStandardCase record is an interior gypsum-board partition wall on metal studs with the raw IFC name Basic Wall:Interior—6 1/8” Partition (2-h):2330888. The IFC input gives an area-based quantity, a nominal thickness of approximately 6 1/8 in., gypsum wall board and metal stud layers, and a 2 h fire rating. Figure 14 shows the layer-based thickness interpretation used in the comparison. Two workflow variants were compared: the full workflow retained the thickness check, whereas the ablation variant removed only the thickness condition while retaining description standardization and the unit, material, assembly, and fire rating checks.

Without the thickness check, the workflow selected a technically similar 2 h metal stud partition item with a 6 in. stud width. This candidate preserved the unit, material, and fire rating cues, but its implied wall thickness was larger than the BIM-reported 6 1/8 in. wall. With the full workflow, the selected item used 3–5/8 in. metal studs with two layers of 5/8 in. gypsum board on each side, which is more consistent with the BIM-reported nominal wall thickness and 2 h fire rating information. The result indicates that thickness information can affect candidate ranking among similar wall items, although the final selection also depends on material, fire rating, assembly, and unit checks. Table 8 summarizes the comparison.

These cases indicate that the proposed method can improve entry-level selection by checking whether the retrieved RSMeans item has the correct unit basis, construction scope, assembly coverage, and thickness consistency. Its value is therefore not limited to increasing project-level pricing coverage; it also provides a more reviewable basis for cost item selection under IFC-to-RSMeans semantic mismatch.

4.5. Workflow Comparison: Category-Level Costs

The component-level total cost comparison in Table 9 reports estimates across direct cost item retrieval and three language model variants. The comparison reflects the combined effect of Phase III entry-selection behavior and Phase IV cost projection.

Figure 15 presents the project total cost comparison across workflow variants.

The comparison indicates that the workflow improves completeness by recovering categories that direct cost item retrieval leaves unpriced, most notably IfcBuildingElementProxy (USD 250K), IfcFlowTerminal (USD 178K), and IfcMember (USD 644K), which collectively account for over USD 1.07 million. Compact reranking models perform similarly on most categories when the candidate space has already been narrowed through workflow controls. The lower Qwen 3B total reflects entry-selection failures rather than quantity differences: that variant returned zero for IfcBuildingElementProxy and IfcRailing because reranking failed to retain a technically acceptable entry after the shortlist was formed. This behavior suggests that workflow controls and conservative fallback policies may be more important than model size alone under the reported conditions. A supplementary project-level check against the Snowdon total-cost reference places the full workflow closer to that reference than direct cost item retrieval, but that comparison is secondary to the workflow findings reported here.

Figure 16 complements the cost comparison by indicating that the workflow variants retain accepted pricing in 17 IFC categories, whereas direct cost item retrieval leaves 3 additional categories unmatched. Figure 17 summarizes the share of comparison records whose top-1 RSMeans suggestion changes relative to the full heuristic stack. Together, these graphics indicate that the workflow improves pricing coverage at building scale and changes candidate behavior inside the internal comparison set. Removing multimodal standardization substantially reshapes candidate selection, while suppressing assembly-aware cues creates the next-largest disruption and concentrates that disruption inside the assembly-sensitive subset.

5. Discussion

This paper presents a case study of a BIM model-to-cost database workflow in which quantity takeoff, multimodal description standardization, item selection, and cost synthesis remain reviewable and database-based. The application demonstrates how the workflow operates on one BIM model and one structured cost database.

The case study results clarify how the workflow affects entry-level cost item selection. The four diagnostic cases show that direct retrieval can return semantically related but technically inappropriate records when measurement unit, construction scope, assembly composition, or thickness information is not checked. In the wall case, the proposed method retains the area-based quantity basis instead of accepting a linear framing item. In the door case, it selects a door item rather than an accessory or operator item. In the roof case, it retains assembly-level information instead of reducing the match to a membrane-only record. In the thickness case, the full workflow selects a wall item that is more consistent with the reported partition thickness than the no-thickness-check variant. These findings support a narrower claim: the workflow improves the technical screening of candidate cost database items, while the language model remains limited to description standardization and bounded reranking rather than unit-rate generation.

In practical estimating workflows, the proposed method can be used after BIM-based quantity extraction and before estimator review. The workflow can export matched cost items, quantities, unit rates, and source identifiers in spreadsheet form so that estimators can review, revise, or reject suggested matches. The current implementation is better suited for office or cloud-based processing rather than site-level execution. It does not require proprietary BIM authoring software, but integration with commercial estimating platforms would require additional import/export mapping. The workflow may reduce manual search and screening effort, but this study does not empirically measure estimator workload reduction.

6. Conclusions

This study presented a four-phase multimodal and rule-based framework for linking BIM model information with structured cost database records. The main contribution is a cost item selection workflow that addresses the descriptive mismatch between BIM model content and estimator-facing cost records. In the proposed workflow, BIM-derived quantities and component attributes are extracted, model-side descriptions and visual cues are standardized into estimator-readable language, candidate cost database items are retrieved, and rule-based checks are applied to verify unit basis, material assembly, thickness, finish, fire rating, and construction scope before a database item is accepted. In the case study, the framework processed 7374 extracted BIM elements and generated component-level and project-level costs from accepted database unit rates. Four diagnostic matching cases further showed how the workflow reduces common matching errors involving measurement unit, construction scope, assembly coverage, and thickness-sensitive item selection. These results indicate that the proposed framework can support feasible and traceable BIM model-to-cost database matching under the reported case study conditions.

Several limitations remain. The study is limited to one BIM model and one structured cost database. The four diagnostic cases illustrate entry-level matching behavior, but they do not replace an independently adjudicated benchmark with estimator-validated correct and acceptable alternative cost items. The study also does not empirically validate traceability through estimator review time or industry user evaluation. Such validation is left for future work. Broader sensitivity analysis across retrieval depth, thresholds, embeddings, multimodal prompts, and additional disciplines is also outside the present scope.

Future research should extend this case study demonstration in two directions. First, formal module-level ablations and sensitivity analyses should be conducted, including separate tests of LLM-based standardization, visual-input standardization, and rule-constrained item selection. Second, the workflow should be evaluated across larger cost libraries and across structural, MEP, and multidisciplinary BIM models.

Author Contributions

Conceptualization, H.A.-D. and L.G.; methodology, H.A.-D., R.J.C. and L.G.; software, H.A.-D. and R.J.C.; validation, H.A.-D., R.J.C., L.G. and A.S.; formal analysis, H.A.-D. and R.J.C.; investigation, H.A.-D. and R.J.C.; resources, L.G. and A.S.; data curation, H.A.-D. and R.J.C.; writing—original draft preparation, H.A.-D., R.J.C. and L.G.; writing—review and editing, L.G., A.S., H.A.-D. and R.J.C.; visualization, H.A.-D. and R.J.C.; supervision, L.G. and A.S.; project administration, L.G.; funding acquisition, L.G. and A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Lawrence, A.; Pottinger, R.; Staub-French, M.; Nepal, M.P. Creating flexible mappings between building information models and cost information. Autom. Constr. 2014, 43, 108–117. [Google Scholar] [CrossRef][Green Version]
Babatunde, S.O.; Perera, S.; Ekundayo, D.; Adeleye, T.E. An investigation into BIM-based detailed cost estimating and drivers to the adoption of BIM in quantity surveying practices. J. Financ. Manag. Prop. Constr. 2020, 25, 61–81. [Google Scholar] [CrossRef]
Wahab, A.; Wang, J. Factors-driven comparison between BIM-based and traditional 2D quantity takeoff in construction cost estimation. Eng. Constr. Archit. Manag. 2022, 29, 702–715. [Google Scholar] [CrossRef]
Cassandro, J.; Farina, A.; Mirarchi, C.; Pavan, A. Cost estimation practices: A comparative analysis of traditional 5D BIM and IFC methods. In Proceedings of the 41st International CIB W78 Conference, Marrakesh, Morocco, 1–3 October 2024. [Google Scholar]
Lee, G.; Sacks, R.; Eastman, C.M. Specifying parametric building object behavior (bob) for a building information modeling system. Autom. Constr. 2006, 15, 758–776. [Google Scholar] [CrossRef]
Lee, S.; Yu, J.; Jeong, J. BIM and ontology-based approach for building cost estimation. Autom. Constr. 2014, 41, 96–105. [Google Scholar] [CrossRef]
da Paz Moreira, G.; Carvalho, M.T.M.; Junior, E.R. Ifc-Based Automated Validation of Quantity Takeoff Requirements for Cost Estimation. SSRN 2025. [Google Scholar] [CrossRef]
Cassandro, J.; Mirarchi, C.; Zanchetta, C.; Pavan, A. Enhancing accuracy in cost estimation: Structured cost data integration and model validation. J. Inf. Technol. Constr. 2024, 29, 1293–1325. [Google Scholar] [CrossRef]
Cassandro, J.; Mirarchi, C.; Pavan, A. IFC-based cost estimation: Application to a structural model. In Proceedings of the European Conference on Computing in Construction, Crete, Greece, 14–17 July 2024. [Google Scholar]
Choi, J.; Kim, H.; Kim, I. Open BIM-based quantity take-off system for schematic estimation of building frame in early design stage. J. Comput. Des. Eng. 2015, 2, 16–25. [Google Scholar] [CrossRef]
Monteiro, A.; ao Poças Martins, J. A survey on modeling guidelines for quantity takeoff-oriented BIM-based design. Autom. Constr. 2013, 35, 238–253. [Google Scholar] [CrossRef]
Akanbi, T.; Zhang, J. IFC-based algorithms for automated quantity takeoff from architectural model: Case study on residential development project. J. Archit. Eng. 2023, 29, 04023026. [Google Scholar] [CrossRef]
Pishdad, P.; Onungwa, I.O. Analysis of 5D BIM for cost estimation, cost control, and payments. J. Inf. Technol. Constr. 2024, 29, 525–548. [Google Scholar] [CrossRef]
Chong, H.Y.; Zhang, Y.; Lee, C.Y.; Wang, F.; Zhang, Y. Synchronizing BIM cost models and bills of quantities for lifecycle audit trail cost management. Eng. Constr. Archit. Manag. 2025, 32, 6566–6592. [Google Scholar] [CrossRef]
Uhm, M.; Kim, J.; Ahn, S.; Jeong, H.; Kim, H. Effectiveness of retrieval augmented generation-based large language models for generating construction safety information. Autom. Constr. 2025, 170, 105926. [Google Scholar] [CrossRef]
Gatto, C.; Cassandro, J.; Mirarchi, C.; Pavan, A. LLM-based automatic relation between cost-domain descriptions and IFC objects. In Proceedings of the 41st International Conference of CIB W78; ITC Digital Library: Marrakech, Morocco, 2024. [Google Scholar]
Olsen, D.; Taylor, J.M. Quantity take-off using building information modeling (BIM), and its limiting factors. Procedia Eng. 2017, 196, 1098–1105. [Google Scholar] [CrossRef]
Furstenberg, D.; Hjelseth, E.; Klakegg, O.J.; Wikstrom, L.; Laedre, O. Automated quantity take-off in a Norwegian road project. Sci. Rep. 2024, 14, 458. [Google Scholar] [CrossRef]
Valinejadshoubi, M.; Moselhi, O.; Iordanova, I.; Valdivieso, F.; Bagchi, A.; Corneau-Gauvin, C.; Kaptué, A. Automated system for high-accuracy quantity takeoff using BIM. Autom. Constr. 2024, 157, 105155. [Google Scholar] [CrossRef]
Valinejadshoubi, M.; Moselhi, O.; Iordanova, I.; Valdivieso, F.; Bagchi, A.; Corneau-Gauvin, C.; Kaptué, A. A cloud-driven framework for automated BIM quantity takeoff and quality control: Case study insights. Buildings 2025, 15, 3942. [Google Scholar] [CrossRef]
Alathamneh, S.; Collins, W.; Azhar, S. BIM-based quantity takeoff: Current state and future opportunities. Autom. Constr. 2024, 165, 105549. [Google Scholar] [CrossRef]
Pham, V.H.; Chen, P.H.; Nguyen, Q.; Duong, D.T. BIM-based automatic extraction of daily concrete and formwork requirements for site work planning. Buildings 2024, 14, 4021. [Google Scholar] [CrossRef]
Liu, H.; Cheng, J.C.P.; Gan, V.J.L.; Zhou, S. A knowledge model-based BIM framework for automatic code-compliant quantity take-off. Autom. Constr. 2022, 133, 104024. [Google Scholar] [CrossRef]
Ergen, F.; Bettemir, Ö.H. Development of ontological algorithms for exact QTO of reinforced concrete construction items. Structures 2024, 66, 105907. [Google Scholar] [CrossRef]
Marcinkeviciute, D.; Schildknecht, L.; Huber, M.; Pancera, M.; Gschwind, J.; Badertscher, M.; Crevillen, J. Simplification and enrichment of IFC models for cost estimation. In Proceedings of the Creative Construction Conference, Zadar, Croatia, 14–17 June 2025. [Google Scholar]
Akanbi, T. IFC-Based Systems and Methods to Support Construction Cost Sstimation. Ph.D. Thesis, Purdue University, West Lafayette, IN, USA, 2021. [Google Scholar]
Gholamzadehmir, M.; Cassandro, J.; Mirarchi, C.; Pavan, A. Advancing cost estimation through BIM development: Focus on energy-related data associated with IFC element. Appl. Sci. 2025, 15, 7814. [Google Scholar] [CrossRef]
Solla, M.; Derbi, D.; Alosta, M.; Kazee, M.F.A.; Hayder, G.; Milad, A. Integrating price guideline and building information modeling for construction project cost estimation: A case study. Discov. Civ. Eng. 2025, 2, 151. [Google Scholar] [CrossRef]
Rostamiasl, V.; Jrade, A. Integrating building information modeling (BIM) and life cycle cost analysis (LCCA) to evaluate the economic benefits of designing aging-in-place homes at the conceptual stage. Sustainability 2024, 16, 5743. [Google Scholar] [CrossRef]
Moradabadi, B.; Noorzai, E.; Abbasi, S. BIM-based optimization approach to reduce life cycle costs by focusing on the integration of construction and operation phases in office-commercial buildings. J. Build. Eng. 2024, 90, 111126. [Google Scholar] [CrossRef]
Park, D.; Yun, S. Construction cost prediction using deep learning with BIM properties in the schematic design phase. Appl. Sci. 2023, 13, 7207. [Google Scholar] [CrossRef]
Wang, C.; Qiao, J. Construction project cost prediction method based on improved BiLSTM. Appl. Sci. 2024, 14, 978. [Google Scholar] [CrossRef]
Cheng, M.Y.; Vu, Q.T.; Gosal, F.E. Hybrid deep learning model for accurate cost and schedule estimation in construction projects using sequential and non-sequential data. Autom. Constr. 2025, 171, 105904. [Google Scholar] [CrossRef]
Sadikoglu, E.; Demirkesen, S. Review of machine learning and artificial intelligence use for cost estimation in construction projects, 2025. In Proceedings of the European Conference on Computing in Construction, Porto, Portugal, 14–17 July 2025. [Google Scholar]
Zhang, S.; Zhang, S.; Liu, H.; Wang, C.; Zhao, Z.; Wang, X. Semantic enrichment of BIM models for construction cost estimation in pumped storage hydropower using Industry Foundation Classes and interconnected data dictionaries. Adv. Eng. Inform. 2025, 68, 103670. [Google Scholar] [CrossRef]
Banihashemi, S.; Khalili, S.; Sheikhkhoshkar, M.; Fazeli, A. Machine learning-integrated 5D BIM informatics: Building materials costs data classification and prototype development. Innov. Infrastruct. Solut. 2022, 7, 271. [Google Scholar] [CrossRef]
Chiarello, F.; Barandoni, S.; Škec, M.M.; Fantoni, G. Generative large language models in engineering design: Opportunities and challenges. Proc. Des. Soc. 2024, 4, 1959–1968. [Google Scholar] [CrossRef]
Lee, G.; Jang, S.; Hyun, S. A generalized LLM-augmented BIM framework: Application to a speech-to-BIM system. arXiv 2024, arXiv:2409.18345. presented at CIB W78 2024. [Google Scholar]
Liu, D.; Zhou, X.; Li, Y. Enhancing natural language retrieval of BIM data through integration of large language models with multi-agent systems. In Proceedings of the 30th Conference on Computer Aided Architectural Design Research in Asia (CAADRIA), Tokyo, Japan, 22–29 March 2025; Volume 3, pp. 91–100. [Google Scholar] [CrossRef]
Valdebenito, R.; Forcael, E. Integrating artificial intelligence and BIM in construction: Systematic review and quantitative comparative analysis. Appl. Sci. 2025, 15, 12470. [Google Scholar] [CrossRef]
Du, S.; Hou, L.; Zhang, G.; Tan, Y.; Mao, P. BIM and IFC data readiness for AI integration in the construction industry: A review approach. Buildings 2024, 14, 3305. [Google Scholar] [CrossRef]
Kampelopoulos, D.; Tsanousa, A.; Vrochidis, S.; Kompatsiaris, I. A review of LLMs and their applications in the architecture, engineering and construction industry. Artif. Intell. Rev. 2025, 58. [Google Scholar] [CrossRef]
Li, M.; Wang, Z. BuildingGPT: Query building semantic data using large language models and vector-graph retrieval-augmented generation. Build. Environ. 2026, 287, 113855. [Google Scholar] [CrossRef]
Saka, A.; Taiwo, R.; Saka, N.; Salami, B.A.; Ajayi, S.; Akande, K.; Kazemi, H. GPT models in construction industry: Opportunities, limitations, and a use case validation. Dev. Built Environ. 2024, 17, 100300. [Google Scholar] [CrossRef]
Abdelsalam, M.; Ashmawi, A.; Nguyen, P.H.D. AI-driven automation of construction cost estimation: Integrating BIM with large language models. Buildings 2026, 16, 485. [Google Scholar] [CrossRef]

Figure 1. Overall four-phase workflow for multimodal BIM model-to-cost estimation.

Figure 2. LLM standardization prompt.

Figure 3. Structured input from the BIM model for the LLM example.

Figure 4. JSON output from the LLM standardization example.

Figure 5. Window object used as visual input for the Phase II VLM standardization example.

Figure 6. VLM visual-input prompt.

Figure 7. Constraint-guided cost item selection.

Figure 8. Phase III retrieval-centered decision flow from BIM model query to accepted cost database item.

Figure 9. Project-scale views of the BIM model used in the case study.

Figure 10. Representative isolated IFC elements illustrating the class-aware quantity basis assignment logic for Phase I.

Figure 11. Layered wall assembly view for the exterior rainscreen wall used in the unit-basis matching comparison. Thickness is exaggerated for visualization; the layer labels preserve the IFC material-layer information.

Figure 12. Three-dimensional view of the single flush passage door used in the scope-mismatch matching comparison. The frame is shown only as contextual geometry; the selected RSMeans match prices the door leaf rather than an automatic operator or guide record.

Figure 13. Three-dimensional assembly view for the EPDM roof case. The IFC material-layer information identifies an EPDM membrane, rigid insulation, concrete, and metal deck substrate; the layer thicknesses are exaggerated to make the assembly structure visible.

Figure 14. Layered wall view for the 6 1/8 in. interior fire-rated partition used in the thickness-check comparison. The visualized layers correspond to two 5/8 in. gypsum-board layers on each side of a 3–5/8 in. metal stud layer.

Figure 15. Project-level total cost across four workflow variants.

Figure 16. Category-level pricing coverage by workflow variant.

Figure 17. Share of comparison records whose top-1 RSMeans suggestion changes relative to the full heuristic stack.

Table 1. Case study implementation parameters for Phase III cost item selection.

Parameter	Value Used	Role in the Case Study Workflow
Top-K candidate pool	$K = 80$	Maintains a broad semantic retrieval pool before rule-based filtering.
Shortlist cap	$M = 3$	Limits the candidate set passed to bounded reranking.
Retention threshold	$τ_{keep} = 0.30$	Removes weakly supported candidates from the retained shortlist.
Reranking threshold	$τ_{LLM} = 0.60$	Skips LLM reranking when the top rule-based score is already strong.
Rule-score weights	$w_{cat} = 0.30$ , $w_{unit} = 0.20$ , $w_{desc} = 0.25$ , $w_{name} = 0.10$ , $w_{mat 1} = 0.18$ , $w_{mat 2} = 0.08$ , $w_{fin} = 0.08$ , $w_{perf} = 0.06$	Prioritizes class and unit consistency, followed by description similarity, material cues, finish, and performance-related cues.
Thickness bonus	$β_{1} = 0.12$ , $β_{2} = 0.08$ , $β_{3} = 0.03$	Gives a secondary ranking bonus to thickness-consistent candidates.
Thickness breakpoints	$δ_{1} = 5$ mm, $δ_{2} = 15$ mm, $δ_{3} = 40$ mm	Defines close, moderate, and weak thickness agreement among comparable candidates.

Table 2. Estimated processing time for the Snowdon Towers case study workflow.

Stage	Estimated Time
IFC parsing and quantity extraction	3 min
BIM text normalization/LLM standardization	35 min
VLM processing for representative visual cases	5 min
Embedding generation and top-K retrieval	2 min
Rule-based filtering, scoring, and shortlist generation	4 min
Bounded LLM reranking	18 min
Cost synthesis and spreadsheet export	1 min
Total processing time	68 min

Table 3. IFC element extraction summary from the Snowdon Towers model.

IFC Class	Count	IFC Class	Count
IfcWall	1078	IfcRoof	20
IfcWallStandardCase	902	IfcStair	31
IfcSlab	230	IfcStairFlight	49
IfcColumn	118	IfcRailing	142
IfcBeam	0	IfcRamp	2
IfcMember	1625	IfcRampFlight	2
IfcDoor	132	IfcCurtainWall	42
IfcWindow	68	IfcPlate	480
IfcCovering	68	IfcFlowTerminal	607
IfcFurnishingElement	345	IfcOpeningElement	436
IfcTransportElement	2	IfcBuildingElementProxy	995
Total extracted elements			7374

Table 4. Wall case study sample showing gross area, opening deductions, and computed net area for representative Phase I wall instances.

Sr.	Raw IFC Name	Gross Area	Opening Area	Net Area
		(m²)	(m²)	(m²)
1	Exterior—13 5/8” Rainscreen (Two Sides)	80.03	0.00	80.03
2	Foundation—24” Concrete	26.37	0.00	26.37
3	Foundation—24” Concrete	11.11	0.00	11.11
4	Chase—GWB & Metal Stud 4 1/4”	330.56	0.00	330.56
5	Core—Concrete 18”	21.55	0.00	21.55
6	Core—Concrete 18”	87.10	48.89	38.21
7	Core—Concrete 12”	284.09	122.22	161.87
8	Party—6 1/8” (2-hr)	12.32	0.00	12.32
9	Exterior—12 5/8” Rainscreen w Insulation	7.33	0.00	7.33
10	Core – Concrete 10”	108.24	0.00	108.24
11	Foundation—24” Concrete	29.98	0.00	29.98
12	Chase—GWB & Metal Stud 6 5/8”	6.23	0.00	6.23
13	Foundation—24” Concrete	11.15	0.00	11.15
14	Exterior—14 5/8” Rainscreen w Insulation	135.31	24.44	110.86
15	Foundation—24” Concrete	18.35	0.00	18.35
16	Exterior—12 5/8” Rainscreen w Insulation	518.64	308.61	210.03
17	Party—6 1/8” (2-hr)	22.72	0.00	22.72
18	Exterior—12 5/8” Rainscreen w Insulation	65.28	0.00	65.28
19	Exterior—12 5/8” Rainscreen w Insulation	9.92	0.00	9.92
20	Core—Concrete 10”	13.72	0.00	13.72

Table 5. Case 1 comparison of direct cost item retrieval and proposed RSMeans matching for the exterior wall assembly.

Workflow Output	RSMeans Entry	Unit and Cost
Direct cost item retrieval	05 41 13 Load-Bearing Metal Stud Framing	L.F.; USD 39.50/L.F.
Proposed method: wall backup match	09 21 16 Gypsum Board Assemblies: exterior gypsum sheathing, interior gypsum, insulation, and metal studs	S.F.; USD 5.75/S.F.
Proposed method: cladding match	07 42 13 Metal Wall Panels: steel siding with 2 in. insulation and baked-enamel exterior	S.F.; USD 19.65/S.F.

Table 6. Case 2 comparison of direct cost item retrieval and proposed RSMeans matching for the single flush passage door.

Workflow Output	RSMeans Entry	Unit and Cost
Direct cost item retrieval	08 71 13 Automatic Door Operators: commercial automatic opener for single-swing door	Per opening; USD 6475
Proposed method: final match	08 14 16 Flush Wood Doors: smooth birch-face wood door, 1-3/8 in., 3 ft by 7 ft	Ea.; USD 191
Proposed method: secondary assembly candidate	08 17 23 Integrated Wood Door Opening Assemblies: interior passage door with solid jamb and birch flush solid-core door	Ea.; USD 418.50

Table 7. Case 3 comparison of direct cost item retrieval and proposed RSMeans matching for the EPDM roof assembly.

Workflow Output	RSMeans Entry	Unit and Cost
Direct cost item retrieval	07 53 23.20 EPDM membrane only, 45 mil	Sq.; USD 55/Sq.
Proposed method: membrane match	07 53 23.20 Ethylene-propylene-diene-monomer roofing, 45 mil, mechanically attached	Sq.; USD 161/Sq.
Proposed method: insulation match	07 22 16.10 Roof deck insulation, polyisocyanurate, 3 in. thick	S.F.; USD 1.12/S.F.

Table 8. Case 4 comparison of the full workflow and no-thickness-check variant for wall item selection.

Workflow Variant	Thickness Check	Selected RSMeans Entry	Interpretation
No thickness check	Removed	09 21 16.33-6600 Partition wall: fire resistant, 2 layers, 2 h., metal studs, 6 in. wide, 16 in. O.C.; S.F.; USD 5.22/S.F.	Thickness was not used to narrow similar fire-rated metal stud partition records.
Full workflow	Retained	09 21 16.33-6400 Partition wall: fire resistant, 2 layers, 2 h., metal studs, 3–5/8 in. wide, 16 in. O.C.; S.F.; USD 4.84/S.F.	The selected item better matches the BIM thickness and fire rating cues.

Table 9. Component-level total cost estimates (USD) across matching strategies.

IFC Entity	Count	Direct Cost Item Retrieval	Qwen 1.5B	Qwen 3B	Phi-4-mini
IfcBuildingElementProxy	995	0	250,076	0	250,076
IfcColumn	118	357,265	357,265	357,265	357,265
IfcCovering	68	205,401	205,401	205,401	205,401
IfcDoor	132	99,067	98,950	98,950	98,950
IfcFlowTerminal	607	0	178,164	178,164	178,164
IfcMember	1625	0	643,950	643,950	643,950
IfcPlate	480	70,269	69,721	69,721	69,721
IfcRailing	142	53,144	53,144	0	53,144
IfcRampFlight	2	1908	1667	1667	1667
IfcRoof	20	302,374	302,374	302,374	302,374
IfcSlab	230	2,757,970	2,757,970	2,757,970	2,757,970
IfcStair	31	320	320	320	320
IfcStairFlight	49	33,490	35,623	35,623	35,623
IfcTransportElement	2	15,152	16,932	16,932	16,932
IfcWall	1078	1,376,953	1,374,638	1,374,638	1,374,638
IfcWallStandardCase	902	793,907	793,941	793,941	793,941
IfcWindow	68	45,231	44,970	44,970	44,970
Total (USD)		6,112,448	7,270,121	6,966,901	7,270,121

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Al-Derham, H.; Chaudhari, R.J.; Gao, L.; Senouci, A. A Multimodal Model- and Retrieval-Guided Framework for BIM Model Cost Estimation. Buildings 2026, 16, 2103. https://doi.org/10.3390/buildings16112103

AMA Style

Al-Derham H, Chaudhari RJ, Gao L, Senouci A. A Multimodal Model- and Retrieval-Guided Framework for BIM Model Cost Estimation. Buildings. 2026; 16(11):2103. https://doi.org/10.3390/buildings16112103

Chicago/Turabian Style

Al-Derham, Hassan, Ruchika Jagannath Chaudhari, Lu Gao, and Ahmed Senouci. 2026. "A Multimodal Model- and Retrieval-Guided Framework for BIM Model Cost Estimation" Buildings 16, no. 11: 2103. https://doi.org/10.3390/buildings16112103

APA Style

Al-Derham, H., Chaudhari, R. J., Gao, L., & Senouci, A. (2026). A Multimodal Model- and Retrieval-Guided Framework for BIM Model Cost Estimation. Buildings, 16(11), 2103. https://doi.org/10.3390/buildings16112103

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Multimodal Model- and Retrieval-Guided Framework for BIM Model Cost Estimation

Abstract

1. Introduction

2. Literature Review

2.1. BIM Model-Based Quantity Takeoff

2.2. BIM-Based Cost Estimation and 5D BIM Workflows

2.3. Model-to-Cost Mapping and Structured Cost Databases

2.4. AI, NLP, and Multimodal Language Model Support in Construction Informatics

2.5. Unresolved Gap Addressed in This Study

3. Methodology

3.1. Overall Workflow

3.2. Phase I: Automated Quantity Takeoff from the BIM Model

3.3. Phase II: Multimodal Description Standardization for Estimator-Readable Matching

3.4. Phase III: Retrieval-Guided Cost Matching and Record Selection

3.5. Phase IV: Database-Based Cost Synthesis and Reporting

4. Case Study

4.1. Case Study Data

4.2. Implementation Environment

4.3. Phase-Wise Application of the Workflow

4.3.1. Phase I: Automated Quantity Takeoff from the IFC Model

4.3.2. Phase II: Multimodal Description Standardization into Estimator-Readable Language

4.3.3. Phase III: Retrieval-Guided Cost Matching and Entry Selection

4.3.4. Phase IV: Database-Based Cost Synthesis and Reporting

4.4. Comparative Case Studies of Cost-Item Matching

4.4.1. Case 1: Direct Retrieval vs. Proposed Method for Measurement Unit Selection

4.4.2. Case 2: Direct Retrieval vs. Proposed Method for Scope Selection

4.4.3. Case 3: Direct Retrieval vs. Proposed Method for Assembly Selection

4.4.4. Case 4: Full Workflow vs. No Thickness Check for Wall Item Selection

4.5. Workflow Comparison: Category-Level Costs

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI