1. Introduction
Smart home building data management has become a global concern as cities pursue low-carbon operation, resilient housing, and accountable information flows across the building lifecycle. A smart home here refers to a residential building where sensors, devices, and data services are systematically embedded to provide data-driven support for occupant experience, energy efficiency, and operations and maintenance [
1,
2]. Recent reviews and practice reports highlight digital-twin adoption—especially for operations and energy analytics—but evidence from the design stage remains limited, particularly on object-level identity governance, consumption-side handover, and auditable processes [
3,
4,
5,
6]. This study responds by focusing on design-stage digital-twin assets and game engine interoperability in smart home projects and by proposing a traceability-oriented governance workflow based on Industry Foundation Classes Globally Unique Identifier-22 (IfcGUID-22) compliance to keep cross-system data chains auditable from the design model to downstream visualization and game engine environments.
In this context, digital-twin assets denote object-level digital representations on the consumption side—such as visualization and game engines and operations-and-maintenance platforms—that can be used directly for rendering, interaction, and business linkage and that act as data carriers connecting the design-stage model with downstream applications [
3,
4]. In practice, once the consumption side (for example, the representative game engine Unity) accumulates non-standard data—object identifiers that are inconsistent, missing, or invalid—cross-system reconciliation and traceability are quickly blocked, leading to rework and a broken chain of evidence. This research therefore targets the non-standard data that emerge when digital-twin assets are generated at the design stage of smart home projects and proposes and validates a practical “non-standard to standard” workflow. The aim is to convert consumption-side non-standard data into standard-compliant, auditable data streams already at the design stage, serving Building Information Modeling (BIM) and building data management in the Common Data Environment (CDE), as well as visualization and engine teams and owners and auditors, by reducing data chain breakage and rework at the source. Industry studies likewise show that the cost of such “bad data” is substantial [
7,
8].
Building data management for smart homes concerns the organization, quality control, and traceability-oriented governance of lifecycle models, metadata, and versions. Its management anchor is the CDE: under ISO 19650 [
9] and the UK BIM Framework [
10], each information container should have a unique identifier, follow agreed naming conventions, and pass through controlled processes for creation, review, and release [
9,
10]. In this study, we implement traceability-oriented governance in the CDE by registering paths, timestamps, and checksums for key inputs and outputs, measurement configurations, and results so that every reported number along the design-stage digital-twin asset workflow can be traced back to a concrete file and command [
11].
Cross-system consistent identification in building data management depends on robust object-level primary keys. At the open-standards layer, buildingSMART’s Industry Foundation Classes (IFC, ISO 16739-1) provide the international standard for cross-platform exchange and sharing [
12]. Each IfcRoot object carries an object key, IfcGUID, with a fixed 22-character exchange representation, and toolchains are expected to convert this consistently to and from the underlying globally unique identifier (GUID), a property we refer to as IfcGUID compliance [
12,
13]. In this study, IfcGUID compliance covers four aspects: existence, format validity, uniqueness, and cross-version stability. If any of these fail, cross-system comparison and reuse collapse into pseudo-comparisons of “the same object with different identifiers (ID)” [
14,
15].
On the consumption side of smart homes, game engines act as real-time 3D platforms that integrate multi-source data such as BIM and IoT for visualization, simulation, and interactive presentation to support collaborative decision-making; Unity is a representative engine widely used in digital-twin integration scenarios [
16,
17]. Game engine interoperability is therefore a critical part of smart home digital-twin assets. However, engine-side identifiers are typically project-internal: for example, Unity writes an asset’s unique identifier (ID) into a local meta file, and if that file is lost or recreated, a new GUID is generated, which is inherently unsuitable as a minimal cross-platform mutual recognition key [
18,
19]. It is thus necessary to bridge engine assets to IfcGUID-22 and to register that mapping, together with its context, in the CDE so that references remain consistent and replayable along the “IFC to engine assets to audit and reconciliation” chain [
9,
12,
13,
18].
Existing reviews and practice largely emphasize digital-twin research and evidence on the operations and energy side; integrated mechanisms that connect object key governance, consumption-side handover, and evidencing at the design stage are under-reported [
3,
4,
5,
6]. BIM–engine integration studies also tend to focus on visualization and interaction rather than on quantitative, auditable ID–mapping–verification pipelines and measurement conventions [
16,
17]. Accordingly, our entry point is to embed IfcGUID compliance and traceability-oriented governance at the design stage and to close the loop via engine-side handover and cross-system alignment, turning “source-side compliance (IFC and IfcGUID) to consumption-side acceptance (engine and assets) to cross-system reconciliation (CDE and audit)” into executable processes with evidence [
9,
11,
12,
13,
18].
Based on the above, this study addresses two research questions:
RQ1 (governance): At the design stage, how can multi-source data be incorporated into a controlled BIM-standard project environment and managed with unified, stable object identity across teams, systems, and versions so that they can be aligned and compared and support accountable traceability, thereby reducing data chain breakage and rework caused by non-standard data? [
9,
10]
RQ2 (engineering): How can a verifiable, low-cost, and transferable process convert consumption-side non-standard data into compliant data streams, create a complete chain of evidence, and support sustained reuse across projects and version evolution? [
11,
13,
14,
18]
To answer these questions, we build a virtual experimental platform aligned with design-stage building data management requirements for smart homes. We freeze data assets, lock environments and execution entry points, and register hashes for key executables, configurations, and all inputs/outputs. Using a unified denominator , we define and compute four integrity metrics—completeness, validity, uniqueness, and stability—to form the pre-test baseline. We then implement an IfcGUID↔engine bridging and minimal repair mechanism and conduct same-convention post-tests with paired reruns, reporting changes in the metrics as absolute percentage-point differences under a stable denominator. This end-to-end, evidence-oriented pipeline is validated in a virtual smart home setting and is designed to be reproducible within existing toolchains.
3. Materials and Methods
In terms of methodology, we adopt a staged design that reflects how design-stage digital-twin assets are created and managed in smart homes. First, based on the use scenarios defined earlier, we construct a virtual smart home experimental object and build its design-stage digital twin using a mainstream game engine (Unity), achieving a unified and controllable representation of both the BIM model and the digital-twin assets. Second, we freeze all BIM and digital-twin assets produced at this stage and treat them as a unified denominator
, providing a fixed object set for subsequent measurement and comparison. Third, in the pre-test, we apply four IFC-based integrity metrics—completeness, validity, uniqueness, and stability—under the unified denominator
to quantify data chain breaks and identify weaknesses in identifier management. Based on these pre-test findings, we then propose, in the governance workflow section, an IfcGUID-oriented bridging and management process to carry out targeted governance at the design stage. Finally, in a separate post-test, we rerun the same metrics and interoperability checks under the same denominator
and the same measurement conventions to evaluate whether the workflow effectively reduces data chain break risks and improves identifier compliance and game engine interoperability. The research framework is shown in
Figure 1.
3.1. Experimental Platform Setup and Metric Definition
This subsection describes how we construct the experimental platform and define the metrics used to evaluate IfcGUID-based compliance and interoperability at the design stage. First, we outline the scope and construction of a virtual smart home environment and its digital-twin assets so that the building model and the visualization/interaction side can be studied in a controlled setting (
Section 3.1.1). Second, we explain how we freeze the export and consumption pipelines and designate IFC as the single source of truth, ensuring that design-stage BIM and digital-twin assets form a stable object set for pre- and post-tests (
Section 3.1.2). Third, we define the unified denominator
and the associated metrics—completeness, validity, uniqueness, stability, and interoperability indicators (BRR, DR)—which serve as the quantitative basis for all subsequent analyses.
3.1.1. Scope Definition and Platform Construction
To support comparable IfcGUID measurement and re-testing, we build the virtual smart home platform in stages—defining requirements, fixing the scope, designing the layout, modeling, and generating digital-twin assets—while keeping each step feasible and reproducible.
First, we define the requirements. Based on the interaction tasks sampled in
Section 2, we construct a virtual smart home scenario that can be verified in a closed loop at the design stage. The platform focuses on four systems—security and access, lighting, entertainment, and communication—to ensure the coverage of typical interactions while remaining implementable. HVAC, which depends more on operational IoT data and control strategies, is excluded at this stage.
Next, we fix the scope. To avoid denominator drift between the pre-test and post-test, objects are partitioned into a unified denominator
, which enters the measurement and repair loop, and an extended set
used for coverage statistics. Both sets are defined through an asset list and a boundary configuration stored in the Common Data Environment (CDE), and the same conventions are reused in
Section 3.3 and
Section 3.4.
After completing the floor layout and system placement, we select two representative spaces as the testbed: the Entrance and the Multimedia Room. The Entrance covers access control, security, and lighting with presence-based linkage; the Multimedia Room covers audio-visual, lighting, and communication linkages. Together they cover the main interaction combinations listed in
Table 2 while keeping scale and complexity within implementable bounds.
Figure 2 illustrates the selection of the Entrance and Multimedia Room as typical spaces covering access, lighting, AV, and communication interactions, and
Figure 3 shows the system design for these spaces in the virtual smart home experiment platform.
Finally, we build the BIM model and the digital-twin assets. After modeling in SketchUp with unified naming and hierarchy, we export IFC as the sole data source; in Unity, we construct the corresponding interactive scenes and object mappings to form the runtime environment for measurement and re-testing. Object references are registered in the CDE and reused under the same conventions in subsequent stages. The CDE directory map and registry roles are summarized in
Appendix B.1.1 and
Appendix B.1.
Figure 4 and
Figure 5 present the live digital-twin demonstration scene used to validate interactive behaviors and provenance logging under the frozen environment.
3.1.2. Export Pipeline and Asset Freezing
To ensure that measurements are comparable and auditable, we apply a one-time freeze to design-stage digital-twin assets. Project coordinates, units, and naming are first harmonized and then locked, and an asset inventory and a data boundary are exported to define the evaluation set
. The export pipeline is fixed as SketchUp → IFC with a single, consistent strategy so that IFC serves as the sole data source. Under identical environment and configuration settings, two baselines (v1/v2) are exported to support subsequent checks on GlobalId stability. All write operations are confined to tool-generated containers, leaving the original project files unchanged. In a single controlled registration step, we record the baselines, inventories, environment snapshots, and checksums in the CDE to support reruns and audits. Key baselines and locks (coordinates/units, naming and hierarchy, export channel, environment snapshot, and CDE register) are summarized in
Table 3, and full environment pins and checksums are provided in
Table A1,
Appendix A.2 and
Appendix A.3.
3.1.3. Metric Definitions and Statistics
To obtain comparable and reproducible conclusions in the pre-test and post-test, we define a unified denominator and a compact set of metrics that quantify IfcGUID-based integrity and interoperability in a consistent way. All metrics are reported as proportions in
. Four core metrics describe the integrity of IfcGUID on the IFC side—completeness, validity, uniqueness, and stability—while two additional indicators summarize interoperability between IFC and Unity: the bridge recognition rate (BRR) and Disconnect Rate (DR). The four integrity metrics follow the common data quality and identifier governance dimensions used in BIM/FM practice—completeness, validity, uniqueness, and stability—as summarized in
Section 2, and they are here specialized to the semantics of IfcGUID.
We use a unified denominator
to represent the set of objects that fall within the evaluation scope (
Section 3.1.1) and are expected to carry a valid identifier. All IFC-side integrity metrics are computed on
. Interoperability metrics are computed on the set of object pairs that are intended to be bridged between IFC and Unity (the mapping set) and are reported in parallel.
R1 Completeness (C).
Meaning: The proportion of objects in that have a resolvable GlobalId.
Definition: The number of objects in whose GlobalId is present and can be parsed into a valid IfcGUID-22 exchange string, divided by the total number of objects in .
R2 Validity (V).
Meaning: The proportion of objects in whose GlobalId satisfies basic format and standard checks.
Definition: The number of objects in whose GlobalId has the correct length and character set and passes IfcOpenShell/IFC validation, divided by .
R3 Uniqueness (U).
Meaning: The proportion of objects in that are not involved in any GlobalId duplication.
Definition: The number of objects in whose GlobalId occurs exactly once within , divided by . When duplicate clusters exist, all members of a cluster are counted as non-unique.
R4 Stability (S).
Meaning: The proportion of objects in whose GlobalId remains stable across exports.
Definition: Given two IFC baselines (v1 and v2) exported under identical settings, the number of objects in whose GlobalId is identical in v1 and v2, divided by .
Interoperability is summarized over the explicit mapping between IFC objects and Unity assets. Each mapping entry links one IfcGUID to one Unity asset GUID. For this mapping set, we define the following:
BRR (Bridge Recognition Rate).
Meaning: The share of mapping entries where the Unity asset can be successfully resolved and recognized in the project.
Definition: The number of mapping entries whose Unity GUID can be resolved to a valid asset/scene reference, divided by the total number of entries in the mapping set.
DR (Disconnect Rate).
Meaning: The share of mapping entries where the Unity side is missing, i.e., a planned link from IFC to Unity is broken.
Definition: The number of mapping entries whose Unity asset is missing or cannot be resolved, divided by the total number of entries in the mapping set.
For pre-/post-test comparison, we keep the denominator
, the interpreter, and the measurement functions unchanged. Changes in C/V/U/S and interoperability metrics are reported as absolute percentage-point differences between Pre-test and Post-test, accompanied by the corresponding counts. The formal statements of metric invariants and paired rerun matching rules are given in
Appendix C.
3.2. Pre-Test (Baseline)
This section establishes a baseline measurement without altering the source data. Under the unified denominator , we compute the four IfcGUID integrity metrics (C/V/U/S) and summarize interoperability, using a consistent workflow of extracting, validating, aligning, measuring, summarizing, and registering the results. This baseline is later used for pre–post comparison and for auditable reruns.
We first establish an IfcGUID compliance baseline without any repair. Within the frozen object scope defined in
Section 3.1 and the controlled export pipeline, we generate two IFC baselines (v1 and v2) and, under the same denominator
, compute completeness, validity, uniqueness, and stability while reporting an IFC-side comparator
in parallel. This stage corresponds to the “pre” condition; the “post” recomputation in
Section 3.4 follows the same conventions.
Execution in the pre-test is strictly read-only. All inputs are taken from controlled information containers in the CDE so that the statistical denominator, export settings, and runtime environment remain identical; no writes are made to the IFC or Unity sources. An automated routine loads the IFC baselines and the Unity inventories, detects missing and invalid identifiers, checks the reversibility of IfcGUID-22, identifies duplicate clusters to support the uniqueness metric, and aligns IFC objects with Unity assets to produce an object-level table. Summary-level metrics are then computed, and both input fingerprints and output indices are registered so that paired reruns remain auditable.
3.2.1. Pre-Test Pipeline
The pre-test pipeline implements five steps—Detect, Normalize, Align, Measure, and Register—in a single automated routine:
Detect. Load the two IFC baselines and the controlled lists (asset list, scene list, object boundary); parse IfcRoot.GlobalId, identify missing and invalid values, and locate duplicate clusters, outputting object-level flags.
Normalize. Standardize the representations of IfcGUID and engine-side GUIDs (e.g., case, separators, whitespace) so that set operations and matching follow a single convention and the unified denominator is fixed consistently.
Align. Perform explicit IfcGUID → Unity GUID alignment to obtain the hit set; verify that referenced assets and scenes exist and are resolvable; generate an object-level alignment table; and assign hit/missing reasons for the later interpretation of the BRR and DR.
Measure. Compute C/V/U/S on the unified denominator and report in parallel; all calculations are read-only and do not modify the IFC or Unity sources.
Register. Record input/output manifests and hashes; attach timestamps, operator, and run configuration; update the CDE index; and output object-level tables and summary reports to support paired reruns and auditable reproduction.
A reproducibility checklist is provided in
Appendix A.3, and field semantics are summarized in
Appendix B.3. Acceptance gates and relevant IDS rule excerpts referenced in the pre-test are listed in
Appendix C (
Table A6).
Figure 6 outlines the pre-test metric pipeline.
3.2.2. Execution and Outputs
The pre-test routine is executed within the CDE under a locked environment snapshot, using the frozen IFC baselines, Unity inventories, and configuration files as inputs. The routine runs the full “extract–validate–align–measure–register” sequence in a single pass and produces both object-level and summary-level artifacts. These outputs, together with their paths and checksums, are registered in the CDE so that the pre-test can be rerun under identical conditions. Core file types and their roles are documented in
Appendix A, and the corresponding schemas and data dictionaries are provided in
Appendix B.
3.3. Bridging Game Engine Digital-Twin Assets and Minimal Repair
Under the unchanged pre-test conventions, we design a self-check and batch repair workflow. The objective is to keep IfcRoot.GlobalId stable, complete the mapping keys and consistency metadata, and produce auditable repair artifacts and mapping lists, thereby supplying like-for-like inputs for the subsequent post-test.
3.3.1. Where the Chain Breaks: Pre-Test Insights
When a game engine is introduced to produce design-stage digital-twin assets for the smart home, the main data chain problems appear not in the IFC model itself but at the point where data are handed over to the engine.
The pre-test shows that most object-level identifiers on the IFC side are present and stable: IfcGUID is largely complete and consistent across the two IFC baselines. However, the chain from IFC to Unity is often broken. In practice, three typical situations dominate:
No explicit mapping from IFC objects to Unity assets: The engine-side project uses its own internal identifiers, but there is no stable key that tells the engine which asset corresponds to which IfcGUID.
Missing or moved Unity resources: Some assets referenced in the intended mapping are no longer present or no longer resolvable in the Unity project, even though the IFC objects themselves exist and are well-formed.
Unclear or unresolved paths: Asset and scene paths are stored in a way that cannot be resolved under the current environment, so links cannot be followed or verified.
In addition, the pre-test reveals a small number of non-standard identifier encodings and duplicate clusters on the IFC side. These records do not necessarily break the model, but they undermine the assumptions behind uniqueness and stability and therefore must be detected and clearly flagged. Importantly, all of this must be performed without changing the object primary key (IfcRoot.GlobalId); otherwise cross-version traceability would be lost.
From the perspective of smart home digital-twin design, these findings mean that the “data chain” breaks between the BIM model and the game engine: the model has IDs, and the engine has assets, but the bridge between the two is weak, partial, or undocumented. The goal of this section is therefore to construct a governance and repair workflow that (i) restores this bridge in a controlled way and (ii) does so without damaging the original IFC identifiers.
3.3.2. Governance and Minimal Repair Pipeline
To address the above problems, we implement a bridging and minimal repair workflow that treats the game engine as a consumer of design-stage data and adds a thin but explicit governance layer between IFC and Unity. The workflow has four practical objectives:
Standardize IfcGUID representations so that identifiers can be compared reliably across tools.
Detect and label duplicate clusters so that uniqueness problems are visible and can be managed.
Establish a verifiable IFC–Unity object mapping, making clear for each object whether it is recognized by the engine or not.
Record repair actions and mapping lists with CDE registration so that changes are auditable and can be rerun.
Concretely, the workflow proceeds in five steps:
Diagnose issues.
Under the frozen conventions of the pre-test, we rerun read-only checks on the IFC baselines and controlled lists, confirming for each object whether its identifier is missing, invalid, or duplicated and whether a planned link to Unity is present, missing, or unresolved. This step turns the pre-test findings into a concrete list of problem cases.
Normalize identifiers.
We then standardize the representation of IfcGUID to a fixed exchange form (e.g., consistent 22-character string, unified case, and whitespace handling). This does not change IfcRoot.GlobalId itself, but it ensures that any comparison or matching across software uses the same representation, reducing spurious mismatches caused by formatting differences.
Build the IFC–Unity bridge.
For objects in scope, we compile an explicit mapping between their IfcGUID and the corresponding Unity asset identifier, using the controlled asset and scene inventories. Each mapping entry is checked to see whether the referenced Unity asset actually exists and whether its path can be resolved. The result is a structured mapping table that records, for every IFC object intended to appear in the digital twin, whether the game engine recognizes it and, if not, why (e.g., no mapping, missing asset, bad path).
Apply minimal repair.
When the problem can be fixed by a clear, rule-based adjustment—for example, filling a missing mapping entry from an unambiguous match or normalizing a non-standard encoding—we apply the smallest possible repair and record it. In all other cases, we leave the source unchanged and keep the issue as a labeled exception. A repaired IFC copy and an external mapping list are produced, both preserving the original IfcRoot.GlobalId values but improving the ability of the game engine to recognize and track objects.
Register evidence.
Finally, we register the inputs, outputs, and run metadata (paths, checksums, timestamps, operator, and configuration) in the CDE. This step turns the bridging and repair actions into an auditable chain of evidence and ensures that the same process can be rerun under identical conditions for verification and for the post-test in
Section 3.4.
The whole workflow runs under the controlled environment described in
Section 3.1, with the original IFC and Unity projects kept read-only. From a smart home perspective, this means that design-stage digital-twin assets produced by the game engine are now anchored back to stable IFC identifiers through an explicit, documented bridge, reducing data chain breaks while respecting existing modeling practices. The workflow is shown in
Figure 7.
3.4. Re-Test and Paired Rerun
This section evaluates the effectiveness of the bridging and minimal repair workflow by rerunning the metrics under identical conditions and comparing the pre–post results. The data scope, denominator
, interpreter, and measurement functions are kept unchanged, and all metrics are recomputed on two input states: the originally frozen assets (pre) and the standardized assets produced in
Section 3.3 (post). Any differences can therefore be attributed to the governance and repair workflow rather than to changes in environment or conventions.
The post-test uses the same IFC baselines (v1/v2), the controlled Unity inventories, and the explicit IFC–Unity mappings as inputs. Under the unified denominator , we recompute the four integrity metrics (C/V/U/S) and the interoperability indicators (BRR and DR) and then compare pre and post on a like-for-like basis. The results are reported both as absolute values and as percentage-point changes (post−pre), accompanied by the corresponding counts. If reruns under the same settings reproduce the same outputs, the process is considered stable; when differences arise, they can be traced back through the registered inputs, logs, and mapping tables.
Post-Test Pipeline
The post-test and paired comparison follow a lightweight, automated procedure that mirrors the pre-test:
Fix conventions.
We adopt exactly the same denominator , object scope, measurement definitions, and environment snapshot as in the pre-test and load both the pre-state (frozen assets) and post-state (standardized assets) as separate inputs.
Rerun measurements.
Using the same interpreter and measurement program, we run the full “extract–validate–align–measure” sequence for the pre- and post-states in turn. For each state, the routine aligns IFC objects with Unity assets using the explicit mappings, then computes C/V/U/S on and derives the interoperability summaries in parallel.
Compute deltas.
For each metric, we compute the change as post minus pre on the same denominator and produce side-by-side tables that list pre values, post values, absolute percentage-point differences, and supporting counts. This provides a compact view of how completeness, validity, uniqueness, stability, and interoperability have improved (or not) under the proposed workflow.
Register outputs.
Finally, we register the pre–post comparison artifacts, together with their paths, checksums, timestamps, operator, and run configuration, in the CDE. This turns the re-test into an auditable, rerunnable step and closes the loop between baseline measurement, governance and repair, and post-test evaluation.
Pass/fail thresholds, acceptance gates, and pairing rules for pre–post comparison are summarized in
Appendix C.1 and
Appendix C.2.
Figure 8 illustrates the post-test and paired comparison workflow under the unified denominator.
3.5. Use of Generative AI (GenAI) Tools
During method development and manuscript preparation, the authors used ChatGPT (GPT-5 Thinking; OpenAI web app, accessed on 29 September 2025, UTC+8) in a limited way to (i) suggest code refactoring and minor bug fixes for preprocessing scripts and (ii) assist in the translation of technical phrases from Chinese to English for terminology harmonization. All code and text were reviewed and edited by the authors, who take full responsibility for the final content.
4. Results and Discussion
4.1. Design-Stage Compliance Workflow for IfcGUID-Oriented Digital-Twin Assets
Within a controlled information environment, this study establishes a design-stage compliance workflow that treats IfcGUID as the object-level primary key and uses the CDE as the single carrier of the chain of evidence. From a building data management perspective, the workflow gives BIM and CDE managers, game engine and visualization teams, and owners and auditors an executable pathway to assign object-level unique identifiers to digital-twin assets and to register them in an auditable way at the design stage. It operationalizes the ISO 19650/UK BIM Framework requirement for unique identifiers in the CDE, adopts the 22-character exchange representation of IfcGUID-22 as the cross-system key, and bridges it to Unity asset identifiers. In doing so, it reduces cross-system reconciliation breaks and rework caused by bad data and restores the traceability and reuse of smart home digital-twin assets at the design stage.
Workflow details are as follows:
Freezing assets and boundaries. Define the evaluation set and the extended asset set; harmonize coordinates, units, and naming; and output an asset inventory that fixes the denominator and value domain.
Freezing export and consumption pipelines. Use IFC as the sole data source, export two baselines and under identical settings for robustness and channel variance control, and lock the Unity environment, asset libraries, scene snapshots, and mapping templates.
Automated compliance checking. Under and a single interpreter, load , , and the Unity inventories, and then run extract → validate → align → measure → summarize → register in a read-only manner, computing completeness, validity, uniqueness, and stability and record bridge recognition rates and disconnection points.
Bridging and minimal repair. Normalize Unity identifiers to IfcGUID-22, verify round-trip reversibility, and apply minimal repairs without changing IfcRoot.GlobalId, generating a standardized post-test copy and bridge table and writing them back to the CDE.
Self-test closure. With the same interpreter, measurement program, object set, and denominator, rerun the workflow to obtain post–pre changes in the four metrics and to evaluate primary key stability and channel variance.
Evidence archiving. In a single controlled transaction, register paths, hashes, timestamps, operator, and stage for key inputs, outputs, and configurations so that, given the same inputs and program version, the workflow remains auditable and reproducible.
Figure 9 depicts the IfcGUID-oriented standardization workflow that turns consumption-side non-standard data into compliant, auditable data streams.
4.2. Pre-/Post-Test Data Results
4.2.1. Pre-Test Results
Under the unified denominator , we performed read-only extraction, validation, alignment, and aggregation to obtain the pre-test baseline for the four integrity metrics and the IFC-side comparator . Inputs comprised two IFC baselines (/) and the Unity asset and scene inventories; outputs included object-level records, summary statistics, and interoperability indicators, all registered in the CDE with checksums.
Overall summary (U0 scope).
On the unified denominator
, R1 Completeness
(5.24%), R2 Validity
(5.24%), R3 Uniqueness
(5.12%), and R4 Stability
(5.24%), with the IFC-side comparator
. Within
, all IFC elements in the interoperability set
had a valid 22-character IfcGUID (
,
), yet model-level uniqueness over
remained low (about 5%). The overall pre-test metrics on the unified denominator U
0 are summarized in
Table 4.
Interoperability alignment.
No IFC–Unity pairs were recognized at this stage: , , bridge recognition rate (BRR) = 0.00, and break rate DR = 1.00. All 43 breaks were classified as missingUnity, with no badPath cases.
Duplicates.
Within M, one duplicate member was detected (in IfcPropertySet; see the object-level table).
By type and by issue.
Among the 43 objects in
, the main IFC types were IfcWall (16), IfcDoor (5), IfcFurnishingElement (5), and IfcWindow (4). For each type within
, R1–R4 = 1.00 (complete, valid, unique, and stable across
/
), indicating good required field quality on the IFC side. Issue composition was dominated by Not_aligned_to_Unity_lists = 43 (≈98% of classified issues; 5.24% of
); Duplicate_cluster_members = 1 accounted for the remaining ≈2% (0.12% of
). No illegal length/characters or cross-version instability appeared in the pre-test. The breakdown of these metrics by IFC object type within the interoperability set M is given in
Table 5. Issue composition by category in the pre-test is summarized in
Table 6.
In practical terms, the pre-test reveals a typical breakpoint: although the IFC side within is present, valid, unique, and stable, it fails to pair with the Unity asset and scene inventories (BRR = 0.00; DR = 1.00), leaving the interoperability chain fully broken on the consumption side. This sets the priority for §4.3 Bridging and minimal repair: first restore Unity-side mapping and recognition, and then address the isolated duplicate member.
Controlled inputs and hash proofs are listed in
Appendix A.2.
4.2.2. Post-Test Results
Under the same denominator U0 and the same interpreter and rules as the pre-test, we recompute the four primary metrics, keeping IFC as the sole data source and maintaining read-only constraints. Relative to the pre-test, the post-test enables upgraded bridging and probes on the interoperability side, expanding the enumeration/recording of consumption-side object pairs and introducing finer recognition/break statistics and provenance/attribution outputs. This yields a more complete interoperability view without changing the definitions or conventions of R1 Completeness, R2 Validity, R3 Uniqueness, or R4 Stability, ensuring direct comparability with the pre-test.
The coverage of the interoperability set is expanded from the pre-test’s small subset to a full scan of consumption-side object pairs. Newly recorded are the bridging recognition rate, break rate, recognized-in-scenes count, and total pairs, together with object-level interoperability details, provenance summaries, and a Pareto list of issues. The four primary metrics continue to be reported on the unified denominator U0, while interoperability metrics are reported on the expanded set of object pairs.
On the unified denominator U0 = 820, the post-test metrics are as follows:
R1 Completeness = 1.00; R2 Validity = 1.00; R3 Uniqueness = 0.98; R4 Stability = 1.00. The parallel IFC-side comparator S_ifc = 1.00. These overall post-test metrics on the unified denominator U
0 are summarized in
Table 7.
On the interoperability side, BRR = 1.00; total pairs M = 778; recognized pairs = 777; DR = 0.00; total breaks = 1 (with missingUnity = 1; badPath = 0); and recognized_in_scenes = 18.
Duplicates are reduced to one remaining member; there are 0 illegal length/characters and cross-version instability.
Within the IFC object set I = 43, each type attains 1.00 on R1, R2, and R4. The only type not reaching 1.00 on R3 is IfcPropertySet (one duplicate member); all other types achieve R3 = 1.00. The breakdown of these post-test metrics by IFC class within I is given in
Table 8. The issue set converges to two categories, Not_aligned_to_Unity_lists = 1 and Duplicate_cluster_members = 1; all others are 0. The issue composition in the post-test is summarized in
Table 9.
Without changing conventions or the denominator, the post-test increases validity, completeness, and stability to full values and narrows residual uniqueness issues to a single member. The interoperability chain is almost fully restored, with only one remaining consumption-side break and effective scene-level recognition established. These results verify the substantive improvement delivered by the bridging and minimal repair strategy on consumption-side interoperability.
4.2.3. Change Metrics and Pass/Fail Determination
Under the same conventions and denominator as the pre-test (definitions and rules for the four primary metrics unchanged), we summarize post vs. pre deltas. All four primary metrics—R1 Completeness, R2 Validity, R3 Uniqueness, and R4 Stability—increase markedly:
R1/R2/R4 absolute gains are +0.95 (post−pre, likewise below), R3 gains +0.93. Interoperability indicators also improve substantially: recognized pairs rise from 0 to 777 (scene recognitions to 18), while total breaks fall from 43 to 1 (missingUnity from 43 to 1, badPath remains 0).
Table 10 reports the post–pre delta summary for the four primary metrics,
Table 11 summarizes the interoperability deltas (post–pre), and
Table 12 lists the deltas of the main metrics.
Figure 4,
Figure 5 and
Figure 6 visualize the absolute gains for the four metrics (bar chart). A formal pass/fail check (e.g., thresholds C/V/S = 1.00, U ≥ 0.95, BRR ≥ 0.95, DR ≤ 0.05) can be annotated directly in
Table 10,
Table 11 and
Table 12; in this run, post-test values meet these stringent gates, with uniqueness not reaching 1.00 only due to one historical duplicate yet still exceeding common thresholds.
Under the unified denominator, the workflow increases completeness, validity, and stability to 1.00 and drives uniqueness close to 1.00 (with only one residual duplicate). Interoperability improves from BRR = 0/DR = 1 to approximately BRR ≈ 1/DR ≈ 0, leaving only one explainable breakpoint.
4.3. Improvement Results
From a building data management perspective, the proposed workflow brings design-stage digital-twin assets into a controlled CDE, using consistent container identifiers and provenance to support version reconciliation and evidence archiving and to form a traceable, rerunnable, and auditable chain. This yields concrete, verifiable benefits for different roles: BIM and CDE managers achieve the closed-loop governance of object primary keys at the design stage; engine and visualization teams use IfcGUID-22 as the minimal mutually recognized key to stably map Unity asset identifiers to object identity, enabling cross-system consistent referencing and replay; owners and auditors can compare pre- and post-states under the same conventions, with mapping details, baselines, and logs together constituting a verifiable registration chain that markedly reduces cross-system breaks and rework caused by bad data.
Under the same denominator
U0 = 820, rules, and interpreter as the pre-test, we compare post (after bridging and minimal repair) against pre item by item. Among the four integrity metrics, completeness, validity, and stability (R1/R2/R4) rise from about 0.05 to 1.00, while uniqueness (R3) reaches 0.98, leaving only one historical duplicate (
Table 10). On the interoperability side, the chain changes from fully broken to almost fully connected: the BRR increases from 0.00 to approximately 1.00, with 777 of 778 pairs correctly recognized, and the DR drops from 1.00 to approximately 0.00 (one missingUnity, zero badPath;
Table 11), while scene-level recognition is established for the first time (18 items;
Table 10,
Table 11 and
Table 12). Although the post-test extends the scan on the consumption side from
∣M∣ = 43 to
∣M∣ = 778, the four primary metrics are always computed on the same
U0, ensuring direct comparability with the pre-test (see
Table 10, “conventions” note).
Taken together, the evidence shows that, when Unity is the consumption side, once its asset identifiers are stably bridged to IfcGUID-22 and the chain of evidence is registered in the CDE, design-stage digital-twin assets can be transformed into compliant data streams that are comparable, traceable, and replayable (
Table 7). The main pre-test breakpoint stemmed from misalignment with Unity inventories, indicating that without bridging and templated mapping, even complete IFC-side primary keys are difficult for the consumption side to recognize as the same object. In the post-test, by freezing export settings and preserving the original GlobalId, cross-version consistency reaches 1.00, making the interpretation “version differences = entity changes” valid and removing ambiguity sources in differencing (see
Table 6, “Stability”). Residual issues on uniqueness converge to a minimal set of historical duplicates, which can be further reduced in the minimal repair step (see the duplicate cluster detail link associated with
Table 6).
4.4. Research Comparison
Regarding research object and stage, existing reviews and empirical work largely place digital-twin (DT) emphasis on operations and maintenance (O&M) and energy scenarios: for example, systematic reviews in Buildings and assessments in Energy Informatics summarize major applications as operational monitoring, anomaly detection, energy optimization, and predictive maintenance, with relatively sparse design-stage evidence [
5,
29]. In contrast, this study targets design-stage object-level primary key governance and traceable registration: rather than proposing new operational control or energy strategies, we address object identifier consistency and cross-platform traceability along the modeling–export–consumption chain under the unified denominator U
0, with pre–post and bridging statistics.
Regarding task focus in Building Information Modeling (BIM)–game engine integration, prior work mainly centers on visualization, interaction, and training. Typical studies emphasize geometry and material conversion, real-time rendering, and user experience improvements but seldom propose a verifiable workflow of “object primary-key governance to stable mapping to provenance logging” as an auditable process [
16,
17,
34,
35,
36]. Recent reviews likewise note that current BIM–engine contributions cluster around visualization, immersive applications, and process integration challenges, while data consistency and identifier persistence remain difficult [
17,
34].
Regarding data quality and delivery, facility management (FM)-oriented approaches such as Construction Operations Building Information Exchange (COBie) and FM-BIM stress list-based constraints on completeness and consistency and quality-control processes to provide usable data at handover [
23,
37]. However, these pipelines typically pivot on as-built and handover time points and pay limited attention to design-stage object-level primary key compliance, cross-version stability, and engine-side recognizability. Our contribution is to front-load primary key governance into executable artifacts at the design stage, through a six-step workflow, and to form an auditable, replayable chain of evidence using the unified denominator U
0 and pre–post Δmetrics.
Regarding problem evidence and engineering pain points, long-standing issues in the ecosystem include IfcGUID loss, duplication, and cross-tool inconsistency, for example missing and duplicate cases in Revit import and export and buildingSMART forum discussions on cross-platform GUID stability, which directly break cross-system reconciliation and traceability [
14,
15,
28]. Our work not only quantifies IfcGUID completeness, validity, uniqueness, and stability, but it also explicitly addresses the ambiguity “engine-internal ID change does not equal object change” by establishing a stable mapping from Unity meta unique IDs to IfcGUID-22 and recording mapping details, baselines, and logs in the Common Data Environment (CDE), turning these “do it right” conditions into a reusable minimal constraint [
11,
18,
27].
4.5. Threats to Validity
Statistical. Using percentages of percentages can exaggerate changes, especially when residual cases are rare. We mitigate this by keeping the unified denominator throughout, reporting percentage-point differences (post—pre) and providing count metrics at the interoperability layer (for example, recognized_pairs and break_total). A residual risk is that relative changes on very small bases can still appear magnified.
Internal. Threats arise from export and consumption channel variance, tool upgrades, and unstable object primary keys. We control these by freezing modeling and export settings and the Unity environment, executing the post-test as a read-only rerun, preserving GlobalId, adopting IfcGUID-22 as the object key, and fixing the IfcOpenShell validation path and commands. A residual risk is that hidden defaults in third-party plug-ins may introduce minor deviations; to address this, we rely on Common Data Environment (CDE) path and checksum registration to support rerun auditing and issue tracing.
Construct. The key question is whether the four core metrics (completeness, validity, uniqueness, stability), plus the BRR and DR, adequately represent “IfcGUID-based compliance and consumption-side usability.” We control for this by treating IfcGUID-22 legality and uniqueness as hard criteria, enforcing unique identifiers and metadata governance at the container level, and materializing compliance as machine-checkable via ifcopenshell.validate and JSON reports. A residual limitation is that deeper semantic consistency is out of scope; therefore, we report IFC-side comparators and interoperability details in parallel to avoid over-interpreting the metrics.
External. The present context is limited to the design stage and the SketchUp to IFC to Unity toolchain. Our reuse premises are as follows: IfcGUID-22 compliance; unique identification and checksum registration of information containers in the CDE; and frozen export and consumption channels with bridging templates. Migration to other engines or toolchains will still require adapted mappings and probes; however, the reproducible artifacts and registrations enable independent verification and repeat experiments, aligning with current artifact review and reproducibility practices [
11].
Provenance artifacts and checksums are summarized in
Appendix A, and the exceptions policy is detailed in
Appendix C.4.
5. Conclusions
This study focuses on the risk of data chain breakage that arises in smart homes when game engines are introduced for digital-twin workflows at the design stage. In response, we propose and validate a “non-standard to standard” governance workflow: IfcGUID-22 is used as the minimal mutually recognized object-level primary key; container-level identification and provenance logging are enforced in the CDE; and Unity’s unique asset IDs (.meta) are stably bridged to IfcGUID. In this way, design-stage digital-twin assets are transformed into comparable, traceable, and replayable data streams, and traceability-oriented governance is concretized at the design stage as a practical pattern for game engine interoperability.
In a virtual smart home case with a unified denominator covering both BIM model assets and digital-twin assets, the workflow increases the four IFC-based traceability metrics—completeness, validity, and stability—from low baselines to 1.00 and increases uniqueness to about 0.98. On the interoperability side, the bridge recognition rate (BRR) approaches 1.00, with only one residual break and stable recognition across multiple scenes. All of these improvements are achieved without changing the primary keys in the source models. In practice, BIM/CDE managers can embed IfcGUID compliance and CDE registration as a design-stage quality gate to reduce the risk of bad data; engine/visualization teams can adopt the IfcGUID↔Unity mapping template as a standard step in digital-twin asset delivery to safeguard the traceability of smart home virtual assets; and owners and auditors obtain a verifiable chain that links object-level changes to specific IFC files, Unity assets, and log commands, supporting auditing and root-cause analysis. Together, these improvements help close building data management gaps created when new tools are introduced at the design stage of smart buildings and are expected to reduce rework and the likelihood of broken evidence chains caused by bad data.
The proposed workflow is not limited to a single project but can be extended to larger building portfolios. Because all metrics are defined at the level of individual objects and computed under the same denominator , portfolio managers can use a common set of integrity metrics (completeness, validity, uniqueness, stability) and interoperability indicators (BRR, DR) to compare the “readiness” of different design-stage models and their digital-twin assets. The same traceability-oriented governance pattern—using IfcGUID as the primary key, assigning clear identifiers to information containers in the CDE, and freezing export and consumption channels—can be reused across projects to support cross-building comparison and the planned reuse of digital-twin assets.
The study also provides a starting point for extending design-stage identifier governance into operational digital twins. Once IfcGUID-based identities are stably established and registered in the CDE and game engines, the same primary key can be carried into O&M platforms and IoT/BMS integrations, providing a consistent object spine for time-series analysis, fault detection, and energy optimization. Although detailed operational strategies are beyond the scope of this paper, the results suggest that robust identifier governance at the design stage can noticeably reduce integration friction later and provide a more reliable data foundation for full-lifecycle digital-twin assets.
Naturally, this study has limitations. The evidence comes from a single context (design-stage smart homes), a single export/consumption toolchain (SketchUp → IFC → Unity), and one pre/post recomputation within a single project. Even with a unified denominator, read-only reruns, and CDE registration to mitigate internal threats, the generalizability of the findings remains limited. Our metrics focus on four IfcGUID-based core indicators and interoperability recognition; deeper semantic consistency, cross-toolchain differences, and quantified time/cost impacts are not analyzed and should not be over-interpreted.
Future research can proceed in three directions. First, replicate and evaluate the workflow in more real projects and larger smart home portfolios, covering different community types, development stages, and technology stacks, to test robustness and transferability under more complex interoperability conditions. Second, analyze the relationship between the quality of design-stage identifier governance and operational outcomes such as energy performance, fault response, and occupant comfort, in order to evaluate the practical benefits of “cleaner” design-stage data for later decision-making. Third, investigate how the workflow can be embedded into project governance and business processes—for example, in development, procurement, auditing, and regulation—so as to form actionable guidelines or contractual clauses that support consistent management and knowledge reuse for smart homes at the estate, district, and city scales.