Voxel-Based Digital Twin Framework for Earthwork Construction

Khan, Muhammad Shoaib; Cho, Hyuk Soo; Seo, Jongwon

doi:10.3390/app15147899

Open AccessArticle

Voxel-Based Digital Twin Framework for Earthwork Construction

by

Muhammad Shoaib Khan

,

Hyuk Soo Cho

^* and

Jongwon Seo

^*

Department of Civil and Environmental Engineering, Hanyang University, Seoul 04763, Republic of Korea

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(14), 7899; https://doi.org/10.3390/app15147899

Submission received: 5 June 2025 / Revised: 9 July 2025 / Accepted: 14 July 2025 / Published: 15 July 2025

(This article belongs to the Section Civil Engineering)

Download

Browse Figures

Versions Notes

Abstract

Earthwork construction presents significant challenges due to its unique characteristics, including irregular topography, inhomogeneous geotechnical properties, dynamic operations involving heavy equipment, and continuous terrain updates over time. Existing methods often fail to accurately capture these complexities, support semantic attributes, simulate realistic equipment–environment interactions, and update the model dynamically during construction. Moreover, most current digital solutions lack an integrated framework capable of linking geotechnical semantics with construction progress in a continuously evolving terrain. This study introduces a novel, voxel-based digital twin framework tailored for earthwork construction. Unlike previous studies that relied on surface, mesh, or layer-based representations, our approach leverages semantically enriched voxelization to encode spatial, material, and behavioral attributes at a high resolution. The proposed framework connects the physical and digital representations of the earthwork environment and is structured into five modules. The data acquisition module gathers terrain, geotechnical, design, and construction data. Virtual models are created for the earthwork in as-planned and as-built models. The digital twin core module utilizes voxels to create a realistic earthwork environment that integrates the as-planned and as-built models, facilitating model–equipment interaction and updating models for progress monitoring. The visualization and simulation module enables model–equipment interaction based on evolving as-built conditions. Finally, the monitoring and analysis module provides volumetric progress insights, semantic material information, and excavation tracking. The key innovation of this framework lies in multi-resolution voxel modeling, semantic mapping of geotechnical properties, and supporting dynamic updates during ongoing construction, enabling model–equipment interaction and material-specific construction progress monitoring. The framework is validated through real-world case studies, demonstrating its effectiveness in providing realistic representations, model–equipment interactions, and supporting progress information and operational insights.

Keywords:

digital twin; progress monitoring; operational-level simulation; earthwork construction; voxelization; game engines; as-built BIM modeling; visualization; unity 3D

1. Introduction

Earthwork operations are critical components of most construction projects and involve significant volumetric modifications of the terrain. These operations are inherently cyclic, as multiple pieces of equipment, such as excavators and dump trucks, are utilized to excavate, transport, and fill earthwork materials. Earthwork construction has a dynamic nature and unique characteristics, including an unstructured work environment such as irregular terrain and inhomogeneous earthwork material properties, the presence of utilities or unstable soil, unexpected rock layers, large-scale operations involving volumetric designs and significant material quantities, difficult scene perception, particularly in underground construction, and limited visibility of machine operators [1]. During construction, two significant challenges arise, as follows: (1) simulating the interaction between an unstructured environment and equipment during earthwork activities, and (2) the difficulty in integrating all the information and dynamically updating the model to reflect the as-built status. These challenges often create safety issues, impacting the productivity and quality of earthwork construction. Hence, a system is needed that can support the integration of all information in an integrated environment, represent earthwork characteristics realistically, and support real-time simulation and dynamic monitoring for efficient earthwork management.

The digital twin connects the physical and virtual assets/processes, enabling seamless integration and control. It uses sensing technologies to capture as-built information and utilizes advanced modeling, visualization, and simulation techniques to accurately represent, analyze, and optimize the system during the operation [2]. At its core, the digital twin serves as a manipulable virtual replica, allowing users to gain insights into the behavior and performance of the physical system [3]. To be effective, the virtual twin must be built with a high degree of fidelity, incorporating the spatial, material, and temporal dynamics of its physical counterpart [4]. Additionally, the virtual twin and physical twin are linked in a bidirectional way, meaning that the changes in the physical environment, such as excavation volume and materials, must be reflected in the virtual twin. Equally, insights from the virtual twin should be used to guide the project stakeholders for accurate and timely decisions [5]. Therefore, in the context of earthwork, a realistic and data-rich virtual twin is fundamental, as it could represent dynamic earthwork processes, including the representation of heterogeneous earthwork materials, interactions, and the volumetric changes that occur during the construction.

Several review articles provide definitions, challenges, opportunities, and current methods for data acquisition, data integration, data linkage between physical and digital components, twin representations, and advanced tools for improving systems and making decisions effectively [4,6,7,8,9,10,11]. Most of the proposed definitions and frameworks share three fundamental components of a digital twin, as follows: the physical twin, the digital twin, and the connections [12]. Tao et al. [13] proposed a generalized framework of the following five components: digital components, virtual components, data, connection, and services. In recent years, digital twin applications have gained substantial attention in the construction industry, particularly for monitoring, simulation, and optimization of physical processes. For example, a digital twin framework for construction equipment monitoring, progress monitoring, and visualization was developed. It provides information about equipment tracking using Internet of Things (IoT) data and dashboard visualization [14]. Drone-based photogrammetry and computer-aided design (CAD) data were integrated to track road base construction in a digital twin environment [15]. A scan-to–building information modeling (BIM) approach was utilized for bridge construction monitoring, leveraging laser scanning data and BIM models to enhance progress monitoring [16]. A framework was developed that uses vision data and IoT data to collect the progress and resource data for construction status assessment, visualization, and prediction [6].

These studies underscore the growing relevance and versatility of digital twins in improving the efficiency, accuracy, and visualization of the construction stage. Despite these advancements, existing digital twin approaches in construction face several limitations, particularly when applied to earthwork construction. For example, previous studies relied on static BIM or CAD models for virtual twin development to represent the as-planned model while collecting data from physical objects [17]. These models do not reflect earthwork properties, which are based on inhomogeneous geotechnical characteristics, and support volumetric changes that occur during excavation and filling [18]. The modeling needed to develop a virtual twin for earthwork is fundamentally different. Structural components typically possess homogenous characteristics and can be represented as discrete, object-based elements within a BIM environment; however, earthwork elements such as terrain and soils are irregular and heterogeneous, characterized by uncertain properties that vary spatially and temporally [18]. Additionally, unlike structural components that are installed or constructed according to the static as-planned designs, earthwork is inherently operational, involving dynamic changes to the terrain in a volumetric manner due to the filling and excavation activities [19]. Capturing these dynamic characteristics of earthwork requires specialized geometric modeling that can represent its complexity in a spatially accurate and semantically rich manner, enabling the virtual twin to reflect realistic conditions and adapt to the evolving physical twin [20].

Apart from this, recently proposed vision-based methods for digital twin creation provide excellent capabilities to capture dynamic changes in the construction environment; however, the earthwork digital twin requires additional information, such as material properties, for semantically specific simulations and progress monitoring [21]. Moreover, they capture point-based information, such as point clouds, which are converted into surfaces and do not support volumetric reasoning, which is crucial for managing cut-and-fill operations [22]. Most importantly, integration between geometric and semantic twin sources (such as design information, construction dataset monitored from the physical twin, and geotechnical parameters) is typically ad hoc [23]. A holistic system that integrates the information in a unified platform with an update function, not only for the geometric information but also for the earthwork semantics, is needed and can contribute significantly.

Voxel-based representation offers a robust solution to address the inherent challenges, making it highly suitable for the development of a realistic earthwork digital twin [24]. The advantages of the voxels in this context are as follows: (1) voxels are a volumetric unit created by discretizing the as-planned or as-built model into three-dimensional (3D) cells/cubes, capturing the realistic representations of earthwork characteristics. (2) Voxels are scalable and their size can be refined, allowing for the detailed representation of uncertain earthwork properties [25]. (3) Each voxel can integrate semantic information such as soil type, cohesion, strength, moisture content, unit cost, and excavation time, enabling detailed monitoring of earthwork conditions. This embedded semantic information not only supports material-specific simulation during model–equipment interaction but also allows for tracking which materials have been excavated or filled during construction. (4) Voxels can be systematically updated when new sensing data becomes available, supporting real-time tracking and productivity assessment at an operational level. Despite these advantages, the practical implementation of voxel-based models for earthwork digital twins, particularly their efficient creation, visualization, updating, and management in a constantly changing environment, remains a significant research challenge [4,26].

This article proposes a voxel-based digital twin framework for earthwork construction that addresses the critical gaps. The framework is designed to integrate the earthwork information (including terrain, geotechnical, design, and construction data) and uses voxels for the as-planned and as-built modeling, enabling volumetric representation and allowing for real-time simulation and progress tracking. The proposed framework has been validated through real-world case studies, demonstrating the system’s effectiveness in providing a high-fidelity, operationally insightful digital twin for earthwork. Moreover, the scope of this study is primarily focused on the virtual twin creation for earthwork, the development of graphical simulations that facilitate model–equipment interaction, and the monitoring of progress through the dynamic integration of construction data from the physical twin to the digital twin. The primary objective of the proposed study is to achieve a voxel-based digital twin framework that overcomes the limitations of the existing representation schemes. The particular objectives are as follows: (1) to integrate terrain, geotechnical, and construction data, with an update function, into a unified voxel-based environment; (2) to represent heterogeneous earthwork semantics through attribute-rich voxels; (3) to enable volumetric modeling of as-planned and as-built earthwork conditions; (4) to facilitate earthwork model–equipment interaction and progress monitoring with semantic tracking during construction. By achieving these particular goals, the proposed system bridges the gap between static modeling practices and dynamic earthwork management needs, contributing to improved automation, visualization, and decision-making in a complex construction environment.

2. Related Research

2.1. Earthwork Twin Modeling

Capturing the as-is conditions and simulating earthwork processes demand a context-sensitive modeling approach that accounts for irregular topography, material heterogeneity, and continuous spatial changes. In digital practice, earthwork environments are represented by surface and solid-based models [27]. Triangular irregular network (TIN) surfaces have been used in earthwork terrain modeling; however, they lack volumetric information. Solid data modeling, including boundary representation (B-rep) and decomposition models, was used to create a solid-based volumetric environment [28]. B-rep uses the boundary information to represent a 3D volume, whereas decomposition models segment a 3D volume into smaller units [29]. However, B-rep is a homogeneous model and is not designed to incorporate the varying properties of the earthwork within a model [30].

A decomposition model can efficiently address this issue and represent inhomogeneous earthwork properties. The tetrahedral network (TEN) is a decomposition model that uses the faces of TIN surfaces to create units, into which information such as material, cost, and quantity is assimilated [31]. However, TEN has limitations in terms of adaptability and refinement to smaller units (Figure 1). For instance, more often, when detailed information is needed, or when the unit size is larger than the equipment-specific dimension, such as bucket size, this cannot be achieved with TEN because of its fixed structure [28].

Voxel-based representation offers an efficient solution to address the aforementioned limitations. It divides the model into volumetric, 3D voxel units, providing a consistent data structure. It can represent the varying properties of earthworks by using different voxel sizes that can additionally be manipulated with equipment [24]. For example, the underground construction environment is represented by voxels that offer the incorporation of attribute information at the entity level, enabling the efficient management of uncertain spaces for simulation [4]. This reflects the advantages of voxels in precisely representing material characteristics, such as changes in material type (clay or sand) or properties (densities or moisture level), enabling the model to account for different material properties during manipulation [25].

In addition to these geometries, point-based representation methods are also used, especially for representing the as-built information collected through photogrammetry and light detection and ranging (LIDAR) [32]. The point data is processed or enriched with classification labels to extract meaningful information and interpreted using a mesh reconstruction algorithm [19]. However, the direct use of point-based modeling lacks inherent topology, which hinders structural simulation or material-specific progress tracking. Moreover, hybrid modeling strategies have also been used in recent years. For instance, homogenous regions are modeled using surface or Brep models, while TEN and voxels are utilized to represent inhomogeneous regions [24,28]. In other words, boundary information is represented using surface-based models, while volumetric information is modeled through TEN or voxels [33].

2.2. Earthwork Operational Simulation and Monitoring

While voxel-based modeling provides a robust foundation for representing earthwork environments with high spatial and semantic fidelity, the practical value of such detailed models is verified by their use in decision-making. Earthwork activities require different levels of planning and control due to their complexity and scale, from high-level process coordination to detailed real-time operational management. In this context, it is essential to classify them into different hierarchies. For instance, earthwork management is classified into two hierarchical levels, namely, activity level and operation level [34]. The activity level focuses on the overarching elements of earthwork activities (i.e., excavation, transportation, and deposition) and on improving and optimizing broad-scale operations. Operational-level earthwork management focuses on the detailed execution of specific activities and automated field operations. For example, excavation tasks emphasize granular information such as detailed volumetric representation, material properties, geotechnical and spatial characteristics, and equipment interactions with the environment. Automating operational-level tasks can significantly enhance the efficiency of earthwork.

The application of simulation and monitoring techniques through the digital twin has the potential to greatly improve these operational-level tasks. Several simulation tools have been proposed for earthwork operations, serving diverse purposes such as simulating the earthwork process, training operators, evaluating alternatives, enhancing visualization, and providing a user interface that enables human–machine interaction [35]. Early simulation development has often emphasized photorealistic methods, providing high-quality visualization of the earthwork environment to support planning and decision making. For example, a virtual reality (VR) simulator for hydraulic excavators was developed to train equipment operators and improve the efficiency of standard earthwork operations [36]. Although effective, such systems often oversimplify the environmental details and model–equipment interactions because they use datasets that are static, such as the acquired OpenStreetMap, and interactions based on photorealism [37]. In this context, physics realistic approaches have been proposed, utilizing computational methods to simulate model–equipment interactions, such as bucket and material interactions. Computational methods such as the finite element method (FEM), material point method (MPM), and discrete element method (DEM) have been used to model soil deformation, efficiently simulating the interaction, enhancing equipment design, and providing alternatives [38,39]. Physics realistic simulations are more accurate; however, their focus on micro-level details and high computational loads limits their scalability, particularly for large-scale earthwork operations. A simple and efficient method based on ray casting has been proposed, which uses rays from the edges of an excavator bucket to estimate the soil volume and weight, simulating earthwork operations and thereby providing an alternative method with sufficient realism [40]. Furthermore, volumetric modifications are estimated using surface meshes for training simulators, enabling interactive visualization and improving situational awareness for operators [41]. Moreover, a contextually realistic simulation environment was developed, providing an efficient platform for training [37].

Despite increasing popularity, several shortcomings exist. For instance, existing systems often use oversimplified data such as OpenStreetMap (OSM), which lacks the level of precision, subsurface details, and material information required. Although these simulators provide advantages, they are designed for specific environments and are not sufficiently flexible to adopt custom design data or incorporate earthwork information that changes from site to site. For example, the designs, material properties, and other characteristics of earthwork projects may vary [42]. Even for a single construction site, the material properties exhibit dynamics in three dimensions [43]. This highlights the necessity of using accurate design and construction data with advanced data collection methods such as 3D geotechnical investigation, unmanned aerial vehicle (UAV)-based mapping, and laser scanning.

On the monitoring side, tracking construction progress involves the acquisition of operational as-built data, processing it into the required format for visualization and analysis, and comparing it with the as-planned data. Conventionally, these tasks were performed manually by surveyors, which is a time-consuming process. In recent years, advanced data-capturing tools and techniques such as laser scanning and photogrammetry have been proposed for construction monitoring [6]. For the comparison of as-planned and as-built models to identify the progress, multiple methods have been proposed, including image vs. model, point cloud vs. model, and model vs. model [44].

The as-built time-lapsed photograph has been used and compared with the BIM-based as-planned model to monitor the construction of a building project [45]. While effective for progress visualization, such methods struggle with quantitative analysis, such as volumetric computation. To address this, the construction images were processed into a point cloud and compared with the as-planned BIM model in an augmented reality (AR) environment for progress quantification [46]. Moreover, the point cloud was converted into an as-built BIM model that enhances model visualization and provides progress quantification [47]. However, such methods still rely on homogeneous surfaces or solid BIM models, which often fail to reflect the actual material condition, particularly for the earthwork construction site. The lack of geotechnical variation and real-time volumetric updates limits the accuracy and applicability of these methods in earthwork construction.

3. Methodology

3.1. Proposed Voxel-Based Digital Twin Framework

The proposed voxel-based digital twin framework for earthwork construction is presented in Figure 2. The proposed framework digitally replicates and manages earthwork construction through the following five components: physical parts, digital parts, data, connections, and services. At its core, the physical parts represent real-world entities on construction sites, including terrain, excavation and filling zones, geotechnical properties, and heavy equipment such as excavators and dump trucks. These physical parts are actively engaged in earthwork activities. For instance, the terrain changes volumetrically, which is captured through sensing technologies such as photogrammetry or laser scanning. Equipment operations at the construction site continuously generate data that reflects the evolution and serves as a data source for the updated digital twin.

The digital part of the framework involves the representation of each physical part in a 3D environment, such as through BIM models. The terrain, geotechnical, design, and construction data are modeled and embedded with semantic information. Additionally, this includes a simulation of the earthwork operation, where equipment and models interact based on the as-built data. The digital counterpart facilitates both real-time model updating and simulation, enabling stakeholders to analyze the current state, simulate future scenarios, and visualize previous activity sequences.

The data, connectivity, and service components of the framework contribute to developing the operational layer of the system. The data component integrates earthwork construction-related information such as geotechnical parameters, terrain information, equipment specification, design models, and as-built scans. The connection layer provides seamless communication between the physical and digital parts, ensuring volumetric updates and feedback. The service layer delivers application functions, including model–equipment interaction simulation, material tracking, and volumetric analysis. These components are linked together, creating a highly integrated and context-aware digital twin system tailored for earthwork construction.

3.2. Proposed Digital Twin Architecture

The proposed framework for earthwork construction is structured around five interconnected modules that cumulatively allow the creation, synchronization, visualization, simulation, and monitoring of the dynamic earthwork process (Figure 3). The modules are as follows: (1) data acquisition, (2) virtual model creation, (3) digital twin core, (4) visualization and simulation, and (5) monitoring and analysis.

3.2.1. Data Acquisition Module

This module collects multi-source information, which forms the foundation of the digital twin. The data sources are broadly categorized into construction site data and equipment data. Construction site data consist of subsurface, surface, and design data, such as target models, cut/fill zones, and as-built construction data collected during operation. Subsurface data represent the underground material conditions and their characteristics, which can be collected through geotechnical investigations. Boreholes provide data on the earthwork material at different depths, along with other in situ and laboratory test results, to achieve geotechnical stratigraphy and properties such as material type, strength, cohesion, friction, and ground level.

Surface data, which represents the existing ground conditions, can be obtained from publicly available datasets such as the National Geographic Institute of Korea (NGII), the United States Geological Survey, OpenStreetMap, and GIS. However, these datasets are inaccurate because of scale changes and outdated formats. Advanced technologies, such as laser scanning and photogrammetry, can be used to capture surface data with millimeter-level accuracy. In addition to accuracy, it is important to obtain the current state of information for realistic simulations, which is lacking in the public datasets. In addition, earthwork design data, including BIM models, CAD data, and earthwork quantity reports, provide targets and cut-and-fill information.

Construction data is acquired regularly using cameras or laser scanners, ensuring that both static and dynamic field data are captured for downstream use. Another critical data source is the equipment used for model manipulation. Equipment data are obtained either by scanning onsite or from equipment specifications, such as equipment-specific dimensions, capacity, and mobility parameters that influence the simulation performance. In addition, kinematic and dynamic equipment data can be obtained from the onboard sensors and telemetry systems. Moreover, automated machine guidance (AMG) can provide significant information about the equipment at a construction site.

3.2.2. Virtual Model Creation Module

After collecting essential data on the construction site and equipment, the next step is to create virtual simulation models that have similar characteristics to the physical models. At this stage, it is assumed that the collected data are sufficiently accurate to be used for virtualization.

This module is responsible for the as-planned and as-built model generation. The as-planned model represents the design models that are scheduled to be excavated or filled. These models include the preprocessing of terrain and terrain modeling, subsurface modeling, design target modeling, and conversion of the design model into a voxel-based model. Additionally, they include the semantic assignment to each voxel, such as material type, strength, moisture content, excavation time, cost, and construction status (a detailed process of voxelization is discussed in Section 3.3.3). This information is appended to each voxel with the aim of utilization in the downstream stages. For example, the material information attached to each voxel is used for material tracking. Moreover, the equipment is configured within the virtual environment that is used for model–equipment simulation. On the other hand, the as-built model is acquired through a sensing technique and continuously updated to reflect the as-built condition. The as-built is integrated with the as-planned to reflect the current state of information.

3.2.3. Digital Twin Core Module

At the center of the proposed framework, the digital twin core module links the physical and digital twins in an integrated environment. It serves as an integration layer that connects the as-planned and as-built models. This module ensures that the physical and digital twins are synchronized, provides the simulation and progress information according to the as-is condition, and evolves continuously with the construction.

3.2.4. Visualization and Simulation Module

This module provides a graphical simulation interface for interaction with the model and equipment in an interactive and immersive environment. It integrates the as-is condition information, including ground, target, and design or as-built voxels, along with the construction equipment, such as an excavator and dumper. The equipment is utilized to simulate a voxel-based earthwork environment according to its semantics, enhancing the situational awareness of construction site information, as well as facilitating scenario analysis and effective decision-making.

3.2.5. Monitoring and Analysis Module

The monitoring and analysis module uses the as-planned and as-built voxels to support the real-time tracking of progress. It enables monitoring the productivity of earthwork construction at the voxel level. The key characteristic of this module is that it not only tracks volumetric information but also material information. Additionally, it continuously provides earthwork quality information using a visualization model of the deviation between as-planned and as-built voxels. This module plays a significant role in facilitating project managers to make decisions based on data-driven analysis, ensuring that the project is completed according to the planned objectives.

These five connected modules are linked together to form a robust and intelligent digital twin tailored for earthwork construction. Integrating voxel-based modeling, game-engine–supported model–equipment simulation, semantic data modeling, and real-time sensing data forms an advanced platform for earthwork construction visualization, simulation, and monitoring.

3.3. Earthwork Virtual Model Construction

The purpose of virtual models in earthwork construction is to develop a virtual environment containing a digital representation of the physical entities and their characteristics. The realism, accuracy, and computation of the simulation are highly dependent on the model’s level of detail (LOD), which is directly correlated with its representation. For example, the as-planned model represents excavation materials to be removed, which requires a detailed representation with a higher LOD. Each earthwork model employs distinct geometric modeling to balance computational efficiency with fidelity to actual earthwork conditions.

3.3.1. Surface and Geotechnical Model

The original ground model digitally represents the terrain topography, slopes, design zones, and their properties. Several representations were reviewed to select the ground model type. The ground model provides a static environment for equipment mobility and is not directly manipulated by the equipment. Hence, a simplified and accurate TIN-based surface modeling approach was used to represent the ground conditions. TIN constructs a mesh model of irregular triangles from drone surveys or LIDAR scans to represent complex and large-scale geographical conditions (Figure 4a).

The geotechnical model represents the variation in geological profiles, along with characteristics such as earthwork materials, bearing strength, and moisture content. Investigation reports and borehole data were used to generate a subsurface model. B-rep was used to represent the subsurface volumetric information by defining the geometries using the boundary surfaces. Before modeling, the borehole coordinates were transferred from the local to the standard Universal Transverse Mercator (UTM) to obtain georeferenced data. The borehole coordinates were then transferred using a transformation equation.

For a borehole point p = [x, y, z]^⊤ in the local coordinate system, its transformed position is

P_{u n i f i e d c o o r d i n a t e} = R . P_{l o c a l c o o r d i n a t e} + t

(1)

where R is a rotational matrix and t is a translational vector.

R = Rx. Ry. Rz

R_{x} = [\begin{matrix} 1 & 0 & 0 \\ 0 & \cos θ x & - \sin θ x \\ 0 & \sin θ x & \cos θ x \end{matrix}]

(2)

R_{y} = [\begin{matrix} \cos θ y & 0 & \sin θ y \\ 0 & 1 & - \sin θ x \\ - \sin θ y & 0 & \cos θ y \end{matrix}]

(3)

R_{z} = [\begin{matrix} \cos θ z & - \sin θ z & 0 \\ \sin θ z & \cos θ z & 0 \\ 0 & 0 & 1 \end{matrix}]

(4)

Assuming the boreholes are created vertically, their coordinates are determined through the following equation:

P_{u n i f i e d c o o r d i n a t e} = t

(5)

where

t = [\begin{matrix} t_{x} \\ t_{y} \\ t_{z} \end{matrix}]

, and t_x, t_y, and t_z are translations in the x-, y-, and z-directions, respectively.

After determining the coordinate values for each borehole, they were integrated with the ground surface and modeled. First, boreholes were created from the point data at different depths obtained through a geotechnical investigation (Figure 4b). Layers were created at each depth to obtain surface information for each material type (Figure 4c,d). The top and bottom layers of each material were used to develop the volumetric B-rep model (Figure 4e,f). A detailed method for creating the geotechnical model can be found in our previous article [24].

3.3.2. Target Model

In earthwork operations, the equipment operates in a continuously changing environment because the materials are either removed or filled. The target model presents the grade line or final information to equipment operators or managers. As this model only provides a target to assist the operator, a TIN-based surface is used to represent the target model. Figure 5a shows the target model design, considering the earthwork cut or excavation scenario.

3.3.3. Voxel-Based As-Planned/As-Built Model Development

The proposed method uses voxel-based representation for the design model and the as-built model. The design model represents the volume between the target model and the ground model. To voxelize a design model or an as-built model, the voxelization process is used. Voxelization converts the geometrical models into cubes, where each cube can be assigned different properties and semantics. The voxelization process of the earthwork model is explained in detail in our previous article [25]. The method uses a three-step procedure, as follows: (1) solid extraction between the target and ground models, (2) conversion of the solid model into a mesh model, and (3) voxel generation inside the mesh. The process is depicted for an excavation example in Figure 5.

The solid was extracted in the first step using the ground and target surfaces. The triangles of the TIN surfaces were utilized to extract the solid that represents the homogenous volumetric details of the cut or fill, providing clear boundaries with topological information. The extracted solids were converted into meshes. The conversion of the design model into meshes retains surface discontinuities and topology, ensuring that geometric fidelity is preserved. The meshes were composed of several triangular facets. The generated meshes were used as inputs for the voxelization algorithm to create a voxel-based as-planned model. The final step involved the creation of voxels inside the cut-and-fill mesh. Voxel creation inside the mesh consisted of the following steps: (i) creation of a bounding box around the design mesh, (ii) voxel size definition, (iii) voxel grid definition and voxel generation in the bounding box, and (iv) ray casting for voxel occupancy detection and classification.

An axis-aligned bounding box (AABB) was created around the selected cut or fill mesh to define a simple enclosed boundary. It is calculated by iterating all the vertices, specifically, the minimum and maximum values (xmin, ymin, zmin, and xmax, ymax, zmax) of the design mesh, which define their coordinates in the Cartesian system. The design models are generally available in surface-based representation and are directly used for voxelization. However, the as-built models are collected through sensing techniques such as cameras or laser scanners. The available dataset is a point cloud, which is then processed to develop a surface model. The surface-based model was utilized as input for the voxelization. The AABB enclosed the extent of the geometry, and voxels were created in alignment with the AABB, ensuring that the exact geometry was voxelized.

Voxel size selection is a critical factor for realistic simulations, as it is directly correlated with site geotechnical conditions, volumetric accuracy and computation, equipment bucket size, and their interactions during the simulation. Depending on the geotechnical conditions of the site, granular materials, such as sand or gravel, can be represented by large voxels, whereas fine materials, such as clay or silt, require smaller voxels to capture realistic site conditions. This method uses actual site condition data from geotechnical investigations and laboratory testing data, and the voxel size is selected based on the condition and material type. After the material is identified, its standard size can be used for voxel size selection. To ensure computational efficiency, a scaling factor is applied to the size of the geotechnical material. The voxel size is calculated as follows:

V o x e l s i z e f o r m a t e r i a l t y p e = Δ_{m a t e r i a l} = m a t e r i a l s i z e \times s c a l i n g f a c t o r

(6)

The voxel size is constrained by the allowable error that occurs when the voxels surpass the design surface boundary or the as-built surface boundary (the as-built surface obtained after the point cloud is converted into a surface). Smaller voxels provide greater accuracy; however, they require additional computation for model creation and simulation. For the earthwork scenario, an error of 2–3% is allowable [25]. For earthwork design volume, V_d, and the allowable error range, E_allow, the voxel size for volume accuracy can be determined as follows:

V o x e l s i z e f o r a c c u r a c y = Δ_{a c c u r a c y} = ({V_{d} \times E_{a l l o w})}^{\frac{1}{3}}

(7)

Within the allowable error range, the voxel size is directly correlated with computational efficiency. A voxel size that balances the accuracy and computational burden should be selected. The bucket size also influences voxel resolution. A suitable voxel size is calculated by dividing the bucket dimension by the resolution factor (n). For bucket width B_w and resolution factor n, the voxel size can be determined as follows:

V o x e l s i z e f o r b u c k e t w i d t h = Δ_{b u c k e t s i z e} = \frac{B u c k e t w i d t h (B w)}{r e s o l u t i o n f a c t o r (n)}

(8)

For larger bucket sizes, coarser voxels provide the required resolution for model–equipment manipulation. In addition, the contact area between the bucket and voxel affects the voxel size because it can affect the penetration depth. Larger buckets have greater widths and penetration depths, and require coarser voxels. The voxel size for penetration depth d_p is calculated as follows:

V o x e l s i z e f o r i n t e r a c t i o n = Δ_{i n t e r a c t i o n} = \frac{B u c k e t w i d t h (B w) \times D e p t h o f b u c k e t t e e t}{C o n t a c t r e s o l u t i o n f a c t o r (n)}

(9)

The optimal voxel size is selected based on min (Δmaterial, Δaccuracy, Δbucketsize, Δinteraction). After the voxel size is selected, voxel grids are generated. For voxel grids, a diagonal vector is generated in the bounding box. As the centers of the bounding box and mesh are aligned in the AABB, the diagonal vector passes through the mesh geometry. The diagonal vector is deconstructed with a predefined interval that creates grids in the local x-direction. Voxels are generated over the grids in the x-direction; see Figure 5c,d,f. Each voxel in the bounding box is indexed, containing the center position. Initiating voxelization from the center position enables uniform distribution over the bounding box. After voxelization of the bounding box, a ray-casting algorithm is applied to determine whether the voxel is inside the design mesh, intersects it, or is outside. The outside voxels are detected and discarded since voxels belonging to the design or as-built models are used. Additionally, each voxel is embedded with semantic information, as presented in Figure 5g.

3.4. Equipment Configuration in the Virtual Environment

Equipment is configured within the virtual environment for the operational simulation. The equipment configuration involves the development of equipment components in a virtual environment and building a hierarchy. This study focused on excavation and filling, and the equipment required for this process, including an excavator and dumper, was configured. Building the hierarchy of the virtual components of the equipment is crucial for capturing the actual equipment kinematics.

For instance, an excavator consists of multiple components such as an upper body, lower body, arm, boom, and bucket. The components are linked hierarchically according to the parent–child relationship. Each component of the equipment has a geographical position, rotation, and scale. Excavator activities include digging/loading, swinging to dump/hauling, dumping/unloading, and returning to the dig. It can translate and rotate about the X, Y, and Z planes using the assigned degrees of freedom (DOFs). Similarly, excavator components have DOFs, such as a bucket that can roll in or roll out to represent the dynamic behavior of the virtual model.

Figure 6 shows the components of the excavator built to develop accurate equipment. The equipment cabins, doors, mechanical parts, vehicle parts, and wheels are at the same hierarchical level, indicating that they can rotate independently during the simulation. The most important components are the mechanical parts, including swing frames, stabilizers, cylinders, booms, arms, levers, and buckets. The swing frame has a parent relationship with the boom, arm, bucket, and cylinders, such that when the frame swings, these components can move together. In this example, data can be attributed to the location information of each joint that provides the motion of the components during the simulation.

3.5. Interaction Between the Model and Equipment for Graphical Simulation in Unity 3D

To create a simulation of the earthwork operational environment and model–equipment interaction at any given time during construction, interaction between voxels and equipment is necessary. This simulation was achieved in the Unity game engine. The Unity game engine was selected because it can import 3D data from design tools, provides a robust physical system, simulates model–equipment contact forces, allows simulation enhancements through scripts, and supports the deployment of cross-platform development and interactive user interfaces (UIs), where the operator or manager can interact with the system. Considering the complex nature of the earthwork process, the interaction is categorized into two main components, as follows: (i) model vs. model, and (ii) model vs. equipment. Since the target, ground, and as-planned voxel models are used in the proposed graphical simulation, each model is created separately. Surface-based representation has been used to create target and ground models, while voxels are used to represent the as-planned earthwork model. These models interact with each other during earthwork operations in the graphical simulation.

Each component is assigned Unity physics components, which provide functions that behave realistically (Table 1). The model vs. model interaction was achieved using the defined Unity physics components. For example, the excavator falls under gravity because of the Rigidbody component, and the ground Mesh Collider supports the equipment’s navigation around the construction site during the graphical simulation.

On the other hand, the model vs. equipment interaction is the most important and complex aspect of manipulating earthwork models with equipment. This has been achieved using Unity physics components and scripting. When an excavator performs activities such as digging, loading, and moving back and forth for material handling, it interacts with the environment. The excavator interacts with the environment in two ways, as follows: (i) interaction with the environment not directly handled by the excavator bucket, i.e., the ground and target models, and (ii) interaction with the work environment, i.e., the voxel-based as-planned model.

In order to reduce the computational efforts, the environment interactions (equipment vs. target model and equipment vs. ground model) are achieved through Unity physics, which provides exact collision. The excavator vs. as-planned model is a complex interaction because the excavator has to deal with the voxels to perform the earthwork activities. In the context of construction management, the project managers and operators are mainly concerned with volumetric deformation, quantities, and realistic visualization for making informed decisions. Simplified but efficient algorithms have been developed based on collision detection to create interaction between buckets and voxels, allowing excavation and earth loading to be performed. The algorithms allow us to perform earthwork activities and update the model in real-time, which would otherwise be too complex, requiring high computational loads due to the strongly nonlinear behavior of the dynamic equations [36].

The algorithms are scripted with C# programming and attached to each voxel and excavator bucket to determine the position of the bucket. When the excavator bucket enters the trigger zone of the voxels, the interaction is determined based on the Box Colliders assigned to each voxel. The algorithm detects the event and triggers the interaction. The identically shaped voxels proposed in this research make it easy and efficient to detect bucket mesh colliders. Similarly, when the excavator bucket exits the voxel trigger zone, the trigger exit function is activated. As the excavator bucket moves inside the voxels, the algorithm monitors the excavator bucket movement continuously as it interacts with the voxels to ensure precise alignment with the voxels. By dynamically adjusting the bucket position according to user input and earthwork conditions, the algorithm replicates the complex interaction between the excavator and voxels. The algorithm accurately penetrates the voxels based on the collision detection between the excavator and voxels, utilizing Unity’s robust physics, and removes voxels according to the bucket adjustment. The voxel position is updated based on the movement of the excavator bucket. When the excavator loads the voxels and unloads them in the dumper, the system monitors the number and type (material type) of voxels and visualizes them in the user interface. This approach enables users to excavate the earthwork as-planned voxels, which accurately represent volumetric changes as the buckets load the as-planned voxels, mirroring actual earthwork construction with high fidelity.

Excavator unloading is performed similarly. When the excavator bucket is rolled out, the voxels fall into the dumper due to the gravity function of Unity physics. The dumper detects the voxels and shows them in the user interface with properties such as volume, material type, number of voxels, etc. The advantage of the proposed method is that it represents an as-planned model with voxels that provide realistic simulation and volume calculation, performing better than previous methods where volume is estimated based on approximations.

3.6. Digital Twin Update

The proposed framework is continuously evolving as new sensing data emerges into the system. This framework uses a voxel-to-voxel comparison to monitor the volumetric information and track productivity progress and material tracking. When new data is acquired with cameras or laser scanning, it is processed into a point cloud [48]. The TIN surface is generated from the point cloud and extruded relative to the ground surface to create a solid model. The solid is voxelized according to our previous method [25].

The as-planned voxels achieved by voxelization of the design surface and as-built voxels, which are obtained from the sensing data, are spatially registered by aligning their global coordinates. Since, in the context of this research application, both as-planned and as-built models are created in the same coordinate system, coordinate transformation is not required. Consequently, each voxel in the as-planned model is compared with the as-built voxels using centroid proximity. Centroid proximity ensures that both models are spatially aligned.

After the as-planned and as-built voxel correspondence, the as-planned voxel status is updated according to the presence of as-built voxels at the same location within the digital twin environment. The status of the as-planned voxels is changed from planned to excavated when as-built voxels are found at a given coordinate. After the status is changed, the excavated voxels are hidden in the visualization to show the as-built status of earthwork construction. Additionally, the volume of the excavated voxels is used for volumetric quantification, productivity calculation, and monitoring. Moreover, the material information, along with other semantics, is extracted from the excavated voxels for material tracking. The voxel-to-voxel comparison not only provides productivity and progress information but also material information, which is significant for project managers in assessing what kinds of materials have been excavated since these materials can be used for filling or other activities. It allows a fine-grained representation of construction progress, ensuring that the digital twin evolves accurately according to the current state of the construction.

4. Implementation of the Proposed System

This study used excavation sites as examples to verify the developed framework and system capabilities for earthwork construction. The system architecture is shown in Figure 7. First, it collects all related information, including surface terrain data, subsurface information, design BIM models, equipment data, and scans collected during the construction stage, to represent the current state of information. All these data are converted into a virtual platform in the next phase to create the earthwork digital replica, which can update the digital models when new data from the physical twin is available. The as-is condition data are converted into a graphical simulation environment for model–equipment interaction. Moreover, the as-planned and as-built data are continuously compared for progress monitoring, facilitating practitioners with visual and quantitative insights into progress.

The system is implemented using commercial tools such as Autodesk Civil 3D, Rhino 3D, and Unity 3D. For instance, the ground, geotechnical model, and design model are developed using Autodesk Civil 3D 2023, while Rhino 3D version 8 is used for the design voxelization and integration of all models to achieve the as-planned model. The point cloud data achieved during the construction stage, after preprocessing in CloudCompare, is converted into a TIN surface in Autodesk Civil 3D, representing the current state of construction, and later integrated with the as-planned model in Rhino 3D. The as-planned model is updated accordingly, and progress is visualized and quantified inside Rhino 3D. Meanwhile, the as-planned model in the preconstruction stage or as-is condition information during any construction is linked with Unity 3D for graphical simulation and model–equipment interaction. Also, the C# scripts are designed within Unity 3D for model–equipment interaction. All these models—adopted from the physical twin through data acquisition techniques—are processed, virtualized, simulated, analyzed, and updated in near-real time whenever new data is available.

Considering the implementation scenarios of the proposed digital twin system, two different case studies were selected. A building pad excavation project was used for the implementation of the graphical simulation environment using the digital twin data. Moreover, a trench excavation site was used to implement progress monitoring. However, the study framework is general and can be applied to all earthwork cases. The selected case studies, a building pad and trench, represent two of the most commonly encountered earthwork conditions in infrastructure and site development projects. The building pad scenario is typical of large-scale cut-and-fill operations requiring broad volumetric balancing and ground preparation for structural foundations. On the other hand, the trench excavation is representative of narrower and linear excavations frequently used for utility installations or road excavation, making it suitable for assessing detailed boundary conditions and model–equipment interaction at a fine resolution. These two case studies together offer balanced and realistic representations of earthwork conditions that span different operational requirements and geometric constraints.

4.1. Graphical Simulation Environment for Model–Equipment Interaction

The first implementation scenario of the proposed digital twin is an application for a graphical simulation environment. The development was divided into the following steps: (i) data collection and accuracy evaluation, (ii) creation of virtual or digital twin models, (iii) platform for model integration, and (iv) scenario development for model manipulation.

4.1.1. Data Collection and Processing

First, the construction site was scanned using a DJI Phantom 4 Real-time Kinematic (RTK) (DJI, Shenzhen, China) drone according to the earthwork management plan. First, ground control points (GCPs) were selected at sites that provided ground coordinates using Leica RTK-Geographic Positioning System (RTK-GPS) (Leica, Wetzlar, Germany) at different reference points. Next, the drone was flown at an altitude of 100 m, and site images were captured. The accuracy of the data was verified by comparing the coordinates of unknown points selected on the construction site from the total station and drone survey. The Leica Total Station (TS), TS-16, was used to measure the coordinates of the reference points for accuracy evaluation. Errors of 2, 2, and 2.4 cm were found for this specific site in the x-, y-, and z-directions, respectively, which are considered within the allowable range for earthwork. Next, subsurface information was collected through boreholes. The borehole coordinates were determined manually using a total station. Geotechnical information was obtained through laboratory tests.

4.1.2. Digital Twin Models Creation

The digital twin models for the graphical simulation were based on the as-is condition. The ground, geotechnical, design, and the current state of the data were processed to develop the twin models. The procedure for creating the digital twin models is presented in Figure 8.

Firstly, a ground model was generated, representing the original ground condition before construction, and serving as a base for the model registration in the downstream stages. A 3D point cloud was generated by processing images through image registration, which involved merging and aligning the images using feature extraction, feature matching, and bundle adjustment. A TIN surface was created from the point cloud data. Then, geotechnical modeling was performed using borehole investigation data, using the algorithm proposed in a previous study [24].

The next step was to create an as-is condition model. Since the scenario was created before the construction started, practically on site, the design model, along with ground and geotechnical, was used to represent the as-is condition. It was modeled using Autodesk Civil 3D, providing target information and quantities. The ground and target models are shown in Figure 9a. A solid was extracted using the ground and target TIN surfaces, which shows the as-planned excavation model. The earthwork models from Autodesk Civil 3D were imported into Rhino 3D software using the Drawing (dwg) format. The extracted solid was converted into a mesh, which was then converted into voxels. Before converting the design mesh into voxels, the voxel size was determined based on the allowable accuracy and computation, site material, and bucket information, such as width. A voxel size of 1 cm was selected, considering the site material and bucket size, as it resulted in minimal error and efficient computation. The voxel-based as-planned model is shown in Figure 9b.

For equipment modeling, a customizable library was used in the implementation. Users can manipulate the equipment in the simulation, such as translating the machine, rotating mechanical parts, and digging voxels with a bucket. Keyboard keys were used to control the equipment commands. Each mechanical component had a lever attached to perform its functions. The lever had an input setting controlled by keyboard commands. The details of the input settings for the excavator and dump truck are described in Table 2.

4.1.3. Model Integration in Platform

A platform was developed to implement the proposed system. The Unity game engine was selected because it can import 3D data from design tools, provide a robust physical system, and simulate model–equipment contact forces, allowing the simulation to be enhanced through scripts. It also supports the deployment of cross-platform development and interactive user interfaces (UIs), where the operator or manager can interact with the system.

Unity has three challenges that must be overcome for successful platform implementation. First, Unity manipulates the data in mesh format, which requires the conversion of earthwork data, such as surface models, into mesh models. Second, the manipulation of voxel data in Unity requires an adaptive tree structure. Third, metadata or attributes attached to each voxel must be preserved.

The earthwork geometrical data were converted into meshes inside Rhino 3D. The meshes can be imported into Unity 3D in Object (Obj), Drawing Exchange Format (DXF), or Filmbox (Fbx) formats. The problem with the Obj and DXF formats is that they export the selected model as an integrated mesh; for example, voxels are exported as a single mesh that cannot be manipulated in Unity 3D. Furthermore, exporting each voxel separately and integrating it into Unity 3D is time-consuming. Hence, the Fbx format was selected to convert the earthwork models into meshes along with the material, color, and texture information.

The ground and target models were assigned different layers with defined materials, colors, and unique names. The voxels were divided into different layers based on the soil type and exported in the Fbx format. This has the advantage of assigning them a name and tag in the Unity platform, which makes them recognizable. The earthwork geometrical models were exported through the Fbx format. The attribute conversion was achieved through the following two steps: exporting the attribute information into a comma-separated values (CSV) file from Rhino and importing it into Unity. A custom script was designed to iterate through each voxel in the design tool, extract key parameters, and save them in a CSV file. The next step is to read the exported attributes and integrate them into voxels. The script first identifies each voxel based on its ID and position, and then maps its respective data accordingly, ensuring that the data is assigned correctly to each voxel. The pseudocode for the script is presented in Algorithm 1: Conversion of voxel physical properties from Rhino 3D to Unity 3D.

Algorithm 1 Algorithm for Voxel Property Conversion

Input: voxels, ID, position, metadata
Output: voxels with metadata in Unity

Function export_voxel_attribute_data();
Open “voxel_attributes.csv” for writing
Write “Voxel_ID, x,y,z, volume, material, mass, density, friction, strength, cohesion, resistance”
For each voxel:
  If voxel is valid:
Voxel_ID = Get voxel unique identifier
x,y,z = Get spatial coordinates of voxel center
volume = Get cubic volume
material = Get material type (user-defined or default)
mass = Get voxel mass
parameters (i…n) = Get voxel parameters value
Write “Voxel_ID, x,y,z, volume, material, mass, parameters value” to File
Function import_voxel_attribute_data();
Open “voxel_attributes.csv” for reading
For each line in file:
  “Voxel_ID, x,y,z, volume, material, mass, parameters value” = Parse line
  //Find voxel in Unity project
  Where voxel = find voxel using Voxel_ID
  If voxel exists:
  Set voxel spatial position to (x,y,z)
  Resize volume based on volume attribute
  Assign metainformation to each voxel:
Voxel. material = material
Voxel. mass = mass
Voxel. parameters(i…n) = parameters value
  Close file
Return voxel model with metainformation in Unity 3D

4.1.4. Modeling Model–Equipment Interaction

The core implementation to enable model–equipment interaction requires models with physical properties assigned, interaction detection capability, and voxel excavation based on the bucket force. With physical properties assigned to each component, Rigidbody and Collider components are also assigned. A Rigidbody enables the earthwork model to avoid deformation under external forces and to react to these forces. For example, the excavator falls under gravity because of its Rigidbody, and the ground Mesh Collider supports equipment navigation around the construction site. Colliders are used to detect collisions between the models during the simulation.

The Rigidbody component is configured with a realistic mass value for every voxel derived from the volume and density. In addition, all voxels are assigned colliders that surround the voxel boundary and detect collisions when equipment buckets trigger the buffer zone. The voxels respond to the bucket forces based on their physical properties and Box Collider-based collision detection. When the excavator bucket enters the trigger zone of the voxels, a collision is detected. As the excavator bucket moves inside the voxels, its movement is continuously monitored because it interacts with the voxels to ensure precise alignment with the voxels. The applied force is compared with the physical properties of the voxel.

When the applied force exceeds the resistive force, the voxels are excavated and added to the bucket payload. The voxel position is updated based on the movement of the excavator bucket. When the excavator loads the voxels and unloads them in the dumper, the system monitors the number and type (material type) of voxels and visualizes them at the UI. Moreover, when the excavator bucket is rolled out, the voxels fall into the dumper owing to gravity. The dumper detects the voxels and shows them in the UI with properties such as volume, material type, and number of voxels.

4.1.5. User Interface Development

The integrated georeferenced earthwork model is shown in Figure 10 along with the equipment and other details of the proposed system. The imported models are shown in the asset module and can be used at any time while the game objects in the hierarchy are manipulated. They include earthwork models, UI components, cameras, lighting, and equipment models. Users interact with the model through an interactive UI developed using the Unity platform. In equipment selection, the excavator and dumper are linked to UI buttons, where users select and switch between equipment types. The equipment is linked and activated through UI buttons with scripts having the ExcavatorActive and DumperActive functions. The ExcavatorActive function activates or deactivates all relevant components or activities of the excavator in the game.

The calculation UI displays information about earthwork quantities in terms of voxel volume, type, and number, along with time metrics and equipment information, such as excavator and dumper productivity. The voxel information is monitored at the excavation site, dump truck, and dump site. The information is categorized according to the soil type. The UI tracks the times for different activities, such as loading, hauling, returning, and dumping. These periods are associated with the equipment activities for each cycle. The systems calculate the productivity of the equipment based on the number of voxels excavated and the time period.

4.1.6. Excavation Cycle Visualization

An earthwork cyclic activity scenario was selected to demonstrate the functionality of the proposed system. The excavation activities were performed in repetitive cycles, and involved digging, loading, hauling, unloading, dumping, and returning. For model–equipment manipulation, users first selected equipment according to the activity. The excavation in the simulation was initiated when the excavator bucket interacted with the voxels. As the bucket penetrated the voxels, a collision was detected. The bucket penetrated the voxels to depth d based on the applied force, which was calculated from the bucket speed, penetration depth, and machine power. Once the voxels were dug and loaded into the bucket, they were classified as excavated voxels and added to the bucket payload. Figure 11 shows the excavation cycle of the excavation process, allowing users to experience the excavation process—from digging, involving bucket–voxel interaction, to unloading voxels at the dump site.

After voxel excavation, the loaded voxels were collected in the bucket against the force of gravity, and the total number of voxels was monitored, depending on the voxel size and bucket size. The excavator hauled the voxel materials following constraints such as rotational angle, arm limit, movement, or velocity constraints. The unloading process was followed by placing the voxels from the bucket in the dump truck under gravitational acceleration and interactions with neighboring voxels. Finally, the dump truck was driven to the dump site, where the voxels from the dump truck container were unloaded under gravity at the dump site and settled after interacting with the rigid terrain.

To improve operational performance, performance metrics were used to display operational information, including voxel volume excavated, dumped volume, cycle time, and productivity. This information was displayed on the UI captured through the scripts attached to the components (Figure 11). The results of the simulation cycle show that the proposed system provides an effective visualization of the earthwork operational process, where the equipment interacts and manipulates the as-planned model.

4.1.7. System Evaluation

The volumetric changes in the simulation were compared with the volumetric changes at the actual construction site, along with the time taken by the equipment for the excavation activities. Volume accuracy was calculated by monitoring the voxel volume from the virtual environment and the earthwork material volume excavated from a real construction site. Accuracy was calculated from the volumetric differences between the virtual and actual environments using the following equation:

Volume accuracy (%) = (1 - \frac{| V_{s i m u l a t e d} - V_{r e a l} |}{V_{r e a l}}) × 100

(10)

The earthwork volume at the actual construction site was calculated by scanning the excavation site and calculating the physical site volume. The proposed method uses identically sized voxels and calculates the volume using the voxel size and the number of voxels. For a robust evaluation, three excavation volume measurements were taken, and the average of the three measurements was used for each material.

The simulated vs. real volume data showed a close correspondence for each material (Figure 12). Gravel materials had higher accuracy because they consist of discrete particles and can accurately represent the contact forces and resistance between the model and equipment. The proposed method uses a larger voxel size, which creates an additional error between the simulated and real volumes. Smaller voxels can improve accuracy, particularly for silts, because of their fine material and characteristics; however, this significantly increases the computational cost. The proposed system balances accuracy and computational efficiency, providing more accurate results for granular materials.

The time taken by each activity was similar between the virtual and real equipment (Figure 12b). The time taken by the dump truck to move from the excavation site to the dump site was the longest in the virtual earthwork cycle because the distance was long, and the movement was greatly affected by the ground topography. The results were similar to those of the real environment because the topography was developed using scanned data from the photogrammetry technique, which captures accurate as-built information. The proposed method uses a voxel-based representation that accurately captures earthwork information, utilizes Unity 3D physics, and makes the system suitable for large-scale operations.

Beyond material properties and realistic models developed using data collected from real sites, several other factors contribute to the discrepancies observed between simulated and real excavation volume and time. One important cause is the simplification of equipment behavior in the virtual environment. In the simulation, equipment follows idealized, repeatable paths with fixed motion parameters, while in real-world operations, the skill level and behavior of human operators introduce variability in bucket angles, scooping depth, and dumping motions, which significantly affect the exact volume excavated during cycle time. Additionally, the model is simplified, and several key contributing factors, such as spillage, partial bucket loads, and soil sticking to equipment surfaces, are not explicitly modeled in the simulation. This simplification also causes over- or underestimation. Moreover, the system is implemented in Unity 3D, which does not account for the simulation of complex soil–tool interactions, such as deformation and breakage. These factors also impact realism. These simplifications, while needed for maintaining computational performance, collectively contribute to the observed differences and suggest areas for future enhancement using more advanced simulation techniques.

4.2. Operational-Level Voxel-Based Progress Monitoring

In this scenario, the earthwork operational-level progress is monitored using a voxel-to-voxel comparison. The demonstration steps are as follows: (i) data collection, (ii) twin model creation, (iii) alignment of as-planned and as-built models, and (iv) progress visualization and quantification.

4.2.1. As-Planned and As-Built Model Creation

The digital twin models include both as-planned and as-built models. Both of these models are created in a voxel-based representation. The as-planned models are created using original ground data, geotechnical data, and design data. On the other hand, as-built models are created using original ground data and current state data. These models are regularly aligned and compared for progress visualization and quantification.

For demonstration, firstly, the coordinate system was first set up. Site calibration was performed by using the total station survey equipment. Then, the site coordinate system was developed from four nationally authorized reference points near the competition field. The coordinate system was calibrated to the GRS80 (Korean 2000/Central Belt 2010) and the Geoid KN13. Then, the earthwork site, measuring 105 m × 35 m, was scanned with a camera mounted on a UAV, as shown in Figure 13a. Autodesk Civil 3D was used to convert the scanned data into a TIN surface. At the same time, geotechnical investigations, such as boreholes, were conducted to obtain the site’s geotechnical information. Additionally, the design trench model was developed and voxelized accordingly (Figure 13b,c).

To generate the TIN surface, the point cloud data after pre-processing was imported into Autodesk Civil 3D. The Surface Generation from Points function was used to create a TIN surface of the ground. This workflow inside Autodesk Civil 3D, while not fully automated through application programming interface (API), utilizes semi-automated native functions. Additionally, the geotechnical module was used for the creation of geotechnical boreholes, geotechnical surfaces, and solids. The design model was also created using Autodesk Civil 3D native functions.

After the voxel-based as-planned model was developed, the model was divided into different sections for construction planning, and as-built data was captured accordingly (Figure 13d). During the construction stage, the planned voxels were removed, and the model was updated to represent the current state of information. Figure 13e represents the earthwork status after the data was collected at Time-1. Similarly, the data was collected at different intervals, and the model was updated accordingly to represent the current state of information. Figure 13f represents the final model, depicting the as-planned target and as-built target model.

4.2.2. Progress Visualization and Quantification

The models were integrated for the visualization and quantitative analysis during construction. The as-built model was compared with the as-planned model at different times, while the target model was shown to the operator and project managers throughout construction. The ground and target models are represented using surfaces, and the as-planned vs. as-built models are represented with voxels.

For progress monitoring, the as-built voxel model was noted at different times, and the quantity of excavated voxels was determined. The actual excavated voxel quantities were compared with the as-planned excavation. In this way, earthwork progress was monitored using voxel excavation. On the other hand, productivity was calculated using the following: (i) the excavated voxel data and (ii) the time taken for excavation.

The total quantity of the earthwork to be removed was 65 m³. The excavated volume design is shown using a target surface representation, and schedules are embedded into the model for the as-planned state using voxels. After finishing the work in actuality on the construction site, the site images were taken, and a point cloud was generated. The point cloud was converted into a surface model, which was then voxelized to achieve an as-built voxel model. The experiments performed in this scenario are explained in Table 3, along with the progress and productivity information.

4.2.3. Accuracy Evaluation

The proposed method uses point cloud data for as-built voxel generation. However, while UAV-based or laser scanners provide data with sub-centimeter accuracy, raw data typically contains noise, incomplete regions, and redundant points, particularly in areas of steep ground. Additionally, the data is converted into voxels, and this process introduces some inaccuracy, which is inherent in the modeling process. For example, the sources of the as-built data collection (UAVs and laser scanners) may introduce some errors. Furthermore, converting the point cloud data into a TIN surface incorporates interpolation errors, especially in areas where point data is missing or in complex terrains. Moreover, creating voxels from this data introduces discretization errors because there are voxels that are outside of the as-built TIN surface. Hence, to ensure geometric fidelity and reliability of the voxel results, the raw data must undergo a structured preprocessing workflow prior to conversion into the as-built model.

Firstly, point cloud preprocessing begins with noise filtering, where statistical outlier removal is applied to eliminate sparse points that lie significantly outside the neighboring points. This step normalizes the data points, reducing as-built surface roughness and preventing voxel misclassification, making the process efficient for comparing as-planned and as-built voxels. To reduce computational burden and manage data scalability, point cloud decimation/subsampling is applied using a distance-based method. The distance-based method uses a predefined distance between 3D points, and points are subsampled based on the minimum distance. In this method, the points are retained after the minimum distance is achieved, and the interior points are excluded [49]. For example, a sampling percentage of 20% refers to a large inter-point distance, resulting in a sparser point cloud, whereas a higher sampling percentage (e.g., 70%) corresponds to denser point clouds with smaller inter-point distances. The method was selected based on simplicity, and other methods, such as random point or cell-based approaches, can also be adopted to achieve the output sparser point cloud [50]. The method is applied inside CloudCompare, a commercial tool, allowing users to manipulate point cloud datasets. This method divides 3D space into uniform points and reduces the computational burden of the model, resulting in a sparser but topologically consistent point cloud.

Once filtered and downsampled, the point cloud is converted into a TIN surface representing the current state of information. The TIN surface representing the current state of information and the ground TIN surface are compared, and the solid is extracted. This extracted solid is used as input for voxelization, representing the as-built voxels. However, the transformation from point cloud to TIN introduces minor interpolation errors, particularly in regions with low point density or complex topography. These errors can propagate into the voxel model and must be calculated and controlled. Most importantly, the accuracy of the proposed method must align with construction tolerances; for example, the error of the as-built volume (cut or fill) should not increase the allowable range. For most earthwork projects, an error deviation of 1–3% is typically considered within an acceptable range. For instance, modeling accuracy is greatly affected by the size of the voxel. A smaller voxel size fits the as-built volume, reducing error and enhancing model accuracy.

To assess the impact of voxel size and point cloud resolution on model accuracy, multiple experiments were conducted. The raw point cloud was subsampled at varying levels (20%, 30%, 40%, 50%, 60%, and 70%) using a distance-based approach. Similarly, voxel sizes of 3 mm, 6 mm, and 9 mm were considered. The error was calculated by changing the voxel size, and model accuracy was compared with the allowable range.

Figure 14 presents the sampling of the point cloud data and the effect of voxel size on model accuracy. The error was calculated against the ground truth point cloud obtained using a laser scanner with mm level accuracy. It was found that smaller voxels can tolerate the error and enhance model accuracy even for a large, sampled dataset. For example, a voxel size of 3 mm provides less than 3% error (allowable range) when the data is subsampled by 60%. On the other hand, a voxel size of 9 mm tolerates the same error when the data is sampled at 40%.

Even though smaller voxels provide more accuracy, they increase the computational effort for model creation and model updating when new data is added to the system. Hence, in addition to accuracy, the computational performance of model creation and updating was analyzed. Figure 15 presents the performance of model creation and updating for each voxel size. It can be seen that a smaller voxel size enhances model computation for both model creation and updating. An optimal voxel size can be selected based on accuracy and computational efficiency.

A voxel size of 9 mm provides an error of 2.9% when the data is sampled at 40%. The time values for model creation and model updating are 14 s and 60 s, respectively, which are suitable for the trench excavation case study. The proposed method provides a method to select an optimal voxel size, considering data quality, accuracy, and computational efficiency. For an earthwork project sensitive to accuracy, smaller voxels can be selected that provide more accuracy. The results demonstrate that the voxel size and point cloud resolution must be jointly optimized to achieve a balance between modeling accuracy and real-time performance. The framework offers flexibility in selecting voxel parameters based on project-specific accuracy requirements and available data quality, making it adaptable to various earthwork construction scenarios.

5. Conclusions

Due to the unique characteristics of earthwork construction, such as inhomogeneous earthwork material information, volumetric modeling, irregular terrain, and the complex nature of earthwork resources, such as model–equipment interaction, developing an earthwork digital twin is a challenging task. This study proposes a digital twin framework that connects the physical earthwork twin with its digital counterpart using a voxel-based representation for earthwork operational simulation, as well as for the visualization and quantification of earthwork progress. The key contribution of this study is the proposed development of a framework consisting of five interconnected modules that generate voxel-based as-planned and as-built models, embedding geotechnical properties of the earthwork site. Furthermore, a structured voxel-to-voxel comparison method is proposed that integrates the as-planned model with automated updating using as-built data derived through photogrammetry or laser scanning. Additionally, the voxels’ semantics are used, which provides semantic tracking along with volumetric information. Moreover, the comparison of the as-planned and as-built models provides quality information about the excavation at any instant of the construction.

The proposed digital twin framework was validated with real-world case studies, and its effectiveness was demonstrated through two different scenarios. The first scenario integrates the digital twin environment with earthwork equipment for the simulation of model–equipment manipulations. Unity 3D was utilized, which enabled efficient interaction and manipulation between voxels and equipment through different excavation activities, such as digging, loading, hauling, unloading, and visualizing volumetric deformation at the excavation site. The accuracy of the system was validated by comparing the excavation volumes and activity durations of the real and simulated models. Three different material semantics were used, and the system provided an average accuracy of more than 95%. This shows that the system is reliable for use in different earthwork conditions. Additionally, the gravel material achieved a higher accuracy of 97.8% by using a Rigidbody interaction mechanism from the Unity 3D platform, which simulates granular interaction better than cohesion. Simulation activity duration showed a deviation of less than 6% from the field data. In addition, the developed interactive UI provides valuable insights into project dynamics, assisting earthwork construction managers and equipment operators in interacting with the system and manipulating the work environment and equipment to improve construction site planning, understanding, and management. The proposed system offers a training platform for earthwork practitioners in simulations and can improve earthwork quality and productivity.

On the other hand, the second scenario uses the digital twin models for earthwork progress monitoring. The trench excavation site was used as a demonstration example. The as-planned and as-built models were regularly integrated and updated during construction, facilitating operational-level monitoring and bridging the gap between physical and digital construction sites. The case study demonstrates that the proposed system is capable of integrating multi-source earthwork data, developing 3D as-planned and as-built models, and updating the model regularly, providing progress visualization and quantification information. Additionally, material information and quality assessment can be achieved from the developed digital twin. The accuracy of the system is greatly affected by the voxel size. However, the proposed system provides a flexible algorithm, and the size of the voxel can be selected according to the allowable error. For instance, the proposed system achieved a progress quantification error below 3% using 60% point cloud subsampling, demonstrating its robustness under reduced data density. Similarly, the size also affects the time required for voxel creation when new data arrives during construction and for model updating. The proposed system generates and integrates updated voxels in less than 60 s, confirming the system’s suitability for near real-time operational use.

6. Limitations and Future Work

Despite the benefits, there are several limitations of the proposed system. The accuracy and reliability of the digital twins are dependent on the accuracy of the input data. For instance, in our proposed system, the geotechnical information is collected from boreholes, and construction data is achieved through scans using laser scanners and photogrammetry. However, sparse distribution in boreholes can result in uncertainties that might incorporate some level of inaccuracies in voxel semantics or missed subsurface features. High-fidelity voxel modeling requires dense and well-structured data, which may not be available or affordable in real-world projects. Data-driven techniques can be integrated in the future to enhance the accuracy of the proposed system [43]. Additionally, scans achieved through laser scanners and UAVs are sometimes incomplete due to limited data capturing, occlusions, or incorporate uncertainties due to the subsampling to optimize computation for computer analysis. These can additionally affect the quality of the voxel-based construction models. Hybrid data collection methods, such as cameras and laser scanning from multiple locations, will advance operational data collection. Additionally, sensor technologies integrated with construction equipment can significantly enhance the equipment’s behavior. Integrating the method with interface technologies such as machine guidance, augmented reality, and virtual reality can advance the systems by providing a more granular and detailed work environment using voxels, helping stakeholders with rich information, facilitating decision-making, and leading to improved productivity, work quality, and safety.

In our proposed system, the size of the voxel can be changed according to site condition and required accuracy can be achieved (for example, smaller voxels can be used to represent the inhomogeneous ground conditions or represent detailed features of the design models); however, this increases the number of data points, leading to significant computational and storage demands. This creates a trade-off between accuracy and efficiency, which may limit the scalability of the framework, especially for large-scale earthwork projects, unless optimized voxelization and storage techniques are applied. Efficient voxels and cloud-based storage could be explored to enhance scalability. Moreover, the mode–equipment interaction is driven by geometric and rule-based methods rather than simulation-based, such as FEM or DEM [39]. Future integration with physics-realistic simulation engines could enhance predictive capabilities. Additionally, the system was implemented in Unity 3D, where physics properties such as colliders were assigned; this approach has system limitations [51,52].

The current proposed system provides efficient visualization in an integrated environment and updates the system when new data is available. However, in the current state, it serves as a monitoring, visualization, and analysis tool, providing design and construction information at any stage. Advanced analysis, such as predictive or prescriptive analysis, is not supported, which can significantly improve decision support and achieve a more proactive system [53]. The proposed framework serves as a foundation for the advancement in the field of automated earthwork construction monitoring techniques and model–equipment interaction. These applications are not limited to surface excavation, but can also greatly support deep excavation and underground excavation, such as drill-and-blast construction [54]. Currently, the system is applied and validated through two common and representative cases, namely, building pad and trench excavation. Additional validation is needed for more complex and specialized earthwork construction. Each scenario presents specific and unique challenges that need to be explored, such as voxel resolution, data integration, and visualization strategies.

Author Contributions

Conceptualization, M.S.K. and J.S.; methodology, M.S.K., H.S.C., and J.S.; software, M.S.K.; validation, M.S.K., H.S.C., and J.S.; formal analysis, M.S.K.; investigation, M.S.K., H.S.C., and J.S.; resources, J.S.; data curation, M.S.K., H.S.C., and J.S.; writing—original draft preparation, M.S.K.; writing—review and editing, M.S.K., H.S.C., and J.S.; visualization, M.S.K.; supervision, J.S.; project administration, J.S.; funding acquisition, J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Research Foundation of Korea (NRF) grant funded by the Korean Government (MEST), grant number RS-2024-00356995, and the Korea Agency for Infrastructure Technology Advancement (KAIA) grant funded by the Ministry of Land, Infrastructure, and Transport, grant number RS-2020-KA157089.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the 3D models used in this article can be provided upon request.

Acknowledgments

During the preparation of this manuscript/study, the authors used ChatGPT 4o for the purposes of English improvement. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

TIN	Triangular irregular network
TEN	Tetrahedral network
3D	Three-dimensional
FEM	Finite element method
DEM	Discrete element method
MPM	Material point method
LIDAR	Light detection and ranging
IoT	Internet of Things
BIM	Building information modeling
UTM	Universal Transverse Mercator
CAD	Computer-aided design
OSM	OpenStreetMap
UAV	Unmanned aerial vehicle
AMG	Automated machine guidance
VR	Virtual reality
AR	Augmented reality
GIS	Geographic information system
AABB	Axis-aligned bounding box
NGII	National Geographic Institute of Korea
B-rep	Boundary representation
LOD	Level of detail
GUI	Graphical user interface
TS	Total station
GPS	Geographic positioning system
RTK	Real-time kinematic
HMI	Human–machine interface
DT	Digital twin
DXF	Drawing exchange format
Obj	Object
FBX	Filmbox
Dwg	Drawing

References

Seo, J.; Haas, C.T.; Saidi, K.; Sreenivasan, S.V. Graphical Control Interface for Construction and Maintenance Equipment. J. Constr. Eng. Manag. 2000, 126, 210–218. [Google Scholar] [CrossRef]
Reja, V.K.; Varghese, K.; Ha, Q.P. Computer vision-based construction progress monitoring. Autom. Constr. 2022, 138, 104245. [Google Scholar] [CrossRef]
Honghong, S.; Gang, Y.; Haijiang, L.; Tian, Z.; Annan, J. Digital twin enhanced BIM to shape full life cycle digital transformation for bridge engineering. Autom. Constr. 2023, 147, 104736. [Google Scholar] [CrossRef]
Wu, H.; Zhu, Q.; Guo, Y.; Zheng, W.; Zhang, L.; Wang, Q.; Zhou, R.; Ding, Y.; Wang, W.; Pirasteh, S.; et al. Multi-level voxel representations for digital twin models of tunnel geological environment. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102887. [Google Scholar] [CrossRef]
Lyu, B.; Wang, Y. Immersive visualization of 3D subsurface ground model developed from sparse boreholes using virtual reality (VR). Undergr. Space 2024, 17, 188–206. [Google Scholar] [CrossRef]
Pal, A.; Lin, J.J.; Hsieh, S.-H.; Golparvar-Fard, M. Automated vision-based construction progress monitoring in built environment through digital twin. Dev. Built Environ. 2023, 16, 100247. [Google Scholar] [CrossRef]
Li, T.; Li, X.; Rui, Y.; Ling, J.; Zhao, S.; Zhu, H. Digital twin for intelligent tunnel construction. Autom. Constr. 2024, 158, 105210. [Google Scholar] [CrossRef]
Lehtola, V.V.; Koeva, M.; Elberink, S.O.; Raposo, P.; Virtanen, J.-P.; Vahdatikhaki, F.; Borsci, S. Digital twin of a city: Review of technology serving city needs. Int. J. Appl. Earth Obs. Geoinf. 2022, 114, 102915. [Google Scholar] [CrossRef]
Broo, D.G.; Bravo-Haro, M.; Schooling, J. Design and implementation of a smart infrastructure digital twin. Autom. Constr. 2022, 136, 104171. [Google Scholar] [CrossRef]
Liu, Y.; Feng, J.; Lu, J.; Zhou, S. A review of digital twin capabilities, technologies, and applications based on the maturity model. Adv. Eng. Inform. 2024, 62, 102592. [Google Scholar] [CrossRef]
Madubuike, O.C.; Anumba, C.J.; Khallaf, R. A review of digital twin applications in construction. J. Inf. Technol. Constr. 2022, 27, 145–172. [Google Scholar] [CrossRef]
Grieves, M.; Vickers, J. Digital Twin: Mitigating Unpredictable, Undesirable Emergent Behavior in Complex Systems. In Transdisciplinary Perspectives on Complex Systems; Kahlen, F.-J., Flumerfelt, S., Alves, A., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 85–113. [Google Scholar] [CrossRef]
Tao, F.; Zhang, H.; Liu, A.; Nee, A.Y.C. Digital Twin in Industry: State-of-the-Art. IEEE Trans. Ind. Inform. 2018, 15, 2405–2415. [Google Scholar] [CrossRef]
Rogage, K.; Mahamedi, E.; Brilakis, I.; Kassem, M. Beyond digital shadows: Digital Twin used for monitoring earthwork operation in large infrastructure projects. AI Civ. Eng. 2022, 1, 7. [Google Scholar] [CrossRef] [PubMed]
Lo, Y.; Zhang, C.; Ye, Z.; Cui, C. Monitoring road base course construction progress by photogrammetry-based 3D reconstruction. Int. J. Constr. Manag. 2022, 23, 2087–2101. [Google Scholar] [CrossRef]
Jiang, Y.; Li, M.; Li, M.; Liu, X.; Zhong, R.Y.; Pan, W.; Huang, G.Q. Digital twin-enabled real-time synchronization for planning, scheduling, and execution in precast on-site assembly. Autom. Constr. 2022, 141, 104397. [Google Scholar] [CrossRef]
You, K.; Zhou, C.; Ding, L.; Chen, W.; Zhang, R.; Xu, J.; Wu, Z.; Huang, C. Earthwork digital twin for teleoperation of an automated bulldozer in edge dumping. J. Field Robot. 2023, 40, 1945–1963. [Google Scholar] [CrossRef]
Babanagar, N.; Sheil, B.; Ninić, J.; Zhang, Q.; Hardy, S. Digital twins for urban underground space. Tunn. Undergr. Space Technol. 2025, 155, 106140. [Google Scholar] [CrossRef]
Lee, S.B.; Song, M.; Kim, S.; Won, J.-H. Change monitoring at expressway infrastructure construction sites using drone. Sensors Mater. 2020, 32, 3923–3933. [Google Scholar] [CrossRef]
Borngrund, C.; Sandin, F.; Bodin, U. Deep-learning-based vision for earth-moving automation. Autom. Constr. 2022, 133, 104013. [Google Scholar] [CrossRef]
Davletshina, D.; Reja, V.K.; Brilakis, I. Automating maintenance of road Geometric Digital Twins through single scan instance aware point cloud change retrieval. Adv. Eng. Inform. 2025, 67, 103476. [Google Scholar] [CrossRef]
Fang, W.; Chen, W.; Love, P.E.D.; Luo, H.; Zhu, H.; Liu, J. A status digital twin approach for physically monitoring over-and-under excavation in large tunnels. Adv. Eng. Inform. 2024, 62, 102648. [Google Scholar] [CrossRef]
Ninić, J.; Koch, C.; Stascheit, J. An integrated platform for design and numerical analysis of shield tunnelling processes on different levels of detail. Adv. Eng. Softw. 2017, 112, 165–179. [Google Scholar] [CrossRef]
Khan, M.S.; Kim, I.S.; Seo, J. A boundary and voxel-based 3D geological data management system leveraging BIM and GIS. Int. J. Appl. Earth Obs. Geoinf. 2023, 118, 103277. [Google Scholar] [CrossRef]
Khan, M.S.; Kim, J.; Park, S.; Lee, S.; Seo, J. Methodology for Voxel-Based Earthwork Modeling. J. Constr. Eng. Manag. 2021, 147, 04021111. [Google Scholar] [CrossRef]
Xu, Y.; Tong, X.; Stilla, U. Voxel-based representation of 3D point clouds: Methods, applications, and its potential use in the construction industry. Autom. Constr. 2021, 126, 103675. [Google Scholar] [CrossRef]
Khan, M.S.; Park, J.; Seo, J. Geotechnical Property Modeling and Construction Safety Zoning Based on GIS and BIM Integration. Appl. Sci. 2021, 11, 4004. [Google Scholar] [CrossRef]
Hegemann, F.; Manickam, P.; Lehner, K.; Koch, C.; König, M. Hybrid ground data model for interacting simulations in mechanized tunneling. J. Comput. Civ. Eng. 2013, 27, 708–718. [Google Scholar] [CrossRef]
Koch, C.; Vonthron, A.; König, M. A tunnel information modelling framework to support management, simulations and visualisations in mechanised tunnelling projects. Autom. Constr. 2017, 83, 78–90. [Google Scholar] [CrossRef]
Ninić, J.; Bui, H.G.; Koch, C.; Meschke, G. Computationally Efficient Simulation in Urban Mechanized Tunneling Based on Multilevel BIM Models. J. Comput. Civ. Eng. 2019, 33, 04019007. [Google Scholar] [CrossRef]
Lee, S.S.; Kim, K.T.; Tanoli, W.A.; Seo, J.W. Flexible 3D Model Partitioning System for nD-Based BIM Implementation of Alignment-Based Civil Infrastructure. J. Manag. Eng. 2020, 36, 04019037. [Google Scholar] [CrossRef]
Siebert, S.; Teizer, J. Mobile 3D mapping for surveying earthwork projects using an Unmanned Aerial Vehicle (UAV) system. Autom. Constr. 2014, 41, 1–14. [Google Scholar] [CrossRef]
Gong, H.; Su, D.; Zeng, S.; Chen, X. Advancements in digital twin modeling for underground spaces and lightweight geometric modeling technologies. Autom. Constr. 2024, 165, 105578. [Google Scholar] [CrossRef]
Kim, B.; Kim, C.; Kim, H. Interactive Modeler for Construction Equipment Operation Using Augmented Reality. J. Comput. Civ. Eng. 2012, 26, 331–341. [Google Scholar] [CrossRef]
Saunier, L.; Hoffmann, N.; Preda, M.; Fetita, C. Virtual Reality Interface Evaluation for Earthwork Teleoperation. Electronics 2023, 12, 4151. [Google Scholar] [CrossRef]
Morosi, F.; Caruso, G. Configuring a VR simulator for the evaluation of advanced human–machine interfaces for hydraulic excavators. Virtual Real. 2022, 26, 801–816. [Google Scholar] [CrossRef]
Vahdatikhaki, F.; El Ammari, K.; Langroodi, A.K.; Miller, S.; Hammad, A.; Doree, A. Beyond data visualization: A context-realistic construction equipment training simulators. Autom. Constr. 2019, 106, 102853. [Google Scholar] [CrossRef]
Xu, H.; Ohkita, J.; Tamai, Y.; Benten, H.; Ito, S. A 3D Physics-Based Hydraulic Excavator Simulator. Adv. Mater. 2015, 27, 5868–5874. [Google Scholar] [CrossRef] [PubMed]
Liu, B.; Gan, J.; Xu, J.; Ellis, D.; Zou, R.; Yu, A.; Zhou, Z. Numerical simulation of operations of hydraulic excavators for polydisperse bulk materials and different configurated buckets. Autom. Constr. 2024, 157, 105154. [Google Scholar] [CrossRef]
Dopico, D.; Luaces, A.; González, M. A soil model for a hydraulic simulator excavator based on real-time multibody dynamics. In Proceedings of the 5th Asian Conference on Multibody Dynamics 2010, Kyoto, Japan, 23–26 August 2010; pp. 325–333. [Google Scholar]
Hammad, A.; Vahdatikhaki, F.; El-Ammari, K. Conceptual Framework of Training Simulator for Heavy Construction Equipment Integrating Sensory Data, Actual Spatial Model, and Multi-Agent System. In Proceedings of the 16th International Conference on Computing in Civil and Building Engineering, Osaka, Japan, 6–8 July 2016; pp. 1623–1630. Available online: http://www.see.eng.osaka-u.ac.jp/seeit/icccbe2016/Proceedings/Full_Papers/205-149.pdf (accessed on 4 June 2025).
Chae, M.J.; Lee, G.W.; Kim, J.Y.; Park, J.W.; Cho, M.Y. A 3D surface modeling system for intelligent excavation system. Autom. Constr. 2011, 20, 808–817. [Google Scholar] [CrossRef]
Shi, C.; Wang, Y. Data-driven construction of Three-dimensional subsurface geological models from limited Site-specific boreholes and prior geological knowledge for underground digital twin. Tunn. Undergr. Space Technol. 2022, 126, 104493. [Google Scholar] [CrossRef]
Golparvar-Fard, M.; Peña-Mora, F.; Arboleda, C.A.; Lee, S. Visualization of Construction Progress Monitoring with 4D Simulation Model Overlaid on Time-Lapsed Photographs. J. Comput. Civ. Eng. 2009, 23, 391–404. [Google Scholar] [CrossRef]
Golparvar-Fard, M.; Peña-Mora, F.; Savarese, S. Automated Progress Monitoring Using Unordered Daily Construction Photographs and IFC-Based Building Information Models. J. Comput. Civ. Eng. 2015, 29, 04014025. [Google Scholar] [CrossRef]
Mani, G.F.; Feniosky, P.M.; Savarese, S. D4AR-A 4-dimensional augmented reality model for automating construction progress monitoring data collection, processing and communication. Electron. J. Inf. Technol. Constr. 2009, 14, 129–153. [Google Scholar]
Tao, B.; Bosché, F.; Li, J. Mixed Reality-based MEP construction progress monitoring: Evaluation of methods for mesh-to-mesh comparison. Autom. Constr. 2024, 168, 105852. [Google Scholar] [CrossRef]
Sharafat, A.; Khan, M.S.; Latif, K.; Tanoli, W.A.; Park, W.; Seo, J. BIM-GIS-Based Integrated Framework for Underground Utility Management System for Earthwork Operations. Appl. Sci. 2021, 11, 5721. [Google Scholar] [CrossRef]
Mahmood, B.; Han, S.U.; Lee, D.E. BIM-based registration and localization of 3D point clouds of indoor scenes using geometric features for augmented reality. Remote Sens. 2020, 12, 2302. [Google Scholar] [CrossRef]
Ekanayake, B.; Wong, J.K.W.; Fini, A.A.F.; Smith, P. Computer vision-based interior construction progress monitoring: A literature review and future research directions. Autom. Constr. 2021, 127, 103705. [Google Scholar] [CrossRef]
Rehman, S.U.; Kim, I.; Hwang, K.-E. Advancing BIM and game engine integration in the AEC industry: Innovations, challenges, and future directions. J. Comput. Des. Eng. 2025, 12, 26–54. [Google Scholar] [CrossRef]
Khan, M.S.; Park, S.; Seo, J. Earthwork Model Visualization in Game Engine: From Civil 3D to Unity 3D. 2024, pp. 19–20. Available online: https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE12088515 (accessed on 4 June 2025).
Cotoarbă, D.; Straub, D.; Smith, I.F. Probabilistic digital twins for geotechnical design and construction. Data Centric Eng. 2025, 6, e30. [Google Scholar] [CrossRef]
Sharafat, A.; Khan, M.S.; Latif, K.; Seo, J. BIM-Based Tunnel Information Modeling Framework for Visualization, Management, and Simulation of Drill-and-Blast Tunneling Projects. J. Comput. Civ. Eng. 2021, 35, 04020068. [Google Scholar] [CrossRef]

Figure 1. Properties of the voxel-based representation and comparison with other geometric models.

Figure 2. Proposed voxel-based digital twin framework.

Figure 3. Digital twin development process for earthwork construction.

Figure 4. The process of creating and integrating surface and geotechnical models.

Figure 5. Voxelization of an excavation site.

Figure 6. A 3D model of the equipment with a hierarchical structure.

Figure 7. Demonstration steps of the proposed system.

Figure 8. Process of creating voxel-based as-planned and as-built models.

Figure 9. Ground, target, and voxel-based as-planned model.

Figure 10. Integrated models in Unity and a user interface for user interaction with the system.

Figure 11. Excavation cycle activities.

Figure 12. Comparison results between virtual and actual earthwork entities. (a) Volume comparison, and (b) activities comparison.

Figure 13. (a) Ground model, (b) target model, (c) voxelized model, (d) voxel-based planning for earthwork construction, (e) model update at Time-1, and (f) model update at Time-n (adopted from [25]).

Figure 14. Subsampling of point cloud data and the effect of voxel size on model accuracy.

Figure 15. Performance of model creation and model updating.

Table 1. Defined Unity physics components for models and types of equipment.

Model	Unity Physics Components	Remarks
Ground	Mesh Collider	This collider enables realistic behavior of the ground, so an excavator or dump truck can move over the ground.
Target	Mesh Collider	It enables users to interact with equipment on the target surface and collide with the as-planned voxels so as not to pass through them.
As-planned voxels	Box Collider, Rigidbody, script	Voxels can interact with the ground, target, and dump site models using a Box Collider. Also, voxels can be detected by equipment through a collider; they can be monitored and shown to the users on the GUI. The Rigidbody component ensures that the voxels respond realistically to the forces applied during manipulation.
Excavator body	Rigidbody	The excavator behaves as expected under gravity, with consideration for its mass, resulting in natural-looking movements as it navigates the construction site. It also conserves excavator momentum.
Excavator bucket	Mesh Collider, script	The Mesh Collider on the excavator bucket ensures precise interactions with individual voxels. As the bucket scoops, lifts, and deposits voxels, the detailed Mesh Collider allows for accurate collision detection, contributing to a convincing excavation experience.
Dumper body	Mesh Collider, Rigidbody	As the dumper moves across the ground, the collider ensures that collisions are detected appropriately, allowing the dumper to respond to changes in the landscape. The Rigidbody component allows both kinematic and dynamic interactions. While the dumper is driven by player input, it can also be affected by dynamic forces, creating a flexible and realistic simulation of a moving vehicle.

Table 2. Keyboard button configurations for moving equipment and its parts.

Model/Asset	Input Setting
	Arm forward, arm backward	K, J
	Boom forward, boom backward	M, N
	Bucket up, bucket down	O, P
	Swing frame left, swing frame right	Keypad 7, 8
	Move forward, move backward, brakes	W, S, Space
	Turn right, turn left	D, A
	Move forward	Arrow up
	Move backward	Arrow down
	Turn right	Arrow right
	Turn left	Arrow down
	Brake	Space
	Unload	Unload button

Table 3. Progress visualization and quantification using an integrated model.

Time	Model Quantification	Remarks
Time-0	Total volume = 65 m³ Quality = under/over excavation	Before construction
Time-1	Total volume = 65 m³ As-planned voxels = 65 m³ As-built = 0 m³ Tracked material = silty voxels	On schedule
Time-2	As-planned voxels = 51.55 m³ As-built voxels = 14.45 m³ Tracked material = silty voxels	On schedule
Time-3	As-planned voxels = 39.13 m³ As-built voxels = 12.87 m³ Tracked material = silty voxels	Behind schedule
Time-4	As-planned voxels = 51.55 m³ As-built voxels = 14.45 m³ Tracked material = silty voxels	Behind schedule
Time-5	As-planned voxels = 18.75 m³ As-built voxels = 21. 38 m³ Tracked material = silty voxels	Behind schedule
Time-6	As-planned voxels = 0 m³ As-built voxels = 63.15 m³ Tracked material = silty voxels	Behind schedule

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Khan, M.S.; Cho, H.S.; Seo, J. Voxel-Based Digital Twin Framework for Earthwork Construction. Appl. Sci. 2025, 15, 7899. https://doi.org/10.3390/app15147899

AMA Style

Khan MS, Cho HS, Seo J. Voxel-Based Digital Twin Framework for Earthwork Construction. Applied Sciences. 2025; 15(14):7899. https://doi.org/10.3390/app15147899

Chicago/Turabian Style

Khan, Muhammad Shoaib, Hyuk Soo Cho, and Jongwon Seo. 2025. "Voxel-Based Digital Twin Framework for Earthwork Construction" Applied Sciences 15, no. 14: 7899. https://doi.org/10.3390/app15147899

APA Style

Khan, M. S., Cho, H. S., & Seo, J. (2025). Voxel-Based Digital Twin Framework for Earthwork Construction. Applied Sciences, 15(14), 7899. https://doi.org/10.3390/app15147899

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Voxel-Based Digital Twin Framework for Earthwork Construction

Abstract

1. Introduction

2. Related Research

2.1. Earthwork Twin Modeling

2.2. Earthwork Operational Simulation and Monitoring

3. Methodology

3.1. Proposed Voxel-Based Digital Twin Framework

3.2. Proposed Digital Twin Architecture

3.2.1. Data Acquisition Module

3.2.2. Virtual Model Creation Module

3.2.3. Digital Twin Core Module

3.2.4. Visualization and Simulation Module

3.2.5. Monitoring and Analysis Module

3.3. Earthwork Virtual Model Construction

3.3.1. Surface and Geotechnical Model

3.3.2. Target Model

3.3.3. Voxel-Based As-Planned/As-Built Model Development

3.4. Equipment Configuration in the Virtual Environment

3.5. Interaction Between the Model and Equipment for Graphical Simulation in Unity 3D

3.6. Digital Twin Update

4. Implementation of the Proposed System

4.1. Graphical Simulation Environment for Model–Equipment Interaction

4.1.1. Data Collection and Processing

4.1.2. Digital Twin Models Creation

4.1.3. Model Integration in Platform

4.1.4. Modeling Model–Equipment Interaction

4.1.5. User Interface Development

4.1.6. Excavation Cycle Visualization

4.1.7. System Evaluation

4.2. Operational-Level Voxel-Based Progress Monitoring

4.2.1. As-Planned and As-Built Model Creation

4.2.2. Progress Visualization and Quantification

4.2.3. Accuracy Evaluation

5. Conclusions

6. Limitations and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI