Digital Twin-Enabled Framework for Intelligent Monitoring and Anomaly Detection in Multi-Zone Building Systems

Faeze Hodavand; Issa Ramaji; Naimeh Sadeghi; Sarmad Zandi Goharrizi

doi:10.3390/buildings15224030

,

and

¹

School of Engineering, Construction, and Management, Khajeh Nasir Toosi University of Technology, Tehran 1996715433, Iran

²

School of Engineering, Construction, and Computing, Roger Williams University, SE1117, One Old Ferry Road, Bristol, RI 02809, USA

³

School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran 1439957131, Iran

^*

Author to whom correspondence should be addressed.

Buildings2025, 15(22), 4030;https://doi.org/10.3390/buildings15224030

This article belongs to the Special Issue Digital Technologies in Buildings and Critical Infrastructure: Transforming Design, Construction, and Operations

Version Notes

Order Reprints

Abstract

The growing complexity of modern building systems requires advanced monitoring frameworks to improve fault detection, energy efficiency, and operational resilience. Digital Twin (DT) technology, which integrates real-time data with virtual models of physical systems, has emerged as a promising enabler for predictive diagnostics. Despite growing interest, key challenges remain, including the neglect of short- and long-term forecasting across different scenarios, insufficiently robust data preparation, and the rare validation of models on multi-zone buildings over extended test periods. To address these gaps, this study presents a comprehensive DT-enabled framework for predictive monitoring and anomaly detection, validated in a multi-zone educational building in Rhode Island, USA, using a full year of operational data for validation. The proposed framework integrates a robust data processing pipeline and a comparative analysis of machine learning models, including LSTM, RNN, GRU, ANN, XGBoost, and RF, to forecast short-term (1 h) and long-term (24 h) indoor temperature variations. The LSTM model consistently outperformed other methods, achieving R² > 0.98 and RMSE < 0.55 °C for all tested rooms. For real-time anomaly detection, we applied the hybrid LSTM–Interquartile Range (IQR) method on one-step-ahead residuals, which successfully identified anomalous deviations from expected patterns. The model’s predictions remained within a ±1 °C error margin for over 90% of the test data, providing reliable forecasting up to 16 h ahead. This study contributes a validated, generalizable DT methodology that addresses key research gaps, offering practical tools for predictive maintenance and operational optimization in complex building environments.

Keywords:

digital twin technology; anomaly detection; predictive maintenance; machine learning; smart building management; deep learning

1. Introduction

Efficient monitoring and timely anomaly detection are essential for optimizing building performance, minimizing energy consumption, and ensuring occupant comfort. As urban populations grow and people spend more time indoors [,], the demand for sustainable building systems has increased significantly, resulting in a substantial rise in energy consumption. In developed regions, such as Europe, buildings account for approximately 40% of total energy consumption and 28% of carbon dioxide emissions, a trend also observed in other industrialized nations [,].

Despite the widespread adoption of Building Management Systems (BMS), these platforms often fail to detect subtle or interacting faults in real time and adapt to dynamic building conditions. For example, a gradual decline in HVAC efficiency, combined with changing occupancy patterns, may go unnoticed by rule-based alarms, resulting in energy waste and reduced occupant comfort [,]. Continuous monitoring and early anomaly detection are therefore essential for proactive maintenance and efficient operation.

Recent advancements in the Internet of Things (IoT) have enabled the collection of high-resolution data on key building parameters. However, converting this raw data into actionable insights remains challenging, especially in multi-zone, high-occupancy buildings with complex operations.

Digital twin technology addresses this challenge by creating dynamic virtual replicas of physical systems, supporting real-time monitoring, predictive anomaly detection, and data-driven decision-making. Unlike static Building Information Modeling (BIM) [,,], digital twins combine real-time sensor data with machine learning algorithms to simulate building behavior and detect deviations from expected performance. By providing early warnings of potential faults, digital twins facilitate efficient maintenance, reduce energy waste, and enhance occupant comfort and safety [,].

This paper presents a digital twin-enabled framework to enhance anomaly detection in building systems. Unlike single-zone studies that model the thermal dynamics of a single enclosed space, this research addresses a multi-zone educational building, where each classroom represents a thermally distinct zone with unique occupancy and environmental conditions. Modeling multiple zones simultaneously captures inter-zone variations caused by factors such as floor level, orientation, solar exposure, and occupancy schedules. This multi-zone configuration is crucial for evaluating the scalability and generalizability of digital-twin frameworks to real buildings that consist of many interacting spaces.

By integrating IoT data, predictive models, and real-time feedback, the framework aims to:

(1): Design and implement a robust data pipeline to manage real-world sensor imperfections.
(2): Perform a comparative analysis of shallow and deep learning models for short-term (real-time) and long-term (proactive) forecasting.
(3): Validate the framework through a year-long deployment in an operational educational building, demonstrating its feasibility and adaptability in a multi-zone environment with varied occupancy and usage patterns.

2. Literature Review

The concept of digital twins originated during NASA’s Apollo program in the 1960s [], though the term was formally coined by Grieves in 2002 at the University of Michigan []. Grieves and Vickers [] later refined it as a virtual representation of physical products across scales, integrating real-time physical data. Building on this, Van der Hohen et al. [] reviewed 46 definitions and described DTs as dynamic virtual models synchronized with their physical counterparts through continuous data exchange. This evolution transformed DTs from conceptual tools into powerful, data-driven platforms for monitoring, prediction, and decision support across asset lifecycles [,].

2.1. Digital Twin in the AEC Sector

In recent decades, DT adoption has expanded beyond manufacturing into the Architecture, Engineering, and Construction (AEC) sector []. Within this domain, DTs have demonstrated potential to improve building design [], construction processes [], and operational management []. Benefits include process optimization, reduced rework costs, and the reuse of performance data in future projects []. Unlike Building Information Modeling (BIM), which provides static models, DTs integrate real-time IoT sensor data [,], enabling dynamic management of building systems []. Applications span energy optimization [,,], predictive maintenance [], and thermal comfort monitoring [,].

For example, Arowoiya et al. [] reviewed DT applications and highlighted improvements in both thermal comfort and energy efficiency. Augustinelli et al. [] demonstrated how IoT and artificial intelligence (AI) integration within DTs enabled residential buildings to achieve near-zero-energy standards. Similarly, Tahmasbnia et al. [] focused on DTs in building energy management, emphasizing the importance of advanced machine learning techniques for enhancing prediction accuracy and system performance.

Despite such progress, most studies emphasize isolated use cases or specific lifecycle phases, leaving complex, multi-zone operational environments—such as educational buildings with varying occupancy patterns—relatively unexplored.

2.2. Predictive Modeling in Buildings

Accurate prediction of building parameters (e.g., indoor temperature, air quality, energy consumption) is crucial for proactive management. Early approaches in the 2000s relied on statistical models such as autoregressive integrated moving average (ARIMA) models [] and Holt-Winters exponential smoothing [], valued for simplicity but limited in addressing nonlinearities and multivariate inputs. With the rise of shallow machine learning techniques—including Random Forests [], Support Vector Machines (SVM) [], Multi-Layer Perceptrons (MLP), and Extreme Gradient Boosting (XGBoost) []—researchers began leveraging richer datasets and external variables to improve predictive accuracy [,].

These methods, however, show context-dependent performance. For instance, Ramadan et al. [] found ANN and Extra Trees models outperforming SVMs and thermal gray-box models in controlled laboratory temperature prediction but observed significant performance degradation in real-world scenarios. Boesgaard et al. [] used XGBoost to predict relative humidity in cultural heritage buildings, achieving high accuracy in stable storage environments but reduced reliability in dynamic spaces such as churches.

Recent efforts have attempted to address these weaknesses through ensemble methods [] and feature selection techniques tailored to complex building datasets [,]. Nonetheless, the limitations of shallow models [] reinforce the need for more advanced time-series approaches [,], including Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, and Gated Recurrent Units (GRUs).

2.3. Advanced Deep Learning Approaches

Deep learning models have shown consistent improvements in predictive performance. Cui et al. [] employed a hybrid LSTM approach for multi-zone temperature forecasting, outperforming gray-box models. In another research Xing et al. [] applied GRUs for short-term HVAC predictions, leveraging fine-grained occupancy and environmental data. Jiang et al. [] advanced this by developing a hybrid LSTM-GRU model with encoder–decoder and attention mechanisms, achieving a correlation coefficient of 0.9 for 90 min forecasts, though stability declined for one-step-ahead predictions.

Other hybrid approaches, such as the CNN-LSTM model by Elmaz et al. [], integrated spatial and temporal features and achieved a correlation coefficient of 0.9 for 120 min forecasts. Park et al. [] showed that Multilayer Perceptron (MLP) models can predict short-term indoor air temperature. Their model achieved an RMSE of 0.77 °C for 3600 s intervals. However, its performance declined over longer periods. For multi-building scenarios, Fang et al. [] applied a sequence-to-sequence LSTM model across eight zones in an educational building, yielding an RMSE of 0.45 °C but still struggling with long-term stability.

Similarly, Norouzi et al. [] compared multiple algorithms, finding deep networks achieved superior performance (RMSE = 0.16 °C) in educational building contexts. Yu et al. [] developed a deep ensemble framework using data from 96 sensors across 25 buildings, outperforming RF, SVM, and LSTM models. Despite these advances, challenges remain around validation in diverse environments, data quality issues (noise, incompleteness), and generalizability.

2.4. Anomaly Detection in Buildings

Machine learning has also become central to anomaly detection in building operations. Borda et al. [] applied supervised and semi-supervised models to detect HVAC faults, optimizing energy use and occupant comfort. Hodavand et al. [] emphasized that integrating Digital Twin (DT) technology with data-driven methods—such as supervised classification, regression, and unsupervised learning—provides an effective approach to smart facility management, facilitating real-time data analysis, enhancing occupant comfort, and supporting sustainable operations.

Zhang et al. [] introduced a symbolic AI (SAX)-based DT framework for identifying fault-relevant sensor data, reducing computational load by tagging real-time data streams with ontology-based fault labels. Cicero et al. [] proposed a lightweight deep learning method (Sparse U-Net Autoencoder) for IoT-based anomaly detection, targeting events such as fires, gas leaks, and intrusions.

Cybersecurity in IoT-enabled environments is another emerging focus. Mirdula and Roopa [] developed a deep learning-based framework integrating user behavior analytics with Manufacturer Usage Description (MUD) profiles to detect intrusions at the network level in smart buildings. Song et al. [] developed an unsupervised ensemble approach that integrates Local Outlier Factor, Deep Isolation Forest, and Anomaly Transformer, achieving high accuracy in detecting energy anomalies in industrial buildings.

Shahid et al. [] employed transfer learning with LSTMs for real-time school energy consumption anomaly detection, reducing false positives with limited data. Abdollah et al. [] introduced a self-supervised Transformer for HVAC fault detection, reconstructing masked inputs and applying dynamic thresholding to detect point and sequential faults in unlabeled datasets.

2.5. IoT, BIM, and Big Data in DTs

IoT integration plays a pivotal role in DT implementation by enabling real-time sensing. Floris et al. [] presented an IoT-based DT framework for environmental monitoring, achieving 93% accuracy in energy savings and comfort optimization. Eneyew et al. [] demonstrated a BIM–IoT integrated DT framework enabling real-time synchronization between physical and virtual components. Al-Rowady et al. [] combined IoT and BIM to develop a five-step platform for thermal comfort management, achieving correlation coefficients above 0.9 using Prophet and Neural Prophet algorithms.

Big data analytics further enhances DTs by supporting large-scale data processing and predictive accuracy. Ratur et al. [] emphasized the role of big data and artificial intelligence in enriching digital twin systems, enabling precise performance monitoring and predictive maintenance in buildings. Zhang et al. [] highlighted that high-quality data extraction is crucial for reliable digital twin models, ensuring precise predictions and effective decision-making in smart building management. However, despite these advancements, the practical implementation of digital twins remains resource-intensive, requiring substantial computational power, robust data infrastructure, and interdisciplinary expertise.

According to the literature, several research gaps remain unaddressed in the application of digital twin technology to building systems. Existing studies primarily focus on either short-term or long-term predictions, often neglecting the trade-offs between these two approaches. Additionally, educational buildings receive less attention despite their unique occupancy patterns and energy demands. Accurate data preparation is crucial but often overlooked, while IoT sensor challenges, such as noise and missing values, remain unresolved.

Moreover, many proposed frameworks lack rigorous validation using extensive real-world datasets, which limit their generalizability and practical applicability in diverse contexts. These gaps underscore the need for customized digital twin frameworks that can effectively address the complexities of real-world building systems.

This study introduces a digital twin-enabled framework designed for real-time monitoring and anomaly detection. Leveraging advanced machine learning models, it predicts indoor conditions over both short and long-term periods. A comprehensive data preparation process addresses IoT sensor challenges, ensuring the reliability of data. One year of real-world building data was employed for validation, allowing the framework to be tested across seasonal cycles and operational changes. By addressing these critical gaps, this study aims to promote the broader adoption of digital twin technology in the built environment, enhancing monitoring, predictive capabilities, and anomaly detection.

3. Methodology

This research develops a digital twin framework to enable real-time monitoring and anomaly detection in building systems. By leveraging real-time data analysis, the framework targets educational buildings as a case study. This section outlines the systematic approach to designing the framework and its predictive models. This chapter outlines the systematic methodology for digital twin architecture and implementation, including digital twin architecture, data acquisition and processing, predictive model development, and the strategies for anomaly detection and evaluation.

3.1. Digital Twin Architecture

The proposed framework is organized into four layers: Physical Building and IoT Edge, Digital Twin Core and Cloud Infrastructure, Analytics and Intelligent Layer, and User Interface, as shown in Figure 1. Within these layers, four core components—IoT Sensor Network, Data Preprocessing Module, Cloud-Based Infrastructure, and Predictive Models—work together to support real-time monitoring, predictive analytics, and anomaly detection in building operations.

Figure 1. Digital Twin Architecture.

At the Physical Building and IoT Edge layer, the IoT Sensor Network comprises a system of strategically deployed sensors that capture environmental and operational parameters. Data is collected continuously and passed through gateways for local cleaning and contextualization before being transferred to the next layer. The Digital Twin Core and Cloud Infrastructure host the Data Preprocessing Module, which is responsible for validation, data cleaning, feature engineering, and feature selection. This module ensures that only reliable, high-quality data are stored and managed within the scalable cloud-based infrastructure, which also integrates with BIM to provide spatial context.

Within the Analytics and Intelligent Layer, Predictive Models—trained on historical data and updated in real-time—perform both short-term and long-term forecasting while supporting anomaly detection. Finally, the User Interface delivers results through an interactive, web-based dashboard that integrates real-time indoor building conditions with predictive insights, allowing stakeholders to access, interpret, and act upon system performance data effectively.

3.2. Data Acquisition and Case Study Description

3.2.1. Case Study Overview

To validate the framework, an educational building—a lower school in Rhode Island, USA—was selected as a case study. The dataset spans three years, from January 2021 to January 2024, providing a robust dataset that captures seasonal variations and long-term patterns. The dataset consists of two primary data sources. The primary data source consists of IoT sensors developed by dataArrows, installed throughout the building to provide real-time information on the indoor environment.

The sensors measure variables such as temperature, humidity, carbon dioxide, and particulate matter (PM10 and PM2.5) across classrooms, libraries, hallways, and other spaces. In this study, two classrooms—Room 128 on the first floor and Room 201 on the second floor—were selected due to their diverse occupancy patterns and environmental characteristics, as illustrated in Figure 2. The second data source consists of outdoor weather information. This includes 20 variables, such as air temperature, relative humidity, dew point, apparent air temperature, precipitation, rain rate, snowfall rate, sea level and ground air pressure, cloud cover, wind speed at different altitudes, wind direction, and sunlight intensity (day or night).

Figure 2. Location of classrooms 128 and 201 on the right and left side of the first and second floors of the educational building, respectively.

3.2.2. Data Collection and Initial Preprocessing

Maintaining time-ordered and consistent time-series data is crucial for reliability in predictive modeling. IoT sensors across different building zones continuously transmit readings to cloud storage, where initial preprocessing enforces proper temporal ordering and uniform intervals. The data is then stored in an SQL database. The database organizes readings by building zone, enabling efficient queries and comprehensive analysis.

During implementation, the high dimensionality of the features and the substantial data volume—approximately 3,207,919 records collected at 15 min intervals over a three-year period—presented considerable computational challenges. Furthermore, a key constraint was the availability of external weather data, which was limited to hourly intervals. To address this, the internal building data was resampled to a matching hourly frequency, ensuring temporal consistency across all data sources.

Among the five sensor data parameters, monitoring and predicting indoor temperature are identified as the most critical factors for occupant comfort and energy consumption []. Accurate models for indoor temperature prediction can play a pivotal role in optimizing the performance of HVAC systems, ultimately leading to significant energy savings []. While the focus of this study is on indoor temperature, the developed methodology is extensible to other parameters.

3.3. Data Preprocessing and Feature Engineering

3.3.1. Data Preprocessing Pipeline

The preprocessing pipeline was designed to systematically address challenges inherent in real-world sensor data. The process began with a quality assessment to verify sensor calibration and signal stability, including cross-validation with redundant sensors. Timestamps were standardized to the YYYY-MM-DDTHH:mm:ss.sss format for system-wide compatibility.

A significant challenge was data integrity, with 2734 missing values identified in internal sensor data, including a continuous gap of 24 days. To manage this, a hierarchical approach was employed. The primary method utilized valid data from corresponding time periods in adjacent years to preserve seasonal patterns. When this was not feasible, a combination of forward and backward filling techniques was implemented. This methodology was selected to preserve temporal patterns and minimize prediction errors, as shown in Figure 3. Finally, between Min-Max and Standardization, Min-Max normalization was applied to scale all features to a [0, 1] range, a step crucial for enhancing the performance and convergence of deep learning models by mitigating feature magnitude bias.

Figure 3. The left image shows the temperature data containing missing values, while the right image presents the reconstructed data after applying the missing-value filling techniques.

3.3.2. Feature Engineering and Selection

A rigorous feature-engineering and selection pipeline was implemented to capture the building’s complex, multiscale thermal dynamics across multiple zones. The pipeline combines physics-informed temporal encodings, additive time-series decomposition, and multiscale lagged variables, followed by a staged, XGBoost-based selection and correlation-pruning workflow to produce an optimized, non-redundant feature set used for modeling.

For feature engineering we implemented a sophisticated three-tiered approach focusing on temporal feature extraction, time series decomposition, and lagged variable integration. The process began with the extraction of fundamental temporal components, where we systematically incorporated 13 time-based features, including day, month, year, season, weekday, day of year, and month of year. These features were encoded using sinusoidal transformations (sine and cosine functions) rather than traditional numerical representations, facilitating the machine learning models’ ability to recognize and learn cyclical patterns in temperature variations.

Second, seasonal–trend decomposition was applied using an additive model,

y(t) = “Trend”(t) + “Seasonal”(t) + ε(t), to extract periodic patterns across multiple temporal scales. Three decomposition periods were defined: daily (24 h) to capture diurnal temperature cycles, weekly (168 h) to represent occupancy-related variations, and yearly (8760 h) to capture long-term seasonal trends. The resulting components provided interpretable features for analyzing building thermal dynamics and were subsequently used as model inputs and for residual analysis.

Third, a multi-scale system of lagged variables for internal temperature was introduced: short-term (1 h), medium-term (24 h), and long-term (168–336 h) lags. This captures immediate thermal responses, daily occupancy effects, and longer-term seasonal variations.

In the next step, to identify the most influential features, we employed a five-stage selection process using the XGBoost model. This process systematically integrated external weather data, temporal features, seasonal patterns, and lagged variables to build a comprehensive predictive framework. In the first stage, external weather data were combined with classroom-specific measurements, and XGBoost was applied to identify the two most significant features correlated with indoor temperature. This was followed by a temporal feature analysis stage, where ten time-based features with the strongest predictive relationships were selected and incorporated into the model.

Subsequently, we enhanced our dataset by integrating seasonal and daily patterns, carefully selecting ten additional high-impact features while excluding previously selected temporal characteristics. This systematic approach was further refined through the incorporation of lagged data features, where we identified four critical time-delay parameters that effectively captured the temporal dependencies of temperature variations across different time scales. The final stage involved conducting a comprehensive correlation analysis of all selected features, as illustrated in Figure A1 through a correlation matrix, to identify and eliminate redundancy. This resulted in an optimized set of twelve features chosen for their high predictive power and minimal intercorrelation. The final selected features are detailed in Table 1.

Table 1. Feature selection results.

3.4. Prediction Model Development

3.4.1. Model Selection Strategy

Robust predictive capabilities are critical for effective digital twin implementation in building systems, enabling multi-parameter monitoring and anomaly detection across diverse conditions. While temperature prediction is the primary case study in this research, the predictive architecture is designed to be versatile, adapting to various building systems and environmental factors. Model selection criteria emphasized three key capabilities: capturing temporal dependencies, handling multivariate inputs, and adapting to dynamic building states. This rigorous process ensures the selected models effectively address the complexities of building system monitoring.

A comparative analysis was conducted between traditional shallow learning models (e.g., Artificial Neural Networks, linear regression) and Recurrent Neural Networks (RNNs). While many models are effective at detecting abrupt anomalies (e.g., sensor failures or sudden HVAC shutdowns), they often struggle with detecting gradual degradation, such as the slow decline in air conditioning performance. Addressing this challenge requires models capable of learning long-range dependencies within time-series data.

Standard RNNs excel at processing sequential data but are hindered by the vanishing gradient problem, which limits their ability to capture long-term patterns effectively. Similarly, simpler models like Artificial Neural Networks (ANNs) struggle to identify long-term trends without extensive feature engineering of historical data, failing to autonomously learn temporal dynamics.

Consequently, the Long Short-Term Memory (LSTM) architecture was selected as the core predictive engine for both short-term (1 h ahead) and long-term (24 h ahead) forecasting. For short-term predictions, the LSTM enables real-time anomaly detection, effectively capturing both abrupt faults and subtle, gradual issues.

LSTM networks are explicitly designed to remember information over extended periods, making them highly suitable for modeling building thermal dynamics. The core of the LSTM unit is the cell state, which acts as a conveyor belt for information, regulated by three gating mechanisms: the forget gate, the input gate, and the output gate. Based on Figure 4, the key equations governing the LSTM unit at time step t are as follows:

Figure 4. The architecture of a Long Short-Term Memory (LSTM) unit.

Forget Gate ( $f_{t}$ ): This gate determines which information from the previous cell state (C_t₋₁) should be discarded.

$f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})$
Input Gate ( $i_{t}$ ): This gate decides which new information from the current input ( $x_{t}$ ) to store in the cell state.

$i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})$
Candidate Cell State ( ${\tilde{C}}_{t}$ ): A new candidate vector is created using the current input and previous hidden state.

${\tilde{C}}_{t} = t a n h (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})$
Cell State Update ( $C_{t}$ ): The cell state is updated by a combination of the previous cell state (scaled by the forget gate) and the new candidate values (scaled by the input gate).

$C_{t} = f_{t} ⊙ C_{t - 1} + i_{t} ⊙ {\tilde{C}}_{t}$
Output Gate ( $o_{t}$ ) and Hidden State ( $h_{t}$ ): The output gate controls which parts of the cell state will be used to generate the final hidden state ( $h_{t}$ ).

$o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})$

$h_{t} = o_{t} ⊙ \tan (C_{t})$

In these equations,

x_{t}

is the input vector at time t,

h_{t - 1}

is the hidden state from the previous time step, σ represents the sigmoid function, and W and b denote the weight matrices and bias vectors, respectively. This architecture enables the model to effectively learn both short-term fluctuations and long-term dependencies from the feature-rich time-series data.

3.4.2. Model Training Workflow

The models were trained using a structured workflow to ensure robustness and optimal performance, encompassing sequence transformation, data partitioning, and hyperparameter optimization.

To prepare continuous sensor data for time-series prediction models, particularly those using LSTM architectures, it is crucial to transform it into structured sequences of input–output pairs. This involves segmenting the data into fixed-length sequences, where each sequence contains a series of historical observations and their corresponding target values. This structured approach is essential for enabling the model to learn temporal dependencies and dynamic patterns across different time scales.

RNNs and their variants like LSTMs are uniquely capable of extracting features from these sequences and handling multivariate inputs, making them suitable for multi-step forecasting. To train the LSTM models, a sliding window approach was used to create these supervised learning sequences. For short-term, one-step-ahead predictions, a historical sequence of inputs (e.g., 24 h) was used to predict the value at the next step. For long-term, multi-step-ahead predictions, a direct multi-horizon approach was used, where the same input structure forecasted a sequence of future values. This data preparation process is visually represented in Figure 5.

Figure 5. Data preparation for short-term and long-term forecasting.

Data partitioning was performed to ensure robust model development and unbiased performance evaluation. The three-year dataset was split chronologically: the first two years (15 January 2021–15 January 2023) were used for training; the following six months (January–July 2023) were reserved for validation and hyperparameter tuning; and the final six months (July 2023–January 2024) were held out as a completely unseen test set to assess model generalization.

Hyperparameter optimization was conducted using the Optuna framework, which employs a Bayesian optimization approach to efficiently search for suitable model configurations []. This method, combined with a pruning mechanism to terminate underperforming trials early, targeted key hyperparameters for the LSTM and XGBoost models. Early stopping based on validation loss was implemented to prevent overfitting, while the Optuna median pruner was applied to halt weak trials during the optimization process. The search spaces and achieved values for the short-term LSTM, long-term LSTM, and XGBoost models are summarized in Table 2. The final models were developed in TensorFlow with the Adam optimizer and Mean Squared Error (MSE) as the loss function.

Table 2. Tuned hyperparameters for models.

3.5. Anomaly Detection Framework

In the proposed Digital Twin-enabled framework, anomaly detection is crucial for real-time monitoring and decision-making in building systems. Anomalies are data points or events that deviate from the digital twin’s expected behavior. These deviations may signal sensor degradation, HVAC malfunctions, or unusual environmental conditions.

The framework employs a hybrid approach that combines the predictive power of the LSTM model with the statistical robustness of the Interquartile Range (IQR) method. While advanced unsupervised methods like autoencoders or one-class SVMs are powerful for anomaly detection, they often present challenges in real-world deployment. These models can be computationally expensive for real-time applications and may function as “black boxes,” making it difficult for facility managers to interpret the reason for an alert.

The chosen LSTM-IQR approach offers a pragmatic balance. The LSTM model provides highly accurate one-step-ahead predictions, establishing a dynamic, context-aware baseline of expected system behavior. The IQR method then operates on the residuals (the difference between predicted and actual values), providing a simple, computationally lightweight, and highly interpretable method for setting anomaly thresholds. This strategy is effective for identifying both abrupt faults and gradual performance degradation while remaining transparent and efficient for real-time implementation.

The IQR method is a non-parametric statistical technique known for its robustness to outliers. It defines anomalies based on the spread of the prediction residuals. The IQR is calculated as the difference between the third quartile (Q3) and the first quartile (Q1) of a given dataset:

IQR = Q3 − Q1

Based on this range, dynamic upper and lower thresholds are defined to identify outliers:

Upper Bound = Q3 + k × IQR

Lower Bound = Q1 − k × IQR

where k is a sensitivity constant, typically set to 1.5 for standard outliers or 3.0 for extreme outliers. Within the digital twin framework, this method enables the system to continuously compare real-time sensor data and predicted values against these statistically derived thresholds. Any data point falling outside this range is automatically flagged as a potential anomaly.

3.6. Model Evaluation

To assess the accuracy and reliability of the forecasting models, five evaluation metrics were employed. Each metric captures a different aspect of model performance, providing a comprehensive evaluation.

Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) measure the average squared difference between predicted and actual values, heavily penalizing large errors.

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

Mean Absolute Error (MAE) reflects the average magnitude of absolute errors, providing an interpretable measure of typical prediction error.

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

Coefficient of Determination (R²) measures the proportion of variance in the observed data that is explained by the model, indicating goodness of fit.

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - {\bar{y}}_{i})}^{2}}

Percentage of Predictions Within 1 °C Error designed to assess practical utility, this quantifies the proportion of forecasts that fall within a 1 °C tolerance of the actual temperature. It is calculated as:

{A c c u r a c y}_{1 ° C} = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - \hat{y}| \times 100

where

y_{i}

is the predicted temperature,

{\hat{y}}_{i}

is the actual temperature, and n is the total number of predictions. This combination of metrics ensures a balanced assessment of both statistical accuracy and practical applicability.

4. Results and Discussion

This section presents the performance outcomes of different machine learning prediction models. Their performance was evaluated for two classrooms (128 and 201) using multiple error metrics, including MAE, MSE, RMSE, R², and the percentage of predictions within a 1 °C error margin.

A simple baseline model was added to provide a benchmark. It was defined as the historical average of hourly measurements from the previous 24 h. While many studies compare neural networks with advanced methods such as Random Forests [] and gradient boosting [,], they often overlook baseline models []. Yet, in some cases, baselines can perform surprisingly well []. In this study, the baseline achieved a mean absolute error of 0.88 °C for Classroom 128 and 1.65 °C for Classroom 201.

The predictive performance of the models was evaluated using a six-month test dataset. Among the RNNs, LSTM achieved the highest accuracy for both classrooms, as shown in Figure 6. Based on this result, LSTM was selected for comparison with the other shallow models.

Figure 6. Comparison of RNN models for Classroom 128 and 201.

4.1. Short-Term Prediction

Short-term prediction began with a comparative analysis of shallow learning models (ANN, XGBoost, and RF) and the LSTM network to evaluate their ability to capture temporal dependencies and handle multivariate inputs. As shown in Table 3, LSTM achieved the best performance, with MAE, MSE, and RMSE about 50% lower than those of shallow models. More than 97% of its predictions were within 1 °C, with an R² above 0.98. This comparison confirms that LSTM captures complex temporal dynamics more effectively than tree-based and feedforward architectures.

Table 3. Short-term prediction results.

The predicted results for Classrooms 128 and 201 demonstrate that ANN, XGBoost, and RF models also performed well, achieving MAE values below 0.5 °C and R² above 0.95. However, their accuracy was slightly lower, with 90–98% of predictions within 1 °C. These models were less effective at learning long-range dependencies without additional feature engineering, highlighting the strength of the LSTM in modeling complex temporal dynamics. The robust data preparation process enhanced the performance of all models, particularly benefiting the LSTM by enabling accurate learning of temporal patterns.

4.2. Long-Term Prediction

Long-term prediction up to 24 h ahead was conducted using the LSTM model to evaluate its reliability in identifying the optimal forecast horizon and capturing energy patterns. Across both classrooms, the average MAE over the 24 h horizon was below 0.65 °C, and the R² values exceeded 0.87, as shown in Table 4.

Table 4. The LSTM 24 h prediction result.

More than 83% of predictions within this horizon exhibited errors of less than 1 °C, as shown in Figure 7. The model also produced highly accurate forecasts up to 16 h ahead for both classrooms, with R² values exceeding 0.92 and over 90% of predictions falling within 1 °C of the actual temperature. As a result, the 16 h horizon emerged as the optimal forecast window, consistently providing high accuracy and reliability.

Figure 7. The left figure illustrates the R² performance for prediction horizons up to 24 h ahead, whereas the right figure presents the corresponding prediction accuracy (percentage) across the same forecast horizons.

The consistent performance across two zones with different profiles demonstrates the model’s robustness for multi-zone applications. These results demonstrate the robustness and effectiveness of the LSTM model for both short- and long-term indoor condition prediction.

4.3. Anomaly Detection

The hybrid LSTM-IQR framework was evaluated for its ability to detect both abrupt faults and gradual performance degradation in building systems while ensuring computational efficiency and interpretability. The IQR method was applied with a sensitivity constant

k

, which was tuned during validation to achieve a practical alert rate (approximately 3–5 alarms per zone per day). This approach balanced sensitivity and specificity, ensuring that detected anomalies reflected meaningful deviations rather than frequent false positives, thereby maintaining operational stability.

To maintain adaptability over time, the IQR thresholds were computed on a rolling two-week window of recent residuals rather than the entire history. This dynamic recalibration allowed the framework to adjust to slow drifts and seasonal variations in building conditions, reducing the risk of false alarms from long-term data shifts.

An unsupervised validation strategy was adopted to assess the reliability of detected anomalies, focusing on the detector’s behavioral consistency, plausibility, and robustness. The alert workload was characterized by alarm frequency across classrooms, confirming that detection rates remained operationally manageable.

Detected anomalies were further analyzed through deviations between actual and predicted temperatures, where larger residuals indicated more significant fault events. Over a six-month test period, the LSTM-IQR framework effectively identified anomalies in Classrooms 128 and 201, as illustrated in Figure 8 and Figure 9.

Figure 8. Anomaly Detection Results for Classroom 128.

Figure 9. Anomaly Detection Results for Classroom 201.

The most critical anomalies detected during a two-month subset, summarized in Table 5, were primarily associated with sensor shutdowns, unexpected environmental fluctuations (e.g., abrupt outdoor temperature changes), and occupant-driven disturbances (e.g., prolonged window openings). Although the absence of ground-truth fault labels precludes definitive verification of each detected event, the anomalies exhibited strong temporal coherence and operational plausibility. This consistency supports the reliability and interpretive value of the proposed framework despite the inherent limitations of unsupervised validation.

Table 5. Two-Month Anomaly Detection Results for Classroom 128 and 201.

The results of anomaly detection primarily serve as an early warning mechanism to identify potential faults rather than a complete diagnostic tool. The proposed LSTM–IQR framework effectively identifies deviations from expected thermal behavior, helping to flag potential system irregularities before performance degradation occurs. However, determining the root cause of detected anomalies requires integration with additional contextual data sources—such as energy consumption patterns, HVAC damper positions, occupancy schedules, and equipment-level control signals. For instance, an anomaly may arise from a sensor drift, stuck damper, or occupancy-related disturbance, each requiring a distinct operational response. These results demonstrate the framework’s potential to support facility operators in prioritizing maintenance actions, optimizing operational efficiency, and reducing downtime through early fault detection.

The empirical results from this study reaffirm the effectiveness of the proposed digital twin-enabled framework in enhancing predictive monitoring and anomaly detection within building systems. The LSTM model demonstrated superior performance in indoor temperature forecasting, achieving R² values above 0.98 and RMSE below 0.5 °C across multiple zones. These results highlight the model’s robustness in capturing complex temporal dependencies and nonlinear thermal interactions inherent in multi-zone environments.

A comparative assessment with recent studies further highlights the superior performance of the proposed LSTM model for short-term prediction. In this study, the LSTM outperformed other machine learning models commonly applied in HVAC demand response and indoor temperature prediction tasks, such as ANN, SVM, and XGBoost []. Specifically, prior research reported RMSE = 0.88 for ANN [], R² = 0.91 for DEML [], and R² = 0.93 for LSTM [], whereas the proposed model achieved even higher accuracy across the tested zones.

The enhanced accuracy in this study can be attributed to the integration of an optimized hyperparameter tuning process and a robust data preprocessing pipeline that effectively handled missing and noisy IoT sensor data. Overall, the findings reinforce the reliability of LSTM architectures for short-term temperature forecasting and demonstrate their potential for deployment in real-time digital twin environments for predictive building management.

While most existing studies have focused primarily on short-term forecasting horizons, the proposed framework maintained high predictive accuracy—achieving a ±1 °C error margin in over 90% of test cases—while providing reliable forecasts extending up to 16 h ahead. This extended prediction horizon highlights the framework’s capability for real-time operational deployment and proactive building management. Model performance was further strengthened through systematic hyperparameter optimization using the Optuna framework, which employed an automated Bayesian search with pruning to identify optimal configurations for both LSTM and XGBoost models.

In the context of anomaly detection, the hybrid LSTM-IQR approach effectively identified deviations in one-step-ahead residuals, enabling proactive fault identification that could mitigate operational inefficiencies. This method’s performance resonates with recent advancements in digital twin-based anomaly detection, where deep learning techniques have achieved detection accuracies up to 98% on industrial IoT datasets, with significant reductions in false positives [].

Comparative benchmarks further support this, showing LSTM-integrated frameworks yielding precision and recall improvements over traditional models in building environments, particularly for time-series anomalies []. The integration of a robust data preprocessing pipeline to address IoT sensor imperfections, such as noise and missing values, not only enhanced model stability but also addressed a common gap in prior research, where data quality issues often degrade forecasting reliability in multi-zone settings.

The framework’s strengths lie in its comprehensive validation using a full year of operational data from a real-world educational building, incorporating varying occupancy patterns and external weather influences. This long-term evaluation bridges short-term and long-term forecasting horizons, a limitation in many existing studies that focus predominantly on short-term predictions or simulated environments. By leveraging digital twin technology, the approach facilitates enhanced energy efficiency and occupant comfort through predictive analytics, potentially reducing energy consumption in HVAC systems by enabling optimized control strategies.

Although the proposed LSTM-based framework was developed and validated using data from a single educational building, the underlying methodology is designed to be independent of specific architectural or mechanical characteristics. The model primarily relies on environmental variables, and the data preprocessing and model training pipelines are structured to function independently of building layout or HVAC configuration. This design enables the framework to be easily retrained or fine-tuned using locally collected sensor data, supporting adaptability and broad applicability across diverse geographical regions and operational contexts.

These findings contribute to the Architecture, Engineering, and Construction (AEC) sector by providing a generalizable methodology that supports sustainable building management, aligning with broader goals of reducing carbon emissions in urban infrastructures.

However, the results should be interpreted with consideration of the study’s specific context. While the LSTM model’s performance is competitive, its computational intensity may limit scalability in resource-constrained deployments, a challenge echoed in the literature on edge-based anomaly detection. Future iterations could explore lightweight alternatives or hybrid models to balance accuracy and efficiency, further advancing the practical deployment of digital twins in diverse building typologies.

5. Conclusions

This study introduced a comprehensive digital twin-enabled framework for real-time monitoring and anomaly detection in building systems, addressing key gaps in predictive diagnostics, robust data preparation, and long-term validation in multi-zone environments. Validated using a full year of operational data from a multi-zone educational building in Rhode Island, USA, the framework integrated IoT sensor networks with advanced machine learning models, including LSTM, RNN, GRU, ANN, XGBoost, and RF, for short-term (1 h) and long-term (24 h) indoor temperature forecasting.

The LSTM model demonstrated superior performance, achieving R² values exceeding 0.98 and RMSE below 0.55 °C across tested rooms, with predictions maintaining a ±1 °C error margin for over 90% of the test data and reliability up to 16 h ahead. The use of Bayesian hyperparameter tuning with the Optuna framework further reinforces the robustness of the proposed framework, ensuring that model performance reflects true learning capability rather than parameter sensitivity or manual bias.

For anomaly detection, the hybrid LSTM-Interquartile Range (IQR) method effectively identified deviations in one-step-ahead residuals, enabling proactive fault identification and operational optimization. By incorporating a robust data preprocessing pipeline to handle IoT imperfections such as noise and missing values, the framework enhances energy efficiency, occupant comfort, and system resilience in complex building operations.

The contributions of this work include a generalizable digital twin methodology that bridges short- and long-term forecasting horizons, validated in a real-world multi-zone educational building with varying occupancy patterns. This advances the application of digital twins in the AEC sector, offering practical tools for predictive maintenance and sustainable building management. Ultimately, the proposed framework supports the transition toward smarter, more energy-efficient buildings, reducing energy consumption and carbon emissions while promoting broader adoption of intelligent technologies in urban environments.

Despite these promising results, several limitations should be acknowledged. First, validation was conducted on a single multi-zone educational building in a temperate climate. Broader validation across diverse building types and climatic regions is necessary to evaluate recalibration needs and confirm model generalizability. Second, the study primarily focused on temperature and weather parameters, excluding other influential variables such as occupant behavior, HVAC component performance, and energy consumption. Integrating these contextual factors would enable a more holistic representation of building dynamics. Third, anomaly detection was performed without ground-truth fault labels, limiting the ability to quantitatively validate anomaly accuracy. Finally, the computational demands of deep learning models may restrict their scalability on low-resource edge devices.

Future research should expand the framework to multi-site datasets encompassing varied building typologies and climates, incorporating contextual data such as occupancy patterns, equipment-level logs, and energy metrics. Exploring advanced architectures—such as transformers and attention-based hybrid models—may further enhance adaptability for both short- and long-term forecasting. To improve interpretability, future transformer-based extensions will employ explainable AI (XAI) techniques, enabling facility managers to understand model reasoning and ensure transparent decision-making within the digital twin environment.

Moreover, Strengthening IoT cybersecurity—through encrypted communication, secure gateways, and access control—will be essential for protecting multi-building systems. Furthermore, to enable large-scale implementation and enhance interoperability, future research should focus on integrating standardized communication protocols to facilitate seamless interaction among heterogeneous sensors and building management systems. This approach would reduce implementation costs and support broader, multi-building deployment of the proposed framework.

In parallel, adopting lightweight edge-computing approaches can improve data privacy by processing sensitive information locally before cloud synchronization. Ensuring interoperability with BIM and BMS platforms through standardized data exchange formats will be critical for achieving scalable, secure, and efficient digital twin deployment.

Overall, this research establishes a robust foundation for integrating AI-driven digital twin technologies into the AEC/FM sectors, paving the way toward more intelligent, energy-efficient, and resilient building environments.

Author Contributions

Conceptualization, F.H. and I.R.; methodology, F.H.; software, F.H. and S.Z.G.; validation, F.H., I.R. and S.Z.G.; formal analysis, F.H. and S.Z.G.; investigation, I.R.; resources, I.R.; data curation, F.H.; writing—original draft preparation, F.H.; writing—review and editing, S.Z.G. and N.S.; visualization, F.H., S.Z.G.; supervision, I.R. and N.S.; project administration, F.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data used in this study were obtained from a commercial building and contain operational information that cannot be publicly shared due to confidentiality restrictions. No personal or sensitive data were collected. Therefore, institutional ethics approval was not required for this research.

Acknowledgments

During the preparation of this work, the authors used Generative Artificial Intelligence (GPT, OpenAI) in order to improve the readability and language of the manuscript. After using this tool, the authors reviewed and edited the content as needed. The authors consent to this acknowledgment and take full responsibility for the content of the published article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DT	Digital Twin
LSTM	Long Short-Term Memory
RNN	Recurrent Neural Network
GRU	Gated Recurrent Unit
ANN	Artificial Neural Network
XGBoost	Extreme Gradient Boosting
RF	Random Forest
IQR	Interquartile Range
BIM	Building Information Modeling
IoT	Internet of Things
MSE	Mean Squared Error
RMSE	Root Mean Squared Error
MAE	Mean Absolute Error
$R^{2}$	Coefficient of Determination

Appendix A

Figure A1. Pearson correlation matrix for feature selection.

References

Klepeis, N.E.; Nelson, W.C.; Ott, W.R.; Robinson, J.P.; Tsang, A.M.; Switzer, P.; Behar, J.V.; Hern, S.C.; Engelmann, W.H. The National Human Activity Pattern Survey (NHAPS): A resource for assessing exposure to environmental pollutants. J. Expo. Sci. Environ. Epidemiol. 2001, 11, 231–252. [Google Scholar] [CrossRef] [PubMed]
Xie, X.; Merino, J.; Moretti, N.; Pauwels, P.; Chang, J.Y.; Parlikad, A. Digital twin enabled fault detection and diagnosis process for building HVAC systems. Autom. Constr. 2023, 146, 104695. [Google Scholar] [CrossRef]
Hong, T.; Koo, C.; Kim, J.; Lee, M.; Jeong, K. A review on sustainable construction management strategies for monitoring, diagnosing, and retrofitting the building’s dynamic energy performance: Focused on the operation and maintenance phase. Appl. Energy 2015, 155, 671–707. [Google Scholar] [CrossRef]
Seyedzadeh, S.; Rahimian, F.P.; Glesk, I.; Roper, M. Machine learning for estimation of building energy consumption and performance: A review. Vis. Eng. 2018, 6, 5. [Google Scholar] [CrossRef]
Idahosa, L.O.; Akotey, J.O. A social constructionist approach to managing HVAC energy consumption using social norms—A randomised field experiment. Energy Policy 2021, 154, 112293. [Google Scholar] [CrossRef]
Papadopoulos, S.; Kontokosta, C.E.; Vlachokostas, A.; Azar, E. Rethinking HVAC temperature setpoints in commercial buildings: The potential for zero-cost energy savings and comfort improvement in different climates. Build. Environ. 2019, 155, 350–359. [Google Scholar] [CrossRef]
Marocco, M.; Garofolo, I. Integrating disruptive technologies with facilities management: A literature review and future research directions. Autom. Constr. 2021, 131, 103917. [Google Scholar] [CrossRef]
Motamedi, A.; Hammad, A.; Asen, Y. Knowledge-assisted BIM-based visual analytics for failure root cause detection in facilities management. Autom. Constr. 2014, 43, 73–83. [Google Scholar] [CrossRef]
Volk, R.; Stengel, J.; Schultmann, F. Building Information Modeling (BIM) for existing buildings—Literature review and future needs. Autom. Constr. 2014, 38, 109–127. [Google Scholar] [CrossRef]
Prušková, K. BIM technology and changes in traditional design process, reliability of data from related registers. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2020; Volume 960, p. 032049. [Google Scholar]
Lu, Q.; Xie, X.; Parlikad, A.K.; Schooling, J.M. Digital twin-enabled anomaly detection for built asset monitoring in operation and maintenance. Autom. Constr. 2020, 118, 103277. [Google Scholar] [CrossRef]
Shafto, M.; Conroy, M.; Doyle, R.; Gleassgen, E.; Kemp, C.; LeMoigne, J.; Wang, L. Draft modeling, simulation, information technology & processing roadmap. Technol. Area 2010, 11, 1–32. [Google Scholar]
Grieves, M.W. Product lifecycle management: The new paradigm for enterprises. Int. J. Prod. Dev. 2005, 2, 71–84. [Google Scholar] [CrossRef]
Grieves, M.; Vickers, J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In Transdisciplinary Perspectives on Complex Systems: New Findings and Approaches; Florida Inst. Technol.: Melbourne, FL, USA, 2017; pp. 85–113. [Google Scholar]
VanDerHorn, E.; Mahadevan, S. Digital Twin: Generalization, characterization and implementation. Decis. Support Syst. 2021, 145, 113524. [Google Scholar] [CrossRef]
Boukaf, M.; Fadli, F.; Meskin, N. A Comprehensive Review of Digital Twin Technology in Building Energy Consumption Forecasting. IEEE Access 2024, 12, 187778–187799. [Google Scholar] [CrossRef]
Arsecularatne, B.; Rodrigo, N.; Chang, R. Digital Twins for Reducing Energy Consumption in Buildings: A Review. Sustainability 2024, 16, 9275. [Google Scholar] [CrossRef]
Zahedi, F.; Alavi, H.; Sardroud, J.M.; Dang, H. Digital Twins in the Sustainable Construction Industry. Buildings 2024, 14, 3613. [Google Scholar] [CrossRef]
Huang, J.; Wu, P.; Li, W.; Zhang, J.; Xu, Y. Exploring the Applications of Digital Twin Technology in Enhancing Sustainability in Civil Engineering: A Review. Struct. Durab. Health Monit. (SDHM) 2024, 18. [Google Scholar] [CrossRef]
Zhang, Z.; Wei, Z.; Court, S.; Yang, L.; Wang, S.; Thirunavukarasu, A.; Zhao, Y. A Review of Digital Twin Technologies for Enhanced Sustainability in the Construction Industry. Buildings 2024, 14, 1113. [Google Scholar] [CrossRef]
Liu, Z.; Lu, Y.; Shen, M.; Peh, L.C. Transition from building information modeling (BIM) to integrated digital delivery (IDD) in sustainable building management: A knowledge discovery approach based review. J. Clean. Prod. 2021, 291, 125223. [Google Scholar] [CrossRef]
Boje, C.; Guerriero, A.; Kubicki, S.; Rezgui, Y. Towards a semantic Construction Digital Twin: Directions for future research. Autom. Constr. 2020, 114, 103179. [Google Scholar] [CrossRef]
Iliuţă, M.-E.; Moisescu, M.-A.; Pop, E.; Ionita, A.-D.; Caramihai, S.-I.; Mitulescu, T.-C. Digital Twin—A Review of the Evolution from Concept to Technology and Its Analytical Perspectives on Applications in Various Fields. Appl. Sci. 2024, 14, 5454. [Google Scholar] [CrossRef]
Yeom, S.; Kim, J.; Kang, H.; Jung, S.; Hong, T. Digital twin (DT) and extended reality (XR) for building energy management. Energy Build. 2024, 323, 114746. [Google Scholar] [CrossRef]
Francisco, A.; Mohammadi, N.; Taylor, J.E. Smart city digital twin–enabled energy management: Toward real-time urban building energy benchmarking. J. Manag. Eng. 2020, 36, 04019045. [Google Scholar] [CrossRef]
Hwang, J.; Kim, J.; Yoon, S. DT-BEMS: Digital twin-enabled building energy management system for information fusion and energy efficiency. Energy 2025, 326, 136162. [Google Scholar] [CrossRef]
Hosamo, H.H.; Nielsen, H.K.; Kraniotis, D.; Svennevig, P.R.; Svidt, K. Improving building occupant comfort through a digital twin approach: A Bayesian network model and predictive maintenance method. Energy Build. 2023, 288, 112992. [Google Scholar] [CrossRef]
Benfer, R.; Müller, J. Semantic digital twin creation of building systems through time series based metadata inference—A review. Energy Build. 2024, 321, 114637. [Google Scholar] [CrossRef]
Iqbal, F.; Mirzabeigi, S. Digital Twin-Enabled Building Information Modeling–Internet of Things (BIM-IoT) Framework for Optimizing Indoor Thermal Comfort Using Machine Learning. Buildings 2025, 15, 1584. [Google Scholar] [CrossRef]
Arowoiya, V.A.; Moehler, R.C.; Fang, Y. Digital twin technology for thermal comfort and energy efficiency in buildings: A state-of-the-art and future directions. Energy Built Environ. 2024, 5, 641–656. [Google Scholar] [CrossRef]
Agostinelli, S.; Cumo, F.; Guidi, G.; Tomazzoli, C. Cyber-physical systems improving building energy management: Digital twin and artificial intelligence. Energies 2021, 14, 2338. [Google Scholar] [CrossRef]
Tahmasebinia, F.; Lin, L.; Wu, S.; Kang, Y.; Sepasgozar, S. Exploring the benefits and limitations of digital twin technology in building energy. Appl. Sci. 2023, 13, 8814. [Google Scholar] [CrossRef]
Jeong, K.; Koo, C.; Hong, T. An estimation model for determining the annual energy cost budget in educational facilities using SARIMA (seasonal autoregressive integrated moving average) and ANN (artificial neural network). Energy 2014, 71, 71–79. [Google Scholar] [CrossRef]
Arora, S.; Taylor, J.W. Short-term forecasting of anomalous load using rule-based triple seasonal methods. IEEE Trans. Power Syst. 2013, 28, 3235–3242. [Google Scholar] [CrossRef]
Smarra, F.; Jain, A.; De Rubeis, T.; Ambrosini, D.; D’Innocenzo, A.; Mangharam, R. Data-driven model predictive control using random forests for building energy optimization and climate control. Appl. Energy 2018, 226, 1252–1272. [Google Scholar] [CrossRef]
Martínez-Comesaña, M.; Eguía-Oller, P.; Martínez-Torres, J.; Febrero-Garrido, L.; Granada-Álvarez, E. Optimisation of thermal comfort and indoor air quality estimations applied to in-use buildings combining NSGA-III and XGBoost. Sustain. Cities Soc. 2022, 80, 103723. [Google Scholar] [CrossRef]
Huang, Y.; Miles, H.; Zhang, P. A Sequential Modelling Approach for Indoor Temperature Prediction and Heating Control in Smart Buildings. arXiv 2020, arXiv:2009.09847. [Google Scholar] [CrossRef]
Magnier, L.; Haghighat, F. Multiobjective optimization of building design using TRNSYS simulations, genetic algorithm, and Artificial Neural Network. Build. Environ. 2010, 45, 739–746. [Google Scholar] [CrossRef]
Ferreira, P.; Ruano, A.; Silva, S.; Conceicao, E. Neural networks based predictive control for thermal comfort and energy savings in public buildings. Energy Build. 2012, 55, 238–251. [Google Scholar] [CrossRef]
Ramadan, L.; Shahrour, I.; Mroueh, H.; Chehade, F.H. Use of machine learning methods for indoor temperature forecasting. Future Internet 2021, 13, 242. [Google Scholar] [CrossRef]
Boesgaard, C.; Hansen, B.V.; Kejser, U.B.; Mollerup, S.H.; Ryhl-Svendsen, M.; Torp-Smith, N. Prediction of the indoor climate in cultural heritage buildings through machine learning: First results from two field tests. Herit. Sci. 2022, 10, 176. [Google Scholar] [CrossRef]
Ma, C.; Pan, S.; Cui, T.; Liu, Y.; Cui, Y.; Wang, H.; Wan, T. Energy consumption prediction for office buildings: Performance evaluation and application of ensemble machine learning techniques. J. Build. Eng. 2025, 102, 112021. [Google Scholar] [CrossRef]
Zhang, L. Data-driven building energy modeling with feature selection and active learning for data predictive control. Energy Build. 2021, 252, 111436. [Google Scholar] [CrossRef]
Matetić, I.; Štajduhar, I.; Wolf, I.; Ljubic, S. A review of data-driven approaches and techniques for fault detection and diagnosis in HVAC systems. Sensors 2022, 23, 1. [Google Scholar] [CrossRef]
Arwansyah; Suryani; Sy, H.; Faizal; Alam, S.; Piu, S.; Usman; Tamsir, N.; Djafar, I. Deep sequence models for time series data: A comparative study and parameter fine-tuning approach. In Proceedings of the 2024 11th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), Yogyakarta, Indonesia, 26–27 September 2024; pp. 703–709. [Google Scholar]
Somu, N.; MR, G.R.; Ramamritham, K. A hybrid model for building energy consumption forecasting using long short term memory networks. Appl. Energy 2020, 261, 114131. [Google Scholar] [CrossRef]
Kim, T.-Y.; Cho, S.-B. Predicting residential energy consumption using CNN-LSTM neural networks. Energy 2019, 182, 72–81. [Google Scholar] [CrossRef]
Cui, B.; Im, P.; Bhandari, M.; Lee, S. Performance analysis and comparison of data-driven models for predicting indoor temperature in multi-zone commercial buildings. Energy Build. 2023, 298, 113499. [Google Scholar] [CrossRef]
Xing, T.; Sun, K.; Zhao, Q. MITP-Net: A deep-learning framework for short-term indoor temperature predictions in multi-zone buildings. Build. Environ. 2023, 239, 110388. [Google Scholar] [CrossRef]
Jiang, B.; Gong, H.; Qin, H.; Zhu, M. Attention-LSTM architecture combined with Bayesian hyperparameter optimization for indoor temperature prediction. Build. Environ. 2022, 224, 109536. [Google Scholar] [CrossRef]
Elmaz, F.; Eyckerman, R.; Casteels, W.; Latré, S.; Hellinckx, P. CNN-LSTM architecture for predictive indoor temperature modeling. Build. Environ. 2021, 206, 108327. [Google Scholar] [CrossRef]
Park, B.K.; Kim, C.-J. Short-term prediction for indoor temperature control using artificial neural network. Energies 2023, 16, 7724. [Google Scholar] [CrossRef]
Fang, Z.; Crimier, N.; Scanu, L.; Midelet, A.; Alyafi, A.; Delinchant, B. Multi-zone indoor temperature prediction with LSTM-based sequence to sequence model. Energy Build. 2021, 245, 111053. [Google Scholar] [CrossRef]
Norouzi, P.; Maalej, S.; Mora, R. Applicability of deep learning algorithms for predicting indoor temperatures: Towards the development of digital twin HVAC systems. Buildings 2023, 13, 1542. [Google Scholar] [CrossRef]
Yu, W.; Nakisa, B.; Ali, E.; Loke, S.W.; Stevanovic, S.; Guo, Y. Sensor-based indoor air temperature prediction using deep ensemble machine learning: An Australian urban environment case study. Urban Clim. 2023, 51, 101599. [Google Scholar] [CrossRef]
Borda, D.; Bergagio, M.; Amerio, M.; Masoero, M.C.; Borchiellini, R.; Papurello, D. Development of anomaly detectors for HVAC systems using machine learning. Processes 2023, 11, 535. [Google Scholar] [CrossRef]
Hodavand, F.; Ramaji, I.J.; Sadeghi, N. Digital twin for fault detection and diagnosis of building operations: A systematic review. Buildings 2023, 13, 1426. [Google Scholar] [CrossRef]
Cicero, S.; Guarascio, M.; Guerrieri, A.; Mungari, S. A deep anomaly detection system for iot-based smart buildings. Sensors 2023, 23, 9331. [Google Scholar] [CrossRef]
Mirdula, S.; Roopa, M. MUD enabled deep learning framework for anomaly detection in IoT integrated smart building. e-Prime-Adv. Electr. Eng. Electron. Energy 2023, 5, 100186. [Google Scholar]
Song, Y.; Kuang, S.; Huang, J.; Zhang, D. Unsupervised anomaly detection of industrial building energy consumption. Energy Built Environ. 2024. [Google Scholar] [CrossRef]
Shahid, Z.K.; Saguna, S.; Ahlund, C.; Mitra, K. Anomaly Detection using Transfer Learning for Electricity Consumption in School Buildings: A Case of Northern Sweden. Energy Build. 2025, 346, 116129. [Google Scholar] [CrossRef]
Abdollah, M.A.F.; Scoccia, R.; Aprile, M. Transformer encoder based self-supervised learning for HVAC fault detection with unlabeled data. Build. Environ. 2024, 258, 111568. [Google Scholar] [CrossRef]
Floris, A.; Porcu, S.; Girau, R.; Atzori, L. An iot-based smart building solution for indoor environment management and occupants prediction. Energies 2021, 14, 2959. [Google Scholar] [CrossRef]
Eneyew, D.D.; Capretz, M.A.; Bitsuamlak, G.T. Toward smart-building digital twins: BIM and IoT data integration. IEEE Access 2022, 10, 130487–130506. [Google Scholar] [CrossRef]
ElArwady, Z.; Kandil, A.; Afiffy, M.; Marzouk, M. Modeling Indoor Thermal Comfort in Buildings using Digital Twin and Machine Learning. Dev. Built Environ. 2024, 19, 100480. [Google Scholar] [CrossRef]
Rathore, M.M.; Shah, S.A.; Shukla, D.; Bentafat, E.; Bakiras, S. The role of ai, machine learning, and big data in digital twinning: A systematic literature review, challenges, and opportunities. IEEE Access 2021, 9, 32030–32052. [Google Scholar] [CrossRef]
Zhang, M.; Tao, F.; Huang, B.; Liu, A.; Wang, L.; Anwer, N.; Nee, A. Digital twin data: Methods and key technologies. Digit. Twin 2022, 1, 2. [Google Scholar] [CrossRef]
Afroz, Z.; Urmee, T.; Shafiullah, G.; Higgins, G. Real-time prediction model for indoor temperature in a commercial building. Appl. Energy 2018, 231, 29–53. [Google Scholar] [CrossRef]
Afroz, Z.; Shafiullah, G.; Urmee, T.; Higgins, G. Modeling techniques used in building HVAC control systems: A review. Renew. Sustain. Energy Rev. 2018, 83, 64–84. [Google Scholar] [CrossRef]
Ahmad, M.W.; Mourshed, M.; Rezgui, Y. Trees vs. Neurons: Comparison between random forest and ANN for high-resolution prediction of building energy consumption. Energy Build. 2017, 147, 77–89. [Google Scholar] [CrossRef]
Attoue, N.; Shahrour, I.; Younes, R. Smart building: Use of the artificial neural network approach for indoor temperature forecasting. Energies 2018, 11, 395. [Google Scholar] [CrossRef]
Bandara, K.; Shi, P.; Bergmeir, C.; Hewamalage, H.; Tran, Q.; Seaman, B. Sales Demand Forecast in E-commerce Using a Long Short-Term Memory Neural Network Methodology. Neural Inf. Process. 2019, 462–474. [Google Scholar] [CrossRef]
Shakhovska, N.; Mochurad, L.; Caro, R.; Argyroudis, S. Innovative machine learning approaches for indoor air temperature forecasting in smart infrastructure. Sci. Rep. 2025, 15, 47. [Google Scholar] [CrossRef]
Shrestha, R.; Mohammadi, M.; Sinaei, S.; Salcines, A.; Pampliega, D.; Clemente, R.; Sanz, A.L.; Nowroozi, E.; Lindgren, A. Anomaly detection based on lstm and autoencoders using federated learning in smart electric grid. J. Parallel Distrib. Comput. 2024, 193, 104951. [Google Scholar] [CrossRef]
Noh, S.-H.; Moon, H.J. Anomaly detection based on lstm learning in iot-based dormitory for indoor environment control. Buildings 2023, 13, 2886. [Google Scholar] [CrossRef]

Figure 1. Digital Twin Architecture.

Figure 2. Location of classrooms 128 and 201 on the right and left side of the first and second floors of the educational building, respectively.

Figure 3. The left image shows the temperature data containing missing values, while the right image presents the reconstructed data after applying the missing-value filling techniques.

Figure 4. The architecture of a Long Short-Term Memory (LSTM) unit.

Figure 5. Data preparation for short-term and long-term forecasting.

Figure 6. Comparison of RNN models for Classroom 128 and 201.

Figure 7. The left figure illustrates the R² performance for prediction horizons up to 24 h ahead, whereas the right figure presents the corresponding prediction accuracy (percentage) across the same forecast horizons.

Figure 8. Anomaly Detection Results for Classroom 128.

Figure 9. Anomaly Detection Results for Classroom 201.

Table 1. Feature selection results.

Feature	Description
Temperature	This feature shows the temperature in Celsius degrees.
Humidity	This feature shows the humidity as a percentage.
Temperature	This feature is likely the same as temperature, but to avoid repetition, the name “Temperature” is used.
Temperature 1 h ago	This feature shows the temperature 1 h ago in Celsius degrees.
Temperature 24 h ago	This feature shows the temperature 24 h ago in Celsius degrees.
Temperature 168 h ago	This feature shows the temperature 168 h ago (i.e., 7 days ago) in Celsius degrees.
Temperature 336 h ago	This feature shows the temperature 336 h ago (i.e., 14 days ago) in Celsius degrees.
Weekly Temperature Trend	This feature shows the trend of temperature changes over a week. Its unit is Celsius degrees.
Daily Temperature Trend	This feature shows the trend of temperature changes over a day. Its unit is Celsius degrees.
Seasonal Temperature	This feature shows the seasonal temperature in Celsius degrees.
External Seasonal Temperature	A precise explanation regarding the settings applied to this seasonal temperature feature is available in the relevant documents. For better understanding, it is recommended to refer to the data sources. Its unit is Celsius degrees.
Seasonal Humidity	This feature shows the seasonal humidity as a percentage.

Table 2. Tuned hyperparameters for models.

Model	Hyperparameters	Search Space	Achieved Value
LSTM (short-term prediction)	Layers	[1, 2, 3]	2
	Units	[32, 64, 100, 128]	100
	Dropout rate	[0.1, 0.2, 0.3, 0.4, 0.5]	0.4
	Learning rate	1 × 10⁻⁶ to 1 × 10⁻¹ (logarithmic scale)	0.001
	Batch	[32, 64, 128]	128
	n-step-in	[24, 48, 72, 96, 120, 144, 168]	168
LSTM (long-term prediction)	Epoch	[20, 50, 100]	100
	Layers	[1, 2, 3]	2
	Units	[32, 64, 128]	128
	Dropout rate	[0.2, 0.3, 0.5]	0.2
	Learning rate	1 × 10⁻⁶ to 1 × 10⁻¹ (logarithmic scale)	0.0001
	Batch	[32, 64, 128]	128
	n-step-in	[24, 48, 72, 96, 120, 144, 168]	48
	n-step-out	[24, 48, 72, 96, 120, 144, 168]	24
	Epoch	[20, 50, 100]	100
	colsample_bytree	[0.5, 1.0]	0.9301
	gamma	[0.0, 0.2]	0.0178
XGBoost	learning_rate	[0.01, 0.3]	0.1673
	max_depth	[3, 10] (integer values only)	5
	n_estimators	[10, 200] (integer values only)	10
	subsample	[0.5, 1.0]	0.7871

Table 3. Short-term prediction results.

Model	Class	MAE	RMSE	R²	Percentage of Prediction Under 1 Degree
LSTM	128	0.11	0.17	0.99	99.74
LSTM	201	0.24	0.42	0.986	97.98
XGBoost	128	0.2	0.57	0.96	98.98
XGBoost	201	0.42	0.82	0.96	93.6
RF	128	0.25	0.59	0.96	98.3
RF	201	0.41	0.83	0.96	93.36
ANN	128	0.19	0.3	0.96	98.85
ANN	201	0.53	0.6	0.95	90
Naive	128	0.88
Naive	201	1.65

Table 4. The LSTM 24 h prediction result.

Classroom	MAE	MSE	RMSE	R²
128	0.29	0.2	0.45	0.92
201	0.65	1	1	0.87

Table 5. Two-Month Anomaly Detection Results for Classroom 128 and 201.

Classroom	Actual	Prediction	Anomalies	Time
128	23.59	24.38	TRUE	8/6/2023 3:00
128	23.24	24.82	TRUE	8/6/2023 4:00
128	28.96	25.7	TRUE	8/6/2023 5:00
128	22.73	24.52	TRUE	8/6/2023 6:00
128	23.04	23.97	TRUE	8/8/2023 5:00
128	22.82	21.63	TRUE	8/8/2023 23:00
128	21.98	22.56	TRUE	8/18/2023 16:00
128	21.4	20.48	TRUE	8/23/2023 13:00
128	21.68	22.61	TRUE	8/23/2023 14:00
128	22.22	23.03	TRUE	8/30/2023 13:00
128	24.53	23.37	TRUE	8/30/2023 14:00
128	22.16	23.08	TRUE	8/30/2023 15:00
128	19.15	19.78	TRUE	9/4/2023 11:00
128	20.81	20.34	TRUE	9/5/2023 10:00
201	24.56	27.11	TRUE	7/17/2023 3:00
201	25.32	28.61	TRUE	7/24/2023 11:00
201	23.62	26.15	TRUE	7/24/2023 12:00
201	28.59	31.74	TRUE	7/31/2023 2:00
201	24.54	27.09	TRUE	8/14/2023 11:00
201	23.71	26.17	TRUE	8/21/2023 11:00
201	26.75	24.31	TRUE	8/24/2023 0:00
201	23.43	26	TRUE	9/4/2023 11:00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Digital Twin-Enabled Framework for Intelligent Monitoring and Anomaly Detection in Multi-Zone Building Systems

Abstract

1. Introduction

2. Literature Review

2.1. Digital Twin in the AEC Sector

2.2. Predictive Modeling in Buildings

2.3. Advanced Deep Learning Approaches

2.4. Anomaly Detection in Buildings

2.5. IoT, BIM, and Big Data in DTs

3. Methodology

3.1. Digital Twin Architecture

3.2. Data Acquisition and Case Study Description

3.2.1. Case Study Overview

3.2.2. Data Collection and Initial Preprocessing

3.3. Data Preprocessing and Feature Engineering

3.3.1. Data Preprocessing Pipeline

3.3.2. Feature Engineering and Selection

3.4. Prediction Model Development

3.4.1. Model Selection Strategy

3.4.2. Model Training Workflow

3.5. Anomaly Detection Framework

3.6. Model Evaluation

4. Results and Discussion

4.1. Short-Term Prediction

4.2. Long-Term Prediction

4.3. Anomaly Detection

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Article Metrics

Citations

Article Access Statistics