Next Article in Journal
Urban Resilience and Energy Demand in Tropical Climates: A Functional Zoning Approach for Emerging Cities
Previous Article in Journal
Assessing Energy Consumption and Treatment Efficiency Correlation: The Case of the Metamorphosis Wastewater Treatment Plant in Attica, Greece
Previous Article in Special Issue
The Potential of One-Sided Traditional Windcatchers for Outdoor Use as a Sustainable Urban Feature
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Integrating Machine Learning and Digital Twins for Enhanced Smart Building Operation and Energy Management: A Systematic Review

1
INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, Faculty of Engineering, University of Porto, 4200-465 Porto, Portugal
2
CONSTRUCT—Gequaltec, Faculty of Engineering, University of Porto, 4200-465 Porto, Portugal
3
LIACC—Artificial Intelligence and Computer Science Laboratory, Faculty of Engineering, University of Porto, 4200-465 Porto, Portugal
*
Author to whom correspondence should be addressed.
Urban Sci. 2025, 9(6), 202; https://doi.org/10.3390/urbansci9060202
Submission received: 14 March 2025 / Revised: 16 May 2025 / Accepted: 23 May 2025 / Published: 2 June 2025

Abstract

:
Artificial Intelligence has recently expanded across various applications. Machine Learning, a subset of Artificial Intelligence, is a powerful technique for identifying patterns in data to support decision making and managing the increasing volume of information. Simultaneously, Digital Twins have been applied in several fields. In this context, combining Digital Twins, Machine Learning, and Smart Buildings offers significant potential to improve energy efficiency and operational effectiveness in building management. This review aims to identify and analyze studies that explore the application of Machine Learning and Digital Twins for operation and energy management in Smart Buildings, providing an updated perspective on these rapidly evolving topics. The methodology follows the PRISMA guidelines for systematic reviews, using Scopus and Web of Science databases. This review identifies the main concepts, objectives, and trends emerging from the literature. Furthermore, the findings confirm the recent growth in research combining Machine Learning and Digital Twins for building management, revealing diverse approaches, tools, methods, and challenges. Finally, this paper highlights existing research gaps and outlines opportunities for future investigation.

1. Introduction

The concept and the idea of Smart Buildings are increasingly present due to the growing volume of accessible data combined with the technologies’ development such as Digital Twins, the Internet of Things (IoT), Cyber-physical Systems (CPS), Wireless Sensor Networks (WSN), Building Information Modeling (BIM), Artificial Intelligence (AI), Machine Learning (ML), and smart meters. Under this framework, improving energy efficiency management in buildings due to the new intelligent features, with faster communication and information, is a relevant task for managers.
According to the United Nations’ World Population Prospects 2024, rural-to-urban migration continues to rise, with the global urban population projected to reach approximately 6.6 billion by 2050 and 8.2 billion by 2100—representing 68% and 80% of the total population, respectively [1]. Consequently, the cities’ energy consumption is increasing, creating a crucial impact on climate change and carbon emissions [2,3]. At the same time, the amount of data available is growing with the greater use of mobile devices and the expansion and impact of the IoT, which leads to greater expectations of implementing Smart Cities, making its dynamics substantially applicable [4]. Furthermore, adaptation is necessary to manage resources better and improve sustainability, avoiding the lack of resources or other bottlenecks due to the cities’ population growth [5,6].
In this sense, buildings, as a fundamental part of cities, are a significant driver of global energy consumption [7,8,9,10]. Moreover, in terms of carbon emissions, buildings are a vital protagonist [9,11,12]. Finally, this relevance associated with buildings places them as a potential target in the search for sustainable cities, as well as in the mitigation of climate change.
The main objective of this systematic review is to identify studies that address the use of Machine Learning and Digital Twins for energy management in Smart Buildings. This work seeks to explore pathways for transforming existing buildings into smart, energy-efficient spaces. Specifically, this review analyzes the current approaches to integrating sensor networks, real-time data, Machine Learning models, and Digital Twins. Moreover, it aims to identify research gaps and suggest potential directions for future investigation. To achieve these goals, this systematic review is guided by the following primary research questions:
  • RQ1: How can Machine Learning improve energy management in Smart Buildings?
  • RQ2: How can Machine Learning help find the building patterns or load profiles?
  • RQ3: How can Machine Learning and Data Science be used in Digital Twins to improve energy management?
  • RQ4: How can Machine Learning and Data Science build a Digital Twin representative of the energy management of a building?
  • RQ5: How does monitoring with the sensors of the Smart Building help provide better building management?
  • RQ6: How could a Smart Building contribute to a Smart Urban District or a Smart City?
The remainder of this paper is organized as follows. Section 2 describes the primary research methodology and methods used in this study. Section 3 presents the main findings and key insights from the papers identified in the systematic review. Section 4 discusses the main findings, highlights the research gaps identified, and outlines potential opportunities for future work. Finally, Section 5 summarizes the main conclusions.

2. Methods

The initial step involves conducting exploratory searches to understand the available literature related to the overall objectives of this review. These preliminary searches provided a broader perspective and allowed the identification of trends, keywords, and areas requiring further research. The VOSviewer [13] software was utilized for the analysis. With this macro view established, a final search string was developed to locate studies closely related to the research focus. Additionally, to ensure methodological rigor, transparency, and reproducibility, the methodology follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [14] guidelines for conducting a systematic review.
The first search focused specifically on the perspective of patterns or load profiles in Smart Buildings, using the following search string in the Scopus database:
TITLE-ABS-KEY ((“smart building” OR “intelligent building”) AND “machine learning” AND (“load profile” OR “load profiling” OR “consumption pattern” OR “consumption profile” OR “consumption pattern” OR “consumption profiling”)).
The search responded with only 31 results, indicating the need for new search strategies to find more articles. Afterward, several strings were tested during the process, and respective clusters were analyzed. All the searches were initially conducted in the Scopus database, targeting the titles, abstracts, and keywords of the articles. The clusters with the main keywords indicated some opportunities to improve the search. At the same time, in this process, it was possible to observe some possible research gaps, such as the need for more load disaggregation studies.
Finally, the final search aimed to find studies that aggregate all perspectives of this review. Within this context, the final string combined main keywords to find papers integrating Machine Learning and Digital Twins for energy management in Smart Buildings. In addition, synonyms or similar words were included to increase the amplitude of the results. Furthermore, the words “smart” and “management” were removed to avoid promoting restrictions in the universe of the papers. The final string considered was as follows:
TITLE-ABS-KEY ((“building”) AND (“machine learning” OR “data-driven” OR “deep learning” OR “neural network”) AND “energy” AND (“load profile” OR “load profiling” OR “consumption pattern” OR “consumption profile” OR “consumption pattern” OR “consumption profiling” OR “forecast” OR “prediction” OR “predictive” OR “Load Disaggregation” OR “non-intrusive load monitoring” OR “NILM”) AND (“digital twin” OR “digital counterpart” OR “virtual twin” OR “virtual building” OR “digital building”)).
The search string returned 79 results from Scopus, which were used to generate the clusters presented in Figure 1, highlighting the primary keywords and major application areas identified in the literature.
Figure 2 shows how related studies are relatively recent and have grown in recent years, highlighting new trends in research while emphasizing the demand for new studies.
Nonetheless, the exact string was used in Web of Science to find more studies, resulting in 57 papers. The software EndNote [16] was used to organize the results, facilitating processes such as removing duplicate results or classifying article types. The sum of the two sources resulted in 136 papers. Figure 3 presents a flowchart based on the PRISMA [14] reference to represent the entire systematic review process.
In the screening stage, 40 duplicate papers were removed, with a new total of 96. Afterward, it was observed that there were eight complete conference proceedings, which were removed since the idea was to find studies that combine the main keywords jointly and not just one or part of them, resulting in 88 papers.
In the eligibility stage, 34 papers considered out of the subject or out of the topic of this study were removed. The main reason for these papers to be retrieved in the first place was the use of the word “building” as a verb, which consequently found approaches similar to the study objective but applied to different applications. Additionally, three book chapters were not open, and it was not possible to access the entire paper. These articles were removed, resulting in 51 records.
The final count of 51 records includes 26 journal articles, 12 conference articles, two book chapters, nine journal reviews, and three conference reviews.

3. Results

This section presents the main findings and key ideas from the reviewed papers. The main objective of this section is to elucidate the general concepts and aims of the studies. Furthermore, this section identifies the key trends, tools, techniques, and methods employed in the development of the studies.

3.1. White-Box, Black-Box, and Gray-Box

For the task of understanding the patterns of the building to predict energy consumption, there are typically three approaches: white-box, black-box, and gray-box models. On the one hand, the white-box models utilize the detailed characteristics of the building; the idea is to create a representative model to simulate future scenarios. On the other hand, the black-box models do not demand the precise characteristics of the building. Indeed, the model is based on internal behavior and respective analysis of the data. Finally, the gray-box models, also known as hybrid models, are a blend of the other two types since establishing a model with limited information about the characteristics of the building is combined with black-box techniques [7,8,9,17,18,19,20,21,22,23,24,25,26,27,28,29,30]. Finally, in Figure 4, the relations between white-box, black-box, and gray-box models can be analyzed in the context of Digital Twins, Machine Learning, BIM, IoT, and an actual physical building.
The application of black-box models (also known as data-driven models) is an alternative to the white-box models (the physical modeling approach), which typically needs more detailed data to simulate the real characteristics of the building and often presents poor results [8,18,19,20,21,22,24,25,26,31,32,33]. Furthermore, the physical modeling approach is usually time-consuming [9,17,18,19,20,22,24,25,26,30,32,34,35]. In that regard, the application of Machine Learning models to predict energy consumption in buildings has grown in recent years [3,10,21,29,32,34,36].
From the perspective of the white-box model, Zhou and Zheng [35] exalt the relevance of the building envelope characteristics for the building’s energy performance [35]. Kalkhorani and Clark [32] built a white-box model with Energy Plus and a black-box model for a real campus university, testing three Machine Learning algorithms: neural network, Gradient Boosting, and Support Vector Regressor. The main idea was to compare the performance of physics-based models versus data-driven models in terms of computational cost and performance. The study confirms that the white-box models require more information and effort to reflect accurate results [32].
Another perspective is aggregate approaches, which can be used to build gray-box models or compare black-box models with white-box models to validate or compare the performance. Jain et al. [24] developed a Python package (compatible with Python 2.7) to integrate Machine Learning models implemented in Python with white-box models from Energy Plus. For this task, a similar library is MLE+ [37], which aims to address the same results but integrates Energy Plus with the Simulink of MATLAB [24]. In a parallel perspective, Manfren et al. [27] emphasize the relevance of this kind of gray-box approach, highlighting it as a relevant track since it combines main physical characteristics with the advanced models of Machine Learning and Artificial Intelligence [27]. In the same way, from the perspective of Smart Cities, the gray-box approach could be the bridge between white-box and black-box approaches. Furthermore, the gray-box method could solve some issues for the macro view of several buildings in urban planning, such as the issue of the wide diversity of building operational details or limitations regarding the availability of large historical datasets [9,38].
Aruta et al. [17] focus on utilizing an Artificial Neural Network (ANN) to predict energy consumption in one house for two climates. The study uses DesignBuilder® to understand the physical characteristics of the building, such as the materials of the envelope. After this, the result data are processed in Energy Plus to simulate the use and thermal energy consumption necessary for the building. The study shows the possibility of predicting energy loads in buildings utilizing ANNs, especially for heating the spaces. Despite that, the methodology presents limitations, such as reduced inputs [17]. The study demonstrates an opportunity to create models with additional endogenous and exogenous features to build more accurate models and achieve better results. El-Gohary et al. [22] also utilize ANN, in this case, to design Digital Twins of Smart Residential Buildings in Lebanon. The ANN was developed in Simulink of the MATLAB. The study utilizes the Quick Energy Simulation Tool (eQuest) software to simulate 1540 results with variations in building envelope characteristics, such as thicknesses of insulation material, values of conductivity, and window types. The main objective was forecasting energy consumption in buildings with different physical characteristics. In addition, the work aims to build a model capable of supporting the design phase of the projects to choose the best envelope characteristics in terms of energy efficiency [22].
Ni et al. [29] applied seven deep learning models to mitigate the issue of multi-horizon building energy forecasting. The main idea of the study is to integrate future information about the real building, such as exogenous factors or operational data of the building, into the model to increase the model’s performance [29]. This type of approach could be valuable in the context of Digital Twins and Smart Buildings and real-time improvements expected during the day-to-day cycle-life of the building.
Bjornskov and Jradi [20] developed an energy modeling framework with a data-driven approach to simulate components and systems of the building to represent a dynamic Digital Twin environment. The work was produced in Python and validated in a single zone of a building, as well as the respective components and systems [20]. The idea was mainly inspired by the Digital Twin concept of Boje et al. [39] and Grieves [40], who define Digital Twins in three main components: a physical system, a virtual system, and a flow of data connecting both [20,39,40]. The authors believe that this approach to constructing Digital Twins is crucial to increase building efficiency and, more than that, is one of the bases for building the Smart Building [20]. A different approach was made by Wang et al. [41], focusing on the occupant positioning behavior in historic traditional Chinese residences in Wufu, Fujian Providence. The study utilizes Bluetooth beacons and users’ smartphones to extract data about the positioning of the users in the spaces. After that, the Machine Learning model is applied and integrated into a Digital Twin platform with a BIM model to make predictions related to energy management and promote a better visualization of the occupants’ positions. The main idea is an occupant-centric solution for reducing emissions, analyzing the impact of energy management on the users’ behavior, specifically related to the occupant movement. The work presents a roadmap for BLE-based indoor positioning in spaces [41].
Chamari et al. [42] exalt the interoperability challenge in Smart Buildings systems. It is important to create adequate architectures to manage the flow of information [42]. Indeed, extracting data from local sensors and returning answers to stakeholders can be done through several approaches. For this challenge, Kannari et al. [43] developed a Digital Twin of an office building, integrating Digital Twins, a BIM model, energy measurements, and neural networks to forecast energy consumption. In addition, an online platform shows the study’s results and promotes better visualization of data analytics and predictions [43]. Chamari et al. [42] also focus on the interoperability challenge and seek to integrate tools. The study shows an architecture for data-driven Smart Buildings and implemented it in three case studies at the Eindhoven University of Technology campus. Like Kannari et al. [43], the work included the integration of Digital Twins, sensor data, BIM model, and real-time dashboards, such as Grafana dashboards. The work does not include Model Predictive Control (MPC) [42].

3.2. Sensor-Based Monitoring in Buildings

Recently, there has been an increase in both the number of available sensors and the technologies developed for sensor integration in buildings, enabling real-time monitoring of energy consumption and other indoor environmental variables [44,45]. Therefore, it is possible to extract more building data from different aspects and systems. On the one hand, increased sensor deployment in buildings enhances the applicability of black-box modeling approaches. On the other hand, integrating sensors and digital technologies in buildings enables the use of gray-box models, which leverage both real-time data and the building’s intrinsic characteristics. This approach can lead to a better understanding of the spaces and visualization of the building conditions. In addition, with the help of a dashboard that integrates Machine Learning and Artificial Intelligence models, it is possible to achieve better decision making when planning the day-to-day tasks of the building, for example, the maintenance of the systems and equipment [44]. Furthermore, the connection of data and the response of the models in real time are closely related to the construction of the Digital Twin of the building energy management system.
Beyond the global analyses of the whole building, it is possible to analyze each system separately to achieve a better overall outcome for the building. Accordingly, Digital Twins and Machine Learning models can be developed for critical elements only. In this context, some studies focus on building Digital Twins and Machine Learning for specific parts of a building, whether a system or a particular aspect, as summarized in Table 1.

3.3. Real-World Applications

Some studies did not utilize real-world use cases and did not use data from actual buildings. The alternatives are laboratory prototypes or synthetic data [33]. In contrast, some studies have applied models in real buildings, and the same model has been applied in virtual buildings to compare results and performance [26]. Finally, in the literature, it is possible to find a variation in types of buildings for each study regarding real-world applications, as represented in Table 2.
Despite the real-world application’s versatility, most studies concentrate on just three types: offices, residential buildings, and educational institutions. This lack emphasizes the necessity for studies of distinct typologies. Indeed, every type of infrastructure and use of spaces has different behavior and, consequently, specific demands and results.
For instance, commercial buildings are very complex in terms of understanding their dynamic behavior, especially in HVAC systems, since they have several distinct zones with different behaviors. Furthermore, in commercial buildings, typically, there are many layers of air distribution for HVAC systems, such as air handling units, fans, and ducts. In addition, commercial buildings usually also have a wide range of water-based systems, such as cooling towers, chillers, and boilers [25]. More than that, this complexity in terms of zones and systems increases the challenge of developing models to improve energy management.

3.4. Model Predictive Control

MPC has received substantial attention for optimizing building operations, including predicting the behavior of HVAC systems [25]. Jain et al. [24] present an end-to-end architecture to link a data-driven MPC Digital Twin with the existing Supervisory Control and Data Acquisition (SCADA) systems or the Building Energy Management Systems (BEMS) [24]. Lee and Heo [25] developed an MPC framework based on a data-driven model with exogenous inputs for predicting indoor temperatures to reduce heating energy. The results represent around 12% of energy consumption reduction with less than 0.5 °C impact on comfort [25]. Schmitt et al. [30] propose integrating the MPC approach with data-driven model error compensation to achieve more accurate modeling for energy management systems in complex buildings [30].

3.5. Operation and Maintenance

Some studies show the perspective of operation and maintenance of the building’s systems and infrastructures [7,8,11,19,23,34,44,53,54,55]. For instance, Gao and Pishdad-Bozorgi [54] developed a step-by-step architecture to build Machine Learning models regarding facility life-cycle cost (LCC) analysis. The model aggregates the initial cost, the utility cost, and the O&M cost as inputs and was applied on a university campus [54]. Zhao et al. [58] proposed a framework for Digital Twins in the operation and maintenance of building aggregating two main components: the Digital Twin and the Machine Learning algorithm (the predictions are based on ANN). Finally, regarding the unique characteristics of each building, the study defines three key components of the digital for O&M: structure, equipment, and energy consumption [58].
Jiao et al. [11] proposed a sustainable Digital Twin model for O&M of buildings. The study integrates Bayesian network (BN) and Random Forest (RF) models applied in a real gymnasium to forecast the energy consumption of the infrastructure. Finally, the model checks for anomalous consumption to verify possible opportunities to reduce waste and energy consumption [11].
A relevant aspect addressed by some studies is automatic fault detection, with some papers focusing on a global perspective of the building and other studies focusing on a specific system, such as HVAC [23,52,53,57]. In this aspect, Hosamo et al. [57] proposed a framework for a Digital Twin integrating BIM and Machine Learning models for automatic fault detection and support of the predictive maintenance schedule [57]. In a similar approach, Hosamo et al. [53] developed a Digital Twin that integrates real-time sensors in two non-residential buildings in Norway with a BIM visualization platform to represent the output of nine Machine Learning models for predictive maintenance of HVAC systems. The main idea is to identify possible malfunctions with an automatic fault detection process to increase the maintenance plan’s efficiency. Furthermore, the study aims to predict when the equipment is most likely to fail and to use this prediction to update the maintenance plan schedule [53].

3.6. Heating, Ventilation, and Air Conditioning Systems

From the perspective of buildings, the most relevant driver of energy consumption is related to HVAC systems [8,18,19,28,35,48,64]. This relevancy exalts the critical importance of HVAC system and the crucial importance of achieving energy efficiency in buildings in terms of impact on energy consumption and CO2 emissions [28,34,36,62]. Indeed, adopting Machine Learning models and developing Digital Twins for these systems could be essential to predicting energy consumption and detecting anomalies in the HVAC [8,18,19,34,36,42].
Agouzoul et al. [18] extracted the characteristics of a BIM model of an actual building to build a Digital Twin. The simulation was made with Energy Plus. This simulation generates data necessary to train a neural network. The work focuses on reducing the consumption of HVAC, promoting a control strategy capable of verifying anomalies in the structure of the building, and more assertively planning the predictive maintenance required for the several installations of the building. The results show a reduction in energy consumption of 23.74% for cooling and 39.02% for heating. Likewise, thermal comfort was guaranteed since the temperature variation around the set point was reduced [19]. A similar approach by Agouzoul et al. [19] shows a reduction in energy consumption of 37.8% for cooling and 40.8% for heating in 2006. Plus, the reduction was 28.8% for cooling and 25.6% for heating in the simulation of the year 2017. Additionally, the study uses the Predicted Mean Value (PMV) to confirm the adequate thermal comfort of the users [19].
Chakrabarty et al. [31] developed a Gaussian process-based Bayesian optimization (GP-OP) method with two weeks of building data to calibrate and integrate the simulations of HVAC and building. The model is suitable for white-box, black-box, or gray-box approaches, assuming the limited information that occurs several times regarding building operations and showing the versatility in different scenarios. The work utilizes Modelica [67] to simulate the building envelope to represent the digital part of the model. Afterward, the resulting model is extracted and imported into Python to use the Machine Learning tools available in this programming language [31]. Norouzi et al. [48] focus on deep-learning approaches for a digital HVAC system to predict indoor temperatures. However, the study did not test less complex Machine Learning models to compare the performance of the results [48].
Hosamo et al. [56] developed a Digital Twin of the HVAC system (HVACDT) based on the BIM model and then simulated it in Simulink of MATLAB. Additionally, the model applies an ANN to validate the Digital Twin and a multiobjective optimization algorithm (MOGA) to match the best balance between energy consumption and thermal comfort. Therefore, the study’s primary goal is to build a model to reduce energy consumption and increase the users’ thermal comfort. The study utilizes about two years of data from a Norwegian office building [56]. In a similar approach, Hosamo et al. [53] focus on user feedback in terms of comfort and seek to link the relations between HVAC faults and user feedback to find which issues may impact the comfort of the occupants [53]. Hu et al. [52] also focus on the relationship between air quality and fault detection. The work uses robotic scans of several buildings in Singapore to reconstruct BIM models and visualize air quality on an online platform through real-time sensors. The real-time sensors extract data for AI models, such as long short-term memory (LSTM) and Autoencoder, to automatically identify fault detection. The methodology includes failure prediction and alert emissions while simultaneously exposing solutions [52].

3.7. Solar Photovoltaic (PV)

Some studies focus on an approach more specifically to solar power plants in the perspective of buildings to reach the Net Zero Energy Building goals [49,62,68]. Al-Isawi et al. [68] developed a Digital Twin of solar PV in a building, utilizing sensors to collect data in real time and validate the system using the Simulink of MATLAB. In addition, the study uses an LSTM model to forecast solar PV production ahead of 15 min, utilizing historical data. The accuracy results show an adequate representability of the Digital Twin compared to the real asset since, in terms of coefficient correlation (R), it presents R-values of 0.99893 for simulation and 0.99427 for forecast. Finally, the work presents a dashboard in the MATLAB App Designer to promote better visualization for the building operator. The idea of the dashboard is to make it more accessible for the building manager to plan the balance between consumption and production of the building [68]. Castilla et al. [62] also explore solar energy potential in a different target, focusing on a Digital Twin for a flat plate solar collector field. The model utilizes MATLAB to develop an ANN prediction model, which is trained, tested, and validated within a real building dataset for about one year. The Digital Twin is based on the real-time connection to the building, extracting more data to feed the ANN model and promoting a continuous feedback loop of data extraction and decision support. Finally, the study presents a web page to better visualize the Digital Twin and model results and predictions [62].
Surplus electricity generated by individual building PV systems can contribute significantly to the resilience and energy sharing within the district community. The development of energy forecasting frameworks based on Machine Learning techniques is critical for optimizing energy management, resource allocation, and grid stability at the city scale [69,70]. In parallel, the development of optimization strategies that allocate on-site PV generation according to building demand flexibility is crucial for minimizing surplus electricity sent to the grid [71,72].

3.8. Smart City and Urban District

Thinking more broadly than a unique building, the energy supplier can also rely on its Digital Twin to optimize production and supply processes. The multiplication of the reality of each building, with all available data, added to the energy supplier, both having corresponding Digital Twins and seeking to optimize processes more synchronously, can lead to better energy performance for the Smart City as a whole.
It is worth noting that there is no universally accepted definition of a Smart City. The development of a unified conceptual framework remains ongoing and requires further studies and research to establish a robust formulation. Nevertheless, after analyzing several definitions [5,73,74,75,76], common concepts and core ideas can be identified. Similarly, key terms frequently found across Smart City definitions are presented in Figure 5.
Smart City initiatives typically emerge from the need for strategic urban development policies by local governments, combined with technological innovations aimed at improving local services. These projects often involve collaboration between governments and companies specializing in smart technologies. According to Caragliu et al. [5], local conditions must be considered to maximize Smart City benefits and justify investments in high-tech solutions [76]. Moreover, Smart Cities must prioritize the needs of people and communities [73].
The accumulation and processing of large datasets in Smart Cities drive improved decision making, community benefits, and innovative services. Reusing data adds value to public services and enhances citizens’ quality of life [4].
Thus, Smart Cities integrate the physical and digital worlds, enabling real-time information management: first by efficiently archiving data, and second by dynamically providing responses to stakeholders and citizens. Advancing energy management in buildings is key to improving not only individual energy performance but also community-wide energy development within Smart Cities and urban energy districts.
Fathy et al. [77] analyzed a two-year IoT dataset on household energy use, demonstrating energy demand reductions of over 20% and significant savings on residential bills. Zhou et al. [78] proposed a Digital Twin (DT) framework tested on a large-scale Chinese grid model, enabling real-time monitoring and highlighting the benefits of multiple digital twins. Ruohomäki et al. [79] developed a Smart City Digital Twin, showcasing how 3D city models and open-access datasets, such as Helsinki’s, can optimize energy use and promote citizen participation.
O’Dwyer et al. [3] presented a Digital Twin with smart energy management supported by ML tools, such as K-means, Gradient Boosting, and ANN [80]. Additionally, the study applied its proposal to a Smart Cities project in the Greenwich neighborhood of London. K-means is used to verify daily energy use patterns, perceive habits and trends, separate them into similar consumption groups, and divide them into clusters, such as weekdays from holidays. Likewise, this approach can help eliminate outliers or mitigate issues in low-quality datasets. The paper also compares different Machine Learning models applied to the data available through the Digital Twin to forecast important outputs, such as heat drawn from the heat network [3].
Odeh and de Wilde [2] emphasize the potential of Digital Twins on a district scale. The study analyzes several urban building energy methods (UBEM) to mitigate the impacts of uncertainties related to forecasting the energy consumption of a set of buildings [2]. Regarding a set of buildings, Roda-Sanchez et al. [45] developed a Digital Twin of 23 buildings on the University of Murcia campus, Spain. The work utilized the Scorpio broker by FIWARE and other tools that support the NGSI-LD open standard data format. The Digital Twin of the Smart Campus enhances analysis and forecasting of energy consumption, energy generation (solar PV), and occupancy predictions [45]. Finally, this type of work could be a prototype of a Smart District, representing the potential of the Smart City application.

3.9. Privacy Issues

In the context of Smart Buildings in Smart Cities, concerns about confidential data and privacy issues are more evident, as building holders want to avoid exposing internal management to outsiders. On this basis, Niavis et al. [81] emphasize the potential of critical privacy and trust issues when sharing information about the buildings. The study proposed a Trusted Digital Building Logbook (DBL) to secure stakeholders’ data privacy. At the same time, the work promotes transparency so local government can analyze the macro view of the city and several buildings’ performance to build the idea of a Smart City in terms of energy efficiency [81].

3.10. Trends, Techniques and Tools

As previously discussed in this paper, a variety of tools and techniques have been employed across the studies reviewed. This subsection summarizes the specific trends in Machine Learning applications and Digital Twin technologies identified from the literature.
Table 3 presents examples of the main Machine Learning and Deep Learning algorithms used in the studies.
The algorithms are primarily developed using the Python programming language [84], which offers a wide range of libraries and packages to support the deployment of Machine Learning and Deep Learning models, such as Scikit-learn [85], PyTorch [86], and TensorFlow [87,88]. Nevertheless, some studies have employed other programming languages, such as R [89], using packages such as the Keras package [90].
Regarding Digital Twins, the number of tools available is vast and continually expanding. The approach and objective of each study often determine the selection of specific methods. Table 4 presents some of the main tools identified for the development of Digital Twins.

4. Discussion, Research Gaps, and Opportunities

This section discusses the main findings of this review, particularly in relation to the research questions outlined in the Introduction. It also summarizes the principal research gaps identified in the literature. Finally, it suggests opportunities for future research based on the findings of this study.

4.1. Discussion

The answer to RQ1 is addressed first, based on the wide range of critical building elements elucidated in Section 3.2. It was shown how Machine Learning models have been applied to specific parts or infrastructures within buildings, offering several pathways to reduce energy consumption. These models, combined with the growing deployment of sensors in Smart Buildings, enhance energy management by leveraging real-time data to train algorithms and generate actionable insights for stakeholders.
The analysis of building elements revealed a higher concentration of studies focused on HVAC systems, underscoring their dominant role in building energy consumption across various building types and geographic locations. In the case of HVAC, ML has been applied in different ways, including the following:
  • Predictive models for power demand, temperature, on/off control, setpoints, inverter frequencies, and other HVAC commands [12,28,48,49,63,64];
  • Heating and cooling control strategies [19,64];
  • Calibration of multiple parameters in coupled building/HVAC models [31];
  • Digital Twin validation comparing energy consumption and thermal comfort [56];
  • Predictive maintenance models to detect system failures and recommend servicing [53,57].
Beyond HVAC, ML techniques have been applied to other smart building components, such as data centers, solar chimneys, solar PV systems, and plate solar collector fields. These studies targeted distinct objectives, including energy consumption forecasting, fault detection, and proactive maintenance to mitigate system failure risks.
In summary, ML has proven to be a powerful tool for improving building energy management. Furthermore, ML can be integrated at various stages within model architectures—at the beginning, middle, or end. Supervised models can be combined for different tasks within the same framework (e.g., using regression models to forecast building parameters alongside classification models to detect operational patterns). This observation also contributes to answering RQ2. In addition, unsupervised ML models can be employed to identify patterns and load profiles by clustering historical data, allowing for the analysis of seasonality and the identification of critical energy-consuming systems.
Regarding RQ3, the review demonstrated how Data Science and ML techniques can utilize Digital Twins as an additional data source. Digital Twins offer simulation capabilities that can generate new datasets for model training and scenario analysis. For instance, the impact of installing photovoltaic systems on building performance can be simulated and assessed.
The answer to RQ4 emphasizes the black-box approach. This method reduces the need for costly BIM model development by relying on Data Science and ML techniques to create Digital Twins directly from sensor data. This approach aligns with Directive (EU) 2024/1275, which defines a digital building twin as a simulation of real-time building behavior, supported by smart meters and other sensors, to enhance building management [91]. Similarly, gray-box models integrate Data Science and ML techniques to develop Digital Twins based on a mix of empirical data and simplified physical modeling.
RQ5 integrates insights from the previous questions. The growing presence of sensors in buildings enables the collection of detailed consumption patterns, which are fundamental for deploying ML and Digital Twin models to enhance building energy management.
The answer to RQ6 highlights the role of Smart Buildings as key drivers of energy efficiency at the district level. Integrating the models and insights developed at the building scale can significantly enhance the resilience of urban districts. However, privacy concerns emerge when aggregating building-level data, emphasizing the need for methodologies that protect privacy while enabling data-driven decision making for district and city management.
Overall, the answers to all research questions demonstrate strong interconnections. They show that ML and Digital Twin technologies can be integrated in multiple ways—for example, using ML to validate a Digital Twin or developing Digital Twins via black-box or gray-box approaches. At the urban scale, while the complexity of integrating multiple models increases, so does the potential impact on district resilience. Nonetheless, this review identifies several research gaps and opportunities for future investigation in the context of Smart Buildings and urban energy systems.

4.2. Research Gaps and Opportunities

From the perspective of buildings, especially in terms of energy management, further efforts are still required to develop Machine Learning models to improve the performance of the Digital Twins [7]. Additionally, in the context of energy management in buildings, there is still a lack of deep learning approaches to compare the performance of this type of model with traditional Machine Learning models [8,17,23,29,82,83,92,93]. Therefore, there is a need for more studies to compare algorithms for energy consumption forecasting in buildings [36].
Regarding energy disaggregation, most studies of non-intrusive load monitoring (NILM) are developed in the laboratory, indicating the necessity of studying real building data to build models with real-world applications [94].
Concerning the building typology, there is a need for studies in various types of buildings since most of the studies focus just on residential, office, and educational buildings. This gap indicates the need for research in this area, as each building typology has peculiar characteristics. Indeed, the complexity of the dynamics of each building reinforces the great challenge of applying models to real data to improve the energy management of buildings in the real world.
From a macro perspective, there is a need for further studies on building energy management that consider the broader impact at the district or the Smart City levels.
Finally, as a last gap identified in this systematic review, despite the overall growth of studies on Machine Learning, Digital Twins, Smart Buildings, and energy efficiency, the literature still lacks research that addresses these topics in an integrated way [10,23,33,34,36,43].
In summary, opportunities related to the gaps identified in the literature are illustrated in Figure 6.

5. Conclusions

The increased available data can transform decision-making processes, and it is essential to study the possibilities to use the available data to achieve an actual improvement in building management. Simultaneously, with the increase in IoT devices, data from Smart Buildings and smart meters is more likely to be available in real time. In this sense, building energy management can evolve to reduce costs and the overload of energy networks. Furthermore, the application of Machine Learning and Digital Twin technologies can directly and indirectly enhance the urban energy district. This impact can occur at the individual building level, through the development of Machine Learning and Digital Twins, or at a broader scale, by aggregating outputs from multiple buildings to optimize energy demand and/or power grid stability at the city level.
The challenge of improving energy efficiency in buildings is fundamental for a more sustainable future. More than that, the evolution of end-user behavior in energy consumption can be more robust with the support of fast and real-time feedback from techniques and technologies, such as Machine Learning, Artificial Intelligence, and Digital Twins.
The Digital Twins of Industry 4.0 and applications in the manufacturing area indicate that including a 3D model of the physical object is vital or even mandatory. Nonetheless, this type of visualization is not always needed [7,22]. Despite the growing relevance of BIM in the construction industry, BIM data are not a mandatory requirement for developing Digital Twins for building energy management. For instance, BIM models could be used just to extract specific data to enrich the results [7]. In this view, existing studies compare the approaches and evaluate the advantages of utilizing black-box or gray-box approaches instead of white-box, since it is possible to perform more simulations with much less time [8,22].
It is essential to analyze the tasks for each situation to build an adequate Digital Twin of a building or a system. Furthermore, when the idea is to integrate Machine Learning models and the task is more related to the extraction and analysis of the data of the systems, as well as the behavior of the building, the 3D representation could be just a design addition, not contributing to the results and performance of the models. Nevertheless, in some cases, there could be an improvement in data visualization, with heat maps, for example.
This paper provided a review concerning Machine Learning and Digital Twins in the building energy management context. This work contributes to the knowledge of the actual state of the art in applying Digital Twins and Machine Learning to Smart Buildings operations and energy management. This paper exposed the main ideas, concepts, methods, and challenges. Furthermore, some research gaps were identified and presented, indicating opportunities for future studies.
Nonetheless, a limitation of this review is that it relied on only two databases (Scopus and Web of Science). In addition, some book chapters that were not open-access were excluded from the analyses. Future work could explore additional databases and expand the statistical analysis of the selected articles. Furthermore, updating the review after one year could be valuable to assess whether the trends identified in this study persist.

Author Contributions

Conceptualization, B.P., J.P.M., H.B. and R.R.; methodology B.P., J.P.M., H.B. and R.R.; software, B.P.; validation, B.P., J.P.M., H.B. and R.R.; investigation, B.P.; data curation, B.P.; writing—original draft preparation, B.P.; writing—review and editing, B.P., J.P.M., H.B. and R.R.; visualization, B.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially funded by Component 5—Capitalization and Business Innovation, integrated into the Resilience Dimension of the Recovery and Resilience Plan under the Recovery and Resilience Mechanism (RRM) of the European Union (EU), framed in Next Generation EU, for the period 2021–2026, trough the ATE project, with reference 56. It was also supported by Base Funding—UIDB/04708/2020 with DOI 10.54499/UIDB/04708/2020 of the CONSTRUCT—Instituto de I&D em Estruturas e Construções—funded by national funds through the FCT/MCTES (PIDDAC).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial Intelligence
ANNArtificial Neural Network
BEMSBuilding Energy Management Systems
BIMBuilding Information Modeling
BNBayesian network
CO2Carbon dioxide
CONSTRUCTInstitute of R&D in Structures and Construction
CPSCyber-physical Systems
DBLDigital Building Logbook
DNNDeep Neural Network
SVRSupport Vector Regressor
CNNConvolutional neural network
eQuestQuick Energy Simulation Tool
FEUPFaculty of Engineering—University of Porto
GP-OPGaussian process-based Bayesian optimization
HVACHeating, Ventilation, and Air Conditioning
HVACDTDigital Twin of the HVAC system
INESC TECInstitute for Systems and Computer Engineering, Technology and Science
IoTInternet of Things
LCCLife-cycle cost
LIACCArtificial Intelligence and Computer Science Laboratory
LSTMLong short-term memory
MLMachine Learning
MOGAMultiobjective optimization algorithm
MPCModel Predictive Control
NILMNon-intrusive load monitoring
O&MOperations and Maintenance
PMVPredicted Mean Value
PRISMAPreferred Reporting Items for Systematic Reviews and Meta-Analyses
PVPhotovoltaic
RCoefficient correlation
RFRandom Forest
SCADASupervisory Control and Data Acquisition
UBEMUrban building energy methods
UNUnited Nations
WSNWireless Sensor Networks

References

  1. United Nations, Department of Economic and Social Affairs, Population Division. World Population Prospects 2024. Available online: https://population.un.org/wpp/ (accessed on 15 May 2025).
  2. Odeh, K.; de Wilde, P. Exploring the Potential of Digital Twins at the District Scale: A Framework for Investigation. In Proceedings of the Building Simulation Conference Proceedings, Shanghai, China, 4–6 September 2023; pp. 2445–2452. [Google Scholar]
  3. O’Dwyer, E.; Pan, I.; Charlesworth, R.; Butler, S.; Shah, N. Integration of an energy management tool and digital twin for coordination and control of multi-vector smart energy systems. Sustain. Cities Soc. 2020, 62, 102412. [Google Scholar] [CrossRef]
  4. Abella, A.; Ortiz-de-Urbina-Criado, M.; De-Pablos-Heredero, C. A model for the analysis of data-driven innovation and value generation in smart cities’ ecosystems. Cities 2017, 64, 47–53. [Google Scholar] [CrossRef]
  5. Caragliu, A.; Del Bo, C.; Nijkamp, P. Smart Cities in Europe. J. Urban Technol. 2011, 18, 65–82. [Google Scholar] [CrossRef]
  6. Pärn, E.A.; de Soto, B.G. Cyber threats and actors confronting the Construction 4.0; Sawhney, A., Riley, M., Irizarry, J., Eds.; Routledge: London, UK, 2020; pp. 441–459. [Google Scholar] [CrossRef]
  7. Cespedes-Cubides, A.S.; Jradi, M. A review of building digital twins to improve energy efficiency in the building operational stage. Energy Inform. 2024, 7, 11. [Google Scholar] [CrossRef]
  8. Hodavand, F.; Ramaji, I.J.; Sadeghi, N. Digital Twin for Fault Detection and Diagnosis of Building Operations: A Systematic Review. Buildings 2023, 13, 1426. [Google Scholar] [CrossRef]
  9. Pan, Y.; Zhu, M.; Lv, Y.; Yang, Y.; Liang, Y.; Yin, R.; Yang, Y.; Jia, X.; Wang, X.; Zeng, F.; et al. Building energy simulation and its application for building performance optimization: A review of methods, tools, and case studies. Adv. Appl. Energy 2023, 10, 100135. [Google Scholar] [CrossRef]
  10. Tahmasebinia, F.; Lin, L.; Wu, S.; Kang, Y.; Sepasgozar, S. Exploring the Benefits and Limitations of Digital Twin Technology in Building Energy. Appl. Sci. 2023, 13, 8814. [Google Scholar] [CrossRef]
  11. Jiao, Z.; Du, X.; Liu, Z.; Liu, L.; Sun, Z.; Shi, G. Sustainable Operation and Maintenance Modeling and Application of Building Infrastructures Combined with Digital Twin Framework. Sensors 2023, 23, 4182. [Google Scholar] [CrossRef] [PubMed]
  12. Zhang, J.; Ma, T.; Xu, K.; Chen, Z.; Xiao, F.; Ho, J.; Leung, C.; Yeung, S. Smart Data-Driven Building Management Framework and Demonstration. In Proceedings of the 3rd Energy-Informatics-Academy Conference (EI.A), Campinas, Brazil, 6–8 December 2023; pp. 168–178. [Google Scholar]
  13. Van Eck, N.; Waltman, L. Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 2010, 84, 523–538. [Google Scholar] [CrossRef]
  14. Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
  15. Scopus. Analyze Search Results. Available online: www.scopus.com (accessed on 9 July 2024).
  16. The EndNote Team. EndNote 21; Clarivate: Philadelphia, PA, USA, 2013. [Google Scholar]
  17. Aruta, G.; Ascione, F.; Boettcher, O.; De Masi, R.F.; Mauro, G.M.; Vanoli, G.P. Machine learning to predict building energy performance in different climates. In Proceedings of the IOP Conference Series: Earth and Environmental Science, Berlin, Germany, 20–23 September 2022. [Google Scholar]
  18. Agouzoul, A.; Simeu, E.; Tabaa, M. Enhancement of Building Energy Consumption Using a Digital Twin based Neural Network Model Predictive Control. In Proceedings of the 2023 International Conference on Control, Automation and Diagnosis, ICCAD 2023, Rome, Italy, 10–12 May 2023. [Google Scholar]
  19. Agouzoul, A.; Simeu, E.; Tabaa, M. Advancing Sustainable Building Practices: Intelligent Methods for Enhancing Heating and Cooling Energy Efficiency. Sustainability 2024, 16, 2879. [Google Scholar] [CrossRef]
  20. Bjornskov, J.; Jradi, M. An ontology-based innovative energy modeling framework for scalable and adaptable building digital twins. Energy Build. 2023, 292, 113146. [Google Scholar] [CrossRef]
  21. de Wilde, P. Building performance simulation in the brave new world of artificial intelligence and digital twins: A systematic review. Energy Build. 2023, 292, 113171. [Google Scholar] [CrossRef]
  22. El-Gohary, M.; El-Abed, R.; Omar, O. Prediction of an Efficient Energy-Consumption Model for Existing Residential Buildings in Lebanon Using an Artificial Neural Network as a Digital Twin in the Era of Climate Change. Buildings 2023, 13, 3074. [Google Scholar] [CrossRef]
  23. Hosamo, H.H.; Nielsen, H.K.; Alnmr, A.N.; Svennevig, P.R.; Svidt, K. A review of the Digital Twin technology for fault detection in buildings. Front. Built Environ. 2022, 8, 1013196. [Google Scholar] [CrossRef]
  24. Jain, A.; Nong, D.; Nghiem, T.X.; Mangharam, R. Digital twins for efficient modeling and control of buildings an integrated solution with scada systems. In Proceedings of the ASHRAE and IBPSA-USA Building Simulation Conference, Chicago, IL, USA, 26–28 September 2018. [Google Scholar]
  25. Lee, H.; Heo, Y. Simplified data-driven models for model predictive control of residential buildings. Energy Build. 2022, 265, 112067. [Google Scholar] [CrossRef]
  26. Li, X.; Wen, J. System identification and data fusion for on-line adaptive energy forecasting in virtual and real commercial buildings. Energy Build. 2016, 129, 227–237. [Google Scholar] [CrossRef]
  27. Manfren, M.; Gonzalez-Carreon, K.M.; James, P.A.B. Interpretable Data-Driven Methods for Building Energy Modelling—A Review of Critical Connections and Gaps. Energies 2024, 17, 881. [Google Scholar] [CrossRef]
  28. Matsuda, Y.; Ooka, R. Development of a prediction model tuning method with a dual-structured optimization framework for an entire heating, ventilation and air-conditioning system. Sustain. Cities Soc. 2022, 79, 103667. [Google Scholar] [CrossRef]
  29. Ni, Z.; Zhang, C.; Karlsson, M.; Gong, S. A study of deep learning-based multi-horizon building energy forecasting. Energy Build. 2024, 303, 113810. [Google Scholar] [CrossRef]
  30. Schmitt, T.; Engel, J.; Rodemann, T. Regression-Based Model Error Compensation for a Hierarchical MPC Building Energy Management System. In Proceedings of the 2023 IEEE Conference on Control Technology and Applications, CCTA 2023, Bridgetown, Barbados, 16–18 August 2023; pp. 1–8. [Google Scholar]
  31. Chakrabarty, A.; Maddalena, E.; Qiao, H.; Laughman, C. Scalable Bayesian optimization for model calibration: Case study on coupled building and HVAC dynamics. Energy Build. 2021, 253, 111460. [Google Scholar] [CrossRef]
  32. Kalkhorani, V.A.; Clark, J.D. Creating an Energy Model of an Entire University Campus-Part 1: Preliminary Assessment of Building Modeling Techniques. In Proceedings of the ASHRAE Virtual Winter Conference, Electr Network, Virtual, 9–11 February 2021; pp. 400–408. [Google Scholar]
  33. Kotha, R.; Lédée, F.; Shamsi, M.H.; Evins, R. Time-Resolved Neural Network Surrogate Models as Digital Twins. In Proceedings of the 5th International Conference on Building Energy and Environment, Montreal, QC, Canada, 25–29 July 2023; pp. 1519–1528. [Google Scholar]
  34. Corticos, N.D.; Duarte, C.C. Artificial Inteligence Impact on Buildings Energy Efficiency. In Proceedings of the Proceedings—2023 7th International Conference on Computer, Software and Modeling, ICCSM 2023, Paris, France, 21–23 July 2023; pp. 56–61. [Google Scholar]
  35. Zhou, Y.; Zheng, S. A co-simulated material-component-system-district framework for climate-adaption and sustainability transition. Renew. Sustain. Energy Rev. 2024, 192, 114184. [Google Scholar] [CrossRef]
  36. Arowoiya, V.A.; Moehler, R.C.; Fang, Y. Digital twin technology for thermal comfort and energy efficiency in buildings: A state-of-the-art and future directions. Energy Built Environ. 2024, 5, 641–656. [Google Scholar] [CrossRef]
  37. Bernal, W.; Behl, M.; Nghiem, T.X.; Mangharam, R. MLE+ a tool for integrated design and deployment of energy efficient building controls. In Proceedings of the Fourth ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings, Toronto, ON, Canada, 6 November 2012; pp. 123–130. [Google Scholar]
  38. Mondal, N.; Anand, P.; Khan, A.; Deb, C.; Cheong, D.; Sekhar, C.; Niyogi, D.; Santamouris, M. Systematic review of the efficacy of data-driven urban building energy models during extreme heat in cities: Current trends and future outlook. Build. Simul. 2024, 17, 695–722. [Google Scholar] [CrossRef]
  39. Boje, C.; Guerriero, A.; Kubicki, S.; Rezgui, Y. Towards a semantic Construction Digital Twin: Directions for future research. Autom. Constr. 2020, 114, 103179. [Google Scholar] [CrossRef]
  40. Grieves, M. Digital twin: Manufacturing excellence through virtual factory replication. White Pap. 2014, 1, 1–7. [Google Scholar]
  41. Wang, H.; Qian, Y.; Kuang, Y.; Leng, J.; Yang, Y.; Zhang, H. How occupant positioning systems can be applied to help historic residences manage energy consumption: A case study in China. Build. Environ. 2024, 249, 111110. [Google Scholar] [CrossRef]
  42. Chamari, L.; Petrova, E.; Pauwels, P. An End-to-End Implementation of a Service-Oriented Architecture for Data-Driven Smart Buildings. IEEE Access 2023, 11, 117261–117281. [Google Scholar] [CrossRef]
  43. Kannari, L.; Piira, K.; Bistrom, H.; Vainio, T. Energy-data-related digital twin for office building and data centre complex. In Proceedings of the DCIS 2022—37th Conference on Design of Circuits and Integrated Systems, Pamplona, Spain, 16–18 November 2022. [Google Scholar]
  44. Brunone, F.; Cucuzza, M.; Imperadori, M.; Vanossi, A. From Cognitive Buildings to Digital Twin: The Frontier of Digitalization for the Management of the Built Environment. In Springer Tracts in Civil Engineering; Springer: Cham, Switzerland, 2021; pp. 81–95. [Google Scholar] [CrossRef]
  45. Roda-Sanchez, L.; Cirillo, F.; Solmaz, G.; Jacobs, T.; Garrido-Hidalgo, C.; Olivares, T.; Kovacs, E. Building a Smart Campus Digital Twin: System, Analytics, and Lessons Learned from a Real-World Project. IEEE Internet Things J. 2024, 11, 4614–4627. [Google Scholar] [CrossRef]
  46. Cao, Z.; Wang, R.; Zhou, X.; Wen, Y. Reducio: Model Reduction for Data Center Predictive Digital Twins via Physics-Guided Machine Learning. In Proceedings of the BuildSys 2022—9th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, Boston, MA, USA, 9–10 November 2022; pp. 1–10. [Google Scholar]
  47. Liu, Z.; Zhang, H.; Wang, Y.; Fan, X.; You, S.; Li, A. Data-driven predictive model for feedback control of supply temperature in buildings with radiator heating system. Energy 2023, 280, 128248. [Google Scholar] [CrossRef]
  48. Norouzi, P.; Maalej, S.; Mora, R. Applicability of Deep Learning Algorithms for Predicting Indoor Temperatures: Towards the Development of Digital Twin HVAC Systems. Buildings 2023, 13, 1542. [Google Scholar] [CrossRef]
  49. Zhao, C.; Wu, X.; Hao, P.; Wang, Y.; Zhou, X. Machine learning for optimal net-zero energy consumption in smart buildings. Sustain. Energy Technol. Assess. 2024, 64, 103664. [Google Scholar] [CrossRef]
  50. Zhou, X.; Guo, Q.; Han, J.; Wang, J.; Lu, Y.; Shi, J.; Kou, M. Real-time prediction of indoor humidity with limited sensors using cross-sample learning. Build. Environ. 2022, 215, 108964. [Google Scholar] [CrossRef]
  51. Dai, X.; Shang, W.; Liu, J.; Xue, M.; Wang, C. Achieving better indoor air quality with IoT systems for future buildings: Opportunities and challenges. Sci. Total Environ. 2023, 895, 164858. [Google Scholar] [CrossRef] [PubMed]
  52. Hu, W.; Wang, X.; Tan, K.; Cai, Y. Digital twin-enhanced predictive maintenance for indoor climate: A parallel LSTM-autoencoder failure prediction approach. Energy Build. 2023, 301, 113738. [Google Scholar] [CrossRef]
  53. Hosamo, H.H.; Nielsen, H.K.; Kraniotis, D.; Svennevig, P.R.; Svidt, K. Improving building occupant comfort through a digital twin approach: A Bayesian network model and predictive maintenance method. Energy Build. 2023, 288, 112992. [Google Scholar] [CrossRef]
  54. Gao, X.; Pishdad-Bozorgi, P. A framework of developing machine learning models for facility life-cycle cost analysis. Build. Res. Inf. 2020, 48, 501–525. [Google Scholar] [CrossRef]
  55. Gispert, D.E.; Yitmen, I.; Sadri, H.; Taheri, A. Development of an ontology-based asset information model for predictive maintenance in building facilities. Smart Sustain. Built Environ. 2025, 14, 740–757. [Google Scholar] [CrossRef]
  56. Hosamo, H.; Hosamo, M.H.; Nielsen, H.K.; Svennevig, P.R.; Svidt, K. Digital Twin of HVAC system (HVACDT) for multiobjective optimization of energy consumption and thermal comfort based on BIM framework with ANN-MOGA. Adv. Build. Energy Res. 2023, 17, 125–171. [Google Scholar] [CrossRef]
  57. Hosamo, H.H.; Svennevig, P.R.; Svidt, K.; Han, D.; Nielsen, H.K. A Digital Twin predictive maintenance framework of air handling units based on automatic fault detection and diagnostics. Energy Build. 2022, 261, 111988. [Google Scholar] [CrossRef]
  58. Zhao, Y.; Wang, N.; Liu, Z.; Mu, E. Construction Theory for a Building Intelligent Operation and Maintenance System Based on Digital Twins and Machine Learning. Buildings 2022, 12, 87. [Google Scholar] [CrossRef]
  59. Blume, C.; Blume, S.; Thiede, S.; Herrmann, C. Data-driven digital twins for technical building services operation in factories: A cooling tower case study. J. Manuf. Mater. Process. 2020, 4, 97. [Google Scholar] [CrossRef]
  60. Tariq, R.; Torres-Aguilar, C.E.; Sheikh, N.A.; Ahmad, T.; Xamán, J.; Bassam, A. Data engineering for digital twining and optimization of naturally ventilated solar façade with phase changing material under global projection scenarios. Renew. Energy 2022, 187, 1184–1203. [Google Scholar] [CrossRef]
  61. Wang, Y.; Qi, Y.; Li, J.; Huan, L.; Li, Y.; Xie, B.; Wang, Y. The Wind and Photovoltaic Power Forecasting Method Based on Digital Twins. Appl. Sci. 2023, 13, 8374. [Google Scholar] [CrossRef]
  62. Castilla, M.; Redondo, J.L.; Martínez, A.; Álvarez, J.D. Artificial Neural Network-based digital twin for a flat plate solar collector field. Eng. Appl. Artif. Intell. 2024, 133, 108387. [Google Scholar] [CrossRef]
  63. Roth-Dietrich, G.; Gerten, R. Machine Learning for Energy Management Optimization. In Apply Data Science: Introduction, Applications and Projects; Springer Vieweg: Wiesbaden, Germany, 2023; pp. 159–179. [Google Scholar] [CrossRef]
  64. Seo, B.; Yoon, Y.; Lee, K.H.; Cho, S. Comparative Analysis of ANN and LSTM Prediction Accuracy and Cooling Energy Savings through AHU-DAT Control in an Office Building. Buildings 2023, 13, 1434. [Google Scholar] [CrossRef]
  65. Oulefki, A.; Amira, A.; Kurugollu, F.; Alshoweky, M. Twining Buildings: A Methodological Framework for Design and Implementation using Home Assistant Technology. In Proceedings of the 4th International Conference on Electrical, Communication and Computer Engineering, ICECCE 2023, Dubai, United Arab Emirates, 30–31 December 2023. [Google Scholar]
  66. Nakai, M.; Ooka, R.; Ikeda, S. Study of power demand forecasting of a hospital by ensemble machine learning. J. Phys. Conf. Ser. 2021, 2069, 012147. [Google Scholar] [CrossRef]
  67. Wetter, M.; Zuo, W.; Nouidui, T.S.; Pang, X. Modelica Buildings library. J. Build. Perform. Simul. 2014, 7, 253–270. [Google Scholar] [CrossRef]
  68. Al-Isawi, O.A.; Amirah, L.H.; Al-Mufti, O.A.; Ghenai, C. Digital Twinning and LSTM-based Forecasting Model of Solar PV Power Output. In Proceedings of the 2023 Advances in Science and Engineering Technology International Conferences, ASET 2023, Dubai, United Arab Emirates, 20–23 February 2023. [Google Scholar]
  69. Ahmed, W.; Ansari, H.; Khan, B.; Ullah, Z.; Ali, S.M.; Mehmood, C.A.A.; Qureshi, M.B.; Hussain, I.; Jawad, M.; Khan, M.U.S.; et al. Machine Learning Based Energy Management Model for Smart Grid and Renewable Energy Districts. IEEE Access 2020, 8, 185059–185078. [Google Scholar] [CrossRef]
  70. Al-Janabi, S.; Mohammed, G. An intelligent returned energy model of cell and grid using a gain sharing knowledge enhanced long short-term memory neural network. J. Supercomput. 2024, 80, 5756–5814. [Google Scholar] [CrossRef]
  71. Sorour, A.; Fazeli, M.; Monfared, M.; Fahmy, A.A.; Searle, J.R.; Lewis, R.P. MILP Optimized Management of Domestic PV-Battery Using Two Days-Ahead Forecasts. IEEE Access 2022, 10, 29357–29366. [Google Scholar] [CrossRef]
  72. Martins, R.; Musilek, P.; Hesse, H.C. Optimization of photovoltaic power self-consumption using linear programming. In Proceedings of the 2016 IEEE 16th International Conference on Environment and Electrical Engineering (EEEIC), Florence, Italy, 7–10 June 2016; pp. 1–5. [Google Scholar]
  73. Albino, V.; Berardi, U.; Dangelico, R.M. Smart Cities: Definitions, Dimensions, Performance, and Initiatives. J. Urban Technol. 2015, 22, 3–21. [Google Scholar] [CrossRef]
  74. Angelidou, M. Smart cities: A conjuncture of four forces. Cities 2015, 47, 95–106. [Google Scholar] [CrossRef]
  75. Neirotti, P.; De Marco, A.; Cagliano, A.C.; Mangano, G.; Scorrano, F. Current trends in Smart City initiatives: Some stylised facts. Cities 2014, 38, 25–36. [Google Scholar] [CrossRef]
  76. Caragliu, A.; Del Bo, C.F. Smart innovative cities: The impact of Smart City policies on urban innovation. Technol. Forecast. Soc. Change 2019, 142, 373–383. [Google Scholar] [CrossRef]
  77. Fathy, Y.; Jaber, M.; Nadeem, Z. Digital Twin-Driven Decision Making and Planning for Energy Consumption. J. Sens. Actuator Netw. 2021, 10, 37. [Google Scholar] [CrossRef]
  78. Zhou, M.; Yan, J.; Feng, D. Digital twin framework and its application to power grid online analysis. CSEE J. Power Energy Syst. 2019, 5, 391–398. [Google Scholar] [CrossRef]
  79. Ruohomäki, T.; Airaksinen, E.; Huuska, P.; Kesäniemi, O.; Martikka, M.; Suomisto, J. Smart City Platform Enabling Digital Twin. In Proceedings of the 2018 International Conference on Intelligent Systems (IS), Funchal, Madeira, Portugal, 25–27 September 2018; pp. 155–161. [Google Scholar]
  80. Onile, A.E.; Machlev, R.; Petlenkov, E.; Levron, Y.; Belikov, J. Uses of the digital twins concept for energy services, intelligent recommendation systems, and demand side management: A review. Energy Rep. 2021, 7, 997–1015. [Google Scholar] [CrossRef]
  81. Niavis, H.; Laskari, M.; Fergadiotou, I. Trusted DBL: A Blockchain-based Digital Twin for Sustainable and Interoperable Building Performance Evaluation. In Proceedings of the 2022 7th International Conference on Smart and Sustainable Technologies, SpliTech 2022, Split/Bol, Croatia, 5–8 July 2022. [Google Scholar]
  82. Fan, C.; Sun, Y.; Zhao, Y.; Song, M.; Wang, J. Deep learning-based feature engineering methods for improved building energy prediction. Appl. Energy 2019, 240, 35–45. [Google Scholar] [CrossRef]
  83. Fan, C.; Xiao, F.; Zhao, Y. A short-term building cooling load prediction method using deep learning algorithms. Appl. Energy 2017, 195, 222–233. [Google Scholar] [CrossRef]
  84. Van Rossum, G.; Drake, F.L. Python Reference Manual; Centrum voor Wiskunde en Informatica: Amsterdam, The Netherlands, 1995; Volume 111. [Google Scholar]
  85. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  86. Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar] [CrossRef]
  87. Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv 2016. [Google Scholar] [CrossRef]
  88. Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
  89. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2025; Available online: https://www.R-project.org/ (accessed on 26 May 2025).
  90. Kalinowski, T.; Falbel, D.; Allaire, J.J.; Chollet, F.; RStudio; Google; Tang, Y.; Van Der Bijl, W.; Studer, M.; Keydana, S. R Package Version 2.15.0. 2024. Available online: https://CRAN.R-project.org/package=keras (accessed on 26 May 2025). [CrossRef]
  91. European Parliament and Council. Directive (EU) 2024/1275 of 24 April 2024 on the Energy Performance of Buildings (Recast). Official Journal of the European Union, L 2024/1275. 8 May 2024. Available online: https://eur-lex.europa.eu/eli/dir/2024/1275/oj (accessed on 26 May 2025).
  92. Amasyali, K.; El-Gohary, N.M. A review of data-driven building energy consumption prediction studies. Renew. Sustain. Energy Rev. 2018, 81, 1192–1205. [Google Scholar] [CrossRef]
  93. Mocanu, E.; Nguyen, P.H.; Gibescu, M.; Kling, W.L. Deep learning for estimating building energy consumption. Sustain. Energy Grids Netw. 2016, 6, 91–99. [Google Scholar] [CrossRef]
  94. Dash, S.; Sahoo, N.C. Electric energy disaggregation via non-intrusive load monitoring: A state-of-the-art systematic review. Electr. Power Syst. Res. 2022, 213, 108673. [Google Scholar] [CrossRef]
Figure 1. Clusters of keywords with at least three appearances.
Figure 1. Clusters of keywords with at least three appearances.
Urbansci 09 00202 g001
Figure 2. Documents by year in the Scopus database [15].
Figure 2. Documents by year in the Scopus database [15].
Urbansci 09 00202 g002
Figure 3. Flow diagram for systematic review based on PRISMA [14].
Figure 3. Flow diagram for systematic review based on PRISMA [14].
Urbansci 09 00202 g003
Figure 4. Digital twin for building energy management through three different approaches.
Figure 4. Digital twin for building energy management through three different approaches.
Urbansci 09 00202 g004
Figure 5. Keywords or similar ideas in most definitions of Smart Cities.
Figure 5. Keywords or similar ideas in most definitions of Smart Cities.
Urbansci 09 00202 g005
Figure 6. Opportunities and research directions.
Figure 6. Opportunities and research directions.
Urbansci 09 00202 g006
Table 1. Target of the studies.
Table 1. Target of the studies.
TargetStudies
Data centers[46]
Indoor temperature[25,46,47,48,49]
Indoor humidity[50]
Air quality[51,52]
Occupant comfort[47,53]
Occupant behavior[41]
Operation and maintenance (O&M)[7,8,11,19,23,34,44,52,53,54,55,56,57,58]
Cooling towers[59]
Solar chimney[60]
Solar PV[49,61]
Plate solar collector field[62]
Heating, ventilation, and air conditioning (HVAC)[8,12,28,31,47,48,49,53,56,57,63,64]
Table 2. Types of buildings for each study when it comes to real-world applications.
Table 2. Types of buildings for each study when it comes to real-world applications.
Typology of BuildingStudies
Office buildings[26,28,30,43,64]
Residential buildings[25,41,47,49,63,65]
Educational institutions[12,32,45,48]
Hospital[66]
Museum[29]
Theater[29]
Gymnasium[11]
Table 3. Main Machine Learning and Deep Learning algorithms identified.
Table 3. Main Machine Learning and Deep Learning algorithms identified.
AlgorithmApplication
Deep Neural Network (DNN)
  • Forecast indoor temperatures [48,49];
  • Building power demand forecasting [10,29,49,66,82];
  • Building cooling load prediction [83];
  • Optimize energy consumption [49].
Support Vector Regressor (SVR)
  • Power demand forecasting [45,56];
  • Building occupancy forecasting [45];
  • Predicted percentage of dissatisfied [56];
  • Failure prediction of indoor climate maintenance [52].
Random Forest
  • Building occupancy forecasting [45];
  • Power demand forecasting [11,56];
  • Forecast indoor temperatures [48];
  • Predicted percentage of dissatisfied [56];
  • Forecasting heat demand [3];
  • Forecasting PV generation [3];
  • Failure prediction of indoor climate maintenance [52].
Tree-Based Algorithms
  • Building occupancy forecasting [45];
  • Forecasting PV generation [45];
  • Forecast indoor temperatures [48];
  • Forecasting heat demand [3];
  • Forecasting PV generation [3].
Extra Trees
  • Forecast indoor temperatures [48].
ANN
  • Power demand forecasting [17,22,49,56];
  • Digital Twin (black-box approach) [22,56,58,62];
  • Forecast temperature for a flat plate solar collector [62];
  • Finding optimal set points and operation sequences [56];
  • Optimize energy consumption [56];
  • Predicted percentage of dissatisfied [56];
  • Building cooling load prediction [64];
  • Forecasting heat demand [3];
  • Forecasting PV generation [3];
  • Failure prediction of indoor climate maintenance [52].
Convolutional neural network (CNN)
  • Forecast indoor temperatures [48];
  • Building power demand forecasting [29].
LSTM
  • Forecast indoor temperatures [48];
  • Building cooling load prediction [64];
  • Building power demand forecasting [29];
  • Forecasting PV generation [68];
  • Failure prediction of indoor climate maintenance [52].
Table 4. Main Digital Twin tools and techniques identified.
Table 4. Main Digital Twin tools and techniques identified.
ToolApplicationApproach
MATLABDigital Twin, dashboard and data visualization [25,62,68]Black-box
Simulink in MATLABDigital Twins (Smart Residential Buildings, HVAC system, and solar PV) [22,24,56,68]Black-box Gray-box
TRNSYSDigital Twins with detailed physical characteristics [9,10,24,25,30]White-box
Energy PlusDigital Twins with detailed physical characteristics [9,10,24,30,64]White-box
DesignBuilderDigital Twins with detailed physical characteristics. Data acquisition concerning the physical characteristics of the building [9,10,17]White-box
DymolaDigital Twins with detailed physical characteristics [9]White-box
DOE-2Digital Twins with detailed physical characteristics [9]White-box
eQuestSimulation of building envelope characteristics [22]Gray-box
White-box
ModelicaDigital Twins with detailed physical characteristics [31,67]White-box
FMPyPackage in Python: to import data from Modelica to Python [31]Gray-box
MLE+The link between Energy Plus and Python [24,37]Gray-box
pyEpThe link between Energy Plus and Python [24]Gray-box
SCADAFor data acquisition [24]Black-box
BEMSFor data acquisition [24,42,56]Black-box
Autodesk TandemDigital Twins with detailed physical characteristics [7,42]White-box
Autodesk RevitDigital Twins with detailed physical characteristics [10,22,41,42,56,57]White-box
DynamoHistorical sensor data integration [42,53]Gray-box
White-box
Unity 3DDigital Twins with detailed physical characteristics [11]White-box
FIWARESupports the Urban Digital Twins of a Smart City or district and is compatible with NGSI-LD, an open standard data format [45]Black-box Gray-box
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Palley, B.; Poças Martins, J.; Bernardo, H.; Rossetti, R. Integrating Machine Learning and Digital Twins for Enhanced Smart Building Operation and Energy Management: A Systematic Review. Urban Sci. 2025, 9, 202. https://doi.org/10.3390/urbansci9060202

AMA Style

Palley B, Poças Martins J, Bernardo H, Rossetti R. Integrating Machine Learning and Digital Twins for Enhanced Smart Building Operation and Energy Management: A Systematic Review. Urban Science. 2025; 9(6):202. https://doi.org/10.3390/urbansci9060202

Chicago/Turabian Style

Palley, Bruno, João Poças Martins, Hermano Bernardo, and Rosaldo Rossetti. 2025. "Integrating Machine Learning and Digital Twins for Enhanced Smart Building Operation and Energy Management: A Systematic Review" Urban Science 9, no. 6: 202. https://doi.org/10.3390/urbansci9060202

APA Style

Palley, B., Poças Martins, J., Bernardo, H., & Rossetti, R. (2025). Integrating Machine Learning and Digital Twins for Enhanced Smart Building Operation and Energy Management: A Systematic Review. Urban Science, 9(6), 202. https://doi.org/10.3390/urbansci9060202

Article Metrics

Back to TopTop