Next Article in Journal
The Analysis of Operation Modes of Variable Speed Pump Units with Different Circuits of Turbomachine Connection
Previous Article in Journal
Study of a Novel Hybrid Refrigeration System, with Natural Refrigerants and Ultra-Low Carbon Emissions, for Air Conditioning
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Interpretable Data-Driven Methods for Building Energy Modelling—A Review of Critical Connections and Gaps

Massimiliano Manfren
Karla M. Gonzalez-Carreon
Patrick A. B. James
Faculty of Engineering and Physical Sciences, University of Southampton, Boldrewood Innovation Campus, Burgess Rd., Southampton SO16 7QF, UK
Author to whom correspondence should be addressed.
Energies 2024, 17(4), 881;
Submission received: 30 December 2023 / Revised: 22 January 2024 / Accepted: 9 February 2024 / Published: 14 February 2024
(This article belongs to the Section G: Energy and Buildings)


Technological improvements are crucial for achieving decarbonisation targets and addressing the impacts of climate change in the built environment via mitigation and adaptation measures. Data-driven methods for building performance prediction are particularly important in this regard. Nevertheless, the deployment of these technologies faces challenges, particularly in the domains of artificial intelligence (AI) ethics, interpretability and explainability of machine learning (ML) algorithms. The challenges encountered in applications for the built environment are amplified, particularly when data-driven solutions need to be applied throughout all the stages of the building life cycle and to address problems from a socio-technical perspective, where human behaviour needs to be considered. This requires a consistent use of analytics to assess the performance of a building, ideally by employing a digital twin (DT) approach, which involves the creation of a digital counterpart of the building for continuous analysis and improvement. This paper presents an in-depth review of the critical connections between data-driven methods, AI ethics, interpretability and their implementation in the built environment, acknowledging the complex and interconnected nature of these topics. The review is organised into three distinct analytical levels: The first level explores key issues of the current research on the interpretability of machine learning methods. The second level considers the adoption of interpretable data-driven methods for building energy modelling and the problem of establishing a link with the third level, which examines physics-driven grey-box modelling techniques, in order to provide integrated modelling solutions. The review’s findings highlight how the interpretability concept is relevant in multiple contexts pertaining to energy and the built environment and how some of the current knowledge gaps can be addressed by further research in the broad area of data-driven methods.

1. Introduction

The transition to sustainable energy systems in the built environment does not represent only a technological challenge, but rather a multi-faceted socio-technical endeavour. The complex relationship between society and technology in the context of low-carbon transitions is thoroughly explored in recent research, depicting it as a “Great Reconfiguration” [1]. Multiple research issues have emerged in recent years due to the necessity to make the “Great Reconfiguration” happen in practice at an accelerated pace. Examples include the issue of creating a participatory energy transition considering energy democracy and energy citizenship problems [2], the potential conflict between a rapid and inclusive energy transition process [3] as well as the need for skills deployment to enable a “just” transition [4], and the socio-technical problem of designing energy communities [5] which need to operate in different ways compared to traditional solutions [6] and that can be used as a means to foster technological innovation [7], while considering the issue of techno-economic optimisation [8] of potential design solutions. At the same time, the use of empirically grounded data-driven technology forecasts [9] and energy analytics [10] for planning and policies is becoming more and more necessary to turn data into actions.
The built environment, both in Europe and worldwide, has a crucial role in the transition as it is at the intersection of the large carbon emitters: energy, transportation and building sectors [11]; a multi-scale perspective is needed to understand the actual impact of energy consumption and emissions by buildings [12,13]. Further, the carbon emission of the construction industry on a global scale has been analysed by Huang et al. [14], indicating an impact of around 23% of the total carbon emission determined by economic activities on a global scale. At the EU policy level, the European Green Deal [15], the Renovation Wave [16] and Energy Roadmap 2050 [17] underline the importance of the strict implementation of the Energy Performance of Buildings Directive (EPBD) [18] requirements, as well as the appropriate use of cost-optimal analysis principles [19,20].
In parallel, the shift from a “linear” to a “circular” economic model has significant implications for the built environment and the construction industry. This is because buildings, being long-term assets, consume a substantial amount of energy and consequently contribute a significant quantity of carbon dioxide emissions, as previously mentioned. This occurs during the production of materials, as well as throughout the construction, operation and dismantling phases [21]. It is important to acknowledge the technical limitations and constraints of a circular economic paradigm [22]. Nonetheless, the growing focus on understanding building performance throughout its entire life cycle, from “cradle to grave” [23] or even from “cradle to cradle” in a circular economy perspective [24], is crucial for promoting systemic innovation.
In this framework, digitalisation can be leveraged to develop solutions, specifically customised for the built environment and construction industry, to be able to cope with the social and economic challenges outlined previously, as well as with the critical nature of climate evolution. Regarding the digitalisation of building assets, building information modelling (BIM) has developed over time as a multi-faceted concept [25] that presents open challenges from a cognitive [26] and knowledge management standpoint. The recent developments from BIM to digital twins (DTs) [27], involving multiple domains of knowledge within the building sector and across life cycle phases (from “cradle to cradle” [28]), are promising when seen from the perspective of building performance modelling [29], and data-driven methods are becoming [29] crucial instruments [30] for improving energy efficiency [31], fostering renewable energy source penetration and enhancing energy flexibility, by means of data-driven analytics.
Overall, the benefits of technology improvements in the building sector go beyond direct ones such as primary energy, carbon emission and cost reduction and include co-benefits such as health [32] and productivity, among others. In this sense, innovative modelling tools, employing cutting-edge data-driven approaches, have the potential to fundamentally change the way buildings are designed and operated (within a whole life cycle perspective, as explained earlier). However, the effectiveness of these innovative paradigms depends to a good extent on how easily the underlying data-driven methodologies can be understood and interpreted in human terms. It is essential to prioritise transparency and adopt a “human-in-the-loop” approach in machine learning to foster the trustworthiness and acceptance of these technologies.
Challenges to potential future artificial intelligence (AI) and machine learning (ML) applications in the built environment are posed by the concept of “right to explanation”, derived from Article 22 of EU GDPR, dealing with the topic of “Automated individual decision-making, including profiling”, and the statement in Article 13 of Artificial Intelligence (AI) Act that “High-risk AI systems shall be designed and developed in such a way to ensure that their operation is sufficiently transparent to enable users to interpret the system’s output and use it appropriately.” The risk classification pyramid, as proposed within the framework of EU legislation on AI at present [33], encompasses four distinct categories of risk: unacceptable risk, high risk, limited risk, and minimal or no risk. High-risk AI systems encompass, among other issues, important infrastructures that could put the lives and health of citizens at risk and essential private and public services. Consequently, numerous uses of data-driven techniques in the fields of the built environment and energy can be categorised as “high risk” in principle.
This paper aims to explore the topic of interpretability and the related topic of explainability in the context of data-driven methods for building energy modelling, investigating its significance in relation to neighbouring research areas (i.e., critical connections) and identifying current knowledge gaps.

2. Methods and Tools

The energy transition in the built environment and in the construction industry poses several challenges which have been briefly enounced in the introduction. Emerging paradigms related to the use of data-driven analytics can be leveraged in multiple applications during the entire building life cycle, contributing to the acceleration of the energy transition process. However, the impact of artificial intelligence (AI) and machine learning (ML) innovations may be highly disruptive, and issues such as AI ethics and the “right to explanation”, introduced in the previous section, need to be carefully considered. At the same time, consolidated knowledge in the energy and built environment domain can provide solid foundations and benchmarks to be used in the development of innovative solutions based on AI/ML. For these reasons, the review process is articulated in 3 analytical levels, as summarised in Table 1.
The first analytical level considered in the review corresponds to the necessity of introducing some relevant concepts in the evolution of AI/ML research and trying to understand how they can impact applications in the built environment, in relation to emerging paradigms enabled by digitalisation. The second analytical level is provided to illustrate how some of the methods and tools that have already proven to be effective in field applications can be used for the development of future research in the broad area of data-driven methods in buildings. Finally, the third analytical level is included because of possible simplifications of building energy simulation models that can be employed while providing adequate accuracy and retaining the physical structure of models.
Overall, the objective of the proposed literature review methodology is to establish a connection between the key research issues pertaining to the different analytical levels considered and indicate knowledge gaps and potential future research directions.
Figure 1 provides a graphical diagram (a mind-map) of the 3 analytical levels considered in the research and the articulation of some of the key topics discussed later in Section 3. Many of the topics discussed are quite closely related and connected across the different analytical levels chosen. The reason behind the choice of this structure is that of establishing a link between the problem of interpretability of data-driven methods (Level 1) and the consolidated knowledge and recent advances in the domain of building energy performance modelling (Level 3), while developing research at the intersection of the two (Level 2), focused on data-driven methods in buildings and energy domains. This choice corresponds to the necessity of providing solid foundations that can be used in the development of further research, considering the goal of contributing to an acceleration of the energy transition.
The first analytical level starts from the problem of defining interpretability, briefly illustrated by Molnar et al. [34]; the taxonomy by Molnar [35] is used as well, distinguishing between ante hoc (intrinsic) and post hoc interpretability. In brief, a data-driven method in which the changes in output determined by a change in the input or in the algorithmic parameters can be understood in human terms can be defined as intrinsically interpretable. If this is not possible and the model requires post hoc techniques to become understandable in human terms, the term explainable is used in this paper, to mark a clear distinction. As will be discussed in Section 3.1, there is no consensus on the concepts of interpretability and explainability at present; the point of view adopted in this research derives from the extensive research conducted by Rudin et al. [36] on the challenges of interpretability and the related conceptual problems, highlighted by Watson [37].
Many of the challenges identified for interpretable machine learning (ML) by Rudin et al. [36] have direct implications in the building and energy domains. Among them, for example, are sparse logical models (decision trees, lists, sets), scoring systems, generalised additive models (GAMs), case-based reasoning, dimension reduction for data visualisation and ML models that incorporate physics and other generative or causal constraints. A combination of these techniques, which can be formulated in an intrinsically interpretable way, can be exploited in multiple applications in buildings; some of them are discussed in Section 3.2 and 3.3. Finally, Section 3.4 discusses the limitations, summarises findings and provides indications for future research.
Regarding Level 3, the problem of classifying the large variety of modelling approaches used to estimate and predict building energy performance has been extensively reviewed by Kavgic et al. [38], Foucquier et al. [39], Fumo [40] and, more recently, Chen et al. [41]. These reviews illustrate important concepts such as “top-down” [42] and “bottom-up” [43] modelling and “forward” and “inverse” modelling and describe the difference between “white-box”, “grey-box” and “black-box” modelling. White-box models are, in this context, detailed physics-driven models used for building performance simulation. Grey-box models are hybrid physical–statistical approaches frequently encountered in the form of lumped parameter resistance–capacitance (RC) models. Black-box models are purely data-driven models based on statistics and ML. One of the challenges identified by Rudin et al. [36] is incorporating physics and other generative or causal constraints in ML, and this appears to be crucial in the built environment and energy domains, in light of the multiplicity of possible approaches to energy performance estimation and prediction. A discussion around this problem is reported in Section 3.3.
Finally, Level 2 (bridging Level 1 and Level 3) analyses the concept of interpretability in relation to data-driven methods in buildings and energy domains. In this case, the search for papers was conducted using keywords in Scopus and the Web of Science (WoS) dataset. Keyword searches can make use of AND and OR operators, among others. However, only the AND operator has been used in this case to limit the number of results and to consider only the articles that actually address interpretability in the context of energy and buildings. A summary of the searches and results is reported in Table 2.
The choice of keywords such as “Regression”, “Deep Learning”, ”Random Forest” and “Optimal trees” derived from the necessity of considering both intrinsically interpretable methods, such as the ones based on multivariate regression, and explainable ML algorithms. The keyword “Bayesian” was used to provide results that reflect a Bayesian approach rather than a classical/frequentist one. Finally, the keywords “M&V”, an acronym for measurement and verification, and “Degree-Days” were included because of the empirically grounded nature of the techniques used for M&V and the importance of variable-base degree-days (VBDDs) for energy performance normalisation. While the VBDD concept is quite consolidated for both heating (using heating degree-days, HDDs) and cooling (using cooling degree-days, CDDs) performance normalisation, recent advances [44] could enable more sophisticated modelling strategies while retaining interpretability due to the physical definition of this quantity.

3. Literature Review

The literature review presented in this section follows the methodology presented in Section 2. The first analytical level, involving the interpretability and explainability concepts, the ethics of artificial intelligence (AI) and machine learning (ML) and the emerging digital twin (DT) paradigm, is presented in Section 3.1. After that, the second analytical level is presented in Section 3.2, which explores the application of the DT paradigm in buildings, measurement and verification (M&V) and interpretable data-driven methods for energy modelling in buildings. Section 3.3 introduces some advances related to physics-informed ML and then illustrates key features and applications of grey-box energy modelling techniques in buildings. Finally, Section 3.4 discusses limitations, presents a summary of findings (critical connections and gaps in knowledge) and proposes some directions for future research.

3.1. Interpretability Concept, Ethics of AI/ML and Digital Twin Paradigm

The ongoing debate around the concept of “right to explanation”, derived from Article 22 of EU GDPR (dealing with the topic of “Automated individual decision-making, including profiling”) involves multiple aspects with both legal and technical implications, inherently connected to a potential “right to technical explainability”, discussed in detail by Gallese [45] in the context of the EU AI Act for the regulation for artificial intelligence (AI). Panigutti et al. [46] discuss in detail the problem of explainable AI (XAI) in high-risk AI systems (Article 13 of the proposed AI Act). Their article presents some of the key points currently debated and affirms that the AI Act neither mandates a requirement for XAI, which is the subject of intense scientific research at present and still presents technical limitations, nor bans the use of black-box AI systems. Instead, the AI Act aims to achieve its stated policy objectives with a focus on transparency (including documentation) and human oversight. The focus on transparency and the need for human oversight do not exclude a priori black-box modelling but clearly do not actually give an answer to the problem of how to deal with explainability and interpretability from a practical perspective, as clearly indicated by Gryz and Rojszczak [47].
In this paper, the distinction between interpretability and explainability given by Rudin et al. [36,48] is used. In particular, explainability occurs when a “second (post hoc) models is created to explain the first black-box model”. In this case, two models are used rather than one which is ante hoc (intrinsically) interpretable, following the taxonomy proposed by Molnar [35]. In the literature, multiple definitions can be found, as shown by Flora et al. [49], and the two terms are frequently used interchangeably; however, the choice to use the term “interpretable” for intrinsically interpretable and “explainable” for post hoc methods in this research derives from the practical necessity of drawing a clear distinction between the methods that are intelligible in human terms without the use of additional techniques (intrinsic, ante hoc) and the methods that require specific techniques, such as SHAP (Shapley Additive Explanations) and LIME (Local Interpretable Model-agnostic Explanations). SHAP is based on game theory and provides a way to explain the output of any model by calculating the contribution of each feature to the prediction. It assigns each feature an importance value for a particular prediction. LIME, on the other hand, approximates complex models with simpler, local ones that are understandable to humans. It perturbs the input data and observes the changes in predictions to understand which features significantly influence the output. LIME is model-agnostic, meaning it can be used with any machine learning model.
One of the crucial problems is determined by the fact that black-box techniques such as deep learning or random forest (among others) are typically more accurate (therefore preferred based on their performance) than other models, but much less transparent [49]. However, as pointed out by Rudin and Radin [50], in many cases, interpretable models (such as regressions and trees) might be nearly as accurate as black-box techniques, or simply accurate enough for a certain use. A criterion to choose between intrinsically interpretable and black-box explainable methods is provided by Petch et al. [51]: “[...] data scientists should train models using both interpretable and black-box methods to assess whether there is, in fact, an accuracy vs interpretability trade-off in the specific cases on which they are working. If there is no meaningful difference in accuracy between an interpretable model and a black-box, an interpretable method should be used”. In the same paper, the authors discuss the problem of models used in the context of high-stake decisions, where sacrificing interpretability could make sense due to improvement in accuracy, but state that practitioners (clinicians in the specific case) should be aware of limitations and avoid overinterpreting results. The position taken by Rudin [48] is clear-cut, and the use of interpretable methods is advocated for high-stake decisions, due to the difficulty of interpreting results with post hoc methods, which can lead to fallacies.
Going back to policy-related issues, in recent years, different countries have developed numerous artificial intelligence (AI) plans with the aim of maximising benefits and reducing risks, determining a debate around ethical concerns related to AI/ML [52]. According to comparative research, policy plays a crucial role, and there are considerable differences, for example, between the EU and the US [52]. Concurrently, various organisations have initiated a wide range of efforts to establish ethical guidelines for the use of socially beneficial artificial intelligence. In this case, the proliferation of principles poses a risk of overwhelming and confounding. Floridi and Cowls [53] examine this issue and propose five basic principles with the aim of synthesising the numerous emerging instances in this area of research. According to the authors, four of the five main principles are the same as the ones in the field of bioethics, namely beneficence, nonmaleficence, autonomy and justice. The fifth principle, unique to AI, is explicability. This encompasses both intelligibility, addressing the question “how does it work?”, and accountability, addressing the question “who is responsible for the way it works?”. These aspects pertain to the epistemological and ethical dimensions, respectively. Achieving explicability in practice requires an appropriate design of AI/ML applications that can be inspected and audited to be trusted. For this reason, technical notions such as explainability and interpretability become crucial as they could enable or hinder the development of socially beneficial artificial intelligence.
Finally, an emerging paradigm in the digital domain that is receiving a lot of attention at present is that of digital twins (DTs), broadly defined as digital counterparts of physical assets that are connected and synchronised with them. The DT paradigm is reviewed systematically by Semeraro et al. [54] and by Dalibor et al. [55], using a cross-domain approach that aims to tackle the inherent complexity. Specific declinations of the DT concept in the building domain are discussed later in Section 3.2.
What appears interesting, in relation to the challenges outlined by Rudin et al. [36] for interpretable ML, is the possibility of creating models that incorporate physics and/or causal constraints, because they can be more effective in the representation of physical assets across their life cycle. As clearly discussed by Wright and Davidson [56] with examples, there is a difference between a plain model and a DT. Overall, the DT notion combines various well-established technologies and has the potential to yield significant advantages even though further research is needed. Nonetheless, a successful implementation of a DT necessitates trust in the model, trust in the data and trust in the process used to update models based on data. Therefore, the research around technical concepts like interpretability and explainability, together with human oversight and explicability of AI/ML applications (i.e., intelligibility and accountability), is becoming fundamental for driving socially beneficial innovation through digitalisation.

3.2. Digital Twins in Buildings, M&V and Interpretable Data-Driven Methods

The emergence of new digital domains of knowledge in the built environment has brought about novel terminology and ideas, reshaping our understanding of established technologies [29]. This trend is especially noticeable in the domain of machine learning models for energy in buildings, which are frequently developed as “black boxes”. Buildings that are equipped with advanced digital controls have transformed into cyber–physical systems [57], which combine digital computation with physical processes. Of all these advancements, digital twins (DTs) are particularly notable for their ability to enable the comparison between simulated (and/or predicted) and measured data in buildings, providing insights on the actual performance of physical assets and enabling a continuous monitoring and, at least potentially, improvement process. This ability to learn from “gaps” between predicted and actual performance is particularly interesting with respect to the debated question of “performance gap” in buildings [58,59,60] and can act as a catalyst for innovation within the building research community. However, an effective implementation of the DT concept in buildings is challenging due to the need to incorporate multiple data streams from building information modelling (BIM), building performance simulation, machine learning tools/platforms, etc. BIM is, in itself, an evolving concept with respect to which there are different points of view, narrower or broader [25], and poses the challenge of proposing a unique definition [61].
In this regard, Opoku et al. [62] analysed the challenges inherent to the use of DT in the construction industry, which is characterised by low productivity, a modest level of investments in research and development and poor technology advancements. They identified six areas of application: BIM, structural system integrity, facilities management, monitoring, logistics processes and energy simulation. In a similar way, Deng et al. examine the progression from BIM to DTs [27] in relation to the different stages of the building life cycle and the diverse knowledge domains and applications involved.
The definition proposed by Borth et al. [63], characterising a DT as a digital replica version of physical assets that is connected and synchronised, encompassing the elements and dynamics of system and device operations within their environment and lifecycle, is challenging to achieve in practice in the built environment. Nonetheless, ontologies such as SAREF and others can help in developing scalable and adaptable DT applications for energy modelling [64]. The difficulties of using the DT paradigm in practice for data-driven analytics and turning data into actions in the building sector are discussed by Ammar et al. [65].
Even if the DT focus is put on a single domain and a single problem, namely energy and operation, the current research landscape appears to be quite diversified, as discussed by Ye et al. for energy management in industrial applications [66]. Being able to leverage both short- and long-term operational energy data in buildings is not a trivial problem [67], and Chen et al. review the problem of interpretability in relation to building energy management [68], highlighting the lack of a unified consensus regarding interpretability and explainability concepts and the limitations of post hoc techniques such as SHAP and LIME.
In this context, the ability to interpret ML results is essential for users to understand and trust the technology. Although previous research has classified and examined different interpretable ML methods, there are still obstacles to overcome, specifically regarding the ambiguity of terminology and limitations in the interpretability of widely used techniques like SHAP (Shapley Additive Explanations) and LIME (Local Interpretable Model-agnostic Explanations). While powerful, SHAP and LIME do not completely resolve the challenges related to the interpretability and comparability of model explanations, even though the first can highlight key features and the second local linear approximations. This discrepancy underlines the necessity for future studies to concentrate on improving the interpretability of “black-box” models (post hoc) on the one hand and on using methods that are interpretable by design (or ante hoc, intrinsic) on the other hand, where possible. Rudin [5] takes a strong stance on this matter, advocating for the use of data-driven methodologies that are inherently interpretable for high-stake decisions, instead of relying on post hoc explanations of the model. As discussed in the introduction, decision-making processes in the building sector can have significant and long-lasting consequences. It is crucial to evaluate whether judgements based on building performance modelling and simulation can be classified as high-stake decisions and, consequently, interpretable data-driven methods may be prioritised where possible in the built environment. For this reason, a series of papers is presented hereafter which make use of ante hoc (intrinsically) interpretable modelling techniques for baseline energy modelling, statistical energy modelling, retrofit analysis and other related problems.
Predicting building energy consumption is a multi-faceted task that encompasses numerous aspects. Various methodologies can be employed to analyse energy baseline models, as discussed by Qaisar and Zhao [69], which are essential for accurately assessing energy demand and potential savings. Along the same line, the study conducted by Afroz et al. [70] compares four techniques and demonstrates that the NARX modelling technique has greater accuracy and fit. However, change point models (due to their ante hoc/intrinsic interpretability) provide valuable operational insights in addition to calculating energy savings, which are crucial for securing financial investments, enabled by cost savings. Interpretability often leads to a preference for practical and intuitive models over more sophisticated ones. Nevertheless, existing techniques tend to underestimate uncertainty in energy efficiency projects, as shown by Grillone et al. [71].
For this reason, measurement and verification (M&V) approaches play a crucial role in facilitating investments by offering accurate performance estimates and verification. As shown by Fu et al. [72], univariate regression models are widely used by practitioners due to their simplicity and effectiveness. M&V methodologies are also incorporated in protocols for analysing and certifying building performance proposed by ASHRAE, CIBSE and USGBC. These protocols encompass several domains such as energy, water, comfort, air quality, lighting and acoustics. In this respect, Kim and Haberl demonstrated the implementation of energy consumption measurement and modelling techniques at various levels of complexity, ranging from basic to medium [73] and advanced [74], through the use of case studies. Field testing requires gathering data including electricity consumption, natural gas consumption and weather data, among others. As indicated by Kim and Haberl, less complex procedures are frequently more informative and convenient to implement.
Considering the problem of energy-efficient retrofitting, which is essential for decreasing the consumption of buildings, there is a lack of simple but robust evaluation procedures with respect to uncertainty [75]. Alrobaie and Krarti emphasise the importance of using data-driven techniques such as linear regressions, decision trees, ensemble approaches and deep learning in order to develop effective strategies for M&V [76]. They advocate the use of a flexible M&V framework to determine the most effective approaches based on the specific building type and retrofit procedures. Grillone et al. presented a novel approach employing the generalised additive model (GAN) technique [77], evaluated on both synthetic and real-world datasets of retrofitted buildings, which demonstrates the potential to outperform other techniques while retaining intrinsic interpretability.
Further, empirically grounded techniques derived from M&V approaches can provide support for decision-making processes across building life cycle phases [78]. Implementing a standardised approach to processing and reporting energy consumption data is important for tackling a reduction in both operational and embodied energy. Energy analytics enabled by data-driven methods, open and synthetic data, digitisation and innovative business models [79] are essential for accelerating the energy transition process in the building sector [80] and retaining interpretability, for the reasons outlined at the beginning of this section and in Section 3.1. The studies previously reported demonstrated the feasibility of performing various tasks, such as energy baseline modelling, benchmarking and estimating energy savings, using data-driven methodologies that are interpretable and have been proven in the field, leveraging measurement and verification (M&V) protocols. Software tools for M&V such as ECAM [81], OpenEEmeter/Caltrack [82], RMV2.0 [83] and NMEC-R [84], to name a few, can serve as a reference and starting point for future investigation and might accelerate advancements in this area and address, to some extent, the issue of interpretability highlighted in the paper by Chen et al. [68]. However, it is important to note that research in this field is progressing rapidly. For this reason, papers selected from the group identified using the keyword search method outlined in Section 2 and described in Table 1 and Table 2 are analysed in Table 3. The selection of papers was based on two criteria: the year of publication, with a preference for more recent work, and the relevance to the study, focusing on papers that extensively presented the performance of a single approach, compared multiple data-driven methods, or provided an overview of their usage. The analysis considers various factors, including interpretability (ante hoc, post hoc), the time scale of analysis (yearly, monthly, daily, hourly), the spatial scale of analysis (individual building systems, whole building, building stock, urban) and finally causality (counterfactual, physical constraints in interpretation).
The chosen studies demonstrate a broad spectrum of issues that can be addressed by data-driven approaches, encompassing various energy carriers (electricity, fuels, thermal energy, etc.), different components of the energy balance, different temporal resolutions of analysis and multiple spatial scales. M&V principles possess sufficient flexibility to be applied to a wide range of challenges, all the while offering a rigorous approach in the form of a protocol and making use of innovative data-driven methods. The papers associated with keyword search 1 in Table 3 use regression-based techniques, sometimes in conjunction with other techniques, and predominantly leverage ante hoc (intrinsic) interpretability, while the papers found with search number 5 use deep learning and make use of post hoc interpretability by means of techniques such as SHAP and LIME. A great deal of effort is being devoted to the development of modelling tools based on deep learning, because of its high performance for model fitting, but this technique is intrinsically not interpretable. Nonetheless, many interesting concepts are emerging from the applications of deep learning and complex neural network architectures, such as the “attention mechanism”, “transformers” and physics-informed neural networks (PINNs).
The “attention mechanism” in deep neural networks is a technique that allows the model to focus on specific parts of the input data that are relevant to a given task. It is akin to how human attention works: just as we focus on particular aspects of our environment while ignoring others, attention mechanisms enable neural networks to prioritise certain parts of the input data. A “transformer” in neural networks is a model component designed to handle sequences of data, like text, by using self-attention mechanisms. These mechanisms allow the model to weigh the importance of different parts of the input data, making efficient use of the available information. A physics-informed neural network (PINN) integrates physical laws, often in the form of differential equations, into the training of neural networks. This approach imposes coherency with known physical principles in the model, improving accuracy, especially in scenarios with limited data.
While PINNs and, more generally, physics-informed ML are emerging solutions (some of the current developments and open challenges are illustrated in Section 3.3), simpler and intrinsically interpretable solutions may be effective as well, even for dynamic modelling at hourly and sub-hourly resolutions. Kazmi et al. [109] discuss the problem of data-driven modelling and forecasting of operational energy demand both at building and urban scales. In their research, the authors highlighted the problem of regularity and stability of energy consumption at different aggregation levels, from single users to large aggregations of users, the latter being in general more stable and easier to predict; further, they showed the dependence of load profiles on the type of day (weekday/weekend or day of the week) and ambient temperature as the most relevant variables. With a similar approach, Manfren and Nastasi proposed a reformulation of the Time Of Week and Temperature (TOWT) algorithm where the time of week component of the model (day type and hour of the day) [110] and ambient temperature component (outdoor air temperature) are analysed through the lens of interpretability to achieve a better understanding of user behaviour and schedule (time of week component) and building thermal behaviour (outdoor air temperature response). As shown then by Manfren et al. [111], it is possible to achieve an integration between multiple interpretable regression-based models (including TOWT), in order to provide a robust data-driven solution for DTs aimed at energy management. The techniques employed are derived from previous research [112] and are quite closely related, in principle, to the approach proposed by Staffel et al. [113] for a global model of hourly space heating and cooling demand at multiple spatial scales. This global model has been validated against demand from around 5000 buildings and 43 regions across four continents. The model includes a modified formulation of heating and cooling degree-days (HDDs/CDDs) with additional variables such as solar radiation, wind speed and relative humidity and a smoothing factor to deal with the dynamic behaviour of the buildings. However, the model is intrinsically interpretable as it is based on multiple piecewise linear regressions (used to determine the daily energy demand for heating and cooling) and factors to shape the hourly load profiles (intra-daily variation). Further, all the quantities involved in the calculations are practically physically constrained and transparent to the user. These models’ [110,111,113,114] characteristics confirm the possibility of creating effective representations of the dynamic building energy behaviour while retaining an intrinsically interpretable formulation.

3.3. Physics-Informed ML and Grey-Box Modelling in Buildings

Physics-informed machine learning represents a very active research field at present and can contribute to addressing one of the challenges for interpretability presented by Rudin et al. [36], that of incorporating physics and other generative or causal constraints in models. Karniadakis et al. [115] provide an overview of the ongoing research, showing how physics-informed ML can integrate seamlessly data and physical models even in uncertain and high-dimensional problems. The authors discuss how physics-informed neural network architectures can handle ill-posed and inverse problems and the emerging need for new frameworks for computational methods and standardised benchmarks. In general, physics-informed ML employs “first-principles” from physics and integrates them into data-driven modelling techniques. In relation to this problem, Bradley et al. [116] state that hybrid first-principles, data-driven methods generally outperform purely data-driven models, but the interpretation and extrapolation of modelling results require particular attention. Gunnell et al. [117] present the current state and future directions for open-source software for equation-based and data-driven modelling and highlight an emerging trend of integration of data-driven and principles-based (i.e., physics-informed) tools.
In the specific context of building energy performance prediction, the problem of classifying the large variety of modelling approaches available has been described in extensive literature reviews [38,39,40,41], considering both “top-down” and “bottom-up” perspectives, as indicated in Section 3. The necessity of employing both “forward” and “inverse” modelling approaches in an integrated way has been discussed both by Tian et al. [118] and Tronchin et al. [119], respectively, in relation to the problem of uncertainty in building performance simulation and the issue of energy efficiency, demand side management and storage in the built environment. In Figure 2, a simple example of “forward” and “inverse” modelling integration is presented. Inverse modelling can exploit typically purely data-driven methods or hybrid physical–statistical approaches and constitutes a point of contact with the problems illustrated in Section 3.2.
Indeed, energy modelling in the built environment involves evaluations at multiple scales, as highlighted for example in UBEM research [120] in which data-driven modelling and forecasting [109] assume an increasingly important role. Another example in this sense is the problem of “zoning” in energy models. Building energy modelling tools, crucial for assessing energy efficiency, face difficulties in achieving precise thermal zoning (i.e., a rational subdivision of detailed building models into parts), a gap highlighted by Shin and Haberl [121]. There is currently a lack of information regarding the accuracy and reliability of zoning algorithms for HVAC design engineers. This highlights the necessity for future studies to improve and fine-tune existing methods for detailed energy modelling, in light of potential simplifications.
Regarding thermal models’ simplifications, Dogan and Reinhart suggest the use of “shoe-box” modelling techniques [122], which offer faster simulation results without compromising accuracy (within 5–10% with respect to models compliant with ASHRAE guidelines). These methods are particularly effective for modelling at the neighbourhood and urban scale. The shoe-boxing algorithm developed by Battini et al. [123], which has been tested in urban settings in three climatic conditions, has high accuracy in predicting thermal loads and indoor air temperatures and provides a considerable reduction in simulation time. Moreover, uncertainty in building energy modelling involves the data sources, the approaches used and the practical uses of simulation tools [118]. Chong et al. [124] highlight the difficulties involved in calibrating building simulations with actual energy measurements. Effectively handling the uncertainty in model inputs and model structure is essential when creating energy-related applications and highlights the importance of reproducible simulations. Among the recommendations for reproducibility, Chong et al. propose archetype generation with a segmentation of building typologies and characteristics.
Li et al. [125] discuss how grey-box modelling, employing RC formulations, provides benefits over black-box and white-box methods in different domains, such as district/urban-scale energy modelling and building–grid integration. The overview by Boodi et al. [126] demonstrates the versatility and effectiveness of creating thermal network models formulated according to the grey-box approach. While there is not a single standard approach for RC formulations for building energy modelling, two largely diffused approaches are the 5 Resistances and 1 Capacitance (5R1C) of International Standard ISO 13790 and the 7 Resistances and 2 Capacitances (7R2C) of the German Guideline VDI 6007. These approaches are compared by Vivian et al. [127], indicating how 7R2C is more accurate. Nonetheless, 5R1C can be improved compared to its original formulation [128,129] and validated with respect to benchmarks for dynamic simulation.
De Rosa et al. [130] examine the trade-off between the accuracy of modelling and the computational requirements, proposing a method for iteratively decreasing the complexity of simulation models while preserving the RC physical structure. Kircher and Zhang [131] thoroughly analyse RC models, emphasising their constraints from the point of view of thermal models’ formulation. Serale et al. [132] propose a standardised terminology for building control and review model predictive control (MPC) algorithms for managing building and HVAC systems, including grey-box formulations. Further, the comprehensive framework developed by Drgoňa et al. [133] for model predictive control (MPC) in buildings includes the possible use of RC models. This framework provides valuable information on control systems and communication infrastructures. Finally, Andriamamonjy’s [134] automated grey-box implementation employing BIM showcases the ability of BIM to support the process of gathering data from building management systems. This, in turn, can improve facility management and operations by providing real-time monitored data. Based on the previous considerations regarding the potential to streamline energy modelling methodologies while maintaining the model’s physical structure and interpretation, specific studies were chosen and examined, as indicated in Table 4.
The selected articles all employ a grey-box RC model and, in numerous instances, use the ISO 13790 5R1C model, which has been tailored to address specific issues. The models presented in the selected papers are chosen based on their suitability for various scales of application, ranging from individual buildings to urban planning problems. They are capable of simulating building energy behaviour at an hourly or sub-hourly resolution, can be validated and standardised, and can be easily calibrated due to the use of lumped parameter formulations. Additionally, they are suitable for uncertainty analysis, allowing for the execution of multiple simulation scenarios under uncertain conditions in a limited time. From a practical standpoint, both planning, design and operation phases are considered, in order to emphasise a possible continuity in the application of grey-box models, in accordance with the idea of integrating forward and inverse modelling as illustrated in Figure 2.
As can be seen in Table 4, RC grey-box models have been used for a multiplicity of scopes in planning–design phases (urban scale planning, building stock modelling, early-stage optimisation) and in operation (energy management, MPC control, grid interaction and flexibility, environmental monitoring, calibration under uncertainty) for more than 15 years. In most of the papers selected, grey-box formulations are used as “forward” modelling tools for building performance prediction, but the last three papers presented in Table 4 discuss the problem of calibration under uncertainty, i.e., “inverse” modelling, where measured data are used to determine the actual value of the lumped parameters to be used in “forward” modelling. Low order models [126] in particular are easy to calibrated in an “inverse” way.
Given the versatile nature of these modelling methods, the potential for additional validation and standardisation and the simplicity of calibration, they provide a solid foundation for further exploration in the cutting-edge field of physics-informed machine learning in the future. The preservation of a physical structure and interpretation of a model can become particularly powerful in relation to the principles outlined in Section 3.1. At the same time, the possibility of employing both open data [159] and synthetic data, where necessary, is opening new frontiers for the research considering interoperable [160] linked open data at multiple scales [161] and across different domains of knowledge in the built environment [162], including comfort [163] and user behaviour [164].

3.4. Limitations, Summary of Findings and Further Work

The limitations of this literature review are primarily determined by the challenge of conducting an analysis that encompasses various levels of the relationship between interpretability as a concept and its practical applications in the built environment for energy modelling purposes. These applications involve making use of interpretability across different temporal and spatial scales of analysis. Further, two aspects emerging from the literature review by Chen et al. [68] are particularly interesting, the lack of unified consensus around the definitions of interpretability and explainability (discussed in Section 3.1) and the limitations of post hoc techniques such as SHAP and LIME (discussed in Section 3.1 and 3.2). At the same time, the importance of using intrinsically (ante hoc) interpretable techniques is highlighted in other reviews [72,75,76] on data-driven methods for energy modelling in buildings (discussed in Section 3.2). Additionally, recently developed research and software tools [113] demonstrate the possibility of applying intrinsically interpretable techniques for the calculation of heating and cooling demand at hourly resolution on a global scale and provide results comparable, at least in principle, to those of grey-box dynamic simulation tools (discussed in Section 3.3).
Indeed, the objective of the research was to highlight critical problems that are interconnected in the rapidly developing domain of data-driven methods and to identify knowledge gaps that can inspire further research. Moreover, the choice of papers in this review underscores the need to develop connections between fast-evolving fields of research and established procedures and tools that have been rigorously verified both conceptually and practically through field studies. The critical connections and knowledge gaps identified are summarised in Table 5 hereafter for all the analytical levels considered in the paper.
The connections and gaps identified indicate the possibility of leveraging consolidated knowledge even in a very fast-moving research landscape such as the one of data-driven methods for energy applications. Future research can integrate concepts from both measurement and verification (M&V) and grey-box RC building models to accomplish multiple objectives while maintaining physical interpretation and transparency, to enable human oversight (intelligibility of models). This addresses one of the challenges identified by Rudin et al. [36] and could become also a normative requirement, as discussed in Section 3.1. One of the main issues is that black-box techniques like deep learning and random forest (among others) are more accurate (and hence preferred in many cases) than other models, but less transparent. However, Rudin and Radin [50] noted that interpretable models like regressions and trees may be as accurate as black-box techniques in many cases. This clearly depends on the specific application considered. Following the advice provided by Petch et al. [51], it could make sense to use both intrinsically interpretable and black-box explainable methods, analysing the trade-off between accuracy and interpretability afterwards. In general, the combined use of multiple data-driven modelling approaches, ideally leveraging physical interpretation and constraints, could enable more robust performance estimates.

4. Conclusions

The energy transition’s acceleration poses numerous socio-technical challenges. To fulfil the increasing requirements for sustainability and efficiency in the built environment, it is essential to undertake both technological improvements and significant behavioural changes. Digital twins (DTs), enhanced with artificial intelligence (AI) and machine learning (ML) technologies, are becoming valuable tools in the energy transition, providing useful insights into the performance of buildings throughout different stages of their life cycle. Nevertheless, the deployment of these technologies entails ethical problems, in relation to the principle of explicability (implying fundamental questions such as “how does it work?” and “who is responsible for the results?”). From a technological perspective, the notions of interpretability and explainability are crucial and should be carefully considered in combination with domain-specific knowledge. This integration is vital to ensure that the selection of modelling methods is both appropriate and effective for the specific challenges which are present in the built environment. The literature review conducted in this paper is structured into three analytical levels, each one providing a different lens through which to view the challenges and opportunities presented by AI/ML in the built environment. In the first level, the broader implications of interpretability in machine learning within the existing research debate are examined. The fundamental understanding of this concept has substantial implications for the field of built environment, where the demand for AI/ML solutions that are both interpretable and ethical is becoming more and more relevant. The second level of study explores these implications and indicates the potential of interpretable data-driven methods to act as a connection between data-driven and physics-driven modelling approaches, which are analysed in the third level. Overall, the literature review illustrated the importance of employing intrinsically interpretable techniques and their applicability in energy calculations in buildings up to an hourly resolution. These techniques can provide results that are, in principle, compatible with grey-box dynamic simulation tools that combine physical and statistical principles. The synergy between data-driven and physics-driven methods is crucial for tackling practical applications in the energy transition, ideally exploiting an increasing integration of forward and inverse modelling practices, with the goal of continuous performance improvement. Additionally, the three-level analysis indicates prospective areas for further investigation. These areas of research show the potential for further development of practical AI/ML applications while providing a strong ethical foundation for them. Overall, this review underscores the importance of developing and implementing AI/ML solutions that are not only technically sound but also ethically responsible and interpretable. These features appear to be particularly valuable in relation to the problem of accelerating the energy transition in the built environment from present to future perspectives.

Author Contributions

Conceptualisation, M.M., K.M.G.-C. and P.A.B.J.; methodology, M.M. and K.M.G.-C.; investigation, M.M., K.M.G.-C. and P.A.B.J.; writing—original draft preparation, M.M; writing—review and editing, M.M. and K.M.G.-C. All authors have read and agreed to the published version of the manuscript.


This research received no external funding.


This work is part of the activity of the Energy & Climate Change Division, Sustainable Energy Research Group at the University of Southampton: “ (accessed on 8 February 2024)”. The research is also part of the Sustainability Strategy 2020–2025 of the University of Southampton: “ (accessed on 8 February 2024)”.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.


  1. Geels, F.W.; Turnheim, B. The Great Reconfiguration—A Socio-Technical Analysis of Low-Carbon Transitions in UK Electricity, Heat, and Mobility Systems; Cambridge University Press: Cambridge, UK, 2022. [Google Scholar]
  2. Wahlund, M.; Palm, J. The Role of Energy Democracy and Energy Citizenship for Participatory Energy Transitions: A Comprehensive Review. Energy Res. Soc. Sci. 2022, 87, 102482. [Google Scholar] [CrossRef]
  3. Skjølsvold, T.M.; Coenen, L. Are Rapid and Inclusive Energy and Climate Transitions Oxymorons? Towards Principles of Responsible Acceleration. Energy Res. Soc. Sci. 2021, 79, 102164. [Google Scholar] [CrossRef]
  4. Bray, R.; Mejía Montero, A.; Ford, R. Skills Deployment for a ‘Just’ Net Zero Energy Transition. Environ. Innov. Soc. Transit. 2022, 42, 395–410. [Google Scholar] [CrossRef]
  5. Dall-Orsoletta, A.; Cunha, J.; Araújo, M.; Ferreira, P. A Systematic Review of Social Innovation and Community Energy Transitions. Energy Res. Soc. Sci. 2022, 88, 102625. [Google Scholar] [CrossRef]
  6. Nastasi, B.; Mazzoni, S. Renewable Hydrogen Energy Communities Layouts towards Off-Grid Operation. Energy Convers. Manag. 2023, 291, 117293. [Google Scholar] [CrossRef]
  7. Nastasi, B.; Markovska, N.; Puksec, T.; Duić, N.; Foley, A. Techniques and Technologies to Board on the Feasible Renewable and Sustainable Energy Systems. Renew. Sustain. Energy Rev. 2023, 182, 113428. [Google Scholar] [CrossRef]
  8. Bellocchi, S.; Colbertaldo, P.; Manno, M.; Nastasi, B. Assessing the Effectiveness of Hydrogen Pathways: A Techno-Economic Optimisation within an Integrated Energy System. Energy 2023, 263, 126017. [Google Scholar] [CrossRef]
  9. Way, R.; Ives, M.C.; Mealy, P.; Farmer, J.D. Empirically Grounded Technology Forecasts and the Energy Transition. Joule 2022, 6, 2057–2082. [Google Scholar] [CrossRef]
  10. Noussan, M.; Nastasi, B. Data Analysis of Heating Systems for Buildings—A Tool for Energy Planning, Policies and Systems Simulation. Energies 2018, 11, 233. [Google Scholar] [CrossRef]
  11. Fenner, A.E.; Kibert, C.J.; Woo, J.; Morque, S.; Razkenari, M.; Hakim, H.; Lu, X. The Carbon Footprint of Buildings: A Review of Methodologies and Applications. Renew. Sustain. Energy Rev. 2018, 94, 1142–1152. [Google Scholar] [CrossRef]
  12. Li, Y.L.; Han, M.Y.; Liu, S.Y.; Chen, G.Q. Energy Consumption and Greenhouse Gas Emissions by Buildings: A Multi-Scale Perspective. Build. Environ. 2019, 151, 240–250. [Google Scholar] [CrossRef]
  13. Berardi, U. A Cross-Country Comparison of the Building Energy Consumptions and Their Trends. Resour. Conserv. Recycl. 2017, 123, 230–241. [Google Scholar] [CrossRef]
  14. Huang, L.; Krigsvoll, G.; Johansen, F.; Liu, Y.; Zhang, X. Carbon Emission of Global Construction Sector. Renew. Sustain. Energy Rev. 2018, 81, 1906–1916. [Google Scholar] [CrossRef]
  15. A European Green Deal. Available online: (accessed on 8 February 2024).
  16. European Commission. A Renovation Wave for Europe-Greening Our Buildings, Creating Jobs, Improving Life. COM(2020)662. Available online: (accessed on 8 February 2024).
  17. European Commission. Energy Roadmap 2050-Impact Assessment and Scenario Analysis. 2012. Available online: (accessed on 8 February 2024).
  18. European Commission. Directive (EU) 2018/2001 of the European Parliament and of the Council of 11 December 2018 on the Promotion of the Use of Energy from Renewable Sources (Recast). 2018. Available online: (accessed on 8 February 2024).
  19. Ferrara, M.; Monetti, V.; Fabrizio, E. Cost-Optimal Analysis for Nearly Zero Energy Buildings Design and Optimization: A Critical Review. Energies 2018, 11, 1478. [Google Scholar] [CrossRef]
  20. Zangheri, P.; D’Agostino, D.; Armani, R.; Bertoldi, P. Review of the Cost-Optimal Methodology Implementation in Member States in Compliance with the Energy Performance of Buildings Directive. Buildings 2022, 12, 1482. [Google Scholar] [CrossRef]
  21. Akhimien, N.G.; Latif, E.; Hou, S.S. Application of Circular Economy Principles in Buildings: A Systematic Review. J. Build. Eng. 2021, 38, 102041. [Google Scholar] [CrossRef]
  22. Lehmann, H.; Hinske, C.; de Margerie, V.; Slaveikova Nikolova, A. The Impossibilities of the Circular Economy: Separating Aspirations from Reality; Taylor & Francis: London, UK, 2023. [Google Scholar]
  23. Seyedabadi, M.R.; Samareh Abolhassani, S.; Eicker, U. District Cradle to Grave LCA Including the Development of a Localized Embodied Carbon Database and a Detailed End-of-Life Carbon Emission Workflow. J. Build. Eng. 2023, 76, 107101. [Google Scholar] [CrossRef]
  24. Nebel, B. Cradle to Cradle, LCA and Circular Economy: A Love Triangle. NZ Manuf. Mag. 2020, 23. Available online: (accessed on 8 February 2024).
  25. Borkowski, A.S. A Literature Review of BIM Definitions: Narrow and Broad Views. Technologies 2023, 11, 176. [Google Scholar] [CrossRef]
  26. Borkowski, A.S. Evolution of BIM: Epistemology, Genesis and Division into Periods. J. Inf. Technol. Constr. 2023, 28, 646–661. [Google Scholar] [CrossRef]
  27. Deng, M.; Menassa, C.C.; Kamat, V.R. From BIM to Digital Twins: A Systematic Review of the Evolution of Intelligent Building Representations in the AEC-FM Industry. J. Inf. Technol. Constr. 2021, 26, 58–83. [Google Scholar] [CrossRef]
  28. Chen, Z.; Huang, L. Digital Twin in Circular Economy: Remanufacturing in Construction. IOP Conf. Ser. Earth Environ. Sci. 2020, 588, 32014. [Google Scholar] [CrossRef]
  29. de Wilde, P. Building Performance Simulation in the Brave New World of Artificial Intelligence and Digital Twins: A Systematic Review. Energy Build. 2023, 292, 113171. [Google Scholar] [CrossRef]
  30. IEA-EBC Data-Driven Smart Buildings: State-of-the-Art Review—Annex 81. 2023, pp. 1–103. Available online: (accessed on 8 February 2024).
  31. Rosenow, J.; Eyre, N. Reinventing Energy Efficiency for Net Zero. Energy Res. Soc. Sci. 2022, 90, 102602. [Google Scholar] [CrossRef]
  32. Baniassadi, A.; Heusinger, J.; Gonzalez, P.I.; Weber, S.; Samuelson, H.W. Co-Benefits of Energy Efficiency in Residential Buildings. Energy 2022, 238, 121768. [Google Scholar] [CrossRef]
  33. Regulatory Framework Proposal on Artificial Intelligence. Available online: (accessed on 30 December 2023).
  34. Molnar, C.; Casalicchio, G.; Bischl, B. Interpretable Machine Learning—A Brief History, State-of-the-Art and Challenges. In Communications in Computer and Information Science; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 417–431. ISBN 9783030659653. [Google Scholar]
  35. Interpretable Machine Learning, Section 3.2 Taxonomy of Interpretability Methods, Christopher Molnar. Available online: (accessed on 24 May 2023).
  36. Rudin, C.; Chen, C.; Chen, Z.; Huang, H.; Semenova, L.; Zhong, C. Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges. Stat. Surv. 2022, 16, 1–85. [Google Scholar] [CrossRef]
  37. Watson, D.S. Conceptual Challenges for Interpretable Machine Learning. Synthese 2022, 200, 65. [Google Scholar] [CrossRef]
  38. Kavgic, M.; Mavrogianni, A.; Mumovic, D.; Summerfield, A.; Stevanovic, Z.; Djurovic-Petrovic, M. A Review of Bottom-up Building Stock Models for Energy Consumption in the Residential Sector. Build. Environ. 2010, 45, 1683–1697. [Google Scholar] [CrossRef]
  39. Foucquier, A.; Robert, S.; Suard, F.; Stéphan, L.; Jay, A. State of the Art in Building Modelling and Energy Performances Prediction: A Review. Renew. Sustain. Energy Rev. 2013, 23, 272–288. [Google Scholar] [CrossRef]
  40. Fumo, N. A Review on the Basics of Building Energy Estimation. Renew. Sustain. Energy Rev. 2014, 31, 53–60. [Google Scholar] [CrossRef]
  41. Chen, Y.; Guo, M.; Chen, Z.; Chen, Z.; Ji, Y. Physical Energy and Data-Driven Models in Building Energy Prediction: A Review. Energy Rep. 2022, 8, 2656–2671. [Google Scholar] [CrossRef]
  42. Hong, S.-M.; Paterson, G.; Burman, E.; Steadman, P.; Mumovic, D. A Comparative Study of Benchmarking Approaches for Non-Domestic Buildings: Part 1—Top-down Approach. Int. J. Sustain. Built Environ. 2013, 2, 119–130. [Google Scholar] [CrossRef]
  43. Burman, E.; Hong, S.-M.; Paterson, G.; Kimpian, J.; Mumovic, D. A Comparative Study of Benchmarking Approaches for Non-Domestic Buildings: Part 2—Bottom-up Approach. Int. J. Sustain. Built Environ. 2014, 3, 247–261. [Google Scholar] [CrossRef]
  44. Kheiri, F.; Haberl, J.S.; Baltazar, J.-C. Split-Degree Day Method: A Novel Degree Day Method for Improving Building Energy Performance Estimation. Energy Build. 2023, 289, 113034. [Google Scholar] [CrossRef]
  45. Gallese, C. The AI Act Proposal: A New Right to Technical Interpretability? arXiv 2023, arXiv:2303.17558. [Google Scholar] [CrossRef]
  46. Panigutti, C.; Hamon, R.; Hupont, I.; Fernandez Llorca, D.; Fano Yela, D.; Junklewitz, H.; Scalzo, S.; Mazzini, G.; Sanchez, I.; Soler Garrido, J.; et al. The Role of Explainable AI in the Context of the AI Act. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, Chicago, IL, USA, 12–15 June 2023; pp. 1139–1150. [Google Scholar]
  47. Gryz, J.; Rojszczak, M. Black Box Algorithms and the Rights of Individuals: No Easy Solution to the “Explainability” Problem. Internet Policy Rev. 2021, 10, 1–24. [Google Scholar] [CrossRef]
  48. Rudin, C. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
  49. Flora, M.; Potvin, C.; McGovern, A.; Handler, S. Comparing Explanation Methods for Traditional Machine Learning Models Part 1: An Overview of Current Methods and Quantifying Their Disagreement. arXiv 2022, arXiv:2211.08943. [Google Scholar]
  50. Rudin, C.; Radin, J. Why Are We Using Black Box Models in AI When We Don’t Need to? A Lesson from an Explainable AI Competition. Harv. Data Sci. Rev. 2019, 1, 1–9. [Google Scholar]
  51. Petch, J.; Di, S.; Nelson, W. Opening the Black Box: The Promise and Limitations of Explainable Machine Learning in Cardiology. Can. J. Cardiol. 2022, 38, 204–213. [Google Scholar] [CrossRef]
  52. Roberts, H.; Cowls, J.; Hine, E.; Mazzi, F.; Tsamados, A.; Taddeo, M.; Floridi, L. Achieving a ‘Good AI Society’: Comparing the Aims and Progress of the EU and the US. Sci. Eng. Ethics 2021, 27, 68. [Google Scholar] [CrossRef]
  53. Floridi, L.; Cowls, J. A Unified Framework of Five Principles for AI in Society. Harv. Data Sci. Rev. 2019, 1. [Google Scholar] [CrossRef]
  54. Semeraro, C.; Lezoche, M.; Panetto, H.; Dassisti, M. Digital Twin Paradigm: A Systematic Literature Review. Comput. Ind. 2021, 130, 103469. [Google Scholar] [CrossRef]
  55. Dalibor, M.; Jansen, N.; Rumpe, B.; Schmalzing, D.; Wachtmeister, L.; Wimmer, M.; Wortmann, A. A Cross-Domain Systematic Mapping Study on Software Engineering for Digital Twins. J. Syst. Softw. 2022, 193, 111361. [Google Scholar] [CrossRef]
  56. Wright, L.; Davidson, S. How to Tell the Difference between a Model and a Digital Twin. Adv. Model. Simul. Eng. Sci. 2020, 7, 13. [Google Scholar] [CrossRef]
  57. Schmidt, M.; Åhlund, C. Smart Buildings as Cyber-Physical Systems: Data-Driven Predictive Control Strategies for Energy Efficiency. Renew. Sustain. Energy Rev. 2018, 90, 742–756. [Google Scholar] [CrossRef]
  58. de Wilde, P. The Gap between Predicted and Measured Energy Performance of Buildings: A Framework for Investigation. Autom. Constr. 2014, 41, 40–49. [Google Scholar] [CrossRef]
  59. Imam, S.; Coley, D.A.; Walker, I. The Building Performance Gap: Are Modellers Literate? Build. Serv. Eng. Res. Technol. 2017, 38, 351–375. [Google Scholar] [CrossRef]
  60. de Wilde, P. The Building Performance Gap: Are Modellers Literate? Build. Serv. Eng. Res. Technol. 2017, 38, 757–759. [Google Scholar] [CrossRef]
  61. Doan, D.; Ghaffarianhoseini, A.; Naismith, N.; Zhang, T.; Tookey, T. What Is BIM?: A Need for a Unique BIM Definition. In Proceedings of the IConBEE2018: Inaugural International Conference on the Built Environment and Engineering, EDP Sciences, Johor, Malaysia, 29–30 October 2018; p. 88. [Google Scholar]
  62. Opoku, D.-G.J.; Perera, S.; Osei-Kyei, R.; Rashidi, M. Digital Twin Application in the Construction Industry: A Literature Review. J. Build. Eng. 2021, 40, 102726. [Google Scholar] [CrossRef]
  63. Borth, M.; Verriet, J.; Muller, G. Digital Twin Strategies for SoS 4 Challenges and 4 Architecture Setups for Digital Twins of SoS. In Proceedings of the 2019 14th Annual Conference System of Systems Engineering (SoSE), Anchorage, AK, USA, 19–22 May 2019; pp. 164–169. [Google Scholar]
  64. Bjørnskov, J.; Jradi, M. An Ontology-Based Innovative Energy Modeling Framework for Scalable and Adaptable Building Digital Twins. Energy Build. 2023, 292, 113146. [Google Scholar] [CrossRef]
  65. Ammar, A.; Nassereddine, H.; AbdulBaky, N.; AbouKansour, A.; Tannoury, J.; Urban, H.; Schranz, C. Digital Twins in the Construction Industry: A Perspective of Practitioners and Building Authority. Front. Built Environ. 2022, 8, 834671. [Google Scholar] [CrossRef]
  66. Yu, W.; Patros, P.; Young, B.; Klinac, E.; Walmsley, T.G. Energy Digital Twin Technology for Industrial Energy Management: Classification, Challenges and Future. Renew. Sustain. Energy Rev. 2022, 161, 112407. [Google Scholar] [CrossRef]
  67. Manfren, M.; Tagliabue, L.C.; Re Cecconi, F.; Ricci, M. Long-Term Techno-Economic Performance Monitoring to Promote Built Environment Decarbonisation and Digital Transformation—A Case Study. Sustainability 2022, 14, 644. [Google Scholar] [CrossRef]
  68. Chen, Z.; Xiao, F.; Guo, F.; Yan, J. Interpretable Machine Learning for Building Energy Management: A State-of-the-Art Review. Adv. Appl. Energy 2023, 9, 100123. [Google Scholar] [CrossRef]
  69. Qaisar, I.; Zhao, Q. Energy Baseline Prediction for Buildings: A Review. Results Control Optim. 2022, 7, 100129. [Google Scholar] [CrossRef]
  70. Afroz, Z.; Burak Gunay, H.; O’Brien, W.; Newsham, G.; Wilton, I. An Inquiry into the Capabilities of Baseline Building Energy Modelling Approaches to Estimate Energy Savings. Energy Build. 2021, 244, 111054. [Google Scholar] [CrossRef]
  71. Grillone, B.; Mor, G.; Danov, S.; Cipriano, J.; Lazzari, F.; Sumper, A. Baseline Energy Use Modeling and Characterization in Tertiary Buildings Using an Interpretable Bayesian Linear Regression Methodology. Energies 2021, 14, 5556. [Google Scholar] [CrossRef]
  72. Fu, H.; Baltazar, J.-C.; Claridge, D.E. Review of Developments in Whole-Building Statistical Energy Consumption Models for Commercial Buildings. Renew. Sustain. Energy Rev. 2021, 147, 111248. [Google Scholar] [CrossRef]
  73. Kim, H.; Haberl, J. Field-Test of the ASHRAE/CIBSE/USGBC Performance Measurement Protocols: Part I Intermediate Level Energy Protocols. Sci. Technol. Built Environ. 2018, 24, 281–297. [Google Scholar] [CrossRef]
  74. Kim, H.; Haberl, J. Field-Test of the ASHRAE/CIBSE/USGBC Performance Measurement Protocols: Part II Advanced Level Energy Protocols. Sci. Technol. Built Environ. 2018, 24, 298–315. [Google Scholar] [CrossRef]
  75. Grillone, B.; Danov, S.; Sumper, A.; Cipriano, J.; Mor, G. A Review of Deterministic and Data-Driven Methods to Quantify Energy Efficiency Savings and to Predict Retrofitting Scenarios in Buildings. Renew. Sustain. Energy Rev. 2020, 131, 110027. [Google Scholar] [CrossRef]
  76. Alrobaie, A.; Krarti, M. A Review of Data-Driven Approaches for Measurement and Verification Analysis of Building Energy Retrofits. Energies 2022, 15, 7824. [Google Scholar] [CrossRef]
  77. Grillone, B.; Mor, G.; Danov, S.; Cipriano, J.; Sumper, A. A Data-Driven Methodology for Enhanced Measurement and Verification of Energy Efficiency Savings in Commercial Buildings. Appl. Energy 2021, 301, 117502. [Google Scholar] [CrossRef]
  78. Manfren, M.; Nastasi, B.; Tronchin, L. Linking Design and Operation Phase Energy Performance Analysis Through Regression-Based Approaches. Front. Energy Res. 2020, 8, 557649. [Google Scholar] [CrossRef]
  79. Manfren, M.; Nastasi, B.; Tronchin, L.; Groppi, D.; Garcia, D.A. Techno-Economic Analysis and Energy Modelling as a Key Enablers for Smart Energy Services and Technologies in Buildings. Renew. Sustain. Energy Rev. 2021, 150, 111490. [Google Scholar] [CrossRef]
  80. Manfren, M.; Sibilla, M.; Tronchin, L. Energy Modelling and Analytics in the Built Environment—A Review of Their Role for Energy Transitions in the Construction Sector. Energies 2021, 14, 679. [Google Scholar] [CrossRef]
  81. ECAM 7.0. Available online: (accessed on 8 February 2024).
  82. CalTRACK CalTRACK Methods. Available online: (accessed on 30 December 2023).
  83. RMV2.0—LBNL M&V2.0 Tool. Available online: (accessed on 30 December 2023).
  84. NMECR. Available online: (accessed on 30 December 2023).
  85. Østergård, T.; Jensen, R.L.; Maagaard, S.E. A Comparison of Six Metamodeling Techniques Applied to Building Performance Simulations. Appl. Energy 2018, 211, 89–103. [Google Scholar] [CrossRef]
  86. Li, D.H.W.; Chen, W.; Li, S.; Lou, S. Estimation of Hourly Global Solar Radiation Using Multivariate Adaptive Regression Spline (MARS)—A Case Study of Hong Kong. Energy 2019, 186, 115857. [Google Scholar] [CrossRef]
  87. Wang, Z.; Chen, Y. Data-Driven Modeling of Building Thermal Dynamics: Methodology and State of the Art. Energy Build. 2019, 203, 109405. [Google Scholar] [CrossRef]
  88. Khamma, T.R.; Zhang, Y.; Guerrier, S.; Boubekri, M. Generalized Additive Models: An Efficient Method for Short-Term Energy Prediction in Office Buildings. Energy 2020, 213, 118834. [Google Scholar] [CrossRef]
  89. Li, K.; Sun, Y.; Robinson, D.; Ma, J.; Ma, Z. A New Strategy to Benchmark and Evaluate Building Electricity Usage Using Multiple Data Mining Technologies. Sustain. Energy Technol. Assess. 2020, 40, 100770. [Google Scholar] [CrossRef]
  90. Feng, Y.; Duan, Q.; Chen, X.; Yakkali, S.S.; Wang, J. Space Cooling Energy Usage Prediction Based on Utility Data for Residential Buildings Using Machine Learning Methods. Appl. Energy 2021, 291, 116814. [Google Scholar] [CrossRef]
  91. Zhang, C.; Tian, X.; Zhao, Y.; Li, T.; Zhou, Y.; Zhang, X. Causal Discovery-Based External Attention in Neural Networks for Accurate and Reliable Fault Detection and Diagnosis of Building Energy Systems. Build. Environ. 2022, 222, 109357. [Google Scholar] [CrossRef]
  92. Ding, Y.; Zhang, D.; Lv, J. Comparison of the Applicability of City-Level Building Energy Consumption Quota Methods. Energy Build. 2022, 261, 111933. [Google Scholar] [CrossRef]
  93. Chen, S.; Zhou, X.; Zhou, G.; Fan, C.; Ding, P.; Chen, Q. An Online Physical-Based Multiple Linear Regression Model for Building’s Hourly Cooling Load Prediction. Energy Build. 2022, 254, 111574. [Google Scholar] [CrossRef]
  94. Liu, X.; Tang, H.; Ding, Y.; Yan, D. Investigating the Performance of Machine Learning Models Combined with Different Feature Selection Methods to Estimate the Energy Consumption of Buildings. Energy Build. 2022, 273, 112408. [Google Scholar] [CrossRef]
  95. Yue, N.; Caini, M.; Li, L.; Zhao, Y.; Li, Y. A Comparison of Six Metamodeling Techniques Applied to Multi Building Performance Vectors Prediction on Gymnasiums under Multiple Climate Conditions. Appl. Energy 2023, 332, 120481. [Google Scholar] [CrossRef]
  96. Manfren, M.; James, P.A.B.; Tronchin, L. Data-Driven Building Energy Modelling—An Analysis of the Potential for Generalisation through Interpretable Machine Learning. Renew. Sustain. Energy Rev. 2022, 167, 112686. [Google Scholar] [CrossRef]
  97. Wang, E. Decomposing Core Energy Factor Structure of U.S. Residential Buildings through Principal Component Analysis with Variable Clustering on High-Dimensional Mixed Data. Appl. Energy 2017, 203, 858–873. [Google Scholar] [CrossRef]
  98. Shen, Y.; Pan, Y. BIM-Supported Automatic Energy Performance Analysis for Green Building Design Using Explainable Machine Learning and Multi-Objective Optimization. Appl. Energy 2023, 333, 120575. [Google Scholar] [CrossRef]
  99. Wang, J.; Mae, M.; Taniguchi, K. Uncertainty Modeling Method of Weather Elements Based on Deep Learning for Robust Solar Energy Generation of Building. Energy Build. 2022, 266, 112115. [Google Scholar] [CrossRef]
  100. Chen, J.; Zhang, L.; Li, Y.; Shi, Y.; Gao, X.; Hu, Y. A Review of Computing-Based Automated Fault Detection and Diagnosis of Heating, Ventilation and Air Conditioning Systems. Renew. Sustain. Energy Rev. 2022, 161, 112395. [Google Scholar] [CrossRef]
  101. Zhang, C.; Li, J.; Zhao, Y.; Li, T.; Chen, Q.; Zhang, X. A Hybrid Deep Learning-Based Method for Short-Term Building Energy Load Prediction Combined with an Interpretation Process. Energy Build. 2020, 225, 110301. [Google Scholar] [CrossRef]
  102. Gao, Y.; Ruan, Y. Interpretable Deep Learning Model for Building Energy Consumption Prediction Based on Attention Mechanism. Energy Build. 2021, 252, 111379. [Google Scholar] [CrossRef]
  103. Li, A.; Xiao, F.; Zhang, C.; Fan, C. Attention-Based Interpretable Neural Network for Building Cooling Load Prediction. Appl. Energy 2021, 299, 117238. [Google Scholar] [CrossRef]
  104. Li, G.; Li, F.; Xu, C.; Fang, X. A Spatial-Temporal Layer-Wise Relevance Propagation Method for Improving Interpretability and Prediction Accuracy of LSTM Building Energy Prediction. Energy Build. 2022, 271, 112317. [Google Scholar] [CrossRef]
  105. Gokhale, G.; Claessens, B.; Develder, C. Physics Informed Neural Networks for Control Oriented Thermal Modeling of Buildings. Appl. Energy 2022, 314, 118852. [Google Scholar] [CrossRef]
  106. Lu, J.; Zhang, C.; Li, J.; Zhao, Y.; Qiu, W.; Li, T.; Zhou, K.; He, J. Graph Convolutional Networks-Based Method for Estimating Design Loads of Complex Buildings in the Preliminary Design Stage. Appl. Energy 2022, 322, 119478. [Google Scholar] [CrossRef]
  107. Choi, S.Y.; Kim, S.H. Selection of a Transparent Meta-Model Algorithm for Feasibility Analysis Stage of Energy Efficient Building Design: Clustering vs. Tree. Energies 2022, 15, 6620. [Google Scholar] [CrossRef]
  108. Wang, R.; Lu, S.; Li, Q. Multi-Criteria Comprehensive Study on Predictive Algorithm of Hourly Heating Energy Consumption for Residential Buildings. Sustain. Cities Soc. 2019, 49, 101623. [Google Scholar] [CrossRef]
  109. Kazmi, H.; Fu, C.; Miller, C. Ten Questions Concerning Data-Driven Modelling and Forecasting of Operational Energy Demand at Building and Urban Scale. Build. Environ. 2023, 239, 110407. [Google Scholar] [CrossRef]
  110. Manfren, M.; Nastasi, B. Interpretable Data-Driven Building Load Profiles Modelling for Measurement and Verification 2.0. Energy 2023, 283, 128490. [Google Scholar] [CrossRef]
  111. Manfren, M.; James, P.A.B.; Aragon, V.; Tronchin, L. Lean and Interpretable Digital Twins for Building Energy Monitoring—A Case Study with Smart Thermostatic Radiator Valves and Gas Absorption Heat Pumps. Energy AI 2023, 14, 100304. [Google Scholar] [CrossRef]
  112. Nastasi, B.; Manfren, M.; Groppi, D.; Lamagna, M.; Mancini, F.; Astiaso Garcia, D. Data-Driven Load Profile Modelling for Advanced Measurement and Verification (M&V) in a Fully Electrified Building. Build Environ 2022, 221, 109279. [Google Scholar] [CrossRef]
  113. Staffell, I.; Pfenninger, S.; Johnson, N. A Global Model of Hourly Space Heating and Cooling Demand at Multiple Spatial Scales. Nat. Energy 2023, 8, 1328–1344. [Google Scholar] [CrossRef]
  114. Manfren, M.; Tommasino, M.C.; Tronchin, L. Data-Driven Building Energy Modelling—Generalisation Potential of Energy Signatures through Interpretable Machine Learning. In Proceedings of the Buiding Simulation Applications—BSA 2022, Bozen-Bolzano, Italy, 29 June–1 July 2022; Available online: (accessed on 8 February 2024).
  115. Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-Informed Machine Learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
  116. Bradley, W.; Kim, J.; Kilwein, Z.; Blakely, L.; Eydenberg, M.; Jalvin, J.; Laird, C.; Boukouvala, F. Perspectives on the Integration between First-Principles and Data-Driven Modeling. Comput. Chem. Eng. 2022, 166, 107898. [Google Scholar] [CrossRef]
  117. Gunnell, L.; Nicholson, B.; Hedengren, J.D. Equation-Based and Data-Driven Modeling: Open-Source Software Current State and Future Directions. Comput. Chem. Eng. 2024, 181, 108521. [Google Scholar] [CrossRef]
  118. Tian, W.; Heo, Y.; de Wilde, P.; Li, Z.; Yan, D.; Park, C.S.; Feng, X.; Augenbroe, G. A Review of Uncertainty Analysis in Building Energy Assessment. Renew. Sustain. Energy Rev. 2018, 93, 285–301. [Google Scholar] [CrossRef]
  119. Tronchin, L.; Manfren, M.; Nastasi, B. Energy Efficiency, Demand Side Management and Energy Storage Technologies—A Critical Analysis of Possible Paths of Integration in the Built Environment. Renew. Sustain. Energy Rev. 2018, 95, 341–353. [Google Scholar] [CrossRef]
  120. Hong, T.; Chen, Y.; Luo, X.; Luo, N.; Lee, S.H. Ten Questions on Urban Building Energy Modeling. Build. Environ. 2020, 168, 106508. [Google Scholar] [CrossRef]
  121. Shin, M.; Haberl, J.S. Thermal Zoning for Building HVAC Design and Energy Simulation: A Literature Review. Energy Build. 2019, 203, 109429. [Google Scholar] [CrossRef]
  122. Dogan, T.; Reinhart, C. Shoeboxer: An Algorithm for Abstracted Rapid Multi-Zone Urban Building Energy Model Generation and Simulation. Energy Build. 2017, 140, 140–153. [Google Scholar] [CrossRef]
  123. Battini, F.; Pernigotto, G.; Gasparella, A. A Shoeboxing Algorithm for Urban Building Energy Modeling: Validation for Stand-Alone Buildings. Sustain. Cities Soc. 2023, 89, 104305. [Google Scholar] [CrossRef]
  124. Chong, A.; Gu, Y.; Jia, H. Calibrating Building Energy Simulation Models: A Review of the Basics to Guide Future Work. Energy Build. 2021, 253, 111533. [Google Scholar] [CrossRef]
  125. Li, Y.; O’Neill, Z.; Zhang, L.; Chen, J.; Im, P.; DeGraw, J. Grey-Box Modeling and Application for Building Energy Simulations—A Critical Review. Renew. Sustain. Energy Rev. 2021, 146, 111174. [Google Scholar] [CrossRef]
  126. Boodi, A.; Beddiar, K.; Amirat, Y.; Benbouzid, M. Building Thermal-Network Models: A Comparative Analysis, Recommendations, and Perspectives. Energies 2022, 15, 1328. [Google Scholar] [CrossRef]
  127. Vivian, J.; Zarrella, A.; Emmi, G.; De Carli, M. An Evaluation of the Suitability of Lumped-Capacitance Models in Calculating Energy Needs and Thermal Behaviour of Buildings. Energy Build. 2017, 150, 447–465. [Google Scholar] [CrossRef]
  128. Michalak, P. The Development and Validation of the Linear Time Varying Simulink-Based Model for the Dynamic Simulation of the Thermal Performance of Buildings. Energy Build. 2017, 141, 333–340. [Google Scholar] [CrossRef]
  129. Michalak, P. A Thermal Network Model for the Dynamic Simulation of the Energy Performance of Buildings with the Time Varying Ventilation Flow. Energy Build. 2019, 202, 109337. [Google Scholar] [CrossRef]
  130. De Rosa, M.; Brennenstuhl, M.; Andrade Cabrera, C.; Eicker, U.; Finn, D.P. An Iterative Methodology for Model Complexity Reduction in Residential Building Simulation. Energies 2019, 12, 2448. [Google Scholar] [CrossRef]
  131. Kircher, K.J.; Max Zhang, K. On the Lumped Capacitance Approximation Accuracy in RC Network Building Models. Energy Build. 2015, 108, 454–462. [Google Scholar] [CrossRef]
  132. Serale, G.; Fiorentini, M.; Capozzoli, A.; Bernardini, D.; Bemporad, A. Model Predictive Control (MPC) for Enhancing Building and HVAC System Energy Efficiency: Problem Formulation, Applications and Opportunities. Energies 2018, 11, 631. [Google Scholar] [CrossRef]
  133. Drgoňa, J.; Arroyo, J.; Cupeiro Figueroa, I.; Blum, D.; Arendt, K.; Kim, D.; Ollé, E.P.; Oravec, J.; Wetter, M.; Vrabie, D.L.; et al. All You Need to Know about Model Predictive Control for Buildings. Annu. Rev. Control 2020, 50, 190–232. [Google Scholar] [CrossRef]
  134. Andriamamonjy, A.; Klein, R.; Saelens, D. Automated Grey Box Model Implementation Using BIM and Modelica. Energy Build. 2019, 188–189, 209–225. [Google Scholar] [CrossRef]
  135. Kämpf, J.H.; Robinson, D. A Simplified Thermal Model to Support Analysis of Urban Resource Flows. Energy Build. 2007, 39, 445–453. [Google Scholar] [CrossRef]
  136. Fonseca, J.A.; Schlueter, A. Integrated Model for Characterization of Spatiotemporal Building Energy Consumption Patterns in Neighborhoods and City Districts. Appl. Energy 2015, 142, 247–265. [Google Scholar] [CrossRef]
  137. Prataviera, E.; Romano, P.; Carnieletto, L.; Pirotti, F.; Vivian, J.; Zarrella, A. EUReCA: An Open-Source Urban Building Energy Modelling Tool for the Efficient Evaluation of Cities Energy Demand. Renew. Energy 2021, 173, 544–560. [Google Scholar] [CrossRef]
  138. Fischer, D.; Wolf, T.; Scherer, J.; Wille-Haussmann, B. A Stochastic Bottom-up Model for Space Heating and Domestic Hot Water Load Profiles for German Households. Energy Build. 2016, 124, 120–128. [Google Scholar] [CrossRef]
  139. Koene, F.G.H.F.; Eslami-Mossallam, B.B. Space Heating Demand Profiles of Districts Considering Temporal Dispersion of Thermostat Settings in Individual Buildings. Build. Environ. 2023, 228, 109839. [Google Scholar] [CrossRef]
  140. Schütz, T.; Schiffer, L.; Harb, H.; Fuchs, M.; Müller, D. Optimal Design of Energy Conversion Units and Envelopes for Residential Building Retrofits Using a Comprehensive MILP Model. Appl. Energy 2017, 185, 1–15. [Google Scholar] [CrossRef]
  141. Schütz, T.; Schraven, M.H.; Remy, S.; Granacher, J.; Kemetmüller, D.; Fuchs, M.; Müller, D. Optimal Design of Energy Conversion Units for Residential Buildings Considering German Market Conditions. Energy 2017, 139, 895–915. [Google Scholar] [CrossRef]
  142. Bianco, G.; Bracco, S.; Delfino, F.; Gambelli, L.; Robba, M.; Rossi, M. A Building Energy Management System Based on an Equivalent Electric Circuit Model. Energies 2020, 13, 1689. [Google Scholar] [CrossRef]
  143. Zhang, Y.; Vand, B.; Baldi, S. A Review of Mathematical Models of Building Physics and Energy Technologies for Environmentally Friendly Integrated Energy Management Systems. Buildings 2022, 12, 238. [Google Scholar] [CrossRef]
  144. Hazyuk, I.; Ghiaus, C.; Penhouet, D. Optimal Temperature Control of Intermittently Heated Buildings Using Model Predictive Control: Part I—Building Modeling. Build. Environ. 2012, 51, 379–387. [Google Scholar] [CrossRef]
  145. Hazyuk, I.; Ghiaus, C.; Penhouet, D. Optimal Temperature Control of Intermittently Heated Buildings Using Model Predictive Control: Part II—Control Algorithm. Build. Environ. 2012, 51, 388–394. [Google Scholar] [CrossRef]
  146. Oldewurtel, F.; Parisio, A.; Jones, C.N.; Gyalistras, D.; Gwerder, M.; Stauch, V.; Lehmann, B.; Morari, M. Use of Model Predictive Control and Weather Forecasts for Energy Efficient Building Climate Control. Energy Build. 2012, 45, 15–27. [Google Scholar] [CrossRef]
  147. Lehmann, B.; Gyalistras, D.; Gwerder, M.; Wirth, K.; Carl, S. Intermediate Complexity Model for Model Predictive Control of Integrated Room Automation. Energy Build. 2013, 58, 250–262. [Google Scholar] [CrossRef]
  148. Smith, R.S.; Behrunani, V.; Lygeros, J. Control of Multicarrier Energy Systems from Buildings to Networks. Annu. Rev. Control Robot. Auton. Syst. 2023, 6, 391–414. [Google Scholar] [CrossRef]
  149. Fontenot, H.; Dong, B. Modeling and Control of Building-Integrated Microgrids for Optimal Energy Management—A Review. Appl. Energy 2019, 254, 113689. [Google Scholar] [CrossRef]
  150. Oliveira Panão, M.J.N.; Mateus, N.M.; Carrilho da Graça, G. Measured and Modeled Performance of Internal Mass as a Thermal Energy Battery for Energy Flexible Residential Buildings. Appl. Energy 2019, 239, 252–267. [Google Scholar] [CrossRef]
  151. Askeland, M.; Georges, L.; Korpås, M. Low-Parameter Linear Model to Activate the Flexibility of the Building Thermal Mass in Energy System Optimization. Smart Energy 2023, 9, 100094. [Google Scholar] [CrossRef]
  152. Bueno, B.; Norford, L.; Pigeon, G.; Britter, R. A Resistance-Capacitance Network Model for the Analysis of the Interactions between the Energy Performance of Buildings and the Urban Climate. Build. Environ. 2012, 54, 116–125. [Google Scholar] [CrossRef]
  153. Mosteiro-Romero, M.; Maiullari, D.; Pijpers-van Esch, M.; Schlueter, A. An Integrated Microclimate-Energy Demand Simulation Method for the Assessment of Urban Districts. Front. Built Environ. 2020, 6, 553946. [Google Scholar] [CrossRef]
  154. Ramallo-González, A.P.; Eames, M.E.; Natarajan, S.; Fosas-de-Pando, D.; Coley, D.A. An Analytical Heat Wave Definition Based on the Impact on Buildings and Occupants. Energy Build. 2020, 216, 109923. [Google Scholar] [CrossRef]
  155. Pfafferott, J.; Rißmann, S.; Halbig, G.; Schröder, F.; Saad, S. Towards a Generic Residential Building Model for Heat-Health Warning Systems. Int. J. Environ. Res. Public Health 2021, 18, 13050. [Google Scholar] [CrossRef]
  156. Raillon, L.; Ghiaus, C. An Efficient Bayesian Experimental Calibration of Dynamic Thermal Models. Energy 2018, 152, 818–833. [Google Scholar] [CrossRef]
  157. Rouchier, S. Solving Inverse Problems in Building Physics: An Overview of Guidelines for a Careful and Optimal Use of Data. Energy Build. 2018, 166, 178–195. [Google Scholar] [CrossRef]
  158. Kristensen, M.H.; Hedegaard, R.E.; Petersen, S. Hierarchical Calibration of Archetypes for Urban Building Energy Modeling. Energy Build. 2018, 175, 219–234. [Google Scholar] [CrossRef]
  159. Jin, X.; Zhang, C.; Xiao, F.; Li, A.; Miller, C. A Review and Reflection on Open Datasets of City-Level Building Energy Use and Their Applications. Energy Build. 2023, 285, 112911. [Google Scholar] [CrossRef]
  160. Malhotra, A.; Bischof, J.; Nichersu, A.; Häfele, K.-H.; Exenberger, J.; Sood, D.; Allan, J.; Frisch, J.; van Treeck, C.; O’Donnell, J.; et al. Information Modelling for Urban Building Energy Simulation—A Taxonomic Review. Build. Environ. 2022, 208, 108552. [Google Scholar] [CrossRef]
  161. Manfren, M.; Nastasi, B.; Groppi, D.; Astiaso Garcia, D. Open Data and Energy Analytics—An Analysis of Essential Information for Energy System Planning, Design and Operation. Energy 2020, 213, 118803. [Google Scholar] [CrossRef]
  162. Hu, S.; Wang, J.; Hoare, C.; Li, Y.; Pauwels, P.; O’Donnell, J. Building Energy Performance Assessment Using Linked Data and Cross-Domain Semantic Reasoning. Autom. Constr. 2021, 124, 103580. [Google Scholar] [CrossRef]
  163. Földváry Ličina, V.; Cheung, T.; Zhang, H.; de Dear, R.; Parkinson, T.; Arens, E.; Chun, C.; Schiavon, S.; Luo, M.; Brager, G.; et al. Development of the ASHRAE Global Thermal Comfort Database II. Build. Environ. 2018, 142, 502–512. [Google Scholar] [CrossRef]
  164. Dong, B.; Liu, Y.; Mu, W.; Jiang, Z.; Pandey, P.; Hong, T.; Olesen, B.; Lawrence, T.; O’Neil, Z.; Andrews, C.; et al. A Global Building Occupant Behavior Database. Sci. Data 2022, 9, 369. [Google Scholar] [CrossRef]
Figure 1. Hierarchical mind-map diagram representing the analytical levels of the literature review.
Figure 1. Hierarchical mind-map diagram representing the analytical levels of the literature review.
Energies 17 00881 g001
Figure 2. Forward and inverse modelling approach integration for continuous learning and improvement.
Figure 2. Forward and inverse modelling approach integration for continuous learning and improvement.
Energies 17 00881 g002
Table 1. Criteria for literature review, description and motivation of levels selected.
Table 1. Criteria for literature review, description and motivation of levels selected.
LevelDescriptionMotivation for Selection
1Interpretability of AI/ML, AI ethics, emerging paradigms enabled by digitalisation.The rapidly evolving landscape of research in the broad area of AI/ML poses technical problems that need to be considered while developing applications in the energy and built environment domains.
2Digital twins in buildings, normalisation of energy statistics and benchmarking, interpretable data-driven methods for energy in buildings.In the acceleration of energy transition, data-driven applications in buildings can use methods that have already proven to be effective in field applications.
3Simplification of detailed building energy simulation models while retaining physical interpretation of models.Detailed building energy modelling at the state-of-the-art level can be simplified while providing adequate accuracy and retaining the physical structure of models.
Table 2. Keywords and criteria for literature selection, analytical Level 2.
Table 2. Keywords and criteria for literature selection, analytical Level 2.
Search NumberKeywords and QueryScopusWeb of Science
1“Interpretability” AND “Energy” AND “Buildings” AND “Regression”3240
2“Interpretability” AND “Energy” AND “Buildings” AND “M&V”22
3“Interpretability” AND “Energy” AND “Buildings” AND “Degree-Days”11
4“Interpretability” AND “Energy” AND “Buildings” AND “Bayesian”65
5“Interpretability” AND “Energy” AND “Buildings” AND “Deep Learning”2127
6“Interpretability” AND “Energy” AND “Buildings” AND ”Random Forest”99
7“Interpretability” AND “Energy” AND “Buildings AND “Optimal trees”43
Table 3. Analysis of interpretability, temporal and spatial scalability and causality in modelling for selected literature.
Table 3. Analysis of interpretability, temporal and spatial scalability and causality in modelling for selected literature.
Level 1Level 2Level 3
InterpretabilityTemporal ScaleSpatial ScaleCausality
SourceYearSearch NumberAnte HocPost HocYearlyMonthlyDailyHourlyBuilding SystemsWhole BuildingBuilding StockCommunityCounterfactual AnalysisPhysical Constraints
Østergård et al. [85]20181
Li et al. [86]20191
Wang et al. [87]20191
Khamma et al. [88]20201
Li et al. [89]20201
Feng et al. [90]20211
Zhang et al. [91]20221
Ding et al. [92]20221
Chen et al. [93]20221
Liu et al. [94]20221
Yue et al. [95]20231
Manfren et al. [96]20222
E. Wang [97]20173
Shen et al. [98]20234
J. Wang et al. [99]20224
Chen et al. [100]20224
Zhang et al. [101]20205
Gao et al. [102]20215
Ao Li et al. [103]20215
G Liu et al. [104]20225
Gokhale et al. [105]20225
Jie Lu et al. [106]20226
Choi et al. [107]20226
Ran Wang et al. [108]20197
Table 4. Grey-box modelling approaches and applications (by building life-cycle phases).
Table 4. Grey-box modelling approaches and applications (by building life-cycle phases).
Modelling ApproachPlanning–DesignOperation
SourceYearForwardInverseUrban-Scale PlanningBuilding Stock ModellingEarly-Stage OptimisationEnergy ManagementMPC ControlGrid Interaction and FlexibilityEnvironmental MonitoringCalibration under Uncertainty
Kämpf et al. [135]2007
Fonseca et al. [136]2015
Prataviera et al. [137]2021
Fischer et al. [138]2016
Frans Koene et al. [139]2023
Schütz et al. [140]2017
Schütz et al. [141]2017
Bianco et al. [142]2020
Zhang et al. [143]2022
Hazyuk et al. [144]2012
Hazyuk et al. [145]2012
Oldewurtel et al. [146]2012
Lehmann et al. [147]2013
Smith et al. [148]2023
Fontenot et al. [149]2019
Oliveira Panão et al. [150]2019
Askeland et al. [151]2023
Bueno et al. [152]2012
Mosteiro-Romero et al. [153]2020
Ramallo-González et al. [154]2020
Pfafferott et al. [155]2021
Raillon et al. [156]2018
Rouchier et al. [157]2018
Kristensen et al. [158]2018
Table 5. Summary of findings indicating critical connections and gaps in knowledge.
Table 5. Summary of findings indicating critical connections and gaps in knowledge.
LevelCritical ConnectionsGaps in Knowledge
1Interpretability and explainability, human oversight and “human in the loop” approach, ethics of AI and explicability principles, implications of these concepts in practical implementations.Ambiguity in the definition of interpretability and explainability, definition of high-stake/high-risk decisions where interpretability matters and users have a “right to explanations”.
2Interpretable data-driven methods in digital twins for building, counterfactual analysis and physical interpretation, M&V consolidated principles and practice.Incorporation of M&V rigorous and standardised principles in interpretable data-driven methods, going beyond single models and providing integrated analytical workflows.
3Physics-informed ML and grey-box physical–statistical models for building energy performance modelling.Leveraging grey-box modelling formulations already tested which can be standardised further and used to enhance physics-informed ML formulations.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Manfren, M.; Gonzalez-Carreon, K.M.; James, P.A.B. Interpretable Data-Driven Methods for Building Energy Modelling—A Review of Critical Connections and Gaps. Energies 2024, 17, 881.

AMA Style

Manfren M, Gonzalez-Carreon KM, James PAB. Interpretable Data-Driven Methods for Building Energy Modelling—A Review of Critical Connections and Gaps. Energies. 2024; 17(4):881.

Chicago/Turabian Style

Manfren, Massimiliano, Karla M. Gonzalez-Carreon, and Patrick A. B. James. 2024. "Interpretable Data-Driven Methods for Building Energy Modelling—A Review of Critical Connections and Gaps" Energies 17, no. 4: 881.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop