Machine Learning for Energy Management in Buildings: A Systematic Review on Real-World Applications

Panagiotis Michailidis; Federico Minelli; Iakovos Michailidis; Mehmet Kurucan; Hasan Huseyin Coban; Elias Kosmatopoulos

doi:10.3390/en19010219

,

and

¹

Centre for Research and Technology Hellas (CERTH), Information Technologies Institute, Thermi, 57001 Thessaloniki, Greece

²

Department of Electrical and Computer Engineering, Democritus University of Thrace (DUTH), 67100 Xanthi, Greece

³

Department of Industrial Engineering, University of Naples “Federico II”, 80125 Naples, Italy

⁴

Department of Computer Engineering, Faculty of Computer and Informatics, Adana Alparslan Türkeş Science and Technology University, 74110 Adana, Türkiye

Energies2026, 19(1), 219;https://doi.org/10.3390/en19010219
(registering DOI)

This article belongs to the Special Issue Advances in Artificial Intelligence for Energy Management and Smart Energy Systems

Version Notes

Order Reprints

Abstract

Machine learning (ML) is becoming a key enabler in building energy management systems (BEMS), yet most existing reviews focus on simulations and fail to reflect the realities of real-world deployment. In response to this limitation, the present work aims to present a systematic review dedicated entirely to experimental, field-tested applications of ML in BEMS, covering systems such as Heating, Ventilation & Air-conditioning (HVAC), Renewable Energy Systems (RES), Energy Storage Systems (ESS), Ground Heat Pumps (GHP), Domestic Hot Water (DHW), Electric Vehicle Charging (EVCS), and Lighting Systems (LS). A total of 73 real-world deployments are analyzed, featuring techniques like Model Predictive Control (MPC), Artificial Neural Networks (ANNs), Reinforcement Learning (RL), Fuzzy Logic Control (FLC), metaheuristics, and hybrid approaches. In order to cover both methodological and practical aspects, and properly identify trends and potential challenges in the field, current review uses a unified framework: On the methodological side, it examines key-attributes such as algorithm design, agent architectures, data requirements, baselines, and performance metrics. From a practical standpoint, the study focuses on building typologies, deployment architectures, zones scalability, climate, location, and experimental duration. In this context, the current effort offers a holistic overview of the scientific landscape, outlining key trends and challenges in real-world machine learning applications for BEMS research. By focusing exclusively on real-world implementations, this study offers an evidence-based understanding of the strengths, limitations, and future potential of ML in building energy control—providing actionable insights for researchers, practitioners, and policymakers working toward smarter, grid-responsive buildings. Findings reveal a maturing field with clear trends: MPC remains the most deployment-ready, ANNs provide efficient forecasting capabilities, RL is gaining traction through safer offline–online learning strategies, FLC offers simplicity and interpretability, and hybrid methods show strong performance in multi-energy setups.

Keywords:

machine learning; energy management; smart buildings; model predictive control; reinforcement learning; HVAC; RES; ESS; EVCS; control optimization

1. Introduction

1.1. General

Energy systems embedded within buildings form a central component of modern infrastructure, playing a decisive role in balancing occupant comfort with operational efficiency [1]. Subsystems such as HVACs, RES, ESS, DHW, and LS are highly energy-intensive, and their effective coordination is essential for achieving sustainability goals [2,3,4]. Given that buildings account for more than one-third of global final energy consumption and a substantial portion of CO₂ emissions, optimizing these systems is not merely a technical challenge but a broader societal imperative [5,6]. In this context, building energy systems are increasingly viewed as dynamic and complex environments in which advanced control strategies may substantially reduce energy consumption, improve comfort, and support the integration of renewable energy resources [7,8,9].

Addressing these challenges requires autonomous and robust control mechanisms capable of operating with minimal human intervention [10,11,12]. Traditional approaches such as manual scheduling or operator-based tuning, although common in early deployments, quickly proved inadequate for dealing with fluctuations in occupancy behavior, rapidly changing weather conditions, and dynamic energy prices [13,14]. The transition toward autonomous control was driven both by technological advances in sensing and computation and by policy initiatives aimed at reducing carbon emissions in the building sector [15,16,17,18].

The first automated control strategies adopted in buildings were relatively simple, utilizing a predefined set of “if-then” rules to control a building’s energy consumption and generation [13,19,20]: Fixed-time schedules and rule-based control (RBC) were widely used, with HVAC systems typically operating during preset hours or responding to fixed temperature thresholds defined by simple if-then logic [21,22]. Such approaches were transparent, easy to implement, and required little computational effort. For many years, they formed the backbone of automation in both commercial and residential buildings, providing a standardized way to maintain acceptable comfort conditions [23,24].

However, the limitations of fixed-time schedules and RBC soon became apparent. Their lack of flexibility made them unsuitable for environments characterized by uncertainty and variability. RBC strategies could not respond effectively to unexpected occupancy changes, rapid weather fluctuations, or variable electricity tariffs [21,24]. This often resulted in unnecessary energy use, reduced occupant satisfaction, and limited effectiveness when integrating renewable energy sources [25]. Such shortcomings highlighted the need for control strategies capable of adapting intelligently to complex and uncertain operating conditions.

This need catalyzed the gradual introduction of more advanced control methods into real-world building applications. Model Predictive Control (MPC) emerged as one of the earliest and most influential techniques, offering an optimization-based framework that explicitly incorporates system dynamics and operational constraints [26,27,28]. By using mathematical models of building behavior, MPC enabled predictive scheduling and cost-aware decision-making. Yet, its reliance on accurate, high-fidelity models also exposed a key limitation: developing such models is time-consuming, requires expert knowledge, and does not scale easily across different building types [29,30]. This led researchers and practitioners to explore alternatives that reduce the modeling burden while preserving the ability to handle nonlinear building dynamics.

Consequently, data-driven and soft computing approaches—such as Artificial Neural Networks [31,32], Fuzzy Logic Controllers [33,34], and Evolutionary Algorithms [35,36,37]—began to gain traction. Such practices offer greater flexibility by learning patterns directly from operational data or incorporating expert knowledge into rule-based or bio-inspired algorithms. Their main advantage was the ability to model complex behaviors without relying on detailed physics-based representations, making them particularly attractive in settings characterized by uncertainty and variability [24].

With the growing availability of building instrumentation and sensor data, a new generation of learning-based control strategies emerged. Reinforcement Learning became particularly prominent due to its ability to learn control policies through interaction with the environment, eliminating the need for explicit system identification [9]. At the same time, broader machine learning techniques—including Deep Learning [38,39,40], Random Forests [41,42], Gradient Boosting Decision Trees (GBDT) [43,44], and hybrid physics-informed ML models [45]—were increasingly adopted. These techniques leverage large volumes of operational and contextual data to create scalable and adaptive controllers capable of responding autonomously to changing occupancy, weather conditions, and market dynamics. Through this evolution—from model driven optimization to soft computing and ultimately to fully data-driven or hybrid learning frameworks—building control systems have advanced from basic automation toward methods capable of continuous adaptation [29,46]. This trajectory reflects both the difficulty of developing accurate building models and the broader scientific effort to achieve more autonomous, scalable, and resilient energy management systems.

Although the evolution of intelligent building control has been thoroughly examined in the literature, much of this work remains confined to simulated environments [4,9,24,29]. Researchers often rely on detailed simulators, testbeds, or digital twins to evaluate MPC, ANN-based controllers, RL agents, and other advanced techniques. While simulation-based studies are valuable for controlled experimentation and rapid prototyping, real-world deployment poses additional challenges. Issues such as sensor noise, actuator delays, communication constraints, cybersecurity considerations, and occupant acceptance are difficult to replicate in simulation [47,48]. Consequently, the gap between promising laboratory results and widespread real-world adoption remains substantial, emphasizing the need for more studies conducted in operational buildings (See Figure 1).

Figure 1. Geographical distribution of real-world experiments for BEMS.

The current review therefore focuses explicitly on real-world applications of machine learning for controlling BEMS. By synthesizing studies that extend beyond simulation and into field demonstrations with live operational settings on building energy systems—e.g., HVACs, RES, ESS, TSS, DHW, etc.—or their integrated ecosystems, it aims to provide a comprehensive overview of current practices, identify the technical and practical trends encountered in deployment, and extract insights to guide future research. The ultimate goal is to explore the different key-aspects of machine learning applications (e.g., MPC, ANN, RL, FLC, hybrids etc.) that took place in real-world buildings (See Figure 2-Left and Right), offering a valuable reference for researchers and practitioners working toward the deployment of intelligent building control solutions.

Figure 2. Left: Occurrences of ML-based control applications in real-world BEMS; Right: Real-world ML-based applications for BEMS per year.

1.2. Previous Works

To help situate our work in the broader research landscape, the current subsection provides a brief summary of key existing review studies: Vásquez et al. [49] examined post-occupancy evaluation (POE) as a structured method for assessing buildings under real operating conditions. By linking technical, environmental, and social aspects, their review showed how occupant feedback and performance monitoring can reveal inefficiencies and guide design and operational improvements. They also noted persistent challenges, including implementation costs, unclear responsibilities, and the absence of standardized POE procedures. Abuimara et al. [50] proposed a data-driven workflow to enhance energy-efficient building operations. Their framework integrates four domains—metadata quality, automated fault detection, occupant-centric controls, and energy flow monitoring—into a single, interdependent sequence. Demonstrated through case studies in Canadian institutional buildings, the approach offers a practical, low-cost, and tool-agnostic roadmap for improving building performance. Khoa et al. [51] provided a comprehensive review of machine learning applications for HVAC optimization, control, and fault detection. They surveyed a wide range of algorithms including neural networks, support vector machines, and Reinforcement Learning, highlighting their potential to reduce energy use. At the same time, they identified key barriers such as limited data availability, poor model transferability across buildings, and slow industry adoption, underscoring the gap between research capabilities and practical deployment. More recently, Aghili et al. [52] delivered a systematic review of artificial intelligence techniques for HVAC energy management, covering both operational control and maintenance. Their work emphasized how advanced AI methods—such as deep Reinforcement Learning and generative models—can dynamically accommodate changing occupancy patterns and environmental conditions. Distinct from earlier surveys, the authors stressed the importance of bridging theoretical progress with real-world implementation, positioning AI as a central technology for enabling smart and low-carbon buildings. Previous review papers in the field have contributed valuable overviews but are largely constrained by their focus on simulation-based studies or narrow system scopes—particularly HVAC control. Most lack detailed coverage of real-world implementations, making it difficult to assess how machine learning performs under actual operational constraints. Additionally, prior surveys often overlook important subsystems such as Renewable Energy Systems, Energy Storage Systems, Domestic Hot Water, or Electric Vehicle Charging, and rarely offer statistical synthesis across methodological and deployment factors.

1.3. Contribution and Novelty

In contrast to existing reviews, this work is focusing exclusively on real-world field implementations of ML in BEMS. Seventy three studies are systematically selected, each representing an ML-based approach validated under real operational conditions in actual building testbeds. This emphasis on real deployment is aimed to create a stronger evidence base that has been largely missing in literature. To this end, current work follows a systematic methodology designed to include high impact experimental studies from the past decade. Moreover, unlike the majority of reviews, which concentrates mainly on HVAC systems, the present study encompasses the full spectrum of building energy subsystems: HVAC, Heat Pumps, Renewable Energy Systems, Energy Storage, lighting, and Domestic Hot Water—thereby offering a broader understanding of ML-enabled energy management.

Beyond compiling field studies, current work provides systematic insights into trends, performance, and methodological patterns across the collected real-world implementations. Statistical analysis of the dataset is used to identify recurring behaviors and compare the effectiveness of different techniques under real operating constraints. To the authors’ knowledge, current work represents the first comprehensive/systematic review to consolidate the majority of real-world ML applications for BEMS in a single work. As such, it aims to serve both as a practical reference for practitioners and as a research-oriented guide that highlights key challenges, emerging trends, and promising directions for future investigations. More specifically, the novelty attributes of the current work may be summarized as follows: (a) Focuses exclusively on real-world ML-based BEMS deployments, representing the first systematic consolidation of real-world ML–BEMS work; (b) Examines a high number of applications—a total of 73 high-impact, real-life experimentally validated research studies; (c) Covers all major building energy subsystems, such as HVAC, HP, RES, ESS, TSS, DHW, EVCS and Light systems (LS); (d) Provides cross-study patterns in methodological (as algorithms, agent-based architectures, reward designs, datasets, baseline control approaches, performance indexes) and practical (as IoT implementation types, building typologies, zones number, location and climate) design choices; and (e) Uses statistical analysis to reveal trends across real implementations and foster the future directions in the field of real-life machine learning applications for building energy management.

1.4. Paper Structure

The structure of this paper may be described as follows (see Figure 3):

Figure 3. Paper structure.

Section 1: Introduces the background and rationale for the study, briefly reviews existing surveys in the field, and clarifies how the present work differs in scope and novelty.
Section 2: Explains the research methodology, including literature search strategy, selection and screening process, data extraction, quality control, and synthesis procedure.
Section 3: Describes the main categories of Building Energy Systems (BES) and discusses the integration of ML approaches, along with typical barriers encountered in real implementations.
Section 4: Outlines the mathematical underpinnings of ML techniques and introduces the principal algorithm families (MPC, RL, evolutionary strategies, etc.), with particular attention to multi-agent frameworks.
Section 5: Compiles the integrated influential real-world studies from 2015 to 2025 in a structured table, capturing the key-attributes of the integrated works.
Section 6: Analyzes the reviewed studies across methodological and practical aspects. Methodological key-attributes concern dimensions as the algorithm design, agent structure, data types baselines control, performance indexes, while practical key-attributes concern implementation features, building type, zone scalability, location, climate and experimental period execution.
Section 7: Discusses and summarizes the emerging trends and knowledge gaps, contrasts them with prior reviews, and proposes future research directions to advance ML applications in BEMS.
Section 8: Closes the paper by highlighting the main contributions and overarching insights derived from the review.
Appendix A: Concise summaries of each selected work are being illustrated in tables, following the numbering of the key attribute tables illustrated in Section 5.

2. Methodology

2.1. Methodological Clarifications

To ensure transparency and make the review easy to follow and reproduce, this study follows the PRISMA guidelines and includes several clarifications on how the methodology was carried out. The literature search focused on peer-reviewed studies published between January 2015 and March 2025, with the final database query completed in March 2025. Only papers written in English were included, and eligible sources were limited to journal articles and full conference papers. Materials like theses, dissertations, technical reports, workshop proceedings, or non-peer-reviewed documents were excluded. For a study to be included, it had to report on a ML-based control or optimization method that was tested in a real building—such as a field pilot, a living lab, or a long-term experimental setup. Simulation-only studies were not considered unless they included direct comparisons with real-world measurements. During the review process, data from each selected study were extracted using a structured template. Methodological information comprised the following: algorithm used, agent architecture, forecast horizon and control timestep, utilized data, baseline control methodology, and performance metrics. Practical information comprised the following: the experimental setup, the type of energy system involved, the building typology, the building zone scalability, the location, climate and the experimental time. The extracted data were carefully checked for consistency and accuracy.

Since the reviewed studies varied widely in terms of building types, control goals, time horizons, and evaluation metrics, a formal meta-analysis was not possible. Instead, a descriptive quantitative synthesis was used, reporting on trends, frequencies, and comparisons across different methods—an approach that aligns well with PRISMA guidance for complex engineering reviews. Moreover, all the quantitative trends, percentages, and distributions shown in the figures and discussed throughout the paper were directly drawn from the curated set of 73 real-world studies included in the current review. Unless otherwise noted, percentages—such as the share of each algorithm type, deployment frequency, or regional distribution—were calculated as simple relative frequencies based on the total number of included studies. Geographic patterns were identified by counting the number of deployments per country or region, using the entries listed in the “Location” column of the attribute tables. Current work did not apply any data imputation or weighting. Current summary tables in the Appendix are fully available and make it possible for others to independently verify and reproduce the quantitative insights and visualizations presented. More specifically, the work has followed a systematic process described by the following steps:

2.2. Review Process

Article Search and Retrieval: A systematic search was carried out using major academic databases such as Scopus and Web of Science (WoS). Search queries combined broad ML terminology with keywords related to building energy systems, for example:
("Machine Learning" OR "Artificial Intelligence" OR "Reinforcement Learning" OR "Deep Learning") AND (Building OR HVAC OR "Building Energy Management" OR BEMS OR "Heat Pump" OR "thermal storage" OR RES OR "Domestic Hot Water" OR DHW OR Lighting OR "Energy Storage" OR ESS OR "Electric Vehicle Charging" OR EVCS)
The initial query returned over 400 publications. Titles and abstracts were screened for relevance, duplicates were removed, and only studies reporting real-world ML applications within building energy systems were retained for further evaluation.
Filtering and Selection Criteria: A second filtering stage ensured that only high-quality and practically relevant studies were included. More specifically, only studies that had accumulated at least 10 citations at the time of review were considered, ensuring a minimum level of academic recognition and peer validation. For very recent publications (2025), which have not yet had sufficient time to accrue citations, this criterion was relaxed, provided that the studies reported clear real-world experimental validation and were published in reputable peer-reviewed venues. Peer-reviewed journal articles and leading conference contributions were considered. Each selected work had to demonstrate a tangible real-world implementation, such as deployment in operational buildings, field pilots, or extensive experimental validation. Simulation-only studies were excluded unless the authors incorporated direct benchmarking with field measurements. The final dataset comprises real-life applications.
Data Collection: For each selected publication, detailed information was extracted regarding the ML methodology (e.g., MPC, FLC, RL, ANNs, evolutionary approaches, hybrid models), targeted subsystem, control or optimization objective, baseline comparisons, and performance metrics (including energy savings, comfort indicators, and cost reductions). Information on the validation setup—building type, scale, and zone configuration—was also captured. Particular attention was paid to studies that benchmarked ML controllers against standard or rule-based control strategies.
Quality Assessment: All works were evaluated for methodological rigor, clarity in describing the ML workflow, and completeness of the reported results. Priority was given to studies published in reputable outlets (Elsevier, IEEE, MDPI, Springer, etc.) and authored by established researchers in energy systems and control. Preference was also given to contributions presenting full workflows, from model development and controller design to validation and performance evaluation.
Data Synthesis: The selected studies were grouped according to subsystem type, ML technique, control architecture, and scale of real-world validation. This categorization facilitated cross-comparison across methods and supported the generation of statistical charts illustrating trends in algorithm selection, performance outcomes, and building typologies. Through this synthesis, the review identifies promising approaches, recurring limitations, and emerging directions for applying ML in real operational BEMS environments.

2.3. Reviewer Roles and Dispute Resolution

The systematic review process was conducted by multiple authors with complementary expertise in building energy management systems, machine learning, and control engineering. The search strategy, study screening, and eligibility assessment were primarily conducted by P.M. and I.M., following predefined inclusion and exclusion criteria. Data extraction and investigation were performed by P.M., F.M., M.K., and H.H.C. using a structured extraction template, while validation and formal analysis were carried out collaboratively by all authors to ensure consistency and accuracy.

In cases of disagreement regarding study inclusion, data interpretation, or methodological classification, differences were resolved through structured discussion among the involved reviewers. When consensus was not immediately achieved, final decisions were made under the supervision of senior authors (P.M. and E.K.). This consensus-based procedure ensured methodological rigor, minimized individual bias, and enhanced the reproducibility of the review process. The comprehensive methodology of the current review is illustrated in a PRISMA type diagram in Figure 4.

Figure 4. Methodology in a PRISMA type diagram.

3. Building Energy Management Systems

3.1. Primary Energy Systems

Modern buildings consist of multiple Energy Systems—such subsystems account for most of a building’s energy use and strongly influence both operating costs and occupant comfort [53]. The most common equipment found in the building level may concern [24] (See also Figure 5):

Figure 5. Reasons for controlling energy systems in the building environment.

HVAC: HVACs may consist of heat pumps, boilers, chillers, fans, pumps, and distribution networks that regulate indoor temperature and air quality [54]. Its purpose in the building environment is to maintain comfort and healthy indoor environments, requiring control to adapt operation to weather and occupancy while minimizing energy use [55].
DHW: Domestic Hot Water systems include heaters, storage tanks, heat exchangers, and circulation loops that deliver hot water for hygiene and daily needs [56,57]. DHW operation ensures reliable and efficient hot-water availability, requiring control to schedule heating, avoid thermal losses, and match operation to demand [56].
RES: Renewable Energy Systems may comprise PV panels, inverters, solar thermal collectors, wind turbines, and related sensors that generate on-site renewable electricity or heat [9,58,59]. Their utilization in the building environment is aimed to harvest energy, maximize renewable use and reduce grid dependency. Such systems require control to manage variability and coordinate with loads and energy storage [60].
ESS: Energy Storage Systems include batteries with BMS/inverters [61]. Such equipment is adequate to shift loads and increase flexibility, requiring control of charge/discharge timing to optimize cost, protect system health, and support renewable integration [60,62].
EVCS: Electric Vehicle Charging Systems consist of AC/DC chargers, smart controllers, and sometimes V2G/V2B interfaces supplying energy to EVs [63,64]. The implementation of such systems in the building environment concerns the provision of charging without causing peaks or strain [65]. Such systems require control to schedule charging, use low-tariff periods, and leverage EV flexibility [66,67,68].
GHP: Ground-source Heat Pumps extract or reject heat through underground loops connected to the stable-temperature ground, offering highly efficient year-round heating and cooling [69,70]. Such systems are utilized in order to leverage the ground’s thermal inertia for superior efficiency, requiring control to regulate flow rates, switching modes, and loop balancing to maximize savings and protect the ground loop from thermal drift [71].
LS and Appliances: Lighting Systems and Appliances may include LED fixtures, lamps or home devices, office equipment, computer, servers, that consume electricity during operation or standby [72]. Such equipment is commonly utilized to enable comfort and reduce unnecessary consumption. It requires control through smart switching, dimming, and scheduling aligned with occupancy [73].

In real buildings, such energy systems do not operate in isolation—they continuously cooperate within the same environment to satisfy occupant needs and achieve overall performance targets [74]. Together, they form an integrated building energy management system (IBEMS) where each subsystem influences the others: shading or plug loads affect HVAC demand, PV output affects battery and EV charging decisions, and thermal storage influences heating and cooling loads. This interdependence means that delivering comfort, reliability, and efficiency in real-life potentially requires coordinated control rather than subsystem-level decisions [75].

3.2. BEMS Control Process

Modern BEMS operate through a structured, intelligent control loop that continuously transforms sensor data into optimized actions for energy efficiency and occupant comfort. This process involves sensing, prediction, decision-making, and execution—working together to create a responsive and adaptive environment [29,76].

At the heart of this system is a step-by-step control cycle that enables predictive and coordinated management of key building subsystems, such as HVAC, RES, ESS, GHP, EVCS, etc. Each component plays a role in helping the building operate more efficiently, flexibly, and sustainably. The complete step-by-step control workflow is outlined as follows (see also Figure 6):

Figure 6. Step-by-step control cycle of a modern building energy management system.

Sensing: The process starts with capturing building and environmental real-time data from sensors that monitor temperature, humidity, CO₂, occupancy, solar radiation, energy usage, and other key indicators. This provides a snapshot of the current state of the building.
Preprocessing: Raw data is cleaned, synchronized, and formatted to ensure it can be reliably used. Communication protocols such as BACnet/IP, Modbus, OPC-UA, or MQTT help ensure that data flows safely and accurately to the system.
Forecasting: Using machine learning or statistical models, the system predicts short-term future conditions such as energy demand, occupancy changes, or solar generation. These forecasts allow the system to plan ahead rather than simply react.
Optimization: Based on the forecasts and system constraints (e.g., comfort requirements, energy prices, equipment limits), optimization algorithms compute the best control actions. These might include HVAC setpoints, charge/discharge schedules for batteries, or EV charging rates.
Actuation: The control decisions are sent to physical devices through the building management system, programmable logic controllers (PLCs), or IoT platforms. This is the point where decisions become real-world actions.
Operation: Devices adjust according to the new settings. HVAC systems may change airflow or temperature, batteries may charge or discharge, and EV chargers may ramp up or down.
Condition Alteration: As the control actions are applied, the building’s internal conditions—temperature, humidity, energy use—change. These new states are what the system will monitor in the next control cycle.
Feedback: The system collects new sensor data and compares it to expected outcomes. This feedback allows it to adjust, learn, and improve performance over time, while also detecting anomalies or triggering safety rules if needed.

Thanks to this continuous loop, BEMS can dynamically adjust system settings—like HVAC temperature or battery charging—based on current conditions, user presence, and even external factors like energy tariffs. This ensures buildings run efficiently while maintaining comfort and responsiveness to occupant needs and grid conditions.

4. Mathematical Concepts of ML-Based Methodologies

The development of advanced control strategies for building energy management systems (BEMS) has been shaped by different methodological philosophies, each reflecting a distinct view of how buildings interact with their environment and how energy performance can be improved. Model Predictive Control is rooted in optimization and predictive modeling [77,78]; Fuzzy Logic Control draws on human reasoning and linguistic rules [79]; ANNs and Deep Neural Networks (DNNs) rely on data-driven pattern recognition [80,81]; EAs mimic natural selection [82,83]; and RL focuses on adaptation through trial-and-error interaction [9,84,85]. The following subsections describe the conceptual motivation, mathematical principles, and practical strengths and weaknesses of each approach, particularly in the context of real-world BEMS applications.

4.1. Model Predictive Control

MPC is built on the idea of prediction and optimization. Assuming that building dynamics can be described with sufficient accuracy, MPC forecasts future system trajectories and computes optimal control actions by repeatedly solving a constrained optimization problem [29]. The general formulation minimizes a cost function over a prediction horizon:

min_{u_{0}, \dots, u_{N - 1}} \sum_{i = 0}^{N - 1} ℓ (x_{k + i}, u_{k + i}) + ℓ_{f} (x_{k + N}),

(1)

subject to model dynamics and operational constraints. MPC’s main advantages include explicit constraint handling, support for multi-objective formulation, and the ability to coordinate multiple interacting subsystems [29,78]. Its limitations stem from the need for accurate models, high computational demand, and sensitivity to disturbances not captured in the model [86]. In practice, these challenges complicate model calibration, scalability in multi-zone buildings, and ensuring sufficiently fast computations for real-time control [29].

4.2. Artificial Neural Networks

ANNs are based on the idea of learning nonlinear relationships directly from data. By stacking layers of interconnected neurons, they approximate complex functions without requiring explicit physical models [87]. DNNs extend this capability through multiple hidden layers that enable hierarchical feature extraction. Mathematically, an ANN is expressed as [88]:

y = f (x; θ) = σ_{L} (W_{L} \cdot σ_{L - 1} (\dots σ_{1} (W_{1} x + b_{1}) \dots) + b_{L}),

(2)

with

W_{l}

and

b_{l}

denoting weights and biases, and

σ_{l}

activation functions. ANNs are powerful universal approximators capable of capturing nonlinear building behavior using sensor data. Their drawbacks include the need for large, representative datasets, risks of overfitting, and limited interpretability [4]. For BEMS, this means that models may struggle with unusual conditions, incomplete datasets, or operator concerns regarding the transparency of black-box decision-making [4,31].

4.3. Reinforcement Learning

RL is based on adaptation through ongoing interaction with the environment. An agent learns a control policy by exploring actions and receiving rewards, aiming to maximize long-term performance [9]. Formally, RL optimizes the expected return [89]:

J (π) = E [\sum_{t = 0}^{\infty} γ^{t} r_{t}; |; π],

(3)

where

π

denotes the policy mapping states to actions. RL’s strengths include independence from explicit models, the ability to learn adaptive strategies in dynamic environments, and direct optimization of long-term objectives [9,90]. Its weaknesses involve high data requirements, potential instability during training, and safety concerns during exploration. In BEMS, these limitations pose challenges for direct deployment in occupied buildings, where unsafe actions may compromise comfort or equipment, and where policies trained in simulation often fail to transfer seamlessly to real-world environments [9].

4.4. Fuzzy Logic Control

FLC is grounded in approximate reasoning and relies on human expertise when precise models are unavailable. Inputs are translated into fuzzy sets (e.g., “high temperature”), and inference rules are combined to compute control actions [91]. A standard fuzzy control law is:

u = \frac{\sum_{i = 1}^{M} μ_{i} (x) \cdot u_{i}}{\sum_{i = 1}^{M} μ_{i} (x)},

(4)

where

μ_{i} (x)

are membership functions and

u_{i}

rule outputs. FLC excels in interpretability, robustness to noise, and independence from explicit modeling [92,93]. However, rule bases can become complex and difficult to maintain, and classical FLC lacks automatic adaptation mechanisms. In BEMS, this limits scalability to large multi-zone environments and can reduce performance when operating conditions evolve and rules are no longer well aligned with real dynamics [24].

4.5. Evolutionary Algorithms

EAs draw inspiration from biological evolution, using populations of candidate solutions that evolve through selection, crossover, and mutation. They do not rely on differentiability or convexity, making them effective for complex, multi-objective problems. In the case of GAs, evolutionary updates follow [94]:

x^{(t + 1)} = Mutation (Crossover (x^{(t)} p_{1}, x^{(t)} p_{2})),

(5)

with parents

p_{1}

and

p_{2}

chosen based on fitness. EAs offer robustness, flexibility, and reliable performance in noisy, multimodal optimization settings [94]. However, they often converge slowly, require significant computation, and can yield variable solutions across runs [95]. In BEMS, these characteristics limit their suitability for real-time operation and necessitate careful parameter tuning to ensure consistency and operational robustness [24,95].

5. Key Attributes of ML-Based Real-World Applications

This section illustrates the high impact ML-based applications in BEMS from 2015 to 2025, by offering a high-level overview of MPC, ANN, RL, FLC, hybrid, and other high-impact real-world applications found in the literature. Such an approach allows readers to quickly identify relevant applications in the attribute tables and refer to the detailed summaries for deeper insights into their methodologies and findings. The general description of the tables is as follows:

Key-Attribute Tables: (Table 1, Table 2, Table 3, Table 4, Table 5 and Table 6) systematically illustrate each application according to key characteristics, ensuring a comprehensive understanding of the overall approach:

Table 1. Key attributes of MPC applications for real-life BEMS.

Table 2. Key attributes of ANN applications for real-life BEMS.

Table 3. Key attributes of RL applications for real-life BEMS.

Table 4. Key attributes of FLC applications for real-life BEMS.

Table 5. Key attributes of hybrid applications for real-life BEMS.

Table 6. Key attributes of other type applications for real-life BEMS.

Ref.: Provides the reference application is listed in the first column;
Year: Provides the publication year for each research application;
Type: Contains the specific algorithmic type of each MPC, ANN, RL, FLC, hybrid, and other ML-based methodologies, applied in each application. In MPC applications the solver is contained in parentheses: SLP (Sequential Linear Programming), NLP (Nonlinear Programming), QP (Quadratic Programming), MILP (Mixed-Integer Linear Programming), MIQP (Mixed-Integer Quadratic Programming), SQP (Sequential Quadratic Programming), DP (Dynamic Programming). In ANNs applications the cases concern: MLP (Multilayer Perceptron), LSTM (Long Short-Term Memory Network), NARX (Nonlinear Autoregressive Network with Exogenous Inputs), CNN (Convolutional Neural Network) algorithms. In RL applications the cases concern: FQI (Fitted Q-Iteration), BDQ (Batch Deep Q-Learning), DQN (Deep Q-Network), PPO (Proximal Policy Optimization) and SAC (Soft Actor–Critic) algorithms;
Agent: Contains the agent type of the concerned methodology (“Single” for single-agent or “Multi” for multi-agent ML-based approach);
FH/TS: Illustrates the Forecast Horizon (FH) and the Timestep (TS) intervals for the ML-based control application, considering the BEMS;
Baseline: Illustrates the comparison methods used to evaluate the proposed ML-based approach (such as RBC, Fixed, or other ML-based strategies);
Equipment: Indicates the energy systems integrated in the BEMS framework (e.g., HVAC, RES, ESS, TSS, GHP, EVCS, LS, Appliances etc.);
Building: Describes the typology of the concerned real-world building (e.g., Residential, Office, Academic, Lab, etc.);
Zones: Illustrates the number of the controlled zones for each real-world building testbed;
Location: Illustrates the location of each real-world application. The country code is included in parentheses for each experiment;
Period: Illustrates the period when experimental control implementation took place (Summer, Autumn, Winter, Spring). The exact time intervals of experiment execution is given in parenthesis concerning Days (D), Weeks (W), Months (M) and Years (Y);

It should be underlined that the summaries of the real-world applications that concern the current paper have been moved to the Appendix A along with the Figures that concerns the structures of the experimental building testbeds. The interested reader may identify the summaries for MPC, ANN, RL, FLC, hybrid and other ML applications in Table A1, Table A2, Table A3, Table A4 and Table A5, respectively. Each of these tables are aligned with key-attribute Table 1, Table 2, Table 3, Table 4, Table 5 and Table 6 that this section illustrates in detail.

Table 7 provides a compact synthesis of dominant characteristics observed across all real-world ML-based BEMS implementations from 2015 to 2025. By aggregating common patterns in algorithmic types, agent structures, horizons, baselines, building contexts, and deployment durations.

Table 7. Cross-method summarization of dominant attributes in real-world ML-based BEMS (2015–2025).

6. Evaluation

Current review work analyzes the different trends in methodological and practical aspects in depth in a effort to holistically cover the field of ML-based control in real-world buildings. To this end, Section 6 analyzes the field in-depth across the following aspects:

Section 6.1 Methodological Key-Attributes describes how the ML-based control systems were designed and evaluated in real-world settings. More specifically, each paragraph analyzes:

Algorithmic Methodology: In Section 6.1.1
Agent Architecture: In Section 6.1.2
Forecast Horizon and Timestep: In Section 6.1.3
Data Utilization: In Section 6.1.4
Baseline Control: In Section 6.1.5
Performance Metrics: In Section 6.1.6

Section 6.2 Practical Key-Attributes reflects on how these ML-based approaches were implemented and operated in real building environments. More specifically, each paragraph analyzes:

Implementation Types: In Section 6.2.1
Equipment Types: In Section 6.2.2
Building Types: In Section 6.2.3
Zone Scalability: In Section 6.2.4
Location and Climate: In Section 6.2.5
Experimental Period: In Section 6.2.6

6.1. Methodological Key-Attributes

The methodological attributes (Section 6.1.1, Section 6.1.2, Section 6.1.3, Section 6.1.4, Section 6.1.5 and Section 6.1.6) focus on the internal logic and design of each control system, including the algorithmic approach (e.g., MPC, RL, ANN), agent architecture (single-agent vs. multi-agent), forecasting horizon and timestep, data utilization methods, baseline control strategies, and performance evaluation metrics. Such elements provide a structured lens through which to assess how ML techniques are applied, compared, and validated in real building environments.

6.1.1. Evaluation per Algorithmic Methodology

Among the reviewed real-world experimental studies, MPC in its classical form appears most frequently in academic field deployments (see Figure 7-Left and Right, and also Table 7) [96,111,117]; however, this reflects publication trends rather than a general claim about dominance or maturity across all building types or industry practice. As it is evident, economic and distributed MPC further extend MPC utilization to tariff-driven and multi-zone applications [110,119]. ANNs, on the other hand, seem to be well-suited for forecasting indoor temperature, CO₂, HVAC loads, or for functioning as surrogate models, enabling predictive and personalized control [130,131,134]. Their compatibility with embedded systems and BMS infrastructures foster such suitability for real-time prediction tasks [132]. By contrast, RL becomes suitable when building dynamics are nonlinear, uncertain, or hard to model. Field studies show RL is adequate to adapt to stochastic occupancy and volatile conditions when combined with offline training and safety layers [137,138,142]. However, although RL proved highly adaptive, its real-world experimentation remained small-scale due to stability and data requirements [141] (see Figure 7-Left and Right, and also Table 7).

Figure 7. Left: Occurrence of ML types in in real-world BEMS applications; Right: Percentage (%) of ML types in in real-world BEMS applications.

FLC approaches excelled in environments with uncertainty, sparse sensing, or strong human-in-the-loop requirements. Their interpretability and robustness rendered them suitable for comfort and IAQ control as shown across multiple real experiments [144,145,150]. Metaheuristics—such as GA and PSO—proved suitable for non-convex or multi-energy scheduling tasks, particularly when forecasting modules feed into higher-level optimization [152,162]. Their limitation, however, concerning the slow runtime, restricted them mostly to supervisory, day-ahead, or design-level tasks—as such implementations were primarily concerned hybrid algorithmic approaches. Moreover, in several deployments, hybrid architectures demonstrated the ability to combine forecasting, optimization, and constraint-handling within a single framework [135,158,163]; however, it needs to be mentioned that their performance advantages were context-dependent and not uniformly superior to pure MPC or pure RL approaches. More specifically:

Model Predictive Control: Across the experimental real-world deployments reviewed, MPC was predominantly implemented using classical, physics-based models, which are frequently reported to integrate cleanly with existing BMS/BAS infrastructures and to offer predictable performance [96,112,117] (See Figure 8-Left). Calibrated RC or state-space models—combined with convex cost functions and constraint sets—enable transparent operation and consistent real-time behavior [105,111,114]. In situations with high uncertainty, such as fluctuating occupancy, outdoor conditions, or grid-interaction scenarios, robust and stochastic MPC formulations were used [102,124]. Such embed uncertainty via chance constraints or conservative sets, reducing comfort violations with minimal increases in energy use [102,124]. When detailed physics-based modeling became costly or systems were highly complex, data-driven MPC was adopted. These retain an optimization-friendly structure while improving prediction accuracy through ML models (e.g., residual regressors, neural networks) [107,113,118]. Such hybrid formulations are reported to lower prediction errors without compromising real-time feasibility [112,121,122].

Figure 8. Left: Occurrence of MPC types in real-world BEMS applications; Right: Occurrence of MPC solvers in real-world BEMS applications.

For buildings with dynamic tariffs or onsite RES generation, economic MPC (E-MPC) illustrated the dominant strategy [108,110]. By directly minimizing energy costs or maximizing PV self-consumption, E-MPC was frequently reported to deliver double-digit cost reductions and significant peak-load reductions compared with RBC or PI control [115,120,122]. Conversely, comfort-driven MPC remained common in academic testbeds and constant-tariff settings, where the primary objective is maintaining indoor comfort rather than reducing energy bills [111,114,125]. A smaller but notable share of real-life studies adopted distributed or hierarchical MPC [105,119]: Using ADMM or price-based coordination, such architectures efficiently managed multi-zone buildings or campus-scale systems by separating slower economic optimization from faster local comfort tracking—maintaining both tractability and autonomy.
A consistent trend across real-life MPC studies was the preference for convex optimization. When comfort, actuator limits, and energy costs can be expressed linearly, MPC formulations rely on QP or MIQP solvers, ensuring reliable convergence and predictable computation [105,123,127] (See Figure 8-Right). Nonlinear solvers (NLP) were primarily used for PMV, humidity, or CO₂ constraints, and even then researchers employ efficient algorithms (IPOPT, fmincon, sequential linearization) to maintain real-time feasibility [99,113,114,117]. Moreover, discrete actuators introduced additional complexity. Even when the main optimization was convex, discrete equipment modes (compressors, valves, fans) are managed using MIQP formulations or dedicated post-processing logic to avoid short-cycling [110,112,120,126]. When system nonlinearities remain strong, metaheuristics such as GA or PSO were occasionally integrated as observed in [104,115,129]—it should be mentioned though, that such methods needed to be carefully bounded, in order to preserve real-time operation. According to evidence, comfort modeling was increasingly sophisticated since numerous MPC approaches retained convexity by linearizing PMV or using temperature bands [96,111]—others implemented adaptive or occupant-centered models based on feedback or estimated thermal sensation [101,116]. Such personalized approaches proved adequate to reduce energy use while preserving (or even improving) perceived comfort [99,127].
In summary, the reviewed practical deployments indicate that MPC implementations tend to perform most reliably when designed pragmatically: using calibrated grey-box models [112], lightweight estimation [114,126], convex or mixed-integer solvers [105,127], and horizons tailored to building dynamics [96,111]. Across the reviewed real-world studies, reliability, explainability, and real-time feasibility are more frequently emphasized than theoretical optimality [116,117,125]. Under these conditions, MPC is able to deliver measurable savings, improved comfort, and enhanced flexibility in operational buildings [105,106,112,118].
Artificial Neural Networks: Real-world ANN implementations for building energy management seem to evolve far beyond simple feedforward predictors, increasingly adopting deep, temporal, and hybrid architectures. The dominant pattern follows supervised learning with offline training on building data, followed by online adaptation or seamless BEMS integration for real-time operation. Early demonstrations, such as [130], used a Bayesian-regularized MLP trained on occupant feedback to personalize HVAC setpoints in a commercial office. Updated daily and interfaced directly with PID controllers, this study showed that even lightweight and interpretable ANNs were able to deliver real-time personalization while remaining fully compatible with legacy control structures. A major share of real-world ANN deployments employed recurrent or otherwise time-aware architectures—NARX networks, LSTMs, or hybrid temporal models—to capture thermal inertia, occupancy-driven variability, and indoor air quality dynamics (See Figure 9-Left). In [131], an MLP–NARX integrated model, predicted indoor temperatures under varying occupancy with sub-degree accuracy across both short and long horizons. Similarly, ref. [134] embedded an LSTM within a BEMS to predict CO₂ concentrations 5 min ahead and control an ERV system. Such recurrent ANN design captured temporal correlations more effectively than DNN and GRU baselines, demonstrating the advantages of temporal learning for proactive HVAC and ventilation management.

Figure 9. Left: Occurrence of ANN types in real-world BEMS applications; Center: Occurrence of RL types in real-world BEMS applications; Right: Occurrence of FLC types in real-world BEMS applications.

A second emerging trend concerned embedded and edge-deployed ANN models for distributed control and monitoring. In [132,133], 1-D CNNs and LSTM predictors were implemented on low-cost microcontrollers for residential NILM. Processing aggregated power data at 8 s intervals, these networks achieved less than 12% disaggregation error, proving that compact deep models can operate efficiently on resource-constrained hardware. Such work reflected the growing movement toward edge AI, enabling decentralized analytics and low-latency decision-making within building systems. Another important direction concerns the deployment of ANNs as soft sensors and surrogates, reducing reliance on physics-based models that may be difficult or costly to develop. For example, in [135], researchers used a three-layer MLP to estimate natural ventilation airflow from readily available BMS variables (temperature differences, wind conditions, window geometry). Such an approach provided a scalable, low-cost alternative to continuous CFD or first-principles modeling, especially in retrofits where physical models are unavailable.
As it is evident, across real-world case studies, interoperability remained essential. Most ANN modules interacted with existing BEMS layers via standard communication protocols (BACnet, EnOcean, Wi-Fi), supplying forecasts or supervisory setpoints without replacing core controllers [130,131,134]. Such modular integration preserved safety, supported retrofitting, and ensured compatibility with diverse hardware ecosystems. Architecturally, deployed ANNs tend to be compact—typically one to three hidden layers—trained with standard optimizers such as Adam and employing ReLU, tanh, or softmax activations depending on task requirements [130,131,135]. Practical deployments favor computational feasibility: hidden layers rarely exceed 5–50 neurons, and prediction horizons typically span 1–15 min to maintain fast inference on embedded platforms [131,132,135].
A final trend involves contextual and personalized learning. ANN-based controllers were increasingly integrated physiological or behavioral indicators, enabling comfort and IAQ adaptation at the individual level. The occupant-feedback-driven comfort adaptation in [130] and the comfort-voting mechanism from [138] (from the RL domain) illustrated how personalized inputs are extending into ANN-based decision loops. Likewise, the authors of [134] incorporated metabolic and anthropometric features rendered into LSTM inputs, enabling personalized IAQ optimization in a working office environment.
Overall, the reviewed real-world evidence suggests that ANNs are increasingly deployed as robust modules for prediction and supervisory control: real-world evidence illustrate that ANNs have matured into robust, deployable modules for prediction and supervisory control. Their advantages—as interpretability, ease of integration, computational efficiency, and ability to incorporate temporal and contextual information—render them well suited solutions for live building operation. As reflected in the reviewed case studies, ANN-based methods extend beyond offline modeling tools to fully operational components capable of real-time adaptation, personalized comfort support, and cost-effective energy optimization in modern BEMS.
Reinforcement Learning: Unlike MPC, model-free RL algorithms learn control strategies directly from interaction data rather than from explicit physics-based models. Such a fact makes them naturally suited to buildings, where nonlinear thermal dynamics, stochastic occupancy, and volatile renewable generation create conditions that are difficult to capture analytically. The advantages of such an ML type were clearly demonstrated in [139], where a DQN RL algorithm was inferred from HVAC setpoints solely from environmental states and tariff signals, and in [141], where a SAC RL agent which controlled a real Thermally Activated Building System (TABS) installation using only sensor data achieved MPC-like predictive behavior without any plant model.
Compared with simulation-heavy literature, real-world RL deployments remain limited; however, the evaluation reveals a consistent methodological pattern centered on safety, adaptability, and stable field performance. The dominant strategy is offline pre-training followed by online fine-tuning, which reduces the risks of unsafe exploration in occupied buildings. Historical data allow a policy to train before interacting with the real system, thus ensuring safe initialization. This approach has been identified clearly in [136], where a FQI agent was batch-trained on DHW–PV trajectories and then updated hourly online. Similarly, in [137] researchers employed a DDPG agent pretrained using an EnergyPlus–Modelica co-simulation, and ref. [138] initialized a BDQ RL agent through offline Modelica training prior to human-guided online refinement. Taken together, such studies consistently report offline–online hybrid learning as a practical and field-safe RL deployment strategy. Architectural patterns also converge across real implementations: The actor–critic framework has effectively become a default design due to its sample efficiency and stable policy convergence (See Figure 9-Center). Off-policy methods such as DDPG and SAC [137,141] dominated continuous-control tasks, whereas PPO variants were more common in mixed or discrete settings [140,142,143]. The actor computes continuous actions, while the critic stabilizes training by evaluating value functions. Several real deployments refined such algorithms further: ref. [140] introduced a dual-loop retraining scheme for robustness under non-stationarity, and ref. [142] incorporated invalid-action masking to block unsafe HVAC setpoints. Such adjustments demonstrated how real building constraints are shaping practical RL design.
Another complementary trend involved integrating imitation or human-in-the-loop learning to improve convergence speed and interpretability. In [143], a PPO controller was initialized via Behavioural Cloning from legacy rule-based data before online RL adaptation. In [138], occupant comfort feedback updated the BDQ agent’s reward and policy in real time. Such integrated RL schemes combined the stability of supervised learning with the adaptability of RL, reducing exploration time and aligning RL decisions with human comfort expectations. Finally, recent work places increasing emphasis on continual learning, safety monitoring, and trustworthy AI techniques. The PPO controller in [140] detected policy degradation via cumulative reward trends and triggered offline retraining with Elastic Weight Consolidation to prevent catastrophic forgetting. The Maskable PPO in [142] enforced comfort and safety constraints by dynamically filtering out unsafe actions during exploration. These mechanisms illustrate a broader shift toward RL architectures that are not only adaptive but also dependable during long-term, real-building operation.
Overall, real-world RL deployments in BEMS now follow a mature algorithmic blueprint: (i) offline pre-training for safety, (ii) actor–critic architectures for stable learning, (iii) imitation or human-in-the-loop adjustments for improved adaptability, and (iv) safety-enhancing mechanisms such as masking, reward shaping, or continual-learning loops. When combined with standard BEMS integration, such practices allow RL controllers to operate continuously in occupied buildings, maintain comfort, and adapt to uncertain environmental and occupancy dynamics.
Fuzzy Logic Control: Real-world FLC applications have evolved from simple comfort heuristics to adaptive, multi-input, and hardware-embedded intelligent controllers. Early demonstrations—such as [144]—showed how occupant-driven fuzzy systems could directly translate human thermal feedback into HVAC actions using a Mamdani inference structure. Here, subjective states (“too cold”- “neutral”-“too hot”) were fuzzified and converted into setpoint adjustments, forming a foundation for human-in-the-loop HVAC control that adapted continuously to individual preferences and uncertain occupant responses. Subsequent field test studies extended FLC towards more autonomous and environment-aware thermal regulation. In [145], a Mamdani controller using indoor temperature, its derivative, and outdoor temperature delivered smooth, anticipatory heating control without any prediction model—showing that fuzzy reasoning can mimic MPC-like behavior at a fraction of the computational cost. Likewise, ref. [146] integrated wearable-sensor data (skin temperature, its rate of change, heart rate) into a Fuzzy Comprehensive Evaluation module, fusing multiple occupants’ sensations into a single comfort index for zone-level HVAC adaptation in a real office. Such works illustrated how fuzzy logic naturally accommodates uncertainty, heterogeneous feedback, and nonlinear thermal behavior.
A parallel line of development concerned hybrid fuzzy–PID control structures, designed to address multivariable, nonlinear building environments without sacrificing interpretability. In [148], a Multiple-Input Multiple-Output fuzzy–PID controller with 81 rules dynamically tuned PID gains to regulate temperature, humidity, and air quality simultaneously in a real poultry facility—achieving high comfort stability under volatile microclimatic conditions. Similarly, in [149] researchers integrated a Takagi–Sugeno fuzzy supervisor with an MLP–NARX model that captured occupancy-driven heating dynamics, enabling an adaptive, rule-based strategy integrated with EnOcean wireless sensing and PLC hardware in a multi-storey academic building. Recent FLC implementations increasingly emphasized embedded and edge-deployable fuzzy control. In [147], a Mamdani FLC was embedded directly into a smart-meter prototype, controlling PV–battery interactions every 60 s and managing stochastic load and PV fluctuations on low-power hardware. Likewise, in [150] FLC embedded a 49-rule Mamdani for AHU temperature regulation onto an ESP32 microcontroller, demonstrating robust, real-time performance under hardware-in-the-loop testing. In the broader energy domain, ref. [151] deployed a Sugeno-type FLC on an edge PC to manage PV–battery power flows through Modbus-based communication with inverters and sensors, using 45 adaptive rules to balance load, state of charge, and electricity prices in real time. Such studies have indicated that FLC may be computationally lightweight and also highly compatible with IoT- and PLC-based BEMS architectures.
Across these real-world applications, fuzzy control retains two defining strengths: interpretability and robustness under uncertainty. Mamdani-type controllers, favored in comfort-oriented settings (See Figure 9-Right) [144,145,146], provide transparent rule bases that building operators can understand and adjust. In contrast, Sugeno or fuzzy–PID hybrids dominate energy management and grid-interactive applications [148,149,151] due to their computational simplicity and smooth output surfaces (See Figure 9). Architecturally, FLC designs remain compact—typically two to three inputs and 25–81 rules—yet fully compatible with common BEMS communication layers (LabVIEW, PLCs, Modbus, EnOcean), allowing safe, plug-and-play deployment. In summary, real-world FLC usage has expanded from occupant-centric comfort tuning [144] to adaptive, multi-input, hybrid, and embedded architectures [145,148,150,151]. These systems leverage fuzzy inference’s interpretability and resilience to uncertainty, offering reliable, low-cost, and scalable performance across HVAC, IAQ, PV–battery management, and broader smart-grid applications.
Hybrids: Real-world hybrid ML controllers seem to follow a practical pattern: they combine predictive intelligence—using statistical forecasts, learned models, or digital-twin surrogates—with an optimization or search layer that produces real-time setpoints. A common design uses time-series forecasting and state estimation to feed a metaheuristic optimizer, which then searches for suitable control sequences. For example, ref. [152] integrates demand forecasting, a hybrid thermal/Kalman estimator, and an ACO/Simulated Annealing search to select domestic hot-water reheat schedules over a 48 h/5 min horizon. Other variants simply swap the search strategy: ref. [153] uses a Random Neural Network trained with PSO and SQP for HVAC control in an IoT BEMS, while [162] combines Random Forest forecasts with a GA scheduler for day-ahead multi-vector energy management. Some studies incorporate knowledge-based reasoning as well—for instance, [155], where MLP occupancy predictions support case-based reasoning (CBR) selecting HVAC actions within a lightweight multi-agent framework.
A second major stream fused classical MPC structure with ML flexibility. One way was to adapt the MPC model online: ref. [156] updates linear zone models using RLS and then runs LP/QCQP MPC in a hierarchical, MQTT-based allocator across 85 zones. Another approach was to train ANNs to imitate MPC itself. In [160], a NARX-RNN learns MPC trajectories (60 min horizon, 5 min steps), replacing online optimization with fast policy inference while preserving MPC-like control. Several hybrids also separated slow planning from fast operation: ref. [158] used PSO to size storage for cost minimization, while MPC handles 24 h/5 min real-time heat-pump and battery control; ref. [159] used also PSO for day-ahead flexible-load scheduling and an RBC for stable real-time dispatch.
A third family utilized calibrated ML surrogates to enable safe learning and fast control. For instance, in [157], an EnergyPlus model was calibrated using GA and Bayesian methods, accelerated using a Random Forest PID surrogate, and then used to train an A3C policy that transfers reliably to a real HVAC system. NILM-oriented hybrids also appeared in the literature: ref. [161] employed RBFNN predictors optimized via a multi-objective GA to perform appliance disaggregation at 1 min sampling, supporting downstream control in an embedded HEMS. At the minimalist end, ref. [164] utilized an ALAMO-based surrogate to feed a predictive rule-based controller on a PLC, achieving accurate preheating without relying on cloud computation.
Recent hybrid controllers explicitly incorporated safety and explainability by pairing learning modules with constraint-handling or transparent fallback layers. In [163], a residential microgrid controller used TD3, alongside a MILP MPC planner and a PSO-trained decision tree for interpretable fallback control. In a multi-zone HVAC system, ref. [165] combine feature selection (ReliefF), deep sequence prediction (CNN–BiLSTM), and a Whale Optimization Algorithm (tuned the PID), creating a stable pipeline where ML handles forecasting and the PID ensures reliable actuation.
Across deployments, hybrid pipelines follow a common logic: forecast or identify the state/optimize or schedule/supervise or track, using horizons and timesteps that match the building’s physical dynamics (typically 12–48 h horizons with 1–15 min control). Metaheuristics or MILP/LP/QCQP solvers are used when equipment modes or discrete logic dominate, while learned surrogates or MPC-distilled policies are used to keep real-time computation light. Most computation is pushed toward the edge—via MQTT, BACnet, or REST—ensuring low latency, modularity, and fault isolation [152,153,155,156,157,158,159,160,161,162,163,164,165].
Reviewing all real-life hybrid BEMS deployments, three families consistently dominate (See Figure 10-Left): (i) MPC/Metaheuristic hybrids—where MPC is paired with PSO or GA—are frequently used for multi-energy scheduling and handling nonlinearities [158,161,162]. (ii) ANN/Heuristic or ANN/Surrogate hybrids use neural predictors or surrogate models to support rule-based or optimization layers with low computation cost [153,155,165]. (iii) RL/Optimization hybrids blend the adaptability of Reinforcement Learning with the safety of optimization, showing promising results in HVAC, RES, and storage coordination [152,157,163]. Overall, real-world practice favors combinations that merge accurate prediction with robust, explainable optimization—ensuring both adaptability and operational reliability in BEMS.

Figure 10. Left: Occurrence of hybrid schemes in real-world BEMS applications; Right: Occurrence of hybrid counterparts in real-world BEMS applications.

Last but not least, real-world hybrid control strategies, commonly combined metaheuristics (like GA and PSO), ANN predictors (such as LSTM or NARX), and traditional MPC schemes (see Figure 10-Left) should be mentioned in terms of the algorithmic schemes. Such setups typically used ML models to forecast conditions or learn system behavior, while an optimization layer—like MPC—handled real-time decision-making and constraint enforcement. This combination is able to balance prediction accuracy with operational safety and reliability. As a result, hybrid approaches proved to be among the most flexible and deployment-ready control solutions, effectively managing HVAC, PV, storage, and hot water systems in real buildings across different climates.

6.1.2. Evaluation per Agent Architecture

Multi-agent control implementations in real-world setting were limited in number (See Figure 11-Left), however such innovative practice seems to gradually evolving from experimental prototypes to mature coordination frameworks. According to evaluation, three clear families of multi-agent strategies have emerged across real buildings: Cooperative distributed MPC architectures, where intelligence is divided among agents responding to autonomous building zones that iteratively reach consensus through optimization-based coordination. Studies such as [105,119] showed that consensus mechanisms—either via ADMM or virtual-price dual decomposition—enable agents to maintain autonomy while converging to near-centralized performance. Such cooperative distributed MPCs emphasize convex coordination, peer-to-peer communication, and real-time feasibility, proving that full centralization is unnecessary for high performance. Such a multi-agent scheme presented the most extensively validated form of optimization-based multi-agent control in real buildings [75] (See Figure 11-Right).

Figure 11. Left: Percentage (%) of single-agent vs. multi-agent real-world applications for BEMS; Right: Occurrence of multi-agent strategies (cooperative, hierarchical, fully decentralized) in real-world applications for BEMS.

A second trend involves hierarchical multi-agent systems that distribute prediction and reasoning across functional layers. In these, local agents—corresponding to different building zones or energy systems—handle data-driven forecasting or comfort modeling at the edge, while a supervisory coordinator allocates energy resources. Such a pattern has been demonstrated in [133,155,156], combining local learning and global scheduling and resulting in reduced communication load, higher robustness, and easier integration of complex subsystems, such as HVAC, TSS, and PV. Such hierarchy among agents also enabled privacy-preserving collaboration through federated or case-based reasoning, allowing each agent to learn independently but contribute collectively to system-wide optimization. The hierarchical—or layered—multi-agent algorithmic systems particularly suited for large, sensor-rich environments, reflecting a transition toward edge–cloud intelligence fusion in building control.

The third direction highlighted fully decentralized and model-free multi-agent strategies. Works like [166,167] illustrated that agents may cooperate without any global model or solver, relying only on local feedback or shared scalar performance indices. Such lightweight designs proved able to deliver plug-and-play scalability, resilience to communication loss, and minimal computational cost, marking a promising path for low-infrastructure retrofits. Similarly, occupant-centric fuzzy controllers such as [144], embed humans as agents, transforming comfort feedback into decentralized control signals, thus introducing social intelligence into the energy loop. Together, these approaches revealed a growing movement toward self-organizing, low-communication, and adaptive control paradigms capable of operating autonomously yet coherently (See Figure 11-Right).

Across these strategies, several patterns emerge. Cooperative distributed MPCs concerned complex HVAC coordination due to their proven convergence and predictability; hierarchical multi-agent systems concerned multi-zone or multi-building settings, where balancing autonomy and supervision was crucial; and decentralized model-free agents—which represent the most widely used multi-agent architecture (See Figure 11-Right)—excelled in robustness and ease of deployment, though often at the expense of theoretical guarantees. The overall trend indicates that real-world multi-agent BEMS are shifting from centralized optimization towards distributed, learning-enhanced, and communication-light control—each architecture reflecting a different balance between autonomy, coordination, and computational complexity. However, it needs to be underlined that only a few studies have integrated such paradigms into unified frameworks, suggesting a gap in the literature, since the next generation of intelligent BEMS will rely on multi-agent systems that merge learning, negotiation, and distributed optimization under common communication protocols and shared performance metrics [75].

6.1.3. Evaluation per FH/TS

Across all real-life BEMS implementations reviewed, the forecast horizons and timesteps exhibit a clear methodological clustering that reflects the underlying control philosophy, computational feasibility, and physical dynamics of each ML category. MPC-based controllers typically adopted short to medium horizons ranging from 1 to 24 h with fine-grained timesteps between 2 and 15 min, as seen in [96,100,105,111,117,123,124], balancing computational load and prediction accuracy for HVAC thermal inertia. The most frequent configuration among MPC studies was 24 h/10–15 min, representing daily load optimization with real-time responsiveness to weather and occupancy fluctuations. Economic MPC approaches such as [106,110,126], often extend to 24–48 h in order to capture price and storage dynamics, while stochastic or data-driven MPC applications [107,113,121], usually employed shorter horizons (1–6 h) with denser sampling to balance the model uncertainty and enhance adaptability. ANN-based real-world implementations seemed to exhibit either ultra-short timesteps (seconds to minutes) for fast sensor-driven adaptation [132,133], or medium granularities (5–15 min) for HVAC prediction and IAQ control [130,134,135]. In contrast, RL-based controllers generally operated in short-step intervals (5–15 min) but relied on implicit or rolling horizons derived from episode length rather than explicit forecasts, as denoted in [137,138,141,142]. Such temporal granularity suited online learning and continuous adaptation but limited day-ahead optimization, thus excelling in highly dynamic indoor environments. Fuzzy real-world controllers generally used 1–30 min steps without explicit horizons due to their rule-based nature [145,147,151], prioritizing responsiveness over look-ahead. Finally, hybrid approaches reveal wider temporal diversity: MPC-based hybrids (e.g., [156,158]) maintain multi-hour (4–24 h) horizons, while RL- or swarm-based hybrids (e.g., [152,165]) achieve minute-level resolution (1–5 min) for real-time adaptability. Such diversity underlines a broader trend—model-predictive and hybrid methods favor day-ahead optimization, while learning-based and fuzzy approaches emphasize short-term or reactive control. Overall, a convergence was evident toward sub-hourly timesteps (5–15 min) as a practical standard for balancing computational feasibility, data availability, and comfort sensitivity in real-life BEMS.

6.1.4. Evaluation per Data Utilized

Across all reviewed real-world deployments, a consistent pattern emerged in the sensing and data architectures that enabled intelligent control. Almost every field implementation relied on a core “HVAC–weather” data stack consisting of zone-level thermal states (indoor or operative temperature), outdoor temperature, solar irradiance, and HVAC energy or power measurements. These are complemented by air- or water-side process variables such as supply and return temperatures, flow rates, and valve or damper positions. Such foundational layer was nearly universal, as it aligned with standard BAS/BMS sensing capabilities and provided the observability required for thermal modeling and control actuation. Air-side systems (such as AHUs and VAV terminals) typically emphasized airflow, mixed/supply/return temperatures, and fan or compressor power [109,117,156], while hydronic or radiant equipment relied on water-side measurements such as mass flow, differential pressure, and loop-level power readings [119,157,166]. Together, such measurements formed the minimal but sufficient information set that nearly all ML controllers use to model heat transfer, identify system states, and compute safe HVAC commands. Table 8 illustrated the different clusters of data and the data types they concern in real-world BEMS applications.

Table 8. Data clusters and types utilized in real-world building energy management.

To this end, occupancy and indoor air quality (IAQ) data presented the most common—and most impactful—extensions to the sensing layer (See Figure 12). Notably, CO₂ concentration was the dominant IAQ variable in practice, often combined with relative humidity, not only because it reflected ventilation needs but also because it acted as a reliable proxy for real-time occupancy. Such dual-purpose role allowed ML controllers to reduce over-ventilation during low-occupancy periods while ensuring healthy indoor environments [109,134,156]. Some implementations enriched the signal further by incorporating passive Infrared motion detection, plug-load measurements, or schedule-based occupancy tagging [155]. These data streams supported dynamic adjustments of setpoints, ventilation rates, or cooling loads. More user-centric frameworks went one step further: they integrated occupant preference models or thermal sensation estimators, enabling controllers to anticipate comfort rather than merely react to temperature deviations [130,160]. Overall, occupancy and IAQ sensing, became the first and most effective enhancement beyond basic thermal monitoring, directly influencing the energy–comfort balance at the heart of HVAC operation.

Figure 12. Data cluster occurrence in real-world applications for BEMS.

A third major identified trend concerns the growing reliance on Weather data, which transforms controllers from reactive to proactive agents (See Figure 12). Most modern MPC and RL deployments integrated weather forecasts—including ambient temperature, solar radiation, and sometimes humidity—along with internal load and electricity price forecasts. Such predictive inputs enabled strategies such as pre-cooling and pre-heating, tariff-based load shifting, and optimal storage utilization. For example, MPC deployments for AHU systems used k-NN or LightGBM weather predictors to anticipate cooling requirements [117], while distributed hydronic MPCs relied on outdoor temperature and heat-demand forecasts to coordinate multi-zone plant operation [119]. Smart-home and microgrid pilots incorporated day-ahead electricity price and PV generation forecasts to execute cost-optimal scheduling [108,110,162,163]. Such predictive capabilities were instrumental in achieving the reported energy and cost savings and are essential for participation in demand response and time-of-use pricing programs.

As buildings increasingly functioned as multi-energy microgrids, the sensor landscape expands further to incorporate Renewable and Distributed Energy Resources (DER). Real-life BEMS studies now routinely integrate PV output, battery and EV state of charge, thermal and hydrogen storage states, and grid import/export measurements [108,120,129,162]. This broadened sensing layer enabled a holistic optimization across thermal comfort, cost, and renewable utilization. Retail and office pilots coordinated HVAC and refrigeration loads with BESS and PV systems to reduce peak demand and enhance self-consumption [120,163]. Residential implementations integrated DHW tank temperatures, Air Source Heat Pump data, and time-of-use prices to optimize heating schedules and domestic energy flows [152]. Larger, campus-scale demonstrations combined multiple vectors such as PV, EVs, hydrogen systems, and multi-zone HVAC in unified optimization frameworks [162].

Altogether, the reviewed works indicated a coherent data design philosophy in real-world ML-based building energy management: start with the BAS-native thermal and weather sensors, add occupancy and IAQ for comfort intelligence, extend with forecasts for anticipative decision-making, and integrate DER and economic signals to unlock flexibility and grid participation. The consistency of this pattern across different climates, control architectures, and building typologies—evident from field studies in Europe, Asia, and North America—illustrates a maturing consensus on what information is essential for reliable, scalable, and cost-effective ML control in buildings [108,109,110,117,119,120,128,134,135,152,156,157,162,163,166].

6.1.5. Evaluation per Baseline Control

In the vast majority of field tests, baselines concerned controllers already embedded in buildings—namely rule-based control with thermostatic hysteresis, PI/PID loops, or the default logic of commercial building management systems (BMS)—see Figure 13-Left and Right. Such pattern was evident across both air-side and hydronic deployments, since these strategies represent the business-as-usual condition that operators trust and that can be realistically benchmarked on site. For instance, RBC and PI/PID schemes were the reference logic for adaptive control in VAV air-handling units and hydronic plants where supervisory ML-based control was layered on top of existing automation [99,114,126,141,145,167].

Figure 13. Left: Occurrence of baseline control methodologies in real-world applications for BEMS; Right: Percentage (%) of baseline control methodologies in real-world applications for BEMS.

A related family of baselines relied on fixed schedules or static setpoints, representing the pre-existing automation commonly found in office, residential, or commercial buildings. These examples illustrated how simple, programmable or static baseline strategies continue to serve as the dominant comparator since they reflect the real operational practices in the field and provide a consistent point of reference for energy and comfort improvements. Such “static” comparators replicated the do-nothing-smart operation that most buildings still employ today. Fixed schedule thermostats and constant HVAC setpoints were, therefore, used often as the reference for evaluating learning-based and predictive controllers [120,128,130,164]—see Figure 13-Left and Right. However, a noticeable trend in baseline control concerned the use of standardized or strong baselines to ensure fairer evaluation. A notable example portrayed the adoption of ASHRAE Guideline 36 supervisory logic as a reference point in recent RL and MPC field trials, which raised the baseline beyond simple hysteresis or PI control [140]. Likewise, studies implementing distributed and hierarchical MPC increasingly benchmarked their performance against centralized MPC approaches—as an optimal reference—and decentralized MPC (lacking coordination) to measure both performance and coordination efficiency [119]. Such high-fidelity baselines helped differentiate genuine algorithmic advancements from gains that arise purely from weaker comparators, improving methodological rigor and reproducibility.

When direct comparison with existing building control was not possible, real-world studies relied on simulation-based or analytical baselines. Such comparisons were not intended to mimic real control operation but to quantify incremental improvements in forecasting or inference performance. For example, predictive enthalpy-based control and data-driven optimization frameworks was often benchmarked against “non-predictive” simulations of the same building, reflecting what the energy use would be under static or reactive control [98]. Additionally, in data-driven modeling, simpler ML models such as linear regression, SVM, or KNN were potentially used as algorithmic baselines, focusing on estimation accuracy rather than operational efficiency [129,135]. In case of distributed energy resources integration and grid-interactive pilots, baselines corresponded to historical or non-optimized operation, often characterized by exposure to demand charges, uncoordinated PV self-consumption, and absence of storage control. Baseline comparisons were therefore drawn against a building’s prior yearly schedule or default vendor strategy, capturing typical inefficiencies in resource coordination [120,158,162]. Such baseline definition allowed quantification of the added value that intelligent EMS contributes under realistic economic and operational constraints.

Overall, two significant insights emerge. First, reported performance gains scale inversely with baseline strength: large energy and cost savings (20–50%) were consistently observed when comparing ML controllers against weak baselines such as RBC or static schedules, whereas smaller differences (typically below 10%) occurred when compared against an optimal or centralized MPC reference. In the latter cases, improvements shifted from pure efficiency toward scalability, robustness, and computational feasibility, as distributed and hierarchical schemes achieved near-optimal performance with significantly reduced computational overhead [119,141]. Second, the growing adoption of standardized and multi-tier baselines—including ASHRAE Guideline 36, centralized MPC, and historical operation—marked a methodological maturation in real-world ML control. It prevents inflated claims, promotes reproducibility, and establishes a more transparent framework for cross-study comparison.

6.1.6. Evaluation per Performance Index

Performance evaluation in real-world ML-based BEMS has progressed from single-value efficiency metrics toward a comprehensive, multi-index framework able to captures how buildings actually operate. Across the literature, five major metrics categories dominated the concerned field evaluations: (a) energy and cost performance; (b) comfort and IAQ quality; (c) computational feasibility; (d) robustness and stability; and (e) flexibility or grid interaction (See Figure 14). Collectively, such clusters reflected a matured understanding that intelligent control needs perform well not only in theory, but under the diverse constraints and uncertainties of real buildings.

Figure 14. Left: Performance metrics occurrences in real-world implementations for BEMS; Right: Performance metrics percentage (%) in real-world implementations for BEMS.

The first and still most common category concerned energy and cost performance (See Figure 14-Left and Right). Metrics such as energy savings, cost savings, and energy efficiency improvement, remained the principal indicators of economic and environmental value [108,109,117,119,120]. Their universal presence across MPC, RL, FLC, and hybrid schemes, demonstrated a converging emphasis on transparent, interpretable metrics that managers and operators can directly relate to. This widespread use also facilitated cross-comparison across buildings, climates, and control strategies, making energy and cost indicators the most stable common ground for benchmarking in ML-BEMS research (see Figure 14-Left).

A second major category—comfort and indoor air quality—reflected the growing emphasis on human-centered operation (See Figure 14-Left and Right). Most real-life studies evaluated thermal comfort deviation, PMV, TCI, or CO₂-based IAQ as explicit metrics [109,117,156,166]. Their consistent inclusion denoted a shift from simulation-only validation towards occupant-acceptable control. As highlighted in [168], the success of a BEMS was increasingly judged by its ability to maintain comfort along with efficiency, turning energy management into a dual-objective optimization problem rather than a purely technical one. A third and rapidly growing cluster was focused on computational and operational feasibility (See Figure 14-Left and Right). Real-world deployments commonly reported computation times, solver convergence, real-time viability, and scalability (e.g., solution time vs. number of zones) [119,141,160]. Such emphasis reflected a practical reality: field-ready controllers need to fit within BMS hardware constraints, network communication limits, and operational time windows. By reporting these metrics, studies demonstrated that proposed ML controllers were not only theoretically sound but also feasible within the latency and reliability constraints of actual building operation.

Another increasingly prominent category concerned robustness, stability, and reliability (See Figure 14-Left and Right). Such metrics evaluated how controllers handled disturbances, model inaccuracies, sensor noise, and unexpected occupancy patterns. Examples include actuator smoothness, fallback behavior, stability margins, and resilience under uncertainty [108,109,119]. The growing presence of such indexes, signals a methodological shift: real-world ML research now prioritizes safe, trustworthy, and long-term operation, addressing key barriers that historically limited the adoption of advanced control in commercial buildings. Finally, a fifth evaluation category—flexibility and grid interaction (See Figure 14-Left and Right), denoted the broader energy transition in which buildings act not as passive consumers but as active grid participants. Metrics such as peak load reduction, demand response benefit, renewable energy utilization, and grid interaction benefit appeared in a wide range of multi-energy and microgrid studies [110,120,162,163]. The later metrics extend traditional building evaluation by measuring how controllers contribute to grid stability, PV self-consumption, and price-responsive scheduling. Their increasing adoption highlights how ML-BEMS research has expanded beyond internal optimization toward energy-system interoperability.

Taken together, these five clusters illustrated how performance evaluation in ML-BEMS has broadened to reflect the real needs of modern buildings. Where early studies focused almost exclusively on energy savings, later deployments required controllers able to balance efficiency, comfort, computational feasibility, robustness, and grid-aware operation. The convergence of such clusters across ML-based control frameworks indicated that the field is moving toward shared performance standards, denoting an important milestone: ML-based building control is increasingly treated not as an experimental enhancement but as a practical, reliable technology ready for real-world operation.

6.2. Practical Key-Attributes

The practical key attributes (Section 6.2.1, Section 6.2.2, Section 6.2.3, Section 6.2.4, Section 6.2.5 and Section 6.2.6) address how these ML systems were implemented and operated under real-world conditions. This includes the type of deployment setup (e.g., embedded systems, cloud-based platforms), the energy subsystems involved (such as HVAC, PV, ESS, DHW), building typologies, zone scalability, climatic and geographical context, and the duration of field testing.

6.2.1. Evaluation per Implementation Type

Real-world ML-based BEMS implementations fall into a small number of recurring architectural patterns, largely shaped by available automation infrastructure, desired control frequency, and integration constraints. Across the literature, a concise typology emerges:

Integration with existing BMS/BAS: According to evaluation, the dominant approach concerned the supervisory integration with existing BMS/BAS (See Figure 15-Left). In such case, ML algorithms run on an external PC or server and communicated with the building through a middleware—typically BACnet, Modbus, or OPC. Local PID or rule-based loops remain active, while the supervisory layer update high-level setpoints every few minutes. Such approach was frequently selected in practice, since it enabled non-invasive deployment with straightforward fallback to existing control logic, supporting deployment in commercial buildings [96,97,99,100,102,114,115,121,142].

Figure 15. Left: Occurrence of implementation types in real-world BEMS applications; Right: Percentage (%) of implementation types in real-world BEMS applications.
PLC-based architectures: A closely related category utilized Programmable Logic Controller/PLC-based architectures, where industrial controllers handled I/O and timing, while optimization run on an industrial PC. According to evaluation, PLC-driven systems supported faster sampling (1–60 s) and deterministic control, rendering them common in laboratory and high-performance setups such as NEST and EMPA building testbeds [101,118,124,166] (See Figure 15-Left).
IoT/Edge implementations: At different scales, a significant number of real-world studies adopted IoT/edge implementations, using Raspberry Pi, ESP32, or similar microcontrollers (See Figure 15-Left and Right). Such IoT platforms combined local sensing and on-board ML inference—useful for retrofitting homes, offices, or buildings without a BMS. Examples included Pi-based MPC prototypes [103,107,117,126], embedded fuzzy controllers [150], and microcontroller-based Non-Intrusive Load Monitoring (NILM) systems [132]. Such practices enabled low-cost local autonomy; however, the reviewed studies commonly reported limitations related to scalability and cybersecurity considerations.
Edge–cloud supervisory control: A growing number of systems utilized cloud or edge–cloud supervisory control, where building data stream to remote servers hosting ML/MPC models. Communication typically used Message Queuing Telemetry Transport (MQTT) or Representational State Transfer (REST) APIs, with cloud services returning optimized setpoints to the BMS. It is evident that such setups supported more computationally intensive algorithms and multi-building orchestration [111,112,122,123,125].

In larger deployments like campus- or district-scale systems, control was typically handled by a hierarchical or economic MPC architectures. In such cases, a central controller, running on an industrial PC or server, managed multiple buildings and shared energy systems (e.g., chillers, batteries, PV, thermal storage), by calculating hourly or day-ahead setpoints and sends them to local controllers for real-time operation [106]. For multi-zone buildings, distributed or multi-agent architectures were common. In such cases each zone had its own local controller (e.g., PLC or embedded device), and minimal data like load forecasts or coordination signals is exchanged via building protocols such as BACnet or Modbus [105,119,128,140]. Last, a smaller number of implementations utilized consumer-grade IoT retrofits, with Wi-Fi sensors, smart plugs, or IR controllers linked to a cloud platform [123].

Despite such hardware architectural differences, several features were common across implementations. Nearly all systems reused existing field sensors and relied on standard communication protocols. ML controllers typically operated at a supervisory level rather than issuing raw actuator commands, ensuring safety and compatibility. Real-time feasibility was depended on the platform: sub-minute loops appeared in PLC and embedded systems, while supervisory MPC often run at 5–30 min intervals. Fallback mechanisms (usually reverting to PI or rule-based control) were reported in numerous deployments as a resilience measure [112,141]. Recent studies also highlighted federated and edge–cloud learning for privacy-preserving model updates [133,149].

6.2.2. Evaluation per Equipment Type

Across the reviewed studies, the equipment scope increasingly extends from single-system HVAC control towards multi-energy building management (See Figure 16-Right). Early field studies were focused almost exclusively on HVACs [96,130,146], mainly because such energy systems dominated building energy use and were already well-instrumented. Such early deployments provided controlled testbeds for validating ML models with limited sensing, simple actuation, and low integration effort.

Figure 16. Left: Occurrence of energy system types in real-world BEMS applications; Right: Percentage (%) of single-energy system vs. multi-energy system studies in real-world BEMS.

Across all machine learning approaches, HVAC systems stand out as the most commonly controlled part of real-world BEMS (see Figure 16-Left). Almost every deployment focuses on managing HVAC equipment—whether air-based (like VAV and AHUs) or hydronic (like radiant systems, TABS, heat pumps, and chillers). This trend is clear across offices [96,100,114,117], academic buildings and labs [109,111], and residential settings [110,123,127,169]. In most cases, the ML controllers operate every 5–15 min, which matches the response time of building thermal systems and the limits of BMS communication. These deployments report strong results, often achieving energy savings between 15% and 45%, depending on the baseline, climate, and building type.

More recently, real-world studies have begun to go beyond HVAC alone. A significant number of applications (see Figure 16-Right) now treat the building as a full energy hub—optimizing HVAC together with solar panels, batteries, thermal storage, or Domestic Hot Water systems. Such integrated mutli-energy setups improve flexibility, allow better use of on-site renewable energy, and help reduce energy peaks. For example, some projects coordinate HVAC with PV and batteries in commercial buildings [114,120,126], while others combine HVAC with DHW and PV in residential use [110,152]. Some also pair HVAC with thermal storage to take advantage of building thermal inertia [110,126]. These multi-energy systems strike a good balance between practicality and performance, often leading to better comfort, lower costs, and more flexibility. At a more advanced level, a small but growing number of studies are expanding this idea to microgrid-scale setups: such studies include extra components like EV chargers, hydrogen units, or large battery banks [158,162,163]. These cutting-edge testbeds show the potential of ML-based BEMS to support broader energy goals—such as sector coupling or demand response—turning buildings into active energy players rather than just consumers.

A smaller but growing set of studies demonstrates multi-agent control of multi-energy buildings as already stated. Distributed MPC or hybrid agents supporting coordinated operation between multiple energy subsystems using limited communications. Examples include virtual-price-based coordination of multiple buildings and energy hubs [119], PSO–RBC HEMS controlling HVAC–DHW–PV–ESS [159], token-based MPC across 85 zones [156], CBR/ANN multi-office coordination [155], and decentralized chiller-plant optimization [166]. In summary, in the reviewed dataset, HVAC control illustrated the most prevalent foundation across real-world BEMS implementations, while multi-energy integration—PV, ESS, DHW, EV chargers, TSS—has become increasingly common. Such richer setups enable energy shifting, flexibility, and grid interaction, reflecting a broader move from isolated thermal control toward integrated, multi-energy building operation.

6.2.3. Evaluation per Building Type

Real-world ML-based BEMS deployments were unevenly distributed across building types, with most experiments occurring in environments that offer reliable sensing, controllable conditions, and low deployment risk. The most common settings concern experimental real-world testbeds as academic and laboratory testbeds, including university offices, lecture rooms, and living-labs [109,111,114,131,141] (See Figure 17-Right). Such facilities typically include BAS/BEMS infrastructures, rich sensor availability, and flexible operational schedules—equipment which renders them as ideal for testing advanced control strategies. As a result, experimental real-world building testbeds slightly dominate the field demonstrations across ML-based [100,112,117,131,133,137,143] in comparison to operational real-world building testbeds that concern actual conditions in urban and rural areas [168] (See Figure 17-Right).

Figure 17. Left: Occurrences of building types in real-world BEMS applications; Right: Percentage (%) of operational vs. experimental building testbeds in real-world BEMS applications.

According to the evaluation, the primary types of both real-world experimental and operational testbeds, concern a wide range of building typologies (See Figure 17-Left). More specifically:

Office buildings: Such type offers realistic occupancy patterns and stricter comfort requirements while still providing reliable automation infrastructure. MPC deployments in office buildings—e.g., [96,114,128]—frequently report energy savings in the range of 17–45%, depending on baseline configuration and operating conditions. Such sites favor supervisory control rather than experimental multi-agent schemes due to operational and safety constraints.
Residential buildings: Residential buildings are less frequently represented in the reviewed real-world studies compared to office buildings, despite their societal relevance. Limited sensing, heterogeneous occupant behavior, and the absence of centralized BAS equipment renders deployment challenging for many research efforts. Nevertheless, real experiments—such as in the Netherlands [110], Switzerland [124], and the USA [127]—show that ML-based controllers can still achieve 19–35% heating or cost savings under single-zone or small multi-zone configurations. These studies provide an essential foundation for practical smart-home energy management.
Academic buildings: Purely academic testbeds—such as laboratories, lecture rooms, halls, and libraries—make up a large share of real-world experiments. As mentioned earlier, this is mainly because universities often have well-developed BAS/BEMS infrastructures, dense sensor networks, and flexible operating schedules. Moreover, such environments provide researchers with a practical and accessible setting for implementing and testing advanced control strategies under real building conditions.
Mixed-use buildings: Such type combines office spaces with commercial stores or services, appear in only a limited number of real-world studies. Their complex and varied usage patterns—often involving different occupancy schedules, comfort needs, and energy profiles—make them more challenging to model and control. As a result, mixed-use environments remain an underexplored area in ML-based BEMS research, despite their growing presence in urban development.

Apart from the different building typologies, a noticeable smaller but important group of works targeted multi-building and campus-scale testbeds. In such cases, distributed or hierarchical MPC have been validated across interconnected buildings and shared energy hubs, demonstrating high scalability and cost-efficiency. Examples include coordinated campus energy management at NEST [119], Stanford’s SESI infrastructure [106], and large multi-zone facilities at NTU [156]. Such deployments report 10–30% cost reductions and showed that cooperative ML control is adequate to operate reliably across complex, multi-building networks. Finally, large multi-zone commercial facilities and fully occupied homes provide complementary value: while campuses enable significant absolute savings due to scale, small dwellings validate robustness in uncontrolled, real-life conditions [110,123,127]. In summary, real-world ML deployments concentrate heavily in research labs and academic offices, where infrastructure maturity simplifies experimentation. On the other hand, residential implementations remained relatively limited but crucial for mainstream adoption. Expanding validated ML control strategies from institutional buildings to everyday homes and small businesses represents the next major step toward broad, equitable, and impactful smart energy management.

6.2.4. Evaluation per Zone Scalability

Real-world deployments exhibit clear differences in zone scalability across methods, shaped by modeling effort, computational load, and coordination requirements. MPC demonstrated the broadest scalability—from single-zone pilots to large multi-zone or campus-scale systems. Many studies used small multi-zone setups (4–12 zones) in offices and labs [105,112,117], while larger examples include an 85-zone academic building [156] and campus-level coordination serving hundreds of zones [106]. Mid-scale deployments (6–9 zones) in airports and offices further confirm practical scalability [122,128]. Still, a substantial share of MPC studies remained single-zone [111,126,127], reflecting the modeling effort required for multi-zone parameter identification and the computational cost of higher-dimensional optimization. RL deployments concerned mostly single-zone or very small multi-zone (1–2 zones), focusing on individual rooms or test cells [137,138,139,141]. Only a few studies scaled beyond this, such as 5-zone and 28-zone implementations using digital twins or safety constraints [140,142]. Such limited scalability stems from RL’s sample inefficiency and the difficulty of stabilizing multi-agent coordination in real buildings, leading most demonstrations to use controlled laboratory environments. ANN-based systems operated also at the single-zone level, especially when used for preference learning, IAQ forecasting, or surrogate modeling [130,135]. Larger multi-zone applications (e.g., 10-room occupancy prediction) appeared mainly when ANNs served as sensing or forecasting modules rather than full supervisory controllers [133], since cross-zone coordination was not inherently embedded in standard ANN architectures. FLC implementations exhibited almost exclusively single-zone or very small multi-zone (less than four zones) structures [144,145,146,150]. Such tendency is reasonable, since rule bases grow exponentially with interacting zones, and thus, FLC is better-suited for localized comfort regulation and not for large-scale coordination.

Hybrid approaches span the full range. Architectures coupled with MPC or multi-agent scheduling scale to 7–85 zones [155,156], whereas RL/heuristic hybrids in residential and office settings generally remained single- or small multi-zone (up to 11 zones; 9 zones in large offices) [154,165]. Such flexibility arise from combining local data-driven models with supervisory optimization layers, enabling multi-zone operation without overwhelming computation or communication. It is evident that, MPC—and MPC-centric hybrids—remained the most scalable solutions for real buildings. RL, ANN, and FLC applications predominantly operated at single-zone scales, with emerging medium-scale RL demonstrations suggesting gradual progress as safety, sample efficiency, and coordination mechanisms advance [140,142].

Overall, real-world ML-based implementations are still largely focused on single-zone testbeds, as shown in Figure 18-Left and Right. Such setups simplify control, data collection, and evaluation, making them attractive for early-stage deployment. However, they fall short in capturing the spatial diversity, zone interactions, and thermal coupling effects present in real buildings. As a result, there is a clear research gap in scalable, multi-zone implementations that reflect the true operational complexity of commercial, educational, and mixed-use buildings. Bridging this gap is critical for developing ML-based controllers that are transferable, robust, and ready for large-scale deployment.

Figure 18. Left: Occurrence of single- vs. multi-zone buildings in real-world BEMS applications; Right: Percentage (%) of single- vs. multi-zone buildings in real-world BEMS applications.

6.2.5. Evaluation per Location and Climate

Real-world ML-based BEMS deployments cluster strongly around a few geographic regions and climate types, reflecting both research infrastructure and local energy challenges (See Figure 19-Left and Right). Early and advanced implementations originated from highly instrumented research testbeds per country and continent (See Figure 19-Left and Right): More specifically, Empa’s NEST Living Labs in Switzerland [118,124,141,142,143], LBNL’s FLEXLAB in the USA [102,137], and NTU/NUS facilities in Singapore [109,113,138,160]. The aforementioned sites provide dense sensing, flexible scheduling, and safe experimental conditions, enabling reproducible evaluation of MPC, RL, ANN, and hybrid controllers. Comparable infrastructures across Europe (Aachen’s E.ON ERC [167], Offenburg’s Smart Grid Lab [151], Stanford’s SESI campus [106], and district-heated buildings in Finland [115]) further support multi-energy and grid-interactive testing (See Figure 19-Left and Right).

Figure 19. Left: Occurrences of real-world applications per country; Right: Percentage (%) of real-world applications (%) per continent.

A second major category involves occupied offices, and operational residential buildings, where controllers operate under real occupancy, manual overrides, and utility tariffs. Examples include office pilots in Switzerland, Belgium, Germany, and the Netherlands [96,108,110,112,114] and residential demonstrations in the Netherlands, Greece, and the USA [127,152,168]. More diverse real-world environments (e.g., airports [122], convenience stores [120], and multi-zone offices in Asia [123,128]), illustrate that ML controllers have been demonstrated to operate beyond laboratory conditions under real occupancy and tariff variability.

The geographical distribution of real-world studies on BEMS, is closely linked to the climate conditions of each region (See Figure 19-Left and Right). Researchers tend to focus on areas where climate-specific energy challenges—such as extreme heating or cooling demands—make BEMS particularly relevant. For example, studies from cold regions often prioritize heating efficiency and insulation strategies, while those from hotter climates focus on cooling load optimization and solar gains. This climate-driven focus influences the choice of control strategies, sensor deployments, and energy-saving objectives, highlighting the need for context-aware BEMS designs tailored to local environmental conditions. More specifically (See also Figure 20-Left and Right):

Figure 20. Left: Climate distribution of real-world BEMS applications; Right: Climate-type distribution across real implementations.

Cold, heating-dominated regions: As Switzerland, Germany, Belgium, Finland, UK, Northern USA, represent roughly 35–40% of deployments [96,112,114,115,125,126,128]. These studies focus on heating optimization, thermal storage, and robust MPC formulations to handle slow dynamics, solar gains, and long seasonal transients (e.g., GHP, DHW coordination).
Hot–humid climates: Such as Singapore, Abu Dhabi, Zhejiang, Shenzhen, and parts of the USA account for 30% of cases [98,99,109,113,122,123]. Here, cooling, humidity control, and latent–sensible decoupling dominate, often using data-driven MPC, RL, and hybrid methods. Shorter test durations and high solar availability also support PV-driven economic MPC and RL approaches [108,110,121].
Temperate or mixed climates: Such as Italy, France, Netherlands, Spain, Greece, parts of the US) make up the remaining 30–35% [110,111,118,127,168]. These environments support year-long, multi-season tests across both heating and cooling, making them ideal for hybrid and multi-energy controllers integrating HVAC, RES, ESS, and DHW [152,158,162].

Overall, global deployments are concentrated in Singapore, Switzerland, the USA, and Northern–Central Europe, supported by strong research ecosystems and advanced BAS infrastructures. While most studies still rely on laboratory or academic environments for safety and instrumentation, an increasing number of fully occupied offices, retail sites, airports, and residences [120,122,123,128,168] demonstrate that ML controllers proved adequate to operate robustly under real-world variability. This shift toward diverse climates and fully occupied buildings reflects a growing maturity in ML-BEMS research and suggests increasing evidence of robustness across diverse climates and operational contexts.

6.2.6. Evaluation per Experimental Period

Real-life BEMS studies show a clear progression in experimental duration, reflecting growing technological maturity and confidence in ML-based control. Early pilots (2015–2018) seem to have short and highly controlled (often lasting only a few hours, days, or up to two weeks) and aimed mainly at proving feasibility for MPC, ANN, or FLC in single-zone testbeds [96,97,99,145]. These brief trials ensured safe testing but captured limited weather and occupancy variability. From 2018 onward, experiments increasingly adopted medium-term horizons (several weeks to a few months). Data-driven MPC, ANN-based forecasting, and hybrid MPC–ML controllers were typically evaluated over 1–4 months, enabling the study of forecast drift, model degradation, and controller robustness under realistic operating conditions [104,111,118,125]. These durations mark the shift from proof-of-concept demonstrations toward practical, sustained operation. Seasonal and year-long deployments have become more common in recent years—especially for Economic MPC, multi-energy coordination (HVAC–PV–ESS), and campus-scale systems [106,110,120,128]. These extended trials validate controllers across full heating and cooling seasons, dynamic tariffs, and long-term comfort requirements, providing strong evidence of operational viability. In contrast, Reinforcement Learning (RL) and RL-based hybrids are still mostly limited to short-term deployments (days to weeks) due to safety constraints and the need for cautious online adaptation [137,138,142]. ANN-based predictors and Fuzzy Logic Controllers often achieve longer stable runs—as long as several months—since they rely on pre-trained or rule-based logic that does not require continuous exploration [130,132,147].

Overall, the field is transitioning from short, isolated pilots toward longer, seasonally representative deployments that capture real-world uncertainty, occupant behavior, and energy-market variability. Short-term studies continue to support rapid innovation, but multi-month and annual experiments now serve as the primary benchmark for evaluating the robustness and readiness of ML-based BEMS in real buildings. Figure 21-Left and Right illustrate the occurence and percentage of Short-term (<15 Days), Medium-term (between 16 days and 6 months) and Long-term (>6 months) experimental deployments in real-world BEMS applications:

Figure 21. Left: Number of real-world applications considering the experimental duration; Right: Percentage (%) of real-world applications considering the experimental duration.

7. Discussion

The Discussion section highlights the key trends and gaps identified across the reviewed literature, offering a broader perspective on the current state of real-world ML applications in BEMS. It also outlines future research directions that can help address existing limitations and support the development of more scalable, adaptable, and impactful building control solutions.

7.1. Current Trends

The analysis of all real-world deployments reveals several cross-cutting themes shaping the success and limitations of ML-based BEMS. To support practitioner-oriented technology selection, these trends are synthesized below.

■: MPC: Among the 73 real-world experiments, MPC appears most frequently in real-building deployments, especially in office and campus environments [96,105,111,112]. Such prevalence does not merely portray an academic artifact, but is linked to clear enabling factors: convex or structured optimization problems integrate well with existing BAS/BMS, explicit comfort and safety constraints may be imposed, and computation times may be bounded and verified before deployment. Current properties portray crucial aspects for the occupied buildings, where violating temperature or IAQ limits is unacceptable and where facility staff must trust the controller’s behavior. At the same time, the reviewed studies also highlight significant obstacles: grey-box model identification, plant calibration, and robustness tuning remain labor-intensive, particularly under changing occupancy and usage patterns [114,127]. Data-driven or stochastic MPC variants mitigate model mismatch and uncertainty, but they are still designed to maintain tractability and real-time feasibility, reinforcing the trend that MPC in practice is driven by engineering constraints rather than theoretical generality.
■: ANNs: Across real-life deployments, ANNs have rarely acted as stand-alone decision-makers; instead, ANNs are consistently positioned as information-enhancement modules: forecasting loads or PV, constructing soft sensors, estimating occupancy/IAQ, or learning user preferences [130,131,134]. Such pattern reflects a structural role: ANNs expand observability and predictive horizon, while a separate optimization layer still enforces constraints. Their successful utilization depends on having enough high-quality data, choosing the right input features, and taking into account seasonal changes and occupant behavior. Embedded implementations on microcontrollers and edge devices further show that, from an integration perspective, ANN models can be compressed and deployed close to the field, reducing communication overhead and preserving privacy [132]. The trend that emerges underlines that ANNs perform particularly well where data are rich and stable enough to train robust models, yet operators still prefer to keep the final safety-critical decisions in more transparent or explicitly constrained controllers.
■: RL: The real-world RL controllers in our review were shaped by strict safety and comfort requirements rather than by unconstrained exploration. Successful deployments applied offline pre-training on historical or digital-twin data, use actor–critic architectures for stability, and embed safety mechanisms such as action masking, fallback RBC/PI policies, and carefully engineered reward structures [137,138,142]. Such design choices portrayed direct responses to key obstacles: exploration in unsafe regions is infeasible; data are limited; and operators required predictable, interpretable behavior. Consequently, RL is typically applied in single-zone or small-scale configurations and at 5–60 min decision intervals [140]. The trend here exhibits that real-world RL behaves more like constrained, data-driven adaptation wrapped around a safety envelope, often mimicking MPC-like objectives, rather than a free-form reinforcement learner. Its main enabler concerns the ability to adapt policies from real data without explicit physical models; its main barriers are engineering integration complexity, high data demands, and the difficulty of guaranteeing robustness in uncertain environments.
■: FLC: Fuzzy Logic Control maintains a robust niche where interpretability, ease of implementation, and tolerance to noisy or sparse data are decisive. In many of the examined pilots, FLC was implemented directly on low-cost hardware as a drop-in upgrade to PID or On/Off logic for HVAC, comfort control, or small RES/ESS setups [144,150]. The enabling factors are clear: expert knowledge can be encoded in linguistic rules; parameter tuning is intuitive for practitioners; and computation is extremely light, which simplifies integration with legacy BMS hardware. The trend reveals that FLC is selected precisely where engineering resources are limited and transparency for facility managers is essential. However, as the number of zones, objectives, and energy carriers increases, the rule base grows quickly and systematic design becomes difficult, making FLC less competitive than MPC- or RL-based approaches in large multi-zone or multi-energy BEMS.
■: Hybrid Controllers: Several of the most capable multi-energy and dynamic deployments combined predictors (ANNs, statistical models), optimization (MPC, MILP), metaheuristics, and sometimes RL into hybrid architectures [152,157,162,163]. Such schemes leverage complementary strengths: ANNs and ML models improve forecasts and soft sensing; MPC or MILP enforce comfort and operational constraints; metaheuristics handle discrete scheduling of DERs; RL or iterative identification provide adaptation; and digital twins offer safe environments for policy updates. The empirical trend is that hybrids are chosen when forecasting accuracy, constraint satisfaction, and operational flexibility all matter simultaneously. The clear downside is engineering overhead: design, integration, and maintenance of such architectures require substantial expertise, robust data pipelines, and careful software engineering. As a result, hybrids currently appear mostly in well-instrumented testbeds or technology parks, even though they point toward the likely future of integrated IBEMS.
■: Agent Architectures: Distributed MPC, hierarchical multi-agent control, and decentralized model-free strategies were explored to address scalability and communication constraints in multi-zone or multi-building systems [105,119,155,166]. Such architectures enabled local autonomy, reduce central bottlenecks, and matched natural decompositions (e.g., zones, buildings, or energy hubs). According to the evaluation, such multi-agent approaches are adopted when spatial scale and heterogeneity make a single centralized controller impractical. However, this comes at the cost of more complex coordination, higher sensitivity to communication delays or failures, and non-trivial stability and convergence analysis. Practically, numerous real deployments still favor supervisory single-agent controllers for smaller systems, while multi-agent architectures are being piloted where scalability needs are most pressing.
■: Temporal Design: A striking trend is that most real-world ML controllers, regardless of family (MPC, RL, ANN-assisted, FLC, hybrid), operated at 5–15 min control intervals [96,130,137]. This convergence was dictated by thermal inertia, actuator limitations, and BMS communication constraints rather than algorithmic preferences. Forecast horizons are similarly structured: uncertainty-dominated or purely data-driven controllers typically adopt short horizons, whereas economic MPC and flexibility-oriented schemes use daily or multi-hour horizons to capture tariffs and PV profiles. Such consistency across methods suggests that temporal design in real testbeds is governed by physics and infrastructure, and any ML algorithm must adapt to these timing constraints rather than redefine them.
■: Data Hierarchy: A universal data hierarchy emerged from the field studies: temperature and weather were present in almost all deployments; occupancy and CO₂ sensors provided the most significant performance gains when added; forecasts (load, weather, PV) transform reactive strategies into proactive ones; and DER/price signals enable demand response, self-consumption maximization, or peak shaving [108,117,120,134]. MPC and FLC were able to function with relatively limited sensing, but showed reduced adaptivity; ANN and RL schemes benefited strongly from richer, well-maintained datasets and robust data pipelines. To this end, data availability and quality directly condition which ML methods are feasible: high-end ANN/RL/hybrid solutions cluster in data-rich testbeds, while simpler MPC/FLC dominate in more conventional buildings.
■: Performance Metrics: Energy or cost, comfort, and actuator smoothness form a universal objective triad across MPC, RL, and hybrid controllers [109,110,111]. Flexibility, RES utilization, and grid-interaction metrics were added where relevant (e.g., PV–ESS microgrids). RL reward functions generally mirrored this structure rather than introducing fundamentally novel criteria, confirming that operational and contractual constraints shape what “optimal” means in practice. Evaluation metrics have similarly evolved: early pilots reported mostly energy savings, while more recent deployments quantify comfort deviation, IAQ, computational feasibility, robustness, flexibility benefits, and fallback behaviour [119,163]. According to evidence, the field is maturing from research prototypes to operational technologies evaluated under multi-dimensional performance criteria.
■: Implementation Types: Almost all reviewed controllers operated as supervisory layers on top of existing RBC/PI/PID logic via BACnet, Modbus, or vendor APIs [96,112]. Such evolutionary integration pathway, portrays a key enabler across all method families: it allows gradual commissioning, preserves safety through fall-back modes, and aligns with facility managers’ expectations. At the same time, it constrains sampling rates, data access, and cyber-security requirements, since ML controllers must comply with existing IT/OT policies. To this end, deployment success depends at least as much on clean interfacing, security, and maintainability as on the specific ML algorithm.
■: Building Types: Most real-world deployments in the review were concentrated in well-instrumented offices, campuses, and living labs, where experimentation is possible and sensing is rich [109,117]. Residential implementations, though fewer, show that simpler MPC, ANN, and FLC schemes are adequate to still deliver substantial energy and cost savings with lower engineering overhead [110,127]. Multi-energy and grid-interactive systems (HVAC–PV–ESS–DHW–EV) are becoming more frequent, especially in microgrids and technology parks, where hybrid and optimization-based controllers exploit flexibility [120,162]. The emerging trend is that as buildings evolve into multi-energy hubs, the value of coordinated, hybrid intelligence becomes more pronounced, but so does the complexity of deployment.

Taken together, these trends indicate that there is no universally superior ML method; instead, the appropriate choice is driven by context. MPC offers strong constraint handling and is attractive where modelling resources are available and integration with existing BMS is feasible. ANNs enhance observability and forecasting where data quality is high. RL can provide adaptation and long-term learning, provided that safety mechanisms, data pipelines, and operator trust are carefully addressed. FLC remains a compelling option where interpretability, low cost, and minimal data requirements dominate. Hybrid controllers combine strengths for complex, multi-energy and grid-interactive systems, at the expense of higher engineering and maintenance overhead. For practitioners, the main decision drivers are therefore: (i) ease of integration with existing BMS, (ii) availability and quality of data, (iii) acceptable engineering and tuning complexity, and (iv) required robustness, security, and reliability in uncertain, occupied environments.

7.2. Future Directions

The synthesis of real-world ML-based BEMS implementations reveals a rapidly maturing field with persistent gaps in scalability, long-term evidence, interoperability, and transferability. To guide future experimentation, current work summarizes the main research priorities as follows:

➠: Scaling to multi-building portfolios and district-level systems: Most deployments remained confined to single buildings or small multi-zone configurations, with only a few distributed MPC studies extending to campus scale [105,106,119]. Future research should design real-life, multi-building and district-scale pilots that coordinate heterogeneous buildings through standardized communication (BACnet/MQTT/REST), jointly optimize shared assets (TSS, ESS, EVCS, CHP), and evaluate centralized vs. distributed vs. decentralized strategies under real communication and observability constraints.
➠: Expanding real-world experimentation to residential, social housing, and small business sectors: The current evidence base is biased toward academic labs and office buildings, with limited actual residential or small commercial deployments [110,127]. Future pilots should target cost-sensitive sectors using low-cost sensing (temperature, CO₂, plug-loads), lightweight controllers suitable for retrofit scenarios without full BAS, and user-centric designs that consider overrides, comfort heterogeneity, and variable digital literacy.
➠: Long-term and multi-season validation for ML-based controllers: While MPC and hybrid methods of the current dataset, operated over multi-month horizons [110,120], RL and ANN-based controllers were typically validated over days or short trials due to safety and stability concerns [137,142]. Future experiments need to provide multi-season evidence on performance drift, catastrophic forgetting, occupancy shifts, actuator degradation, and adaptation quality, using structured retraining phases, controlled policy updates, and monitored rollback procedures.
➠: Advancing safe RL and MARL for multi-zone and multi-energy buildings: Real-world RL remained limited to single-zone settings despite strong conceptual potential [137,140]. Future work should leverage digital twins for pretraining, integrate formal safety mechanisms (shielded RL, MPC-based safety filters), and evaluate MARL coordination approaches (CTDE, value decomposition, consensus) under realistic communication delays and failures. Experiments should benchmark RL/MARL not only on comfort and energy but also on commissioning effort, robustness, and scalability.
➠: Formalizing the digital-twin-to-field pipeline: Current hybrid deployments use EnergyPlus/Modelica and ML surrogates for pretraining [157,160]. Future real-life experiments should establish a standard workflow: constructing calibrated twins with uncertainty quantification, generating scenario-rich synthetic datasets, training MPC surrogates and RL policies, and transferring them to the field with structured adaptation phases. Quantifying transfer gaps and comparing synthetic-data training with historical-data-only approaches remains an open research priority.
➠: Establishing standardized experimental protocols and strong baselines: Cross-study comparability is relatively weak due to inconsistent baselines and metrics. Future trials should implement reference baselines (static schedule, RBC/PI, ASHRAE Guideline 36, reference MPC), define minimum reporting requirements (energy/cost, comfort, IAQ, PLR/DRB, computation time, fallback behavior), and adopt standard FH/TS configurations for common use-cases. Multi-site benchmarks deploying identical controllers across climates would significantly improve the scientific maturity in the field.
➠: Developing structured multi-objective formulations beyond weighted sums: Most real-world objectives functions rely on scalar weighted sums combining energy/cost, comfort, and smoothness [110,120,162]. Future work should explore multi-stakeholder and fairness-aware formulations (zone-specific comfort, group fairness, environmental objectives, DR requirements), using constrained RL, lexicographic MPC, and Pareto-based methods, with real-world evaluation of operator acceptance and practical usability.
➠: Embedding human-in-the-loop control and explainability: Although some deployments incorporate feedback or voting [130,138,144], systematic human-in-the-loop integration remains relatively limited in real-world implementations. Future experiments should provide operator-facing explanations (e.g., variable attribution for setpoint changes), simple interfaces for adjusting high-level objectives (e.g., “prioritize comfort”, “maximize PV”), and long-term assessment of user trust, override behavior, and adoption dynamics.
➠: Investigating edge–cloud architectures, latency robustness, and cyber-security: The shift toward edge computing and cloud–edge hybrids raises questions around resilience and security [112,132,133]. Future pilots should study safety-critical loops remaining on-premise, test the impact of communication delays and outages, evaluate privacy-preserving learning (e.g., federated learning), and include controlled simulations of cyber incidents (spoofing, DoS) to assess system resilience and fallback behavior.
➠: Enabling cross-building transfer and automated commissioning: Most ML controllers are designed for a single building, limiting scalability. Future research should design experiments where the same MPC, ANN, RL policy, or FLC rule base is deployed across multiple buildings with minimal manual retuning. This requires meta-learning, domain adaptation, automated parameter identification, and robust “cold-start” procedures to ensure safe deployment in unfamiliar environments. Demonstrating cross-site transfer is essential for commercial viability.

8. Conclusions

This review brings together findings from over 73 real-world studies applying machine learning to BEMS, offering the most in-depth look so far at how intelligent control performs outside of simulation. Despite the variety of methods explored, the field is showing clear signs of convergence. Real-world deployments are being shaped not only by technical innovation but also by practical realities—like building constraints, occupant comfort and safety, and compatibility with existing automation systems.

Among the methods examined, MPC remains the most widely used and deployment-ready approach, thanks to its ability to handle constraints and integrate smoothly with building management systems. ANNs, once mainly used for offline modeling, are now embedded in real applications—helping with forecasting, soft sensing, and acting as surrogate models within hybrid systems. On the other hand, RL is gaining interest but still faces challenges in safety and stability, limiting its current use to smaller-scale, controlled settings. Meanwhile, FLC optimization approaches continue to offer valuable trade-offs between interpretability and performance, particularly in complex or resource-constrained environments. Hybrid ML-based experiments manage to offer higher performances overall, illustrating a robust foundation for future implementations in real-wold energy management.

Across studies, common patterns emerging: short control intervals (typically 5–15 min), a consistent stack of sensing inputs (weather, thermal, indoor air quality, forecasts), and performance goals that go beyond energy savings to include comfort, robustness, and actuator efficiency. Importantly, evaluation practices are becoming more comprehensive—moving from single metrics toward multi-criteria assessments that reflect the full complexity of building operation.

Still, some important gaps remain. Real-world ML-based BEMS are mostly found in academic testbeds, or research labs. Operational residents, small commercial spaces, and large building portfolios remain relatively underexplored. Promising areas like multi-agent coordination, RL, and integrated control across energy sectors have yet to be widely tested in the field. And without standardized evaluation methods, baseline comparisons, or transferable models, broader adoption remains a challenge.

In short, the field is clearly maturing. ML-based BEMS are evolving into intelligent, hybrid, and grid-aware systems that can deliver measurable benefits in energy efficiency, cost savings, and comfort. To fully realize their potential, the next step is scaling these technologies, through larger, longer-term deployments, stronger methodological consistency, and closer alignment with digital twin platforms and multi-energy networks. With these efforts, ML can become a cornerstone of smarter, more sustainable buildings in the years ahead.

Author Contributions

Conceptualization, P.M.; methodology, P.M. and I.M.; software, P.M., F.M. and M.K.; validation, all authors; formal analysis, all authors; investigation, P.M., F.M., M.K. and H.H.C.; resources, all authors; writing—original draft preparation, P.M.; writing—review and editing, P.M., I.M. and H.H.C.; visualization, P.M.; supervision, P.M. and E.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by the SEED4AI project. The project is being implemented within the framework of the National Recovery and Resilience Plan “Greece 2.0”, with funding from the European Union—NextGenerationEU (Implementing body: Hellenic Foundation for Research and Innovation (HFRI))/ID: 16880. SEED4AI: https://seed4ai.ee.duth.gr/ accessed on 22 November 2025.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

ACO	Ant Colony Optimization
AHU	Air Handling Unit
ALAMO	Automated Learning of Algebraic Models
AMV	Actual Mean Vote
ANN	Artificial Neural Network
ASHP	Air-Source Heat Pump
BAS	Building Automation System
BESS	Battery Energy Storage System
BMS	Building Management System
BDQ	Branching Dueling Q-Network
CAO	Cognitive Adaptive Optimization
CBR	Case-Based Reasoning
CNN	Convolutional Neural Network
COP	Coefficient of Performance
CVRMSE	Coefficient of Variation of the Root Mean Square Error
DHP	District Heating Plant/Heat Pump (context-specific)
DHW	Domestic Hot Water
DNN	Deep Neural Network
DMPC	Distributed Model Predictive Control
DP	Dynamic Programming
DR	Demand Response
DTS	Dynamic Thermal Sensation
DQN	Deep Q-Network
DRL	Deep Reinforcement Learning
ECMS	Energy Management Control System
EMS	Energy Management System
ESS	Energy Storage System
EV	Electric Vehicle
EVCS	Electric Vehicle Charging Station
FHC	Forecast Horizon
FLC	Fuzzy Logic Control
FQI	Fitted Q-Iteration
GA	Genetic Algorithm
GHP	Ground-Source Heat Pump
HiL	Hardware-in-the-Loop
HVAC	Heating, Ventilation and Air Conditioning
IAQ	Indoor Air Quality
ICNN	Input Convex Neural Network
IPOPT	Interior Point OPTimizer
KNN	K-Nearest Neighbors
LR	Linear Regression
LSTM	Long Short-Term Memory Network
MAPE	Mean Absolute Percentage Error
MILP	Mixed-Integer Linear Programming
MIQP	Mixed-Integer Quadratic Programming
ML	Machine Learning
MLP	Multi-Layer Perceptron
MOBS	Model-Based Occupancy-Based Control
MPC	Model Predictive Control
MPPO	Maskable Proximal Policy Optimization
NARX	Nonlinear AutoRegressive with eXogenous inputs
NSGA-II	Non-dominated Sorting Genetic Algorithm II
PID	Proportional–Integral–Derivative Control
PMV	Predicted Mean Vote
PPO	Proximal Policy Optimization
PV	Photovoltaics
PVT	Photovoltaic-Thermal Collector
QP	Quadratic Programming
RBC	Rule-Based Control
RC	Resistance–Capacitance Thermal Network
RLS	Recursive Least Squares
RES	Renewable Energy Systems
RL	Reinforcement Learning
RNN	Recurrent Neural Network
SAC	Soft Actor–Critic
SQP	Sequential Quadratic Programming
SSA	Salp Swarm Algorithm
SVM	Support Vector Machine
TABS	Thermally Activated Building Systems
TES	Thermal Energy Storage
TSS	Thermal Storage System
UFAD	Underfloor Air Distribution
V2G	Vehicle-to-Grid
VAV	Variable Air Volume System
WOA	Whale Optimization Algorithm

Appendix A. Summary Tables

Moreover, Summaries (Table A1, Table A2, Table A3, Table A4, Table A5) that follow each key-attribute table provide a brief summary of the concerned applications. The description of the Summary Tables may be defined as follows:

Author: Illustrates the name of the author along with the reference application;
Summary: Provides a brief description of the research work illustrating also numerical results;

The abbreviations “-” or “N/A” represent the “not identified” elements in tables and figures.

Table A1. Summaries of MPC applications for real-life BEMS.

Author	Summary
Sturzenegger et al. [96]	A white-box MPC using a bilinear RC model and Sequential Linear Programming (CPLEX) coordinated TABS, AHU, radiators, and blinds in a multi-zone Swiss office. Running every 15 min with a 58 h horizon, it kept comfort within <10 Kh cooling violations and reduced non-renewable primary energy by 17% and costs by 16.9% versus rule-based control. Operation remained fully stable, with yearly savings estimated at 5590 CHF, showing that large-scale MPC deployment is feasible despite modeling effort.
Razmara et al. [97]	An exergy-based MPC experimentally tested in an office zone (Lakeshore Center, Michigan) using nonlinear programming in YALMIP. By minimizing exergy destruction, the controller outperformed rule-based and energy-based MPCs, achieving 36% energy savings and 22% lower exergy destruction on a real GSHP HVAC system, demonstrating the value of exergy-aware optimization.
Kwak et al. [98]	A real-time MPC predicting building energy 15 min ahead and using predictive enthalpy control for AHU dampers in a large South Korean office. Using EnergyPlus with real-time weather/load inputs, the model achieved MBE = −0.7% and Cv(RMSE) = 19.1%. On a hot, humid day it yielded 0.5% (9 kWh) energy savings over the predicted baseline, with predictions and control actions integrated into the EMCS for operators.
Goyal et al. [99]	A nonlinear MPC (MOBO) for a single-office VAV system in Pugh Hall (University of Florida) using an 8-state RC model and IPOPT, with 10-min steps and a 30-min horizon. Occupancy from PIR sensors informed both MPC and an occupancy-based RBC (MOBS), benchmarked against a dual-maximum RBC. Real tests showed ≈40% savings (62.5→38 MJ/day) and CO₂ less than 650 ppm for both MPC and MOBS, with little advantage of MPC under ventilation-driven operation.
De Coninck et al. [100]	Implements a classical grey-box MPC that jointly optimizes energy cost and comfort for Kalkkaai building, a real Brussels office heating plant—2 HPs and gas boiler—with plant-level actuation only. Using 5-min receding control, a 24 h look-ahead, and IPOPT-solved NLPs, the controller preheats at night with HPs, adapts TSup dynamically, and reserves the boiler for peaks. Against the building’s RBC, MPC achieved 34–40% cost savings and 20–30% primary-energy reduction while maintaining category-I comfort—showcasing tangible field gains from plant-level MPC without zone control.
Vrettos et al. [102]	Proposed a robust MPC-based, three-tier controller that schedules day-ahead regulation reserves and then tracks a 4-s AGC signal by modulating VAV fan power, all while maintaining comfort in a commercial-like single-zone FLEXLAB cell at LBNL. The approach fuses robust optimization (reserves) with robust MPC+EKF (climate) and a switched feedforward/PI tracker, explicitly handling fan nonlinearity.
Chen et al. [101]	A data-driven MPC using a Dynamic Thermal Sensation (DTS) model learned via EKF from occupant feedback, validated in a single-zone office at Penn State. Implemented as a nonlinear program solved every 5 min over a 2 h horizon, it reduced sensible cooling energy by 25% (3.6 × 10⁶ J vs. 4.83 × 10⁶ J) while keeping comfort within ±0.7 AMV, outperforming PMV-based MPC.
Aftab et al. [103]	Simulation-guided MPC using EnergyPlus co-simulation and gradient-descent calibration, combined with video-based occupancy detection and polynomial occupancy prediction on a Raspberry Pi. Tested in a UAE mosque, it achieved 23–39% energy savings, PMV within ±0.5, and CVRMSE less than 1%, demonstrating a low-cost and practical real-life MPC deployment.
Hilliard et al. [104]	A similar simulation-guided MPC approach with EnergyPlus co-simulation and calibrated thermal models, integrating occupancy recognition and prediction on low-cost hardware. Field tests confirmed 23–39% energy savings and stable comfort (PMV ±0.5), showing feasibility for inexpensive, real-building MPC implementations.
Joe et al. [105]	Distributed MPC for radiant floor cooling in a Purdue University office. Four thermal sections operated as independent agents coordinated via a Proximal-Jacobian ADMM method solving local QPs. With a grey-box RC model and 24 h/30 min horizon, the system reduced chiller electricity by 27%, cooling energy by 19%, and comfort exceedance by 66% compared to PI control, demonstrating scalable real-building DMPC.
Rawlings et al. [106]	hierarchical economic MPC for campus-scale HVAC and thermal storage at Stanford University. Combining economic optimization with MILP-based scheduling, it coordinated chillers, heat-recovery units, and storage, achieving 10–15% lower operating costs than expert operators, supplying 93% of heating from heat-recovery chillers, and cutting water use by 15%. All optimizations were solved online within minutes.
Smarra et al. [107]	Data-driven MPC translating random-forest models into linear leaf-equations for QP/MIQP optimization. Validated in an off-grid house in L’Aquila (Italy) with hydronic heating (10-min steps, 40-min horizon), it achieved 25–49% energy savings vs. bang-bang RBC while maintaining comfort. Simulations confirmed robustness in multi-zone and DR scenarios.
Finck et al. [108]	ANN-based EMPC deployed in a Utrecht residential row house, optimizing heat-pump operation under prices and PV forecasts. Using DP (12 h horizon, 1 h steps), EMPC1 reduced daily electricity costs by 7% and improved FF from −0.89 to 0.42; EMPC2 further increased FF to 0.67, Supply Cover to 0.13, and Load Cover to 0.16. Comfort was maintained, and clear flexibility gains were shown in practice.
Yang et al. [109]	Classical MPC implemented in a lecture theatre at NTU Singapore for a DOAS-assisted HVAC system with separate sensible/latent coils. Using QP with a 60-min horizon and 2-min updates, MPC provided up to 18% electricity savings vs. PID-based BMS while keeping PMV in [−0.5, 0.5] and RH less than 65%, proving effectiveness in humid climates.
Fink et al. [110]	Economic MPC was applied for a residential heat-pump and TSS system (Photovoltaic thermal solar collector—PVT) in a detached house near Amstelveen, in Netherlands. Operating hourly over a 24 h horizon under real-time prices, the approach cut operating costs by 12–15%, increased flexibility, achieved high COPs, and could deliver 13.8 kW thermal charging during 60-min events, with HP electricity increasing by 4–9% due to strategic load shifting.
Carli et al. [111]	QP-based MPC with linearized PMV and soft constraints deployed via an IoT platform in a single-zone university lab (Bari, Italy). Running every 2 min with a 4 h horizon, it achieved 18.6% energy savings and raised comfort compliance from 75.1% to 95.4% using less than 5 s computation time.
Drgovna et al. [112]	Cloud-supervised QP-MPC in a 12-zone low-energy GEOTABS office building (“Hollandsch Huys,” Belgium) using a linearized Modelica model with weather-forecast decoupling. MPC coordinated TABS and floor heating, reducing heat-pump energy by 53.5% and comfort deviation by 36.9% versus RBC-PI control.
Yang et al. [113]	A data-driven nonlinear MPC using an adaptive NARX neural network and a hybrid ESM–SQP solver was tested in two single-zone NTU Singapore testbeds (office and lecture theater). Managing HVAC only, it saved 58.5% and 36.7% energy relative to PID/BMS baselines while holding PMV within −0.5 to +0.5, demonstrating strong comfort preservation and adaptability.
Freund et al. [114]	A nonlinear grey-box MPC optimized TABS supply temperature in a Hamburg low-energy office, acting as a supervisory controller over two heating circuits (7 reference zones) and interfacing with a Honeywell Trend BMS. Between February–April 2020, it achieved 30% heating-energy savings (up to 75% in April) without comfort degradation; exceedances were linked to solar gains and TABS inertia rather than MPC performance.
Wu et al. [115]	An economic MPC for a district-heated university office in Espoo (Finland) optimized hourly radiator setpoints over a 12 h horizon using NSGA-II and a calibrated two-capacity RC model. A window safeguard (>15 °C) prevented downdraught. Field tests across two zones showed up to 4.8% cost and 5.4% energy savings versus a static 21 °C baseline, while maintaining comfort (PMV) and significantly reducing draught-risk hours.
Clausen et al. [116]	A digital-twin GA-based MPC controlled radiator heating and VAV ventilation in a university lecture room in Odense, Denmark, using a gray-box model, sMAP data, and OccuRE occupancy prediction. Compared to RBC, MPC kept CO₂ mostly < 1000 ppm (baseline peak 1046 ppm), delivered anticipatory ventilation including 0% VAV when unoccupied, and reduced temperature-band violations.
Blum et al. [117]	A classical nonlinear MPC implemented via MPCPy in a Berkeley office building controlled a multi-zone UFAD HVAC using grey-box R2C2 models and 24 h/10 min optimization. It achieved ~40% HVAC energy savings versus PI control while maintaining comfort. Implementation effort split as 33% modeling, 29% preparation, 25% deployment, and 13% software development.
Bunning et al. [118]	A data-driven MPC in the two-zone Urban Mining and Recycling (UMAR) apartment, in NEST research center, Switzerland, used ARMAX, Random Forest, and ICNN models to control HVAC panels with a 6–7 h horizon and 30 min timestep. All approaches saved 26–49% energy over hysteresis RBC while maintaining comfort; ARMAX-MPC gave the best accuracy–stability–computation trade-off (0.2 s), showing simple models can outperform complex ML ones.
Lefebure et al. [119]	A distributed data-driven MPC coordinated multiple buildings and an energy hub via dual decomposition with virtual prices. Local controllers solved semi-relaxed MIQPs followed by QPs (Gurobi). Applied to the NEST/DFAB building (7 zones with GHP, electric boiler, TSS), it used only +0.42% more energy than centralized MPC, with slightly fewer comfort violations and clear superiority over decentralized control. Comfort stayed between 22–24 °C during the 24 h field test.
Zhang et al. [120]	A supervisory MPC deployed at the Blue Lake Rancheria microgrid—in California, US—coordinated HVAC, refrigeration, 60 kW PV, and a 174 kWh/109 kW battery using nonlinear MPC (IPOPT/JModelica/CasADi) with a 24 h/5 min setup. It reduced annual electricity costs by ~12% and peak load by 34% versus BAU, maintaining comfort and food-safety constraints, and achieved 100% success in demand-limit/load-shift events, 83% in load-shedding, and 56% in load-tracking.
Zhan et al. [121]	A data-centric MPC combining LSTM disturbance forecasts, RC thermal modeling, and QP optimization was deployed in a 6-zone net-zero Singapore office. By aligning setpoints with PV output, it improved PV self-consumption by 19.5% and self-sufficiency by 10.6% versus fixed-setpoint baselines, while maintaining comfort and IAQ.
Yue et al. [122]	A hybrid MPC with a gray-box thermodynamic model and Cubist-based residual predictor was retrofitted into an airport terminal in Zhejiang via a cloud–edge IoT solution. Optimizing chilled-water supply and chiller sequencing, it achieved up to 37.3% energy savings and reduced discomfort time by >90% vs. RBC, with high predictive accuracy (RMSE = 0.12 °C).
Wang et al. [123]	A classical MPC for residential AC units in Shenzhen used a cloud-based 1R1C RC model and low-cost Tuya sockets. Tested for one month in the cooling season, the IPOPT-based MPC optimized ToU-driven cooling, yielding 22.1–26.8% utility-cost savings versus RBC and improved bedroom comfort.
Yin et al. [124]	A stochastic MPC (SMM-PC) using chance constraints ensured probabilistic comfort compliance under noise/disturbances in the NEST testbed (Dübendorf, Switzerland) for space heating, DHW, and batteries. It outperformed N4SID, BiLevel, and DeePC, achieving up to 8% energy savings and 90% fewer comfort violations.
Taheri et al. [125]	A cloud-based ARX-MPC retrofit for HVAC in an educational building in Indiana optimized PI setpoints via IoT microservices. Across six classrooms (two thermal zones), it delivered 19.2% energy savings and sharply reduced comfort violations, outperforming PI and occupancy-based controls while proving cost-efficient and scalable.
Wei et al. [126]	An economic MPC (YALMIP–Gurobi MILP) deployed on a Raspberry Pi for radiator and underfloor heating in a Nottingham passive-house pod. Running every 30 min, it cut electricity costs by 20% versus on/off control with perfect comfort tracking (0% CV-RMSE). With PCM wallboards, savings rose to 35%, validating a low-cost MPC–PCM IoT solution.
Pergantis et al. [127]	A data-driven convex MPC for an air-to-air heat pump with resistive backup in a 208 m² occupied Indiana home optimized setpoints using weather and occupancy forecasts. It maintained comfort while reducing heating energy by 19%, backup use by 38%, and peak-stage events by 83% compared to PID, saving nearly 300 USD annually.
Klanatsky et al. [128]	A grey-box data-driven MPC using MILP co-controlled TABS heating/cooling and façade blinds under a 24 h horizon in a 9-zone Living Lab office. It achieved 41% zone-level energy savings (17% heating, 75% cooling) and an 85% reduction in final energy use, while maintaining comfort and eliminating gas-boiler demand.
Gao et al. [129]	A data-corrected MPC integrating an Extended Kalman Filter and Particle Swarm Optimization controlled a PCM-based LHTES unit in a Guangdong data center. It maintained <3% control error, improved energy efficiency by 21.5%, and cut operating costs by 60.3% vs. PID, with low 20 s latency enabling real-time cooling management.

Table A2. Summaries of ANN applications for real-life BEMS.

Author	Summary
Peng et al. [130]	An occupant-centric HVAC controller learned individual temperature preferences using an MLP classifier based on hour, outdoor air, indoor air, and CO₂ levels, with simple rules and local PID enforcing safe setpoints. Deployed for ~5 months in a Singapore office (four rooms), it replaced a fixed schedule with adaptive 20.5–24 °C setpoints. The system delivered 4–25% cooling-energy savings and reduced occupant overrides to <1 day/month, showing practical, low-cost personalization on standard BMS infrastructure.
Sadeghian et al. [131]	A real-life MLP/NARX identification model for dynamic heating was deployed in an academic smart building at UPEC (France), incorporating occupancy as an exogenous input. Trained and validated on 18 h of sensor data (1 min timestep), it achieved prediction errors <0.2 °C (OSP) and ~0.4 °C (MSP). Results showed heating rates increased fourfold with occupancy (0.015→0.042 °C/min), demonstrating the model’s potential for adaptive, occupant-aware HVAC control.
Mari et al. [132]	A 1-D CNN sequence-to-point NILM model trained on REFIT was deployed for one year in two Italian homes for edge-based real-time appliance disaggregation. Running on a Cortex-M7 microcontroller, it processed 8 s windows of total power and achieved <12% relative error for major appliances. Field tests confirmed strong adaptability and generalization, validating embedded deep learning for practical residential energy management.
Khan et al. [133]	A federated-learning LSTM architecture for multi-occupancy prediction was deployed in a 10-room office at ICAR-CNR (Rende, Italy). Hierarchical aggregation across Edge and Cloud preserved privacy and reduced communication overhead. The LSTM reached 84.5% accuracy and a 0.845 F1-score for 10 min-ahead predictions, matching centralized models while improving scalability and energy efficiency for Cognitive Buildings.
Bae et al. [134]	A personalized LSTM-based ventilation controller was implemented in Chung-Ang University, Living Lab located in Seoul, South Korea, predicting CO₂ 5 min ahead using real-time occupant features (MET, BMI, gender). It outperformed rule-based ventilation, achieving 100% comfort compliance in mock-up tests and ~25% energy savings in real-office deployment. The approach demonstrates an effective single-agent ANN strategy for proactive ERV control.
Simoes et al. [135]	An MLP model for predicting natural ventilation airflow was experimentally validated the in Library of the Faculty of Sciences at the University of Lisbon (FCUL Library Atrium), located in the Campo Grande campus, Lisbon, Portugal. Trained using a heat-balance soft sensor, it achieved ~30% MAPE and generalized well across differing weather and window configurations. Deployed from March–June 2024, it demonstrated feasibility while avoiding CFD or flow-sensor costs, showing promise for scalable smart-BMS NV integration.

Table A3. Summaries of RL applications for real-life BEMS.

Author	Summary
De et al. [136]	A batch RL controller (FQI) was deployed to learn occupant DHW usage and align reheating with PV peaks, using a DNN PV forecaster and an ExtraTrees Q-approximator, with a backup thermostat enforcing comfort. In six Dutch homes with smart heat pumps and PV, the policy (24 h horizon, 1 h grid, 5 min actions) was retrained hourly from field data. Relative to a thermostat baseline, PV self-consumption rose from 46.7% to 58.5% and PV→DHW capture from 6.3% to 16.9%, shifting demand toward solar without comfort loss.
Touzani et al. [137]	A DDPG agent was deployed in LBNL’s FLEXLab to jointly control HVAC setpoints, PV, and a Tesla Powerwall (15-min steps, 2–3 h price look-ahead). Compared with ASHRAE G36 rule control, it achieved up to 39.6% cost savings and improved comfort during demand-response events, demonstrating RL’s ability to coordinate HVAC + storage for cost-flexible real-building operation.
Lei et al. [138]	A Branching Dueling Q-network controlled HVAC and ceiling fans in a real open-plan NUS office. Pre-trained in calibrated Modelica and refined online using personalized comfort feedback, the BDQ improved HVAC energy savings by 13.9% and raised thermal acceptability from 88% to 99% versus a fixed 27 °C + M3 baseline, with fan energy under 3% of total.
Du et al. [139]	A DQN controller for a two-zone residential HVAC system in Knoxville (USA) was trained offline and field-tested for 11 days at ORNL’s Yarnell Station. The controller achieved ~32% simulated cost savings and 12.8% real-life cost reduction under ToU tariffs, autonomously learning preheating strategies while maintaining comfort without explicit thermal modeling.
Naug et al. [140]	A dual-loop DRL supervisory controller using PPO with multi-actor training and LSTM-based building models was deployed in a LEED Gold mixed-use building at Vanderbilt University. With offline relearning triggered by performance drift, it outperformed RBC, reducing energy by 14%, improving comfort by 21%, and lowering actuator activity by 11%.
Silvestri et al. [141]	A Soft-Actor Critic (SAC) agent was deployed in the HiLo office (NEST, Dübendorf) to control a TABS-based HVAC system. In simulation, SAC reduced energy by 15–50% and temperature violations by ~25% versus RBCs, and by 23%/5% versus PI, delivering MPC-like comfort with only 29% more energy. During two months of real cooling operation, SAC improved indoor temperature control by 68% relative to the best RBC, with no energy increase.
Heidari et al. [142]	A Maskable PPO (MPPO) controller was developed for an 8-storey academic building connected to the world’s first CO₂-based heating network in Sion, Switzerland. Action masking prevented unsafe setpoints while optimizing slab and AHU temperatures. In simulation, MPPO cut energy costs by 8% with ≤1 °C comfort deviation; during real deployment it achieved up to 36% cost reduction versus RBC.
Silvestri et al. [143]	A PPO controller pre-trained via behavioral cloning of an RBC was deployed in the HiLo office at EMPA NEST (Dübendorf) for single-zone TABS control (5-min steps). It adapted online without surrogate models and achieved up to 41% energy savings and 43% fewer comfort violations versus two RBC baselines. A digital twin verified performance (RMSE 0.53 °C, MAPE 7.4%).
Jazizadeh et al. [144]	A human-in-the-loop BEMS used a snapped preference slider and a Wang–Mendel fuzzy model to learn personalized comfort maps, with a proportional decentralized controller adjusting zone setpoints. Deployed in a USC office (Los Angeles) across 3 zones/6 rooms, comfort rose to 8.4/10 while airflow—an HVAC energy proxy—dropped 39% vs. legacy fixed-setpoint BMS and 26% vs. the same controller at a fixed setpoint. The approach integrates easily with existing VAV/AHU systems and translates personalization directly into energy savings.
Ulpiani et al. [145]	A real-life comparison of a Mamdani FLC against PID and On/Off control for a single-zone radiator system in an NZEB mock-up in Agugliano (Italy). The FLC used indoor temperature, its derivative, and outdoor temperature to modulate radiator power. It achieved up to 68% energy savings and perfect comfort stability (PPD·h = 0), outperforming both baselines and offering the most balanced performance under varying weather.
Li et al. [146]	A thermal-sensation-based fuzzy controller used wearable measurements (skin temperature, its derivative, heart rate) to estimate individual sensations and update zone setpoints every 30 min, while airflow was PID-controlled. Implemented in a university office in Dalian (China), it improved comfort scores (5.56 vs. 5.10) and reduced total HVAC energy by 13.8% (AHU −20.5%, water loop −13.4%) compared to fixed setpoints.
Chojecki et al. [147]	A fuzzy-logic EMS was embedded into a smart-meter controlling residential PV + battery systems in Lodz (Poland). Built in C++ with multi-rule-base architecture, the controller autonomously optimized energy flows under ToU tariffs. Real hardware tests showed 30% PAR reduction, 34% peak-energy reduction, and 7% total energy savings, proving the feasibility of intelligent FLC directly on smart-meter hardware.
Lahlouh et al. [148]	A MIMO fuzzy–PID controller was deployed in a poultry house (Rabat, Morocco) to regulate temperature, humidity, and gas levels. Combining fuzzy tuning with PID action yielded 43% energy savings, 97% growth efficiency, and reduced CO₂ to 2461 ppm, outperforming both simple fuzzy and On/Off baselines while maintaining highly stable indoor conditions.
Sadeghian et al. [149]	A Takagi–Sugeno fuzzy controller was validated in a five-floor academic building at Université Paris-Est Créteil. Using an MLP-NARX model with occupancy as a thermal gain, the neuro-fuzzy system modulated radiator valves on PLCs. Field tests showed large energy reductions—up to 89% in small offices and 40% in classrooms—while maintaining temperatures near 20 °C, outperforming On/Off control.
Chojecki et al. [150]	A Mamdani FLC for AHUs, implemented on an ESP32 and tested via hardware-in-the-loop for a single-zone HVAC system in Lodz (Poland), replaced traditional PIDs. It achieved 27.4% lower integral errors, 36% lower temperature deviation, and 12.7% energy savings compared with an untuned PID, while eliminating seasonal manual retuning.
Habib et al. [151]	A Sugeno-type fuzzy EMS running on an edge PC controlled a 6.3 kWp PV–4.5 kW battery system at Offenburg University’s Smart Grid Lab. Using SOC, power balance, and price inputs, it regulated ESS current and grid interaction. Compared with rule-based control, it provided smoother SOC behavior, reliable island/grid transitions, and adaptive real-time cost optimization.

Table A4. Summaries of hybrid applications for real-life BEMS.

Author	Summary
Kazmi et al. [152]	A hybrid heuristic RL strategy combining Q-learning with Ant Colony Optimization controlled ASHP-based DHW systems in nZEB homes in Amersfoort (NL). SARIMA learned occupant demand, a Kalman-filtered hybrid model predicted tank thermodynamics, and ACO refined reheating schedules. In field tests, the controller saved 17–27% energy (up to 36% in ideal simulations) with zero comfort violations and showed clear DR flexibility.
Javed et al. [153]	A decentralized IoT-based BEMS using Random Neural Networks trained via hybrid PSO–SQP optimized PMV-based HVAC control. Integrated CO₂/door/PIR sensing achieved 88% occupancy-detection accuracy. Tested in a single-zone chamber at Glasgow Caledonian University, it reduced HVAC energy by 27.1% vs. rule-based thermostats while maintaining comfort, demonstrating feasibility on low-power devices.
Peng et al. [154]	A self-learning hybrid controller using k-means occupancy clustering and KNN-based cooling prediction was deployed in 11 office zones in Singapore with passive chilled beams. It delivered 7–52% cooling-energy savings (mean 21%) while keeping comfort within 0.5 °C of baseline and maintaining >95% control precision.
Gonzalez et al. [155]	A multi-agent hybrid CBR–MLP HVAC controller was implemented across seven offices in the University of Salamanca. Agents learned occupant schedules and adjusted heating/cooling every 30 min. The system achieved ≈41% energy savings over rule-based thermostats while maintaining acceptable comfort.
Png et al. [156]	An IoT-enabled hybrid MPC–RLS–ANN multi-agent framework coordinated 85 zones in an NTU Singapore building. Local MPCs computed zone cooling demand, while a central QCQP scheduler optimized air distribution and chiller load. With a 4-h horizon and 15-min steps, the system achieved 18–23% average savings (up to 31%), with computation under 90 s and a 1–1.5-year payback.
Zhang et al. [157]	A DRL–GA hybrid (A3C + NSGA-II calibration) was trained on a calibrated EnergyPlus model and deployed for 78 days in CMU’s Intelligent Workplace in Pittsburgh. The real-building A3C agent reduced heating energy by 16.7% versus PID-based rule control while keeping comfort stable.
Baniasadi et al. [158]	A hybrid PSO–MPC coordinated PV, battery, thermal storage, and heat pump operation for a Perth residence. PSO sized components and MPC optimized operation under dynamic pricing. Real-life results showed 80% annual electricity-cost reduction, 42% lower life-cycle cost, and 57.3% PV self-consumption.
Rochd et al. [159]	A PSO–RBC hybrid HEMS was deployed at the Moroccan Smart-Campus house, scheduling HVAC, DHW, EV, and appliances over 24-h/10-min horizons. Compared to unscheduled operation, it increased PV self-consumption from 60 to 90% (+30 pp), cut grid dependence by 30 pp, reduced electricity cost by 85%, and shortened payback from 17 to 11 years; under RTP, an additional 13–26% cost reduction was achieved.
Yang et al. [160]	A NARX-RNN controller learned to emulate MPC behavior without online optimization. Tested in two real single-zone NTU buildings (office, lecture theatre), it achieved 51.6% and 36.2% energy savings vs. fixed-setpoint/PID baselines, maintained PMV within limits, and reduced computation by >100×.
Laouali et al. [161]	A NILM system using RBFNNs optimized via a multi-objective GA (ApproxHull) was deployed in a two-floor smart home in Faro (Portugal). It achieved 95–100% detection accuracy and 93–99% estimation accuracy, outperforming SVM, LSTM, and CNN baselines; 66% of energy was attributable to identifiable loads, with 60% schedulable.
Massana et al. [162]	A GA–RF hybrid EMS was deployed for 14 days in the Walqa Technology Park (Spain), coordinating HVAC, EV charging, and hydrogen systems via RF forecasts and GA scheduling. Operational costs fell from €548 to €360 on average (−34%) and by up to €131 (−76%) on the best day, improving renewable use and efficiency.
Ruddick et al. [163]	A real-world comparison of Safe RL (TD3/OptLayer), MILP–MPC, and TreeC in four replicated Brussels houses with PV, batteries, and EVs. TreeC delivered the safest operation (27 Wh grid exceedance) with costs comparable to MPC and RBC (≤0.6% deviation), while RL incurred 25.5% higher costs due to online learning requirements.
Zouloumis et al. [164]	A lightweight smart thermostat using ALAMO-based surrogate modelling and rule-based preheating was deployed on a low-cost PLC in Kozani (Greece). It achieved 97–100% comfort compliance with RMSE ≤ 0.12 °C, enabling fully local, cloud-free HVAC control for single-zone systems.
Zheng et al. [165]	A hybrid ReliefF–SSA–CNN–BiLSTM forecaster integrated with a WOA-tuned PID was deployed for HVAC control in a large multi-zone Hangzhou office. Using 3-min data from three zones, it achieved 1.25% MAPE and kept indoor conditions within 0.5 K/0.3 g/kg, outperforming classical and deep baselines by over 60% in accuracy and stability.

Table A5. Summaries of other applications for real-life BEMS.

Author	Summary
Dai et al. [166]	A peer-to-peer decentralized chiller-plant controller was deployed on a commercial-type system at Tsinghua University. Each pump/chiller self-optimized and negotiated cooling load to equalize relative efficiency, routinely shutting down surplus units and driving active machines toward their best efficiency points. The approach yielded 6.8–9.9% power savings in individual tests and up to 14.5% over conventional sequencing across a daily profile, remained close to exhaustive-search optimality, and converged in under 1 s.
Michailidis et al. [167]	A decentralized, model-free, self-learning HVAC controller (L4GPCAO)—was experimentally validated in three conference rooms at the E.ON Energy Research Center in Aachen, Germany. Each zone acted as an autonomous agent with 3 h forecasts and daily parameter updates. Relative to the PID-based BMS baseline, it reduced non-renewable energy use by 34.7% (193.12 to 126.03 kJ/m²) while maintaining comfort and producing smoother actuator trajectories.
Michailidis et al. [168]	A Parametrized Centralized Cognitive Adaptive Optimization (PCAO) controller was deployed on a legacy split A/C unit in a 15 m² apartment in Xanthi, Greece using low-cost IoT sensing. The model-free controller updated setpoints every 15 min with 3 h forecasts and learned daily from real data. Compared with RBC, summer operation saw 37% lower energy use and 19% cost reduction with negligible comfort loss; winter operation achieved 9.8% energy savings, 10.6% comfort improvement, and 10.5% cost reduction. Peak daily savings reached 40% (summer) and 30% (winter), demonstrating a practical and affordable residential solution.

References

Niza, I.L.; Luz, I.M.d.; Bueno, A.M.; Broday, E.E. Thermal comfort and energy efficiency: Challenges, barriers, and step towards sustainability. Smart Cities 2022, 5, 1721–1741. [Google Scholar] [CrossRef]
Shaikh, P.H.; Nor, N.B.M.; Nallagownden, P.; Elamvazuthi, I.; Ibrahim, T. A review on optimized control systems for building energy and comfort management of smart sustainable buildings. Renew. Sustain. Energy Rev. 2014, 34, 409–429. [Google Scholar] [CrossRef]
Harish, V.; Kumar, A. A review on modeling and simulation of building energy systems. Renew. Sustain. Energy Rev. 2016, 56, 1272–1292. [Google Scholar] [CrossRef]
Michailidis, P.; Michailidis, I.; Gkelios, S.; Kosmatopoulos, E. Artificial neural network applications for energy management in buildings: Current trends and future directions. Energies 2024, 17, 570. [Google Scholar] [CrossRef]
Balta, M.T.; Dincer, I.; Hepbasli, A. Development of sustainable energy options for buildings in a sustainable society. Sustain. Cities Soc. 2011, 1, 72–80. [Google Scholar] [CrossRef]
González-Torres, M.; Pérez-Lombard, L.; Coronel, J.F.; Maestre, I.R.; Yan, D. A review on buildings energy information: Trends, end-uses, fuels and drivers. Energy Rep. 2022, 8, 626–637. [Google Scholar] [CrossRef]
D’Agostino, D.; Minelli, F.; Minichiello, F.; Musella, M. Improving the indoor air quality of office buildings in the post-pandemic era—Impact on energy consumption and costs. Energies 2024, 17, 855. [Google Scholar] [CrossRef]
Merabti, S.; Draoui, B.; Bounaama, F. A review of control systems for energy and comfort management in buildings. In Proceedings of the 2016 8th International Conference on Modelling, Identification and Control (ICMIC), Algiers, Algeria, 15–17 November 2016; pp. 478–486. [Google Scholar]
Michailidis, P.; Michailidis, I.; Kosmatopoulos, E. Reinforcement learning for optimizing renewable energy utilization in buildings: A review on applications and innovations. Energies 2025, 18, 1724. [Google Scholar] [CrossRef]
Aguilar, J.; Garces-Jimenez, A.; R-Moreno, M.D.; García, R. A systematic literature review on the use of artificial intelligence in energy self-management in smart buildings. Renew. Sustain. Energy Rev. 2021, 151, 111530. [Google Scholar] [CrossRef]
Michailidis, I.T.; Kapoutsis, A.C.; Korkas, C.D.; Michailidis, P.T.; Alexandridou, K.A.; Ravanis, C.; Kosmatopoulos, E.B. Embedding autonomy in large-scale IoT ecosystems using CAO and L4G-CAO. Discov. Internet Things 2021, 1, 8. [Google Scholar] [CrossRef]
Ashraf, S.; Zarie, M.M.; Abdellatif, S.O. Towards autonomous energy management: Machine learning for effective auditing and optimization. Sci. Rep. 2025, 15, 39368. [Google Scholar] [CrossRef]
Haniff, M.F.; Selamat, H.; Yusof, R.; Buyamin, S.; Ismail, F.S. Review of HVAC scheduling techniques for buildings towards energy-efficient and cost-effective operations. Renew. Sustain. Energy Rev. 2013, 27, 94–103. [Google Scholar] [CrossRef]
Lim, B.; Van Den Briel, M.; Thiébaux, S.; Backhaus, S.; Bent, R. HVAC-aware occupancy scheduling. Proc. AAAI Conf. Artif. Intell. 2015, 29. [Google Scholar] [CrossRef]
Bagdadee, A.H.; Zhang, L.; Saddam Hossain Remus, M. A brief review of the IoT-based energy management system in the smart industry. In Artificial Intelligence and Evolutionary Computations in Engineering Systems; Springer: Singapore, 2020; pp. 443–459. [Google Scholar]
Liu, Y.; Yang, C.; Jiang, L.; Xie, S.; Zhang, Y. Intelligent edge computing for IoT-based energy management in smart cities. IEEE Netw. 2019, 33, 111–117. [Google Scholar] [CrossRef]
Michailidis, I.T.; Sangi, R.; Michailidis, P.; Schild, T.; Fuetterer, J.; Mueller, D.; Kosmatopoulos, E.B. Balancing energy efficiency with indoor comfort using smart control agents: A simulative case study. Energies 2020, 13, 6228. [Google Scholar] [CrossRef]
Said, O.; Al-Makhadmeh, Z.; Tolba, A. EMS: An energy management scheme for green IoT environments. IEEE Access 2020, 8, 44983–44998. [Google Scholar] [CrossRef]
Salpakari, J.; Lund, P. Optimal and rule-based control strategies for energy flexibility in buildings with PV. Appl. Energy 2016, 161, 425–436. [Google Scholar] [CrossRef]
Doukas, H.; Patlitzianas, K.D.; Iatropoulos, K.; Psarras, J. Intelligent building energy management system using rule sets. Build. Environ. 2007, 42, 3562–3569. [Google Scholar] [CrossRef]
Péan, T.Q.; Salom, J.; Costa-Castelló, R. Review of control strategies for improving the energy flexibility provided by heat pump systems in buildings. J. Process Control 2019, 74, 35–49. [Google Scholar] [CrossRef]
Zhu, J.; Tian, Z.; Niu, J.; Lu, Y.; Cheng, B.; Zhou, H. Machine learning-enhanced lightweight rule-based control strategy for building energy demand response. Build. Simul. 2025, 18, 1857–1876. [Google Scholar] [CrossRef]
Tamani, N.; Ahvar, S.; Santos, G.; Istasse, B.; Praca, I.; Brun, P.E.; Ghamri, Y.; Crespi, N.; Becue, A. Rule-based model for smart building supervision and management. In Proceedings of the 2018 IEEE International Conference on Services Computing (SCC), San Francisco, CA, USA, 2–7 July 2018; pp. 9–16. [Google Scholar]
Michailidis, P.; Michailidis, I.; Vamvakas, D.; Kosmatopoulos, E. Model-free HVAC control in buildings: A review. Energies 2023, 16, 7124. [Google Scholar] [CrossRef]
Tang, H.; Li, B.; Zhang, Y.; Pan, J.; Wang, S. A coordinated predictive scheduling and real-time adaptive control for integrated building energy systems with hybrid storage and rooftop PV. Renew. Energy 2025, 239, 122047. [Google Scholar] [CrossRef]
Chen, H.; Xiong, R.; Lin, C.; Shen, W. Model predictive control based real-time energy management for hybrid energy storage system. CSEE J. Power Energy Syst. 2020, 7, 862–874. [Google Scholar] [CrossRef]
Scherer, H.F.; Pasamontes, M.; Guzmán, J.L.; Álvarez, J.; Camponogara, E.; Normey-Rico, J. Efficient building energy management using distributed model predictive control. J. Process Control 2014, 24, 740–749. [Google Scholar] [CrossRef]
Godina, R.; Rodrigues, E.M.; Pouresmaeil, E.; Matias, J.C.; Catalão, J.P. Model predictive control home energy management and optimization strategy with demand response. Appl. Sci. 2018, 8, 408. [Google Scholar] [CrossRef]
Michailidis, P.; Michailidis, I.; Minelli, F.; Coban, H.H.; Kosmatopoulos, E. Model Predictive Control for Smart Buildings: Applications and Innovations in Energy Management. Buildings 2025, 15, 3298. [Google Scholar] [CrossRef]
Drgoňa, J.; Arroyo, J.; Figueroa, I.C.; Blum, D.; Arendt, K.; Kim, D.; Ollé, E.P.; Oravec, J.; Wetter, M.; Vrabie, D.L.; et al. All you need to know about model predictive control for buildings. Annu. Rev. Control 2020, 50, 190–232. [Google Scholar] [CrossRef]
Georgiou, G.S.; Christodoulides, P.; Kalogirou, S.A. Implementing artificial neural networks in energy building applications—A review. In Proceedings of the 2018 IEEE International Energy Conference (ENERGYCON), Limassol, Cyprus, 3–7 June 2018; pp. 1–6. [Google Scholar]
Ferreira, P.; Ruano, A.; Silva, S.; Conceição, E. Neural networks based predictive control for thermal comfort and energy savings in public buildings. Energy Build. 2012, 55, 238–251. [Google Scholar] [CrossRef]
Hernández, J.L.; Sanz, R.; Corredera, Á.; Palomar, R.; Lacave, I. A fuzzy-based building energy management system for energy efficiency. Buildings 2018, 8, 14. [Google Scholar] [CrossRef]
Keshtkar, A.; Arzanpour, S. An adaptive fuzzy logic system for residential energy management in smart grid environments. Appl. Energy 2017, 186, 68–81. [Google Scholar] [CrossRef]
D’Agostino, D.; Minelli, F.; Minichiello, F. New genetic algorithm-based workflow for multi-objective optimization of Net Zero Energy Buildings integrating robustness assessment. Energy Build. 2023, 284, 112841. [Google Scholar] [CrossRef]
Arabali, A.; Ghofrani, M.; Etezadi-Amoli, M.; Fadali, M.S.; Baghzouz, Y. Genetic-algorithm-based optimization approach for energy management. IEEE Trans. Power Deliv. 2012, 28, 162–170. [Google Scholar] [CrossRef]
Zupančič, J.; Filipič, B.; Gams, M. Genetic-programming-based multi-objective optimization of strategies for home energy-management systems. Energy 2020, 203, 117769. [Google Scholar] [CrossRef]
Han, T.; Muhammad, K.; Hussain, T.; Lloret, J.; Baik, S.W. An efficient deep learning framework for intelligent energy management in IoT networks. IEEE Internet Things J. 2020, 8, 3170–3179. [Google Scholar] [CrossRef]
Alam, M.M.; Rahman, M.H.; Ahmed, M.F.; Chowdhury, M.Z.; Jang, Y.M. Deep learning based optimal energy management for photovoltaic and battery energy storage integrated home micro-grid system. Sci. Rep. 2022, 12, 15133. [Google Scholar] [CrossRef] [PubMed]
Villano, F.; Mauro, G.M.; Pedace, A. A review on machine/deep learning techniques applied to building energy simulation, optimization and management. Thermo 2024, 4, 100–139. [Google Scholar] [CrossRef]
Hussien, A.; Khan, W.; Hussain, A.; Liatsis, P.; Al-Shamma’a, A.; Al-Jumeily, D. Predicting energy performances of buildings’ envelope wall materials via the random forest algorithm. J. Build. Eng. 2023, 69, 106263. [Google Scholar] [CrossRef]
Ling, Z.; Tao, Q.; Zheng, J.; Xiong, P.; Liu, M.; Xiao, Z.; Gang, W. A nonintrusive load monitoring method for office buildings based on random forest. Buildings 2021, 11, 449. [Google Scholar] [CrossRef]
Ma, H.; Yang, X.; Mao, J.; Zheng, H. The energy efficiency prediction method based on gradient boosting regression tree. In Proceedings of the 2018 2nd IEEE Conference on Energy Internet and Energy System Integration (EI2), Beijing, China, 20–22 October 2018; pp. 1–9. [Google Scholar]
Touzani, S.; Granderson, J.; Fernandes, S. Gradient boosting machine for modeling the energy consumption of commercial buildings. Energy Build. 2018, 158, 1533–1543. [Google Scholar] [CrossRef]
Wang, X.; Dong, B. Physics-informed hierarchical data-driven predictive control for building HVAC systems to achieve energy and health nexus. Energy Build. 2023, 291, 113088. [Google Scholar] [CrossRef]
Manic, M.; Wijayasekara, D.; Amarasinghe, K.; Rodriguez-Andina, J.J. Building energy management systems: The age of intelligent and adaptive buildings. IEEE Ind. Electron. Mag. 2016, 10, 25–39. [Google Scholar] [CrossRef]
Rossiter, J.A. Model-Based Predictive Control: A Practical Approach; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
Khabbazi, A.J.; Pergantis, E.N.; Premer, L.D.R.; Papageorgiou, P.; Lee, A.H.; Braun, J.E.; Henze, G.P.; Kircher, K.J. Lessons learned from field demonstrations of model predictive control and reinforcement learning for residential and commercial HVAC: A review. arXiv 2025, arXiv:2503.05022. [Google Scholar] [CrossRef]
Vásquez-Hernández, A.; Álvarez, M.F.R. Evaluation of buildings in real conditions of use: Current situation. J. Build. Eng. 2017, 12, 26–36. [Google Scholar] [CrossRef]
Abuimara, T.; Hobson, B.W.; Gunay, B.; O’Brien, W. A data-driven workflow to improve energy efficient operation of commercial buildings: A review with real-world examples. Build. Serv. Eng. Res. Technol. 2022, 43, 517–534. [Google Scholar] [CrossRef]
Khoa, B.Q.; Nguyen, H.T.; Anh, D.B.H.; Khanh, N.H.; Toai, N. An extensive analysis of machine learning applications in HVAC. Int. J. Multidiscipl. Res. Growth Eval. 2025, 6, 1006–1028. [Google Scholar] [CrossRef]
Aghili, S.A.; Haji Mohammad Rezaei, A.; Tafazzoli, M.; Khanzadi, M.; Rahbar, M. Artificial Intelligence Approaches to Energy Management in HVAC Systems: A Systematic Review. Buildings 2025, 15, 1008. [Google Scholar] [CrossRef]
Garimella, S.; Lockyear, K.; Pharis, D.; El Chawa, O.; Hughes, M.T.; Kini, G. Realistic pathways to decarbonization of building energy systems. Joule 2022, 6, 956–971. [Google Scholar] [CrossRef]
Seyam, S. Types of HVAC Systems; IntechOpen: London, UK, 2018; pp. 49–66. [Google Scholar]
Lu, L.; Cai, W.; Xie, L.; Li, S.; Soh, Y.C. HVAC system optimization—In-building section. Energy Build. 2005, 37, 11–22. [Google Scholar] [CrossRef]
Pomianowski, M.Z.; Johra, H.; Marszal-Pomianowska, A.; Zhang, C. Sustainable and energy-efficient domestic hot water systems: A review. Renew. Sustain. Energy Rev. 2020, 128, 109900. [Google Scholar] [CrossRef]
Mabrouki, J.; Azrour, M.; Boubekraoui, A.; El Hajjaji, S. Simulation and optimization of solar domestic hot water systems. Int. J. Soc. Ecol. Sustain. Dev. (IJSESD) 2022, 13, 1–11. [Google Scholar] [CrossRef]
Krishna, K.S.; Kumar, K.S. A review on hybrid renewable energy systems. Renew. Sustain. Energy Rev. 2015, 52, 907–916. [Google Scholar] [CrossRef]
Bonomo, P.; Frontini, F.; Loonen, R.; Reinders, A. Comprehensive review and state of play in the use of photovoltaics in buildings. Energy Build. 2024, 323, 114737. [Google Scholar] [CrossRef]
Amrouche, S.O.; Rekioua, D.; Rekioua, T.; Bacha, S. Overview of energy storage in renewable energy systems. Int. J. Hydrogen Energy 2016, 41, 20914–20927. [Google Scholar] [CrossRef]
Berseneff, B.; Perrin, M.; Tran-Quoc, T.; Brault, P.; Mermilliod, N.; Hadjsaid, N.; Delaplagne, T.; Martin, N.; Crouzevialle, B. The significance of energy storage for renewable energy generation and the role of instrumentation and measurement. IEEE Instrum. Meas. Mag. 2014, 17, 34–40. [Google Scholar] [CrossRef]
Mitali, J.; Dhinakaran, S.; Mohamad, A. Energy storage systems: A review. Energy Storage Sav. 2022, 1, 166–216. [Google Scholar] [CrossRef]
Acharige, S.S.; Haque, M.E.; Arif, M.T.; Hosseinzadeh, N.; Hasan, K.N.; Oo, A.M.T. Review of electric vehicle charging technologies, standards, architectures, and converter configurations. IEEE Access 2023, 11, 41218–41255. [Google Scholar] [CrossRef]
Kurucan, M.; Michailidis, P.; Michailidis, I.; Minelli, F. A Modular Hybrid SOC-Estimation Framework with a Supervisor for Battery Management Systems Supporting Renewable Energy Integration in Smart Buildings. Energies 2025, 18, 4537. [Google Scholar] [CrossRef]
Coban, H.H.; Lewicki, W. Assessing the efficiency of hybrid energy facilities for electric vehicle charging. Sci. Pap. Sil. Univ. Technol. Organ. Manag. Ser. 2023, 184, 61–72. [Google Scholar] [CrossRef]
Coban, H.H. Production and use of electric vehicle batteries. In Energy Systems Design for Low-Power Computing; IGI Global: Hershey, PA, USA, 2023; pp. 279–304. [Google Scholar]
Rahman, I.; Vasant, P.M.; Singh, B.S.M.; Abdullah-Al-Wadud, M.; Adnan, N. Review of recent trends in optimization techniques for plug-in hybrid, and electric vehicle charging infrastructures. Renew. Sustain. Energy Rev. 2016, 58, 1039–1047. [Google Scholar] [CrossRef]
Kurucan, M.; Özbaltan, M.; Yetgin, Z.; Alkaya, A. Applications of artificial neural network based battery management systems: A literature review. Renew. Sustain. Energy Rev. 2024, 192, 114262. [Google Scholar] [CrossRef]
Luo, J.; Rohn, J.; Xiang, W.; Bertermann, D.; Blum, P. A review of ground investigations for ground source heat pump (GSHP) systems. Energy Build. 2016, 117, 160–175. [Google Scholar] [CrossRef]
Sarbu, I.; Sebarchievici, C. General review of ground-source heat pump systems for heating and cooling of buildings. Energy Build. 2014, 70, 441–454. [Google Scholar] [CrossRef]
Sarbu, I.; Sebarchievici, C. Using ground-source heat pump systems for heating/cooling of buildings. In Advances in Geothermal Energy; IntechOpen: London, UK, 2016; pp. 1–14. [Google Scholar]
Imam, M.T.; Afshari, S.; Mishra, S. Smart lighting control systems. In Intelligent Building Control Systems: A Survey of Modern Building Control and Sensing Strategies; Springer: Berlin, Germany, 2017; pp. 221–251. [Google Scholar]
Guo, X.; Tiller, D.; Henze, G.; Waters, C. The performance of occupancy-based lighting control systems: A review. Light. Res. Technol. 2010, 42, 415–431. [Google Scholar] [CrossRef]
Yang, T.; Clements-Croome, D.; Marson, M. Building energy management systems. Encycl. Sustain. Technol. 2017, 36, 291–309. [Google Scholar]
Michailidis, P.; Michailidis, I.; Kosmatopoulos, E. Review and Evaluation of Multi-Agent Control Applications for Energy Management in Buildings. Energies 2024, 17, 4835. [Google Scholar] [CrossRef]
Barber, K.A.; Krarti, M. A review of optimization based tools for design and control of building energy systems. Renew. Sustain. Energy Rev. 2022, 160, 112359. [Google Scholar] [CrossRef]
Kouvaritakis, B.; Cannon, M. Model Predictive Control; Springer International Publishing: Cham, Switzerland, 2016; Volume 38, p. 7. [Google Scholar]
Mantovani, G.; Ferrarini, L. Temperature control of a commercial building with model predictive control techniques. IEEE Trans. Ind. Electron. 2014, 62, 2651–2660. [Google Scholar] [CrossRef]
Attia, A.H.; Rezeka, S.F.; Saleh, A.M. Fuzzy logic control of air-conditioning system in residential buildings. Alex. Eng. J. 2015, 54, 395–403. [Google Scholar] [CrossRef]
Kalogirou, S.A. Applications of artificial neural-networks for energy systems. Appl. Energy 2000, 67, 17–35. [Google Scholar] [CrossRef]
Marino, D.L.; Amarasinghe, K.; Manic, M. Building energy load forecasting using deep neural networks. In Proceedings of the IECON 2016-42nd Annual Conference of the IEEE Industrial Electronics Society, Florence, Italy, 23–26 October 2016; pp. 7046–7051. [Google Scholar]
Kämpf, J.H.; Robinson, D. Optimisation of building form for solar energy utilisation using constrained evolutionary algorithms. Energy Build. 2010, 42, 807–814. [Google Scholar] [CrossRef]
Fong, K.F.; Hanby, V.I.; Chow, T.T. System optimization for HVAC energy management using the robust evolutionary algorithm. Appl. Therm. Eng. 2009, 29, 2327–2334. [Google Scholar] [CrossRef]
Fu, Q.; Han, Z.; Chen, J.; Lu, Y.; Wu, H.; Wang, Y. Applications of reinforcement learning for building energy efficiency control: A review. J. Build. Eng. 2022, 50, 104165. [Google Scholar] [CrossRef]
Lazaridis, C.R.; Michailidis, I.; Karatzinis, G.; Michailidis, P.; Kosmatopoulos, E. Evaluating reinforcement learning algorithms in residential energy saving and comfort management. Energies 2024, 17, 581. [Google Scholar] [CrossRef]
Oldewurtel, F.; Jones, C.N.; Parisio, A.; Morari, M. Stochastic model predictive control for building climate control. IEEE Trans. Control Syst. Technol. 2013, 22, 1198–1205. [Google Scholar] [CrossRef]
Ekici, B.B.; Aksoy, U.T. Prediction of building energy consumption by using artificial neural networks. Adv. Eng. Softw. 2009, 40, 356–362. [Google Scholar] [CrossRef]
Karatasou, S.; Santamouris, M.; Geros, V. Modeling and predicting building’s energy use with artificial neural networks: Methods and results. Energy Build. 2006, 38, 949–958. [Google Scholar] [CrossRef]
Shen, R.; Zhong, S.; Wen, X.; An, Q.; Zheng, R.; Li, Y.; Zhao, J. Multi-agent deep reinforcement learning optimization framework for building energy system with renewable energy. Appl. Energy 2022, 312, 118724. [Google Scholar] [CrossRef]
Liu, T.; Tan, Z.; Xu, C.; Chen, H.; Li, Z. Study on deep reinforcement learning techniques for building energy consumption forecasting. Energy Build. 2020, 208, 109675. [Google Scholar] [CrossRef]
Kolokotsa, D. Artificial intelligence in buildings: A review of the application of fuzzy logic. Adv. Build. Energy Res. 2007, 1, 29–54. [Google Scholar] [CrossRef]
Ghadi, Y.Y.; Rasul, M.; Khan, M.M.K. Potential of saving energy using advanced fuzzy logic controllers in smart buildings in subtropical climates in australia. Energy Procedia 2014, 61, 290–293. [Google Scholar] [CrossRef]
Kang, C.S.; Hyun, C.H.; Park, M. Fuzzy logic-based advanced on–off control for thermal comfort in residential buildings. Appl. Energy 2015, 155, 270–283. [Google Scholar] [CrossRef]
Ooka, R.; Komamura, K. Optimal design method for building energy systems using genetic algorithms. Build. Environ. 2009, 44, 1538–1544. [Google Scholar] [CrossRef]
Wortmann, T.; Waibel, C.; Nannicini, G.; Evins, R.; Schroepfer, T.; Carmeliet, J. Are genetic algorithms really the best choice for building energy optimization? In Proceedings of the Symposium on Simulation for Architecture and Urban Design, Toronto, CA, USA, 22–24 May 2017; pp. 1–8. [Google Scholar]
Sturzenegger, D.; Gyalistras, D.; Morari, M.; Smith, R.S. Model predictive climate control of a swiss office building: Implementation, results, and cost–benefit analysis. IEEE Trans. Control Syst. Technol. 2015, 24, 1–12. [Google Scholar] [CrossRef]
Razmara, M.; Maasoumy, M.; Shahbakhti, M.; Robinett, R., III. Optimal exergy control of building HVAC system. Appl. Energy 2015, 156, 555–565. [Google Scholar] [CrossRef]
Kwak, Y.; Huh, J.H.; Jang, C. Development of a model predictive control framework through real-time building energy management system data. Appl. Energy 2015, 155, 1–13. [Google Scholar] [CrossRef]
Goyal, S.; Barooah, P.; Middelkoop, T. Experimental study of occupancy-based control of HVAC zones. Applied Energy 2015, 140, 75–84. [Google Scholar] [CrossRef]
De Coninck, R.; Helsen, L. Practical implementation and evaluation of model predictive control for an office building in Brussels. Energy Build. 2016, 111, 290–298. [Google Scholar] [CrossRef]
Chen, X.; Wang, Q.; Srebric, J. Occupant feedback based model predictive control for thermal comfort and energy optimization: A chamber experimental evaluation. Appl. Energy 2016, 164, 341–351. [Google Scholar] [CrossRef]
Vrettos, E.; Kara, E.C.; MacDonald, J.; Andersson, G.; Callaway, D.S. Experimental demonstration of frequency regulation by commercial buildings—Part I: Modeling and hierarchical control design. IEEE Trans. Smart Grid 2016, 9, 3213–3223. [Google Scholar] [CrossRef]
Aftab, M.; Chen, C.; Chau, C.K.; Rahwan, T. Automatic HVAC control with real-time occupancy recognition and simulation-guided model predictive control in low-cost embedded system. arXiv 2017, arXiv:1708.05208. [Google Scholar] [CrossRef]
Hilliard, T.; Swan, L.; Qin, Z. Experimental implementation of whole building MPC with zone based thermal comfort adjustments. Build. Environ. 2017, 125, 326–338. [Google Scholar] [CrossRef]
Joe, J.; Karava, P.; Hou, X.; Xiao, Y.; Hu, J. A distributed approach to model-predictive control of radiant comfort delivery systems in office spaces with localized thermal environments. Energy Build. 2018, 175, 173–188. [Google Scholar] [CrossRef]
Rawlings, J.B.; Patel, N.R.; Risbeck, M.J.; Maravelias, C.T.; Wenzel, M.J.; Turney, R.D. Economic MPC and real-time decision making with application to large-scale HVAC energy systems. Comput. Chem. Eng. 2018, 114, 89–98. [Google Scholar] [CrossRef]
Smarra, F.; Jain, A.; De Rubeis, T.; Ambrosini, D.; D’Innocenzo, A.; Mangharam, R. Data-driven model predictive control using random forests for building energy optimization and climate control. Appl. Energy 2018, 226, 1252–1272. [Google Scholar] [CrossRef]
Finck, C.; Li, R.; Zeiler, W. Economic model predictive control for demand flexibility of a residential building. Energy 2019, 176, 365–379. [Google Scholar] [CrossRef]
Yang, S.; Wan, M.P.; Ng, B.F.; Dubey, S.; Henze, G.P.; Chen, W.; Baskaran, K. Experimental study of model predictive control for an air-conditioning system with dedicated outdoor air system. Appl. Energy 2020, 257, 113920. [Google Scholar] [CrossRef]
Finck, C.; Li, R.; Zeiler, W. Optimal control of demand flexibility under real-time pricing for heating systems in buildings: A real-life demonstration. Appl. Energy 2020, 263, 114671. [Google Scholar] [CrossRef]
Carli, R.; Cavone, G.; Ben Othman, S.; Dotoli, M. IoT based architecture for model predictive control of HVAC systems in smart buildings. Sensors 2020, 20, 781. [Google Scholar] [CrossRef] [PubMed]
Drgoňa, J.; Picard, D.; Helsen, L. Cloud-based implementation of white-box model predictive control for a GEOTABS office building: A field test demonstration. J. Process Control 2020, 88, 63–77. [Google Scholar] [CrossRef]
Yang, S.; Wan, M.P.; Chen, W.; Ng, B.F.; Dubey, S. Model predictive control with adaptive machine-learning-based model for building energy efficiency and comfort optimization. Appl. Energy 2020, 271, 115147. [Google Scholar] [CrossRef]
Freund, S.; Schmitz, G. Implementation of model predictive control in a large-sized, low-energy office building. Build. Environ. 2021, 197, 107830. [Google Scholar] [CrossRef]
Wu, Y.; Mäki, A.; Jokisalo, J.; Kosonen, R.; Kilpeläinen, S.; Salo, S.; Liu, H.; Li, B. Demand response of district heating using model predictive control to prevent the draught risk of cold window in an office building. J. Build. Eng. 2021, 33, 101855. [Google Scholar] [CrossRef]
Clausen, A.; Arendt, K.; Johansen, A.; Sangogboye, F.C.; Kjærgaard, M.B.; Veje, C.T.; Jørgensen, B.N. A digital twin framework for improving energy efficiency and occupant comfort in public and commercial buildings. Energy Inform. 2021, 4, 40. [Google Scholar] [CrossRef]
Blum, D.; Wang, Z.; Weyandt, C.; Kim, D.; Wetter, M.; Hong, T.; Piette, M.A. Field demonstration and implementation analysis of model predictive control in an office HVAC system. Appl. Energy 2022, 318, 10–1016. [Google Scholar] [CrossRef]
Bünning, F.; Huber, B.; Schalbetter, A.; Aboudonia, A.; de Badyn, M.H.; Heer, P.; Smith, R.S.; Lygeros, J. Physics-informed linear regression is competitive with two Machine Learning methods in residential building MPC. Appl. Energy 2022, 310, 118491. [Google Scholar] [CrossRef]
Lefebure, N.; Khosravi, M.; de Badyn, M.H.; Bünning, F.; Lygeros, J.; Jones, C.; Smith, R.S. Distributed model predictive control of buildings and energy hubs. Energy Build. 2022, 259, 111806. [Google Scholar] [CrossRef]
Zhang, K.; Prakash, A.; Paul, L.; Blum, D.; Alstone, P.; Zoellick, J.; Brown, R.; Pritoni, M. Model predictive control for demand flexibility: Real-world operation of a commercial building with photovoltaic and battery systems. Adv. Appl. Energy 2022, 7, 100099. [Google Scholar] [CrossRef]
Zhan, S.; Dong, B.; Chong, A. Improving energy flexibility and PV self-consumption for a tropical net zero energy office building. Energy Build. 2023, 278, 112606. [Google Scholar] [CrossRef]
Yue, B.; Su, B.; Xiao, F.; Li, A.; Li, K.; Li, S.; Yan, R.; Lian, Q.; Li, A.; Li, Y.; et al. Energy-oriented control retrofit for existing HVAC system adopting data-driven MPC–Methodology, implementation and field test. Energy Build. 2023, 295, 113286. [Google Scholar] [CrossRef]
Wang, D.; Chen, Y.; Wang, W.; Gao, C.; Wang, Z. Field test of Model Predictive Control in residential buildings for utility cost savings. Energy Build. 2023, 288, 113026. [Google Scholar] [CrossRef]
Yin, M.; Cai, H.; Gattiglio, A.; Khayatian, F.; Smith, R.S.; Heer, P. Data-driven predictive control for demand side management: Theoretical and experimental results. Appl. Energy 2024, 353, 122101. [Google Scholar] [CrossRef]
Taheri, S.; Amiri, A.J.; Razban, A. Real-world implementation of a cloud-based MPC for HVAC control in educational buildings. Energy Convers. Manag. 2024, 305, 118270. [Google Scholar] [CrossRef]
Wei, Z.; Calautit, J.K. Field experiment testing of a low-cost model predictive controller (MPC) for building heating systems and analysis of phase change material (PCM) integration. Appl. Energy 2024, 360, 122750. [Google Scholar] [CrossRef]
Pergantis, E.N.; Priyadarshan; Al Theeb, N.; Dhillon, P.; Ore, J.P.; Ziviani, D.; Groll, E.A.; Kircher, K.J. Field demonstration of predictive heating control for an all-electric house in a cold climate. Appl. Energy 2024, 360, 122820. [Google Scholar] [CrossRef]
Klanatsky, P.; Veynandt, F.; Heschl, C.; Stelzer, R.; Zogas, P.; Siokas, G.; Balomenos, A. Real long-term performance evaluation of an improved office building operation involving a Data-driven model predictive control. Energy Build. 2025, 338, 115590. [Google Scholar] [CrossRef]
Gao, J.; Lv, Y.; Feng, L.; Sui, J.; Jin, H. Model predictive control incorporating data correction for LHTES power controlling: Deployment and case study in data center. Appl. Energy 2025, 401, 126660. [Google Scholar] [CrossRef]
Peng, Y.; Nagy, Z.; Schlüter, A. Temperature-preference learning with neural networks for occupant-centric building indoor climate controls. Build. Environ. 2019, 154, 296–308. [Google Scholar] [CrossRef]
Sadeghian Broujeny, R.; Madani, K.; Chebira, A.; Amarger, V.; Hurtard, L. Data-driven living spaces’ heating dynamics modeling in smart buildings using machine learning-based identification. Sensors 2020, 20, 1071. [Google Scholar] [CrossRef]
Mari, S.; Bucci, G.; Ciancetta, F.; Fiorucci, E.; Fioravanti, A. An embedded deep learning NILM system: A year-long field study in real houses. IEEE Trans. Instrum. Meas. 2023, 72, 2531215. [Google Scholar] [CrossRef]
Khan, I.; Cicirelli, F.; Greco, E.; Guerrieri, A.; Mastroianni, C.; Scarcello, L.; Spezzano, G.; Vinci, A. Leveraging distributed AI for multi-occupancy prediction in Cognitive Buildings. Internet Things 2024, 26, 101181. [Google Scholar] [CrossRef]
Bae, K.W.; Choi, E.J.; Choi, Y.J.; Yun, J.Y.; Yun, G.Y.; Moon, H.J.; Moon, J.W. Real-time ventilation control for indoor CO₂ management using occupant information. Build. Environ. 2025, 285, 113568. [Google Scholar] [CrossRef]
Simões, J.C.; da Graça, G.C. Experimental validation of neural network-based prediction of natural ventilation bulk airflow rate. Energy Build. 2025, 342, 115871. [Google Scholar] [CrossRef]
De Somer, O.; Soares, A.; Vanthournout, K.; Spiessens, F.; Kuijpers, T.; Vossen, K. Using reinforcement learning for demand response of domestic hot water buffers: A real-life demonstration. In Proceedings of the 2017 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT-Europe), Torino, Italy, 26–29 September 2017; pp. 1–7. [Google Scholar]
Touzani, S.; Prakash, A.K.; Wang, Z.; Agarwal, S.; Pritoni, M.; Kiran, M.; Brown, R.; Granderson, J. Controlling distributed energy resources via deep reinforcement learning for load flexibility and energy efficiency. Appl. Energy 2021, 304, 117733. [Google Scholar] [CrossRef]
Lei, Y.; Zhan, S.; Ono, E.; Peng, Y.; Zhang, Z.; Hasama, T.; Chong, A. A practical deep reinforcement learning framework for multivariate occupant-centric control in buildings. Appl. Energy 2022, 324, 119742. [Google Scholar] [CrossRef]
Du, Y.; Li, F.; Kurte, K.; Munk, J.; Zandi, H. Demonstration of intelligent HVAC load management with deep reinforcement learning: Real-world experience of machine learning in demand control. IEEE Power Energy Mag. 2022, 20, 42–53. [Google Scholar] [CrossRef]
Naug, A.; Quinones-Grueiro, M.; Biswas, G. Deep reinforcement learning control for non-stationary building energy management. Energy Build. 2022, 277, 112584. [Google Scholar] [CrossRef]
Silvestri, A.; Coraci, D.; Brandi, S.; Capozzoli, A.; Borkowski, E.; Köhler, J.; Wu, D.; Zeilinger, M.N.; Schlueter, A. Real building implementation of a deep reinforcement learning controller to enhance energy efficiency and indoor temperature control. Appl. Energy 2024, 368, 123447. [Google Scholar] [CrossRef]
Heidari, A.; Girardin, L.; Dorsaz, C.; Maréchal, F. A trustworthy reinforcement learning framework for autonomous control of a large-scale complex heating system: Simulation and field implementation. Appl. Energy 2025, 378, 124815. [Google Scholar] [CrossRef]
Silvestri, A.; Coraci, D.; Brandi, S.; Capozzoli, A.; Schlueter, A. Practical deployment of reinforcement learning for building controls using an imitation learning approach. Energy Build. 2025, 335, 115511. [Google Scholar] [CrossRef]
Jazizadeh, F.; Ghahramani, A.; Becerik-Gerber, B.; Kichkaylo, T.; Orosz, M. User-led decentralized thermal comfort driven HVAC operations for improved efficiency in office buildings. Energy Build. 2014, 70, 398–410. [Google Scholar] [CrossRef]
Ulpiani, G.; Borgognoni, M.; Romagnoli, A.; Di Perna, C. Comparing the performance of on/off, PID and fuzzy controllers applied to the heating system of an energy-efficient building. Energy Build. 2016, 116, 114287. [Google Scholar] [CrossRef]
Li, W.; Zhang, J.; Zhao, T. Indoor thermal environment optimal control for thermal comfort and energy saving based on online monitoring of thermal sensation. Energy Build. 2019, 197, 57–67. [Google Scholar] [CrossRef]
Chojecki, A.; Rodak, M.; Ambroziak, A.; Borkowski, P. Energy management system for residential buildings based on fuzzy logic: Design and implementation in smart-meter. IET Smart Grid 2020, 3, 254–266. [Google Scholar] [CrossRef]
Lahlouh, I.; Rerhrhaye, F.; Elakkary, A.; Sefiani, N. Experimental implementation of a new multi input multi output fuzzy-PID controller in a poultry house system. Heliyon 2020, 6, e04645. [Google Scholar] [CrossRef]
Sadeghian Broujeny, R.; Madani, K.; Chebira, A.; Amarger, V.; Hurtard, L. A heating controller designing based on living space heating dynamic’s model approach in a smart building. Energies 2021, 14, 998. [Google Scholar] [CrossRef]
Chojecki, A.; Ambroziak, A.; Borkowski, P. Fuzzy controllers instead of classical PIDs in HVAC equipment: Dusting off a well-known technology and Today’s implementation for better energy efficiency and user comfort. Energies 2023, 16, 2967. [Google Scholar] [CrossRef]
Habib, M.; Bollin, E.; Wang, Q. Battery energy management system using edge-driven fuzzy logic. Energies 2023, 16, 3539. [Google Scholar] [CrossRef]
Kazmi, H.; D’Oca, S.; Delmastro, C.; Lodeweyckx, S.; Corgnati, S.P. Generalizable occupant-driven optimization model for domestic hot water production in NZEB. Appl. Energy 2016, 175, 1–15. [Google Scholar] [CrossRef]
Javed, A.; Larijani, H.; Ahmadinia, A.; Emmanuel, R.; Mannion, M.; Gibson, D. Design and implementation of a cloud enabled random neural network-based decentralized smart controller with intelligent sensor nodes for HVAC. IEEE Internet Things J. 2016, 4, 393–403. [Google Scholar] [CrossRef]
Peng, Y.; Rysanek, A.; Nagy, Z.; Schlüter, A. Using machine learning techniques for occupancy-prediction-based cooling control in office buildings. Appl. Energy 2018, 211, 1343–1358. [Google Scholar] [CrossRef]
González-Briones, A.; Prieto, J.; De La Prieta, F.; Herrera-Viedma, E.; Corchado, J.M. Energy optimization using a case-based reasoning strategy. Sensors 2018, 18, 865. [Google Scholar] [CrossRef]
Png, E.; Srinivasan, S.; Bekiroglu, K.; Chaoyang, J.; Su, R.; Poolla, K. An internet of things upgrade for smart and scalable heating, ventilation and air-conditioning control in commercial buildings. Appl. Energy 2019, 239, 408–424. [Google Scholar] [CrossRef]
Zhang, Z.; Chong, A.; Pan, Y.; Zhang, C.; Lam, K.P. Whole building energy model for HVAC optimal control: A practical framework based on deep reinforcement learning. Energy Build. 2019, 199, 472–490. [Google Scholar] [CrossRef]
Baniasadi, A.; Habibi, D.; Al-Saedi, W.; Masoum, M.A.; Das, C.K.; Mousavi, N. Optimal sizing design and operation of electrical and thermal energy storage systems in smart buildings. J. Energy Storage 2020, 28, 101186. [Google Scholar] [CrossRef]
Rochd, A.; Benazzouz, A.; Ait Abdelmoula, I.; Raihani, A.; Ghennioui, A.; Naimi, Z.; Ikken, B. Design and implementation of an AI-based & IoT-enabled Home Energy Management System: A case study in Benguerir—Morocco. Energy Rep. 2021, 7, 699–719. [Google Scholar]
Yang, S.; Wan, M.P.; Chen, W.; Ng, B.F.; Dubey, S. Experiment study of machine-learning-based approximate model predictive control for energy-efficient building control. Appl. Energy 2021, 288, 116648. [Google Scholar] [CrossRef]
Laouali, I.; Gomes, I.; Ruano, M.d.G.; Bennani, S.D.; Fadili, H.E.; Ruano, A. Energy disaggregation using multi-objective genetic algorithm designed neural networks. Energies 2022, 15, 9073. [Google Scholar] [CrossRef]
Massana, J.; Burgas, L.; Herraiz, S.; Colomer, J.; Pous, C. Multi-vector energy management system including scheduling electrolyser, electric vehicle charging station and other assets in a real scenario. J. Clean. Prod. 2022, 380, 134996. [Google Scholar] [CrossRef]
Ruddick, J.; Ceusters, G.; Van Kriekinge, G.; Genov, E.; De Cauwer, C.; Coosemans, T.; Messagie, M. Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems. Energy AI 2024, 18, 100448. [Google Scholar] [CrossRef]
Zouloumis, L.; Ploskas, N.; Taousanidis, N.; Panaras, G. Smart Thermostat Development and Validation on an Environmental Chamber Using Surrogate Modelling. Energies 2025, 18, 3433. [Google Scholar] [CrossRef]
Zheng, R.; Lei, L. A hybrid model for real-time cooling load prediction and terminal control optimization in multi-zone buildings. J. Build. Eng. 2025, 104, 112120. [Google Scholar] [CrossRef]
Dai, Y.; Jiang, Z.; Shen, Q.; Chen, P.; Wang, S.; Jiang, Y. A decentralized algorithm for optimal distribution in HVAC systems. Build. Environ. 2016, 95, 21–31. [Google Scholar] [CrossRef]
Michailidis, I.T.; Schild, T.; Sangi, R.; Michailidis, P.; Korkas, C.; Fütterer, J.; Müller, D.; Kosmatopoulos, E.B. Energy-efficient HVAC management using cooperative, self-trained, control agents: A real-life German building case study. Appl. Energy 2018, 211, 113–125. [Google Scholar] [CrossRef]
Michailidis, P.; Pelitaris, P.; Korkas, C.; Michailidis, I.; Baldi, S.; Kosmatopoulos, E. Enabling optimal energy management with minimal IoT requirements: A legacy A/C case study. Energies 2021, 14, 7910. [Google Scholar] [CrossRef]
Kong, D.; Hong, Y.; Yang, Y.; Gu, T.; Fu, Y.; Ye, Y.; Xi, W.; Zhang, Z. A parametric, control-integrated and machine learning-enhanced modeling method of demand-side HVAC systems in industrial buildings: A practical validation study. Appl. Energy 2025, 379, 124971. [Google Scholar] [CrossRef]

Figure 1. Geographical distribution of real-world experiments for BEMS.

Figure 2. Left: Occurrences of ML-based control applications in real-world BEMS; Right: Real-world ML-based applications for BEMS per year.

Figure 3. Paper structure.

Figure 4. Methodology in a PRISMA type diagram.

Figure 5. Reasons for controlling energy systems in the building environment.

Figure 6. Step-by-step control cycle of a modern building energy management system.

Figure 7. Left: Occurrence of ML types in in real-world BEMS applications; Right: Percentage (%) of ML types in in real-world BEMS applications.

Figure 8. Left: Occurrence of MPC types in real-world BEMS applications; Right: Occurrence of MPC solvers in real-world BEMS applications.

Figure 9. Left: Occurrence of ANN types in real-world BEMS applications; Center: Occurrence of RL types in real-world BEMS applications; Right: Occurrence of FLC types in real-world BEMS applications.

Figure 10. Left: Occurrence of hybrid schemes in real-world BEMS applications; Right: Occurrence of hybrid counterparts in real-world BEMS applications.

Figure 11. Left: Percentage (%) of single-agent vs. multi-agent real-world applications for BEMS; Right: Occurrence of multi-agent strategies (cooperative, hierarchical, fully decentralized) in real-world applications for BEMS.

Figure 12. Data cluster occurrence in real-world applications for BEMS.

Figure 13. Left: Occurrence of baseline control methodologies in real-world applications for BEMS; Right: Percentage (%) of baseline control methodologies in real-world applications for BEMS.

Figure 14. Left: Performance metrics occurrences in real-world implementations for BEMS; Right: Performance metrics percentage (%) in real-world implementations for BEMS.

Figure 15. Left: Occurrence of implementation types in real-world BEMS applications; Right: Percentage (%) of implementation types in real-world BEMS applications.

Figure 16. Left: Occurrence of energy system types in real-world BEMS applications; Right: Percentage (%) of single-energy system vs. multi-energy system studies in real-world BEMS.

Figure 17. Left: Occurrences of building types in real-world BEMS applications; Right: Percentage (%) of operational vs. experimental building testbeds in real-world BEMS applications.

Figure 18. Left: Occurrence of single- vs. multi-zone buildings in real-world BEMS applications; Right: Percentage (%) of single- vs. multi-zone buildings in real-world BEMS applications.

Figure 19. Left: Occurrences of real-world applications per country; Right: Percentage (%) of real-world applications (%) per continent.

Figure 20. Left: Climate distribution of real-world BEMS applications; Right: Climate-type distribution across real implementations.

Figure 21. Left: Number of real-world applications considering the experimental duration; Right: Percentage (%) of real-world applications considering the experimental duration.

Table 1. Key attributes of MPC applications for real-life BEMS.

Ref.	Year	Type (Solver)	Agent	FH/TS	Baseline	Equipment	Building	Zones	Location	Period
[96]	2015	Classical (SLP)	Single	58 h/15 m	RBC	HVAC/Blinds	Office	20	Allschwil (CH)	Mixed (24 W)
[97]	2015	Classical (NLP/QP)	Single	24 h/1 h	RBC	HVAC/GHP	Office	1	Houghton (US)	Winter (1 D)
[98]	2015	Classical (N/A)	Single	15 m/15 m	RBC/PID	HVAC	Office	-	Seoul (KR)	Summer (1 D)
[99]	2015	Economic (NLP)	Single	30 m/10 m	RBC	HVAC	Academic	1	Florida (US)	Summer (3 D)
[100]	2016	Classical (NLP)	Single	24 h/5 m	RBC-PI	HVAC	Office	1	Brussels (BE)	Winter (14 D)
[101]	2016	Data-driven (NLP)	Single	2 h/5 m	PI	HVAC	Office	1	Pennsylvania (US)	Summer (36 h)
[102]	2016	Robust (NLP)	Single	24 h/15 m	Fixed	HVAC	Offices/Lab	1	Berkeley (US)	Summer (10 D)
[103]	2017	Simulative (N/A)	Single	24 h/10 m	Fixed	HVAC	Mosque	1	Abu Dhabi (AE)	Summer (7 D)
[104]	2017	Data-driven (GA)	Single	2 h/15 m	RBC	HVAC/DHW	Academic	8	Halifax (CA)	Mixed (118 D)
[105]	2018	Classical (QP)	Multi	24 h/30 m	PI	HVAC	Office	4	West Lafayette (US)	Summer (11 D)
[106]	2018	Economic (MILP)	Single	1 w/15 m	Manual	HVAC/TSS	Academic	500	Stanford (US)	Mixed (3.5 M)
[107]	2018	Data-driven (MIQP)	Single	40 m/10 m	RBC	HVAC	Resident	4	L’Aquila (IT)	Spring (15 D)
[108]	2019	Economic (DP)	Single	22 h/60 m	PI	HVAC/RES	Resident	1	Utrecht (NL)	Winter (14 D)
[109]	2020	Classical (QP)	Single	1 h/2 m	PID	HVAC	Academic	1	Singapore (SG)	Summer (7 D)
[110]	2020	Economic (NLP)	Single	24 h/60 m	RBC	HVAC/RES/DHW/TSS	Resident	1	Amstelveen (NL)	Winter (7 D)
[111]	2020	Classical (QP)	Single	4 h/2 m	Fixed	HVAC	Lab	1	Bari (IT)	Summer (4 M)
[112]	2020	Classical (QP)	Single	24 h/15 m	RBC-PI	HVAC/GSHP	Office	12	Hasselt (BE)	Mixed (79 D)
[113]	2020	Data-driven (SQP)	Single	1 h/5 m	PID	HVAC	Academic	1	Singapore (SG)	Summer (7 D)
[114]	2021	Classical (NLP)	Single	48 h/30 m	RBC-PI	HVAC/GHP/DH	Office	7	Hamburg (DE)	Winter (63 D)
[115]	2021	Economic (GA)	Single	12 h/1 h	Static	HVAC	Offices	2	Espoo (FI)	Winter (15 D)
[116]	2021	Classical (GA)	Single	N/A	RBC	HVAC	Academic	1	Odense (DK)	Winter (1 D)
[117]	2022	Classical (NLP)	Single	24 h/10 m	PI	HVAC	Office	4	California (US)	Autumn (31 D)
[118]	2022	Data-driven (QP)	Single	24 h/10 m	RBC	HVAC	Resident/Lab	2	Dübendorf (CH)	Mixed (120 D)
[119]	2022	Data-driven (MIQP)	Multi	12 h/30 m	MPC	GSP/TSS	Mixed/Lab	7	Dübendorf (CH)	Winter (1 D)
[120]	2022	Economic (NLP)	Single	24 h/5 m	Fixed/RF	HVAC/RES/ESS	Store	4	Blue Lake (US)	All Seasons (1 Y)
[121]	2023	Data-driven (QP)	Single	1 h/15 m	RBC	HVAC/RES	Office	6	Singapore (SG)	Summer (7 W)
[122]	2023	Discrete (QP)	Single	30 m/10 m	Fixed	HVACs	Airport	6	Zhejiang (CN)	Summer (15 D)
[123]	2023	Classical (NLP)	Single	2 h/10 m	RBC	HVAC	Resident	2	Shenzhen (CN)	Autumn (12 D)
[124]	2024	Stochastic (QP)	Single	3.5 h/15 m	RBC	HVAC/DHW/ESS	Resident/Lab	2	Dübendorf (CH)	Winter (5 D)
[125]	2024	Classical (QP)	Single	12 h/1 h	PI	HVAC/TSS	Hall	2	Indianapolis (US)	Winter (3 M)
[126]	2024	Economic (MILP)	Single	24 h/5 m	RBC	HVAC/TSS	Lab	1	Nottingham (GB)	Winter (8 D)
[127]	2024	Data-driven (QP)	Single	24 h/5 m	PID	HVAC	Resident	1	West Lafayette (US)	Winter (33 D)
[128]	2025	Data-driven (MILP)	Single	24 h/15 m	RBC	HVAC/Blinds	Office/Lab	9	Pinkafeld (AU)	All Seasons (1 Y)
[129]	2025	Classical (PSO)	Single	15 m/1 m	PID	HVAC	Data Center	1	Dongguan (CN)	Spring (1 D)

Table 2. Key attributes of ANN applications for real-life BEMS.

Ref.	Year	Methods	Agent	FH/TS	Baseline	Equipment	Building	Zones	Location	Period
[130]	2019	MLP	Single	-/-	Fixed	HVAC	Office/Lab	1	Singapore (SG)	Summer (5 M)
[131]	2020	NARX (MLP)	Single	18 h/1 m	ANN	HVAC	Office/Lab	1	Paris (FR)	Winter (1 W)
[132]	2023	CNN	Single	80 m/8 s	ANN	Appliances	Residents	1	L’Aquila (IT)	All Seasons (1 Y)
[133]	2024	LSTM	Multi	10 m/5 m	ANN	HVAC/Lights	Office/Lab	10	Rende (IT)	Autumn (24 D)
[134]	2025	LSTM	Single	5 m/5 m	RBC	HVAC	Office/Lab	1	Seoul (KR)	Spring (1 W)
[135]	2025	MLP	Single	-/15 m	LR	HVAC	Library/Academic	1	Lisbon (PT)	Spring (3.5 M)

Table 3. Key attributes of RL applications for real-life BEMS.

Ref.	Year	Method	Agent	FH/TS	Baseline	Equipment	Building	Zones	Location	Period
[136]	2017	FQI	Single	24 h/1 h	RBC	RES/DWH/GHP	Resident	-	Kortrijk (BE)	Winter (4 M)
[137]	2021	DDPG	Single	2–3 h/15 m	RBC	HVAC/RES/ESS	Office/Lab	1	Berkeley (US)	Summer (9 D)
[138]	2022	BDQ	Single	-/15–30 m	Fixed	HVAC	Office/Academic	1	Queenstown (SG)	Summer (10 D)
[139]	2022	DQN	Single	1 h/6 h	RBC/Fixed	HVAC	Resident/Lab	2	Knoxville (US)	Spring (11 D)
[140]	2022	PPO	Single	1 h/48 h	RBC/RL	HVAC	Mixed-use	5	Nashville (US)	All Seasons (1 Y)
[141]	2024	SAC	Single	24 h/5 m	RBC	HVAC	Office/Lab	1	Dübendorf (CH)	Summer (52 D)
[142]	2025	PPO	Single	-/5 m	RBC	HVAC	Academic/Hall	28	Sion (CH)	Winter (7 D)
[143]	2025	PPO	Single	-/5 m	RBC	HVAC	Office/Lab	1	Dübendorf (CH)	Winter (14 D)

Table 4. Key attributes of FLC applications for real-life BEMS.

Ref.	Year	Methods	Agent	FH/TS	Baseline	Equipment	Building	Zones	Location	Period
[144]	2015	Mamdani	Multi	-/30 m	Fixed	HVAC	Office/Academic	4	L.A. (US)	Summer (2 W)
[145]	2016	Mamdani	Single	-/-	RBC/PID	HVAC	Resident/Lab	1	Agugliano (IT)	Winter (3 W)
[146]	2019	N/A	Single	-/30 m	Fixed	HVAC	Office/Lab	1	Dalian (CN)	Summer (10 D)
[147]	2020	Mamdani	Single	24 h/1 m	Fixed	HVAC/RES/ESS/Other	Resident/Lab	1	Lodz (PO)	Mixed (10 D)
[148]	2020	Mamdani	Single	-	RBC/FLC	HVAC	Coop	1	Rabat (MO)	Summer (1 W)
[149]	2021	Takagi–Sugeno	Single	-/1 m	RBC	HVAC	Academic	4	Paris (FR)	Winter (2 D)
[150]	2023	Mamdani	Single	24 h/1 m	PID	HVAC	Resident/Lab	1	Lodz (PO)	Mixed (3 D)
[151]	2023	Sugeno	Single	-/1 m	RBC	RES/ESS	Resident/Lab	1	Offenburg (DE)	Spring (2 D)

Table 5. Key attributes of hybrid applications for real-life BEMS.

Ref.	Year	Methods	Agent	FH/TS	Baseline	Equipment	Building	Zones	Location	Period
[152]	2016	RL/ACO	Single	48 h/5 m	RBC	HVAC/RES/DHW	Residents	1	Amersfoort (NL)	Winter (3.5 M)
[153]	2016	PSO/ANN	Multi	N/A	RBC	HVACs	Office/Lab	1	Glasgow (GB)	Winter (10 D)
[154]	2018	KNN/RBC	Single	24 h/1 m	Fixed	HVAC	Office	11	Singapore (SG)	Summer (45 D)
[155]	2018	CBR/ANN	Multi	7 d/30 m	Fixed	HVACs	Offices	7	Salamanca (ES)	Winter (9 W)
[156]	2019	MPC/ANN	Multi	4 h/15 m	Fixed	HVAC	Academic	85	Singapore (SG)	Winter (63 D)
[157]	2019	RL/GA	Single	-/15 m	RBC	HVAC	Office/lab	-	Pittsburgh (US)	Spring (78 D)
[158]	2020	MPC/PSO	Single	24 h/5 m	Fixed	HP/RES/BSS/TSS	Resident/Lab	1	Perth (AU)	Summer (2 D)
[159]	2021	PSO/RBC	Single	24 h/10 m	Fixed	HVAC/RES/ESS/DHW	Resident/Lab	1	Benguerir (MO)	Summer (14 D)
[160]	2021	MPC/ANN	Single	1 h/5 m	Fixed/RBC	HVAC	Office/Hall	1	Singapore (SG)	Summer (10 D)
[161]	2022	GA/ANN	Single	30 d/1 m	SVM/ANN	HVAC/RES/ESS	Resident	20	Faro (PR)	Summer (55 D)
[162]	2022	GA/RF	Single	24 h/1 h	Fixed	HVAC/RES/ESS/EVCS	Office/Academic	1	Huesca (ES)	Spring (14 D)
[163]	2024	RL/MPC	Single	-/15 m	RBC	RES/ESS/EVCS	Resident/Lab	1	Brussels (BE)	Spring (48 D)
[164]	2025	SRG/RBC	Single	-/1 m	Fixed	HVAC	Chamber/Lab	1	Kozani (GR)	Winter (3 D)
[165]	2025	SSA/ANN	Single	30 s/3 m	PID/ANN	HVAC	Office	9	Hangzhou (CN)	Summer (5 D)

Table 6. Key attributes of other type applications for real-life BEMS.

Ref.	Year	Type	Agent	FH/TS	Baseline	Equipment	Building	Zones	Location	Period
[166]	2016	Decentralized	Multi	-/10 m	RBC/Exhaustive	HVAC	Lab	-	Beijing (CN)	Summer (1 D)
[167]	2018	CAO-based	Multi	3 h/15 m	PID	HVAC	Office/Academic	3	Aachen (DE)	Autumn (5 D)
[168]	2021	CAO-based	Single	3 h/15 m	RBC	HVAC	Resident	1	Xanthi (GR)	Mixed (15 D)

Table 7. Cross-method summarization of dominant attributes in real-world ML-based BEMS (2015–2025).

Method	# Apps	Dominant Types	# Multi-Agent Apps	Dominant FH	Dominant TS	Dominant Baselines	Dominant Equipment	Dominant Building Types	Zone Margins	Dominant Continents	Period Margins
MPC	35	Classical Data-driven Economic	2	24∼48 h	5∼30 m	RBC PID Fixed	HVAC TSS RES	Office Academic Resident	1∼20	EU Asia US	1 D∼1 Y
ANN	6	MLP LSTM	1	5∼80 m	5∼15 m	ANN	HVAC	Office/Lab Resident	1	EU Asia	3.5 M∼1 Y
RL	8	PPO	0	5∼60 m	5∼60 m	RBC	HVAC	Office/Lab	1∼5	EU US	9 D∼1 Y
FLC	8	Mamdani	1	N/A	1∼30 m	RBC PID	HVAC	Resident/Lab	1∼4	EU	2 D∼3 W
Hybrid	14	MPC/GA ANN/GA	3	24 h	1∼15 m	RBC Fixed	HVAC RES ESS	Office/Lab Resident/Lab	1∼10+	EU Asia	2 D∼9 W
Other	3	N/A	1	3 h	10∼15 m	RBC PID	HVAC	N/A	1∼3	EU Asia	5 D∼15 D

Table 8. Data clusters and types utilized in real-world building energy management.

Clusters	Data Type
Temperature	Indoor air temperature, operative temperature, surface temperature, tank temperature, supply water temperature
Environmental	CO₂ concentration, relative humidity, NH₃, IAQ index, PMV, PPD, air velocity.
Occupancy	Occupancy status, schedule, PIR/motion sensors, pressure mats, comfort feedback, clothing, metabolic rate.
HVAC Signals	Air/water flow rates, valve/damper/fan position, actuator status, fan speed, compressor/chiller power, control signals.
Energy	HVAC power, lighting/plug loads, boiler/chiller energy, total electrical use, equipment/pump power.
Weather Data	Outdoor temperature, humidity, solar irradiance, wind speed, weather forecasts.
RES/ESS/Grid	PV generation, battery SOC, grid import/export, TES tank, EV SOC, RES forecasts.
Economic	Electricity price, RTP/ToU tariff, virtual price, feed-in tariff.
Comfort	comfort deviation, satisfaction index, Skin temperature, heart rate, TCI.
Historical	Historical energy/weather data, simulated states, predicted demand, previous control signals.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Article metric data becomes available approximately 24 hours after publication online.