Next Article in Journal
Structural Performance of Fiber-Reinforced Cementitious Composite Members Reinforced with Fiber-Reinforced Polymer Bars: A Systematic Review
Previous Article in Journal
Mapping Ammonium Flux Across Bacterial Porins: A Novel Electrophysiological Assay with Antimicrobial Relevance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning in Smart Buildings: A Review of Methods, Challenges, and Future Trends

1
LISTIC, Université Savoie Mont Blanc, 74944 Annecy, Cedex, France
2
IUT-NFC, National Council for Scientific Research (CNRS), Institut FEMTO-ST, Université Marie et Louis Pasteur, 90000 Belfort, France
3
DeepVu, Berkeley, CA 94704, USA
4
College of Engineering and Technology, American University of the Middle East, Egaila 54200, Kuwait
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(14), 7682; https://doi.org/10.3390/app15147682
Submission received: 6 June 2025 / Revised: 29 June 2025 / Accepted: 29 June 2025 / Published: 9 July 2025

Abstract

Machine learning (ML) has emerged as a transformative force in smart building management due to its ability to significantly enhance energy efficiency and promote sustainability within the built environment. This review examines the pivotal role of ML in optimizing building operations through the application of predictive analytics and sophisticated automated control systems. It explores the diverse applications of ML techniques in critical areas such as energy forecasting, non-intrusive load monitoring (NILM), and predictive maintenance. A thorough analysis then identifies key challenges that impede widespread adoption, including issues related to data quality, privacy concerns, system integration complexities, and scalability limitations. Conversely, the review highlights promising emerging opportunities in advanced analytics, the seamless integration of renewable energy sources, and the convergence with the Internet of Things (IoT). Illustrative case studies underscore the tangible benefits of ML implementation, demonstrating substantial energy savings ranging from 15% to 40%. Future trends indicate a clear trajectory towards the development of highly autonomous building management systems and the widespread adoption of occupant-centric designs.

1. Introduction

Smart buildings represent integrated cyber–physical systems, designed to coordinate and control various essential services such as Heating, Ventilation, and Air Conditioning (HVAC), lighting, life-safety systems, and other ancillary services through advanced automation and supervisory mechanisms. These intelligent infrastructures are supported by interoperable communication protocols, which facilitate seamless cross-domain data exchange. This capability ensures that events detected within one subsystem can trigger compensatory actions in other interconnected systems. To achieve this, heterogeneous sensor streams are processed by real-time analytics engines that infer crucial information, such as occupancy states and environmental conditions. This data is then utilized to continuously update control policies online through adaptive learning algorithms. The ultimate objective of this architectural design is to achieve occupant-centric operation, thereby preserving thermal and visual comfort while simultaneously safeguarding health. Concurrently, these systems strive to minimize energy consumption and reduce associated greenhouse gas emissions [1]. The urgency of improving energy efficiency in buildings is underscored by the fact that the global building sector is a significant contributor to environmental impact, accounting for approximately 40% of total energy consumption and 36% of global CO2 emissions [2].
The integration of ML into building management systems marks a significant and transformative advancement aimed at addressing these pressing energy challenges. ML algorithms possess the inherent capability to process vast quantities of data originating from diverse building systems, including HVAC, lighting, and occupancy sensors. This processing enables them to identify intricate patterns, accurately predict future states, and optimize operational parameters in real time. This data-driven approach signifies a fundamental shift from static, rule-based control mechanisms to dynamic, adaptive management strategies that can intelligently respond to ever-changing conditions and occupant behaviors [3]. Unlike traditional building automation systems, which often struggle to adapt to unexpected changes or learn from historical patterns, ML-powered systems continuously learn and improve. This continuous learning capability provides superior predictive insights, allowing for the anticipation of equipment failures, the optimization of energy consumption based on real-time occupancy data, and the more effective integration of renewable energy sources [4].
ML applications in smart buildings span the entire lifecycle, from design optimization to operational management and retrofit planning. As illustrated in Figure 1, key areas where ML is impactful include the following:
  • Energy-efficient building design: ML models are employed to analyze various design elements, such as building geometry, orientation, and material properties. By predicting energy consumption patterns even before construction commences, these models enable the optimization of designs for minimal energy use [5]. Recent studies have shown that ML-based design optimization can lead to a reduction in predicted energy consumption by 15% to 25% when compared to conventional design approaches [6].
  • Building automation: ML dynamically optimizes control strategies for critical building systems, including HVAC, lighting, and equipment scheduling. This optimization is based on real-time sensor data, allowing for highly responsive adjustments. Research has shown that ML-enhanced automation can reduce energy consumption by 10% to 30% while simultaneously improving occupant comfort levels [1,7].
  • Fault detection and diagnosis (FDD): ML algorithms are instrumental in identifying anomalies that indicate equipment faults or degradation. Supervised learning models, for instance, can classify specific failure modes with accuracy rates often exceeding 90% [8]. Early fault detection facilitated by ML can significantly reduce equipment downtime, typically by 30% to 50%, and prevent energy waste resulting from malfunctioning systems [1].
  • Occupancy estimation and detection: Accurate data regarding building occupancy is crucial for implementing demand-responsive adjustments to environmental conditions. This allows for a reduction in energy consumption in unoccupied spaces while ensuring optimal comfort levels when occupants are present [9]. Studies suggest that occupancy-based control strategies can achieve energy savings of 20% to 40% compared to traditional fixed schedules [10].
  • Retrofit modeling: For existing buildings, ML models are utilized to predict the potential energy savings associated with various upgrade options. This capability helps building owners prioritize investments in initiatives that promise greater energy efficiency and a higher return on investment [1,11].
Contrary to previous reviews that focus on specific aspects, this review aims to provide an integrated analysis across the spectrum of ML techniques, applications, and implementation challenges within the context of smart building energy management. The objective is to synthesize current knowledge, critically evaluate the effectiveness of various ML techniques for specific energy management tasks, identify key challenges that limit widespread adoption, propose practical solutions, and highlight emerging trends and future research directions. By analyzing ML approaches and their inherent strengths and limitations, and offering a critical performance analysis alongside practical implementation guidelines, this review aims to serve both researchers striving to advance the field and practitioners actively working to implement ML solutions in real-world building environments.
The remainder of the paper is organized as follows. Section 2 establishes the foundational knowledge necessary for understanding ML applications in smart buildings. Section 3 describes our systematic review methodology. Section 4 provides a detailed analysis of ML techniques used in building energy management. Section 5 examines critical data management processes. Section 6 explores specific ML applications in energy prediction. Section 7 presents real-world implementations and comparative analyses. Section 8 identifies the current limitations and future opportunities. Section 9 discusses emerging trends. Section 10 provides practical implementation guidance. Section 11 outlines future research directions. Finally, Section 12 synthesizes key findings and implications.

2. Background and Preliminaries

This section provides an overview of intelligent energy management systems, discusses the stages of ML implementation in such systems, and explains prediction metrics and their role in optimizing energy use and minimizing waste.

2.1. Intelligent Energy Management Systems

Intelligent Energy Management Systems (IEMSs) represent a sophisticated integration of IoT sensing technologies, high-frequency metering, and advanced machine learning analytics as depicted in Figure 2. Their primary function is to minimize energy consumption while simultaneously ensuring optimal occupant comfort and maintaining grid stability. IEMSs operate by continuously integrating data streams from a variety of sources, including environmental and occupancy sensors, smart meters, and external weather services. This integrated data is then fed into a supervisory analytics engine. This engine is responsible for forecasting energy demand, detecting anomalous patterns in energy usage, and scheduling equipment operations in real time. Adaptive control signals are subsequently dispatched to local actuators, such as HVAC dampers, variable-speed drives, dimmable luminaires, and storage inverters, thereby closing the loop between prediction and action within mere seconds.
A transparent framework for automated energy management was presented in [12], which combined ML, expert knowledge, and semantic reasoning to enhance learning and foster trust. IEMSs can incorporate various renewable energy sources to optimize sustainable energy utilization while maintaining grid stability and ensuring energy availability. To function as a single high-power energy source, the authors in [13] suggested combining energy storage options with solar, wind, and hydroelectric energy sources. IEMSs are becoming increasingly interconnected with building automation systems as the idea of smart buildings develops. This enables buildings to operate more efficiently and effectively control energy usage while maintaining occupant comfort. To reduce environmental effects and achieve sustainability goals, building management must be approached holistically. GreenBuilding, an intelligent system that monitors and automatically adjusts the energy usage of appliances in a building, was deployed by [14] and showed significant energy savings. By permitting two-way communication between energy providers and users, smart grids have further expanded the capabilities of EMS. This has enabled the implementation of demand response, dynamic pricing, and enhanced grid resilience. In [15], the authors described a sophisticated IoT-enabled intelligent energy management system for buildings that improves the interactivity of building energy management systems.

2.2. Stages of ML Implementation in Building Energy Management Systems

As shown in Figure 3, the implementation of ML in energy management systems typically progresses through the following three distinct stages:
  • Conceptual stage: This initial phase focuses on evaluating the technical feasibility and strategic necessity of deploying ML technology to optimize building energy efficiency. It involves a critical process of identifying significant problems that ML can effectively resolve and subsequently selecting the most suitable data formats and algorithms. During this stage, model designers must possess interdisciplinary competence, combining expertise in computer science with a deep understanding of energy management principles to formulate effective energy management tactics. The conceptual stage is crucial for determining the appropriate quantity and nature of data required, the specific ML methodology to be adopted, and ultimately, whether the overall modeling initiative will genuinely enhance building energy efficiency [1,3].
  • Experimental stage: The experimental phase is dedicated to evaluating the effectiveness of ML technology in predicting outcomes and assessing the suitability of proposed algorithms for handling building energy data. This stage encompasses model development and meticulous data preparation, followed by post-collection data processing aimed at enhancing data quality and ensuring a sufficient volume of data for robust ML model construction. A notable challenge in this phase, despite the availability of data collection and processing methodologies, is the persistent lack of effective quantitative assessments for data adequacy, which often leads to variations in data amounts across different research works [1,3].
  • Application stage: The final application stage involves the seamless integration of a verified ML model into a building’s existing energy management system. The primary goal is to enhance decision-making processes and improve overall system performance. This integration can manifest in several ways: it might involve transitioning from a traditional physical model to an innovative ML model, which effectively replaces the current prediction model but necessitates additional demands for data collection and organization. Alternatively, novel management content and decision-making logic could be generated based on new data patterns discerned by the ML models, which would require the thorough optimization and reconfiguration of the current system. Contemporary research predominantly focuses on refining predictive models within established systems [1].

2.3. Significance of Prediction Metrics in Energy Efficiency

The accurate prediction of energy demand and consumption is paramount for optimizing energy use and minimizing waste. This accuracy is largely enabled by various prediction metrics, which contribute significantly to the following key aspects of energy efficiency:
  • Optimization of energy use: Prediction metrics facilitate the development of methods to maximize energy use, ensuring that it is utilized effectively without compromising occupant needs. Buildings can operate more economically and efficiently by anticipating peak demand periods and adjusting energy consumption accordingly.
  • Cost reduction: Precise energy forecasting enables improved financial planning and budgeting, resulting in lower operating expenses. Organizations can use these measurements to negotiate lower energy prices, participate in demand response initiatives, and make investments in energy-saving technologies that, over time, result in significant cost savings.
  • Enhanced energy planning: In markets where energy prices fluctuate, energy managers can make well-informed decisions about whether to purchase, sell, or store energy by using dependable forecast metrics. This enhances the ability to respond to market fluctuations and to capitalize on reduced energy expenses during off-peak periods.
  • Improved grid reliability: Prediction metrics facilitate the equilibrium of supply and demand within the broader energy system, mitigating overloads and reducing the likelihood of outages. They are crucial in regulating the variability and inconsistency associated with renewable energy sources, making them essential for integration.
  • Sustainability and environmental impact: Prediction metrics can help reduce carbon emissions and achieve sustainability goals by facilitating more efficient energy use. They provide the information necessary to assess the success of energy-saving programs and the progress toward environmental goals.
Table 1 describes some of the commonly used prediction metrics and explains their importance.

3. Work Methodology

This systematic review was conducted in strict adherence to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) reporting standard, ensuring transparency and rigor in the methodology. Records for inclusion were identified through comprehensive searches conducted on the Scopus database, spanning the period from 2012 to April 2025. The search query employed was specifically designed to capture relevant literature: TITLE-ABS ((“Machine Learning”) AND (“Smart Building”) AND (“Energy Management” OR “Efficiency” OR “Occupancy”) AND PUBYEAR > 2011 AND PUBYEAR < 2026).
Following the initial search, the titles and abstracts of the identified records were screened against four pre-defined inclusion criteria: (i) the source had to be peer-reviewed, (ii) the study needed to focus on smart-building applications, (iii) there had to be explicit use of an ML algorithm, and (iv) the study had to include an empirical evaluation of outcomes related to energy, comfort, or operations. The full texts of the remaining eligible articles were then assessed for final inclusion. Studies were excluded if they lacked quantitative results. During the data extraction phase, comprehensive information was captured, including publication metadata, the specific ML technique employed, characteristics of the dataset used, and key performance indicators such as energy-saving rates or prediction accuracy. Figure 4 shows the PRISMA diagram summarizing the identification, screening, eligibility assessment, and final inclusion of machine learning studies on smart-building energy management published between 2012 and April 2025. The studies included in the review cover many application domains within smart buildings, such as energy prediction, system control, and occupant behavior analysis. Some of these works are shown in Table 2.

4. ML Techniques in Energy Prediction

ML is divided into three main categories: supervised, unsupervised, and reinforcement learning. Supervised learning involves training a model on a historical dataset to predict a dependent variable, such as energy use or HVAC faults [33]. It requires a labeled training dataset and can forecast new combinations of independent variables. Unsupervised learning, on the other hand, uses standard data for anomaly detection but cannot diagnose specific failure types [43]. While supervised and unsupervised learning are effective for observation and prediction, they are less suitable for interactive management and adjustment tasks. Reinforcement learning, which involves agents interacting with a structured environment and receiving feedback, is more suitable for these tasks. Table 3 compares the main categories of ML techniques in energy prediction.

4.1. Supervised ML Solutions

Supervised ML involves creating a system model from a labeled training dataset, allowing the learner to generalize representations based on input, output, and system parameters. A training approach is employed to develop ML models, and this process is reiterated until the model achieves the desired accuracy [44,45]. Examples of prevalent supervised ML algorithms include the naive Bayes model, decision trees, Support Vector Machines (SVMs), artificial neural networks (ANNs), hidden Markov models (HMMs), instance-based learning (such as k-nearest neighbor learning), ensemble methods (bagging, boosting, random forests), logistic regression, and genetic algorithms [46,47]. Supervised learning techniques are commonly employed in smart buildings to resolve diverse challenges. Examples of supervised ML in IEMS include the following:
  • Supervised ML for occupancy detection: In smart buildings, supervised ML methods have been used to identify occupancy. These algorithms can precisely anticipate the occupancy state of various spaces within a building by training models on past occupancy data collected from sensors. For instance, ref. [48] classified the occupancy state of office rooms using sensor data and an SVM model. The model’s high precision in determining room occupancy made it possible to operate the lighting, heating, and cooling systems more effectively.
  • Supervised ML in energy consumption forecasting: Supervised ML algorithms are commonly employed in smart buildings to forecast energy usage. To accurately predict future energy use, these models analyze patterns from previous energy data, weather information, and other relevant factors. A Long Short-Term Memory (LSTM) neural network was employed to predict a building’s electricity consumption, demonstrating high accuracy in forecasting energy demand, thereby optimizing energy efficiency and reducing costs.
  • Supervised ML in FDD: Supervised ML approaches have effectively facilitated fault identification and diagnosis in HVAC systems within smart buildings. These algorithms can recognize and categorize various kinds of equipment defects by training models on labeled data that indicate the existence of faults. For instance, the random forest classifier is used to find problems with air-handling equipment in a building. The model accurately identified different problem kinds, allowing for quick maintenance and effective system operation.
Recently, supervised learning methodologies have emerged as a dominant approach to tackling tasks related to smart building energy management systems, mainly due to their high accuracy when trained on extensive, well-annotated datasets. Methods like classification and regression models have demonstrated efficacy in forecasting energy use, detecting occupancy, and diagnosing faults.
The challenges of supervised learning underscore the potential benefits of incorporating unsupervised and reinforcement learning approaches. Unsupervised learning may discern trends and anomalies without labeled data, rendering it appropriate for detecting anomalies in energy consumption patterns or tenant behavior. On the other hand, reinforcement learning is an interactive, self-improving process that can adapt to changes in how the building is used or to variables in the outside environment. This makes solutions more reliable in complex, changing situations. Diversifying ML methodologies in smart energy systems enhances model flexibility and robustness, transitioning from static forecasts to adaptive, real-time decision-making frameworks.

4.1.1. Classification

Classification algorithms aim to assign examples to one of several distinct categories. In supervised learning, a labeled dataset trains the model, whereas an unlabeled dataset evaluates classification accuracy. This is frequently assessed by the accuracy rate ( A R ) given by [49]:
A R = N c N t
where N t represents the total number of test instances, and N c is the number of test instances appropriately assigned to the categories to which they belong. The specifics of the classification findings are measured using the precision (P) and recall (R) metrics. True positive ( T P ), false negative ( F N ), false positive ( F P ), and true negative ( T N ) are the four outcomes that could occur. P and R are defined as follows:
P = T P T P + F P
R = T P T P + F N
Given P and R, a straightforward measure that balances recall and precision of the classification results is the F-measure or F-score described as follows:
F = 2 P R P + R

4.1.2. Decision Tree

The decision tree method is an important predictive ML modeling approach that constructs a model of decisions based on the actual values of features in the data. Decision trees can be utilized for both classification and regression problems. In tree structures, leaves represent class labels, and branches represent conjunctions of attributes that lead to those labels [50]. The decision trees where the target variable takes continuous values are called regression trees. Decision trees are often among the most favored of ML algorithms because of their speed and accuracy. The most common algorithms for decision trees are [51] classification and regression tree, ID3, C4.5 and C5.0, Chi-squared, M5, and conditional decision trees. The study in [52] introduced a machine learning technique utilizing decision trees to identify the most prevalent human behaviors and their temporal relationships, hence facilitating the rapid modeling of human behavior in intelligent environments. Reference [53] presented a prototype distributed data mining system for the healthcare sector, utilizing the C4.5 classification algorithm, capable of delivering patient monitoring and health services. The decision tree algorithm is a non-parametric model that is straightforward to read and elucidate. The primary drawback of this method is its susceptibility to overfitting.

4.1.3. Bayesian Method

Bayesian methods employ Bayes’ theorem for classification and regression tasks. Common algorithms encompass Naive Bayes, Gaussian Naive Bayes, Bayesian belief networks, and Bayesian networks. They are utilized in building automation systems, including indoor localization and detection of HVAC system problems [54]. Naive Bayes classifier methodologies have been utilized for human activity recognition, determining actions with the highest likelihood based on sensor information. These models are constructed upon specialized knowledge and virtual sensors [55].

4.1.4. SVM

It is a supervised ML algorithm applicable to classification and regression tasks, while it is predominantly employed for classification problems [56]. Support Vector Machines (SVMs) are a prevalent technique employed in diverse statistical learning applications, such as handwriting analysis, spam detection, facial and object recognition, and text categorization, among others [57]. The objective of SVM is to maximize the margin that distinguishes the hyperplane between the nearest points of two classes. The objective is to optimize the separation between the nearest points of the hyperplane representing two classes. Support vectors are the points situated at the boundaries, while the center of the margin represents the optimal separating hyperplane [58]. SVM techniques have been used in various applications, including forecasting system-level electrical loads in public buildings, real-time person tracker systems, and the automatic recognition of everyday living activities in smart homes [59]. These applications include real-time activity error detection in smart homes, and detecting occupancy behavior using temperature and heating source information. One-class SVM is suggested for real-time activity error detection, while ML techniques based on SVM and RNN are suggested for detecting occupancy behavior in buildings [60].

4.1.5. Artificial Neural Network Algorithms

Artificial neural networks (ANNs) are popular models used to solve classification and regression issues. Popular ANN algorithms include Perceptron, Back-Propagation, the Hopfield Network, and the Radial Basis Function Network (RBFN). ANNs offer benefits like less statistical training, potential correlations between predictor variables, and complex non-linear interactions [61]. However, they have drawbacks such as the ‘black box’ aspect, high computational cost, and susceptibility to overfitting. Applications of smart buildings have been used for improved energy efficiency, security applications, context-aware services, and activity identification [62].

4.1.6. Deep Learning Algorithms

Deep learning approaches represent an enhanced iteration of artificial neural networks (ANNs) that utilize a complex architecture comprising multiple layers featuring diverse linear and non-linear transformations. One objective of deep learning is to supplant manually selected features with efficient hierarchical feature extraction and unsupervised or semi-supervised feature learning methodologies. The most prevalent deep learning algorithms include the Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Deep Boltzmann Machine (DBM), Deep Belief Network (DBN), and Stacked Autoencoders [63]. Deep learning has been effectively utilized in big data analytics across various domains, including natural language processing (NLP) applications, medical diagnostics, stock market trading, network security, and image recognition. Deep learning is widely used by major enterprises and corporations for various applications, including real-time speech translation, voice recognition, and smart home devices [64]. However, due to the large parameters in deep learning, it requires more data for estimation. Two predictive algorithms based on deep learning frameworks are restricted Boltzmann machines and deep belief networks, as well as a hybrid model [63]. A comprehensive deep learning framework is proposed for human activity recognition using convolutional networks and recurrent neural networks, suitable for multimodal wearable sensors like accelerometers, gyroscopes, and magnetic field sensors. A two-phase neural-based deep model is introduced for classifying human activities, with the initial phase autonomously acquiring spatiotemporal characteristics via convolutional neural networks. Convolutional neural networks are also used for the acceleration-based identification of human activity. Wearable sensors, such as accelerometers and gyroscopes, are used as components of deep convolutional neural networks for human activity recognition [65,66,67].

4.1.7. Hidden Markov Models

HMM, as a doubly stochastic process, can be interpreted by the sequence of observable symbols generated by an alternative stochastic process, accompanied by a concealed underlying stochastic process. According to the SBs applications, an improved Hidden Markov Model (HMM) is used for predicting user behaviors and providing services for impairments [48]. It is also used for real-time assessment and feedback in physical rehabilitation [68]. A comprehensive daily activity recognition inference engine is proposed, integrating Viterbi and Baum–Welch algorithms for enhanced precision and learning capability [69]. Sequence-based models for online recognition of quotidian activities in smart building contexts are presented.

4.1.8. Time Series Analysis

A time series is a compilation of temporal events, and time series datasets generally possess the following characteristics: high dimensionality, a substantial number of instances, and continuous updates [70]. Time series representation has three primary categories: model-based representation, non-data-adaptive representation, and data-adaptive representation. A primary objective of time series representation is to reduce dimensionality [71,72]. In smart buildings, time series analysis is used to forecast elderly sleeping patterns. In [73], the authors proposed a wellness model utilizing seasonal autoregressive integrated moving average time series in conjunction with sleep activity scenarios. In the realm of data sensing in smart buildings, ref. [74] proposed a time series analytical approach to examine correlations among non-stationary time series.

4.1.9. Regression

Regression challenges aim to estimate a real-valued target function. It pertains to employing a metric of the prediction error to elucidate the relationship between variables that are iteratively processed [75]. Linear regression, logistic regression, stepwise regression, and ordinary least squares regression are the most prevalent regression techniques [76]. The use in smart buildings found the physical and environmental variables that enhance energy efficiency in a solar building, utilizing the regression technique of the orthogonal matching pursuit algorithm [77]. To ascertain the orientation of a gesture and evaluate the segmentation of continuous gestures in routine activities inside an intelligent environment, ref. [78] presented a gesture recognition system that integrates linear regression with the correlation coefficient.

4.1.10. Ensemble Methods

A set of classification models that are trained separately, and the predictions of those models that are then integrated in a way to give the overall prediction, is known as a classifier ensemble [63]. The following ensemble learning-based classification methods are the most widely used: bagging, blending, gradient boosting machines, random forest, and boosting. Within the application sector of smart homes, ref. [79] developed a cluster-based ensemble approach solution for activity recognition. This method models activities as cluster collections constructed from various feature subsets. By combining the prediction output of various classifiers that make up the ensemble using genetic algorithm optimization, ref. [78] suggested an ensemble classifier approach for activity recognition in smart homes. For activity recognition, they used the ANN, HMM, conditional random field, and SVM [80] as basis classifiers. A deep LSTM ensemble method was put out by [78] for activity recognition utilizing wearables: Specifically, the authors created improved LSTM network training processes and suggested combining sets of LSTM collectives comprised of various learners.

4.2. Unsupervised ML Solutions

Unsupervised learning involves developing algorithms that assess the behavior or system in question using unlabeled data [81]. Consequently, the algorithm remains oblivious to the definitive conclusion. The unsupervised learning algorithm categorizes the sample sets into distinct clusters according to the similarity of the input samples. Clustering facilitates the identification of clear relationships using several criteria extracted from the data. The model is ready, and its overarching principles are derived by analyzing the structures within the supplied data. Redundancy can be systematically minimized, or data can be grouped based on similarity through mathematical methods [82]. The unsupervised technique has been utilized to identify diverse actions in smart buildings [83]. Clustering, dimensionality reduction, and the learning of association rules exemplify standard unsupervised learning problems. The a priori algorithm and k-Means are two prominent examples of unsupervised learning algorithms derived from supervised learning methodologies. In unsupervised learning, the objective is to identify patterns and correlations among a collection of input measures, typically without any output measure; the focus is just on recognizing the features [44].

4.2.1. Clustering

Clustering is a method used to categorize data into groups with maximal similarity, such as categorizing clients based on their purchase behaviors. Common clustering techniques include centroid-based and hierarchical clustering, which focus on the internal structures of the input data [84]. The quality of clustering results is evaluated based on the specific application employing a clustering technique [49,78]. For instance, Lloyd’s clustering technique is used to group activity instances, and the evidence-theoretic K-Nearest Neighbors learning approach is used to identify and predict user activities in intelligent environments [85]. Unsupervised learning approaches have been employed to address challenges in daily living activities in smart homes, such as assisting individuals with dementia, detecting atypical visits by older individuals, and identifying social interactions and daily activities within a smart environment. An unsupervised algorithm is also used to monitor habitual acts in an individual’s routine using the basic k-means algorithm. Finally, an unsupervised ML approach is proposed to identify behavioral patterns of occupants, using unannotated data from low-level sensors in a smart building [82,83,86].

4.2.2. Association

Association rule learning is used to identify rules that define notable segments of the input data, such as individuals who buy X also tend to buy Y. Association analysis can uncover connections between ostensibly unrelated information in a relational database or comparable repository by examining frequent if/then statements within the incoming data. “Support” denotes the frequency of item occurrence in the database, whereas “confidence” indicates the frequency with which the if/then statements have been validated as true. A variety of algorithms can be employed to produce association rules. The predominant association algorithm is known as a priori [87], a system that identifies common patterns of human behavior through association, process mining, clustering, and classification methods utilized in the context of smart buildings. The learning layer, the primary component of the system, consists of two modules: the language module, which provides a unified understanding of the patterns, and the algorithm module, which identifies the patterns. To analyze the association rules derived from the statuses of each device in SB settings, the work in [88] introduced a system to produce service scenarios. These states are frequently obtained from the devices at predefined periods. In a smart home, the authors in [83] introduced the use of temporal association rule techniques to determine the temporal dimensions of activities, including their time of initiation and conclusion, as well as their temporal sequencing.

4.3. Semi-Supervised Learning

Semi-supervised learning occupies a position between supervised and unsupervised methodologies. The input data comprises a combination of labeled and unlabeled samples. These hybrid algorithms aim to enhance the advantages of major categories while mitigating their shortcomings. The model identifies patterns in the data and generates predictions accordingly. Regression and classification are two instances of problems. Generative models, heuristic methods, semi-supervised Support Vector Machines, graph-based techniques, self-training, help-training, mixture models, co-training, and multi-view learning represent some prominent semi-supervised learning models [89]. In their semi-supervised fall detection methodology, ref. [90] advocated using a supervised algorithm based on decision trees during the training phase, followed by the creation of profiles to formulate a semi-supervised algorithm based on various thresholds. To address the issue of determining whether a user is indoors or outdoors, the authors in [78] outlined a semi-supervised ML approach that employs only the low-power sensors of a smartphone. To improve the effectiveness of activity learning with a restricted quantity of labeled samples, the study in [78] offered the En-Co-training semi-supervised learning algorithm for activity recognition. The suggested method improves the co-training framework through the utilization of an ensemble technique.

4.4. Reinforcement Learning (RL) Solutions

RL is an ML-based approach that aims to control systems to maximize long-term performance measures [91]. It is influenced by behaviorist psychology and focuses on how computer programs behave to maximize cumulative rewards. In particular, in circumstances with ample training data and no prior knowledge, RL algorithms can develop effective control rules. However, RL algorithms face challenges, such as high computational costs, when determining the appropriate course of action, as they must evaluate all possible states. Common techniques used in RL include brute force, Monte Carlo methods, temporal difference, and value function approaches. Reinforcement learning includes computational approaches to complicated control, optimization, and adaptive behavior problems, as well as the study of how rational people modify their behavior in uncertain circumstances. In recent years, significant advances have been made in the field of reinforcement learning. For understanding and replicating adaptive decision-making based on incentives and penalties, it provides qualitative and quantitative models [92]. Reinforcement learning (RL) has been utilized in multiple capacities within the framework of smart buildings (SBs). In [93], they employed Q-learning to regulate lighting in buildings, identifying the ideal intervals for activating and deactivating lights to improve energy efficiency. They suggested a dynamic programming strategy that uses Q-learning, offering clients a more efficient, adaptable, and flexible means of generating optimal selections in SB situations. Furthermore, ref. [94] presented a temporal differential reinforcement learning method that uses explicit or implicit user feedback to autonomously discern user preferences for music and lighting configurations in smart buildings.

5. Data Handling in ML Applications

In smart buildings, the efficacy of ML applications pivots on the ability to handle data meticulously, spanning from acquisition to processing. Data handling in ML applications involves a sequence of processes designed to prepare and refine data, ensuring its optimal use for generating accurate and actionable insights. This sequence includes data collection, preprocessing, feature extraction, and hyperparameter optimization, each playing a pivotal role in the lifecycle of ML projects. Table 4 provides a comprehensive overview of the data handling techniques critical in ML applications, with explanations tailored to smart building management contexts. These techniques are fundamental for ensuring that the data fed into ML models is of high quality and relevance, leading to more accurate and efficient outcomes. Together, these data-handling processes establish a robust foundation for ML applications in smart buildings, facilitating precise energy prediction, efficient resource management, and improved sustainability. This section delves into each of these aspects, elucidating their importance and detailing the methodologies that underpin effective data handling in the context of machine learning applications for smart buildings.

5.1. Data Collection in Smart Buildings

Implementing ML applications in smart buildings is predicated on data collection. It involves acquiring data from diverse sources, including energy meters, occupancy sensors, environmental monitors, and other IoT devices, as shown in Table 5. This data is essential for energy management and prediction since it sheds light on the operating dynamics of the building. Reference [95], for instance, showed how data-gathering systems in smart buildings can improve energy efficiency by improving comprehension and management of energy consumption.

5.2. Preprocessing and Data Cleaning

Preprocessing and data cleaning are essential to converting unprocessed data into a machine learning model-ready format. In this process, missing values are handled, outliers are eliminated, data is normalized, and categorical variables are encoded. These processes guarantee that the data is ready for analysis, consistent, and clean. In their 2000 discussion, ref. [96] emphasized the importance of data pretreatment in data mining and how it affects the outcome of ML initiatives.

5.3. Feature Extraction Techniques

In feature extraction, raw data is converted into a set of attributes that precisely characterize the data for ML models. It enhances the efficacy and performance of ML algorithms by decreasing the dimensionality of the data. Frequently utilized techniques encompass wavelet transforms, Fourier transforms, and principal component analysis (PCA). Reference [97] underscored the importance of feature extraction and selection in improving the performance of ML models.

5.4. Importance of Hyperparameter Optimization

Hyperparameter optimization is essential to achieve optimal performance while fine-tuning ML models. It entails choosing the optimal set of algorithmic parameters, such as learning rate, random forest tree count, and neural network layer count. Furthermore, these data handling procedures work together to provide a pipeline essential to the efficient use of ML in smart building energy management. More accurate energy forecasting and optimization can result in more sustainable and effective building operations. This can be achieved by ensuring that high-quality, pertinent data is input into ML models.

6. ML Applications in Energy Prediction

Applying ML to energy efficiency and management in smart buildings has completely changed the game. It becomes clear when we explore ML applications in energy prediction that these technologies are essential to attaining operational excellence, sustainability, and energy efficiency rather than merely being auxiliary tools. The transition from conventional reactive models to proactive, data-driven tactics is highlighted in this portion of the article as it examines the various uses of ML in intelligent infrastructure energy consumption prediction and management. A crucial part of BEMS is energy prediction, which entails projecting future energy requirements to maximize consumption, cut expenses, and preserve grid stability. A paradigm shift has occurred in this field with the introduction of ML, allowing for the development of more dynamic, flexible, and accurate prediction models that can adjust to the always-shifting demands for energy. These developments have made it easier to anticipate short-term and long-term energy use, predict and monitor loads, perform NILM, and optimize energy use in real time. ML is used by short- and long-term forecasting models to anticipate energy needs over various periods, facilitating effective resource allocation and minimizing waste. In contrast, load prediction and monitoring aim to improve demand-side management by comprehending and forecasting a building’s energy usage trends. With the aid of ML, NILM is a revolutionary approach that provides granular insights into energy utilization by breaking down overall energy consumption data into appliance-level usage without the need for individual monitors. The ultimate use of ML in energy management is real-time optimization, which dynamically modifies energy use in response to real-time data to guarantee optimal operation of the energy systems. Applying ML to energy prediction and management is a major step toward intelligent, sustainable, and user-centered building environments [98]. It goes beyond simple technological improvement. We will delve into the workings of ML models, the difficulties associated with implementing them, and the genuine advantages that they offer in the context of energy management and smart buildings as we examine these applications in the parts that follow.

6.1. Short-Term and Long-Term Energy Prediction

Energy prediction has been significantly impacted by machine learning (ML), which enables the prediction of both short-term and long-term patterns of use. While long-term forecasts are essential for strategic planning and policy-making, short-term forecasts support daily operational planning. It is common practice to use methods such as deep learning models, RNNs, and time series forecasting. To illustrate how ML might improve the integration of renewable energy sources, refs. [99,100] used ML algorithms for solar irradiance prediction.

6.2. Load Prediction and Monitoring

To optimize energy production, save operating costs, and maintain supply and demand balance, load prediction is essential. Support Vector Machines (SVMs) and neural networks are two examples of ML models that utilize historical consumption data to predict future demand. The efficiency of SVM in power load forecasting was shown by [101], highlighting the precision of ML in predicting peak and off-peak needs. A more recent work in [102] used variational autoencoders and tree-based regression models to predict power consumption in warehouses with a coefficient of determination exceeding 97%.

6.3. Non-Intrusive Load Monitoring

Without the need for individual appliance meters, NILM approaches break down the overall energy signal of a building into consumption profiles that are particular to individual devices [103,104]. In NILM systems, ML techniques like deep neural networks and hidden Markov models are crucial. In their discussion of the use of ML approaches in NILM, ref. [105] emphasized the importance of these techniques in obtaining comprehensive insights into energy usage.

6.4. Real-Time Optimization of Energy Usage

Using ML to assess real-time data streams for dynamic energy management, real-time energy optimization enables quick modifications to boost productivity and cut expenses. Optimization algorithms and reinforcement learning (RL) are frequently used to adapt energy usage without human involvement. For instance, ref. [106] demonstrated the ability of ML algorithms to make independent decisions on energy conservation by using reinforcement learning for real-time power balancing [106].

7. Case Studies

The forthcoming case studies section is pivotal for understanding the practical applications and benefits of ML in real-world settings. This section consists of two significant parts that, when taken as a whole, provide a comprehensive picture of how ML is changing the smart building industry.

7.1. Real-World Applications of ML in Smart Buildings

ML has revolutionized the management and operational efficiency of smart buildings, leading to substantial improvements in energy conservation, system reliability, and occupant comfort. Table 6 lists some real-world applications where ML has been effectively employed in smart buildings [107].

7.2. Comparative Analysis of Prediction Techniques

When selecting a predictive technique for smart building management, it is essential to evaluate the specific objectives, data characteristics, and application limitations. Linear regression and time series analysis are well-suited for trend analysis and basic forecasting applications, while Support Vector Machines, decision trees, and random forests offer enhanced adaptability in modeling complex, non-linear interactions. As for neural networks and deep learning models, they are effective for high-dimensional data analysis; nevertheless, they require substantial data and computing power. Clustering enhances the precision of data categorization in predictive modeling. Table 7 gives a comparison of prediction techniques in smart building management systems.

7.3. Comparative Evaluation of ML Implementations in Smart Buildings

A systematic comparison of real-world applications of ML in smart buildings is outlined in Table 8. This comparative assessment integrates various dimensions—building typology, geographic location, climate zone, sensor infrastructure, algorithmic methodologies, and quantifiable results—to discern context-specific elements that affect machine learning performance. The table demonstrates considerable variability in the implementation and efficacy of ML systems across different building environments. For instance, Google’s DeepMind initiative in data centers utilizes high-frequency, detailed operational data and comprehensive sensor instrumentation, resulting in up to a 40% reduction in cooling energy use. Conversely, university-scale initiatives such as Stanford’s HVAC optimization leverage campus-wide data consistency but exhibit relatively modest improvements (10%). The Edge in Amsterdam exemplifies the efficacy of comprehensive smart integration in office buildings, attaining a 70% reduction in energy use; yet, it relies on advanced, integrated infrastructure, which constrains its replicability in lower-tier commercial properties.

8. Limitation, Challenges, and Opportunities of Existing Solutions

Integrating IoT technologies with data-driven management systems defines smart buildings, which offer increased occupant comfort and energy efficiency. However, these developments also provide essential obstacles for data security and privacy, building infrastructure integration, and enhanced predictive analytics prospects. These problems must be resolved for smart buildings to operate sustainably and effectively.

8.1. Limited Real-World Applications and Experimental Validation

A significant limitation in the current research on ML-based energy management systems is the excessive dependence on simulated case studies, which may inadequately reflect the intricacies and difficulties of practical applications. Although simulations provide significant insights and controlled algorithm testing, they frequently omit the unpredictable elements seen in operational settings, including variability in passenger behavior, equipment degradation, and changing environmental conditions. These characteristics can profoundly influence the efficacy of ML models, which may exhibit divergent behavior in real-world scenarios compared to simulated environments [15]. Therefore, practical experimentation and extended evaluation are essential in various architectural environments. This testing would enable researchers to evaluate ML models’ performance, resilience, and adaptability across time, providing a more precise assessment of their efficacy in live, dynamic contexts. Long-term validation facilitates the detection of possible difficulties, such as model drift or the necessity for ongoing retraining, which are challenging to identify in short-term simulations. Prioritizing empirical investigations and ongoing surveillance will significantly improve the dependability and scalability of ML-based intelligent energy management solutions, enabling effective implementation across diverse building types and operational environments [108].

8.2. Practical Challenges in Applying ML for Building Energy Efficiency

A study of the causes, effects, and possible solutions to the practical challenges faced in implementing ML for building energy efficiency will aid researchers and practitioners in understanding the key factors that obstruct the application of ML technology in building energy efficiency practices.
  • Limited techniques and resources for data collection: Any ML-based building energy prediction model must commence with data collection [109]. The collection of building energy management data currently faces two major challenges: constraints in data collection methodologies and the lack of a cohesive data collection platform. As sensor technology progresses, an increasing volume of building information can be precisely monitored. The U-factor (thermal transmittance) of the building envelope and the light transmittance of the glass curtain wall are critical design elements that directly influence a building’s energy efficiency. Nonetheless, quantifying them continues to pose a challenge [1]. Collecting data for building energy modeling is often arduous and time-intensive, similar to many other domains [110]. The literature reveals that there is presently no cohesive data-collection infrastructure enabling researchers to effectively and rapidly acquire the necessary data for building energy prediction. Essential data is routinely collected for ongoing research from many databases [111]. To gather the requisite data, they may occasionally need to devise a system. Consequently, data collection is significantly more challenging than model training. Building energy modeling data collection is frequently laborious and time-consuming, like in many other fields [110]. The literature indicates that there is currently no integrated data-collecting platform that allows researchers to efficiently and swiftly gather all of the data required for building energy prediction. The necessary data is frequently gathered for existing investigations from several databases [111]. To collect the necessary data, they occasionally even need to design a system. As a result, data gathering is much more difficult than model training. This multi-source data collection method is generally ineffective in practical applications due to privacy concerns. The collection of information regarding building energy is hindered by the insufficient integration of data resources, thereby complicating the widespread application of ML models in building energy management. Data quality and collection challenges can be effectively addressed by utilizing multi-source data fusion techniques, enhancing data governance frameworks, and advancing sensing technology [1].
  • Data imbalance: Data imbalance in the energy efficiency field refers to a situation where data distribution across different classes or categories is significantly skewed. This means that one or more classes have a much smaller representation in the dataset than others. This issue is particularly relevant in energy efficiency applications where certain energy-saving behaviors, events, or anomalies are infrequent.
    Data imbalance in the energy efficiency domain can occur due to several reasons:
    Unbalanced energy consumption patterns: Energy usage patterns in buildings or systems may naturally exhibit imbalance, with certain events or behaviors occurring less frequently. For example, anomalies such as equipment malfunctions or energy wastage incidents may be relatively rare compared to normal operating conditions.
    Sampling bias: The data collection process may introduce bias, resulting in imbalanced datasets. For instance, data collection methods that prioritize specific time periods, building types, or energy systems may inadvertently underrepresent certain classes or categories.
    Data collection limitations: A limited monitoring infrastructure or incomplete data collection can lead to imbalanced datasets. For example, if certain sensors or meters are not installed in all buildings or energy systems, it can result in an imbalance in the collected data.
    The presence of data imbalance in the energy efficiency field has several implications:
    Biased model performance: ML models developed on imbalanced datasets sometimes exhibit a bias towards the majority class due to the greater number of samples available for learning. This may lead to suboptimal performance in identifying and forecasting infrequent energy-saving occurrences or anomalies.
    Decreased model generalization: Imbalanced datasets can lead to models that struggle to generalize well to real-world scenarios. The lack of sufficient data for underrepresented classes can limit the model’s ability to capture their characteristics and make accurate predictions in practical applications.
    Evaluation challenges: Traditional performance measurements like accuracy may be deceptive when addressing unbalanced datasets. Models that merely forecast the predominant class can attain elevated accuracy, although they neglect the objectives of energy efficiency.
    To address data imbalance in the energy efficiency field, several strategies can be employed.
    Data resampling: This process entails either oversampling the minority class or undersampling the majority class to establish a more balanced dataset. Methods such as random oversampling, the Synthetic Minority Over-sampling Technique (SMOTE), or Tomek connections can be utilized to accomplish this rebalancing.
    Class weighting: Providing varying weights to different classes during model training might mitigate the imbalance. By assigning greater significance to the minority class, the model can enhance its focus on its patterns and augment its predictive efficacy.
    Ensemble methods: Ensemble approaches, like boosting and bagging, can amalgamate several models trained on various subsets of imbalanced data. This can enhance the representation of the minority class and elevate overall model efficacy.
    Anomaly detection: Instead of considering the imbalanced classes as standard data, specialized anomaly detection algorithms may be employed to discover and concentrate on infrequent occurrences or energy-saving abnormalities. These algorithms can be specially trained on the minority class to identify and emphasize such instances.
  • Model complexity: Model complexity describes the degree of sophistication and complexity of a model employed in energy efficiency. It is an indicator of how many variables, parameters, and interactions are included in the model to effectively represent the system in real life. Complex models can capture complicated linkages and offer granular insights, but they also come with difficulties in terms of interpretability, robustness, and processing requirements.
    Model complexity in the context of energy efficiency can result from various sources. To capture the numerous aspects of energy consumption in buildings, several input factors or variables must be considered, including weather conditions, occupancy patterns, building characteristics, and operational parameters. To handle intricate patterns and non-linear correlations within the data, sophisticated algorithms or techniques, like deep learning or ensemble methods, may also be used [112].
    Model complexity has a variety of effects on the subject of energy efficiency. On the one hand, complicated models may be able to make forecasts that are more accurate and reveal untapped information that will improve energy efficiency techniques and decision-making. Increased model complexity, nevertheless, might sometimes present difficulties. It might be impractical for real-time applications or large-scale implementations because it might require much computer power and take longer to train. Furthermore, complicated models can be difficult to understand, reducing their transparency and making it more difficult to comprehend the fundamental processes underpinning energy efficiency.
    Several potential solutions can be considered to alleviate the issues brought on by model complexity in energy efficiency. A strategy is to balance model complexity and simplicity while considering the trade-off between accuracy and interpretability. Without significantly affecting performance, the model can be made simpler by choosing pertinent features and using interpretable techniques. Incorporating domain knowledge and expert views can also direct the model-development process and increase the model’s applicability in real-world situations.
  • Interpretability: Model interpretability in energy efficiency relates to the ability to understand and explain the decision-making processes of ML models employed in this domain. It solves the issue of complicated ML algorithms being black boxes, which makes it difficult to see how predictions are made, even while the models produce accurate results.
    For stakeholders, including building owners, facility managers, and energy professionals, to gain insights into the variables influencing energy consumption and the motivations behind particular recommendations or actions suggested by the ML models, interpretability is crucial in the energy efficiency domain. It promotes trust, facilitates decision-making, and identifies places where energy efficiency techniques can be strengthened [113]
    For applications involving energy efficiency, several methodologies and techniques have been suggested to improve model interpretability. These consist of the following:
    Feature importance analysis: It determines the factors or features that have the most influence on the ML model’s judgment. Techniques like LIME (Local Interpretable Model-agnostic Explanations), SHAP (Shapley Additive explanations), or permutation importance can be used.
    Creating streamlined rule-based models: They mimic the behavior of complicated ML models. These models use transparent explanations and rules that can be read by humans to describe the decision-making process.
    Visualizations: To help comprehend the connection between input data and output predictions, visual representations of model predictions are presented, such as heatmaps, decision trees, or partial dependence plots.
    Sensitivity analysis: Analyzing how changes in inputs affect the model results to determine how sensitive and robust the ML model is. It aids in locating crucial details and comprehending how the model behaves in various situations.
    Model-agnostic techniques: Making use of methods that may be used to interpret the outcomes of any ML model without requiring model-specific information. This covers techniques like surrogate models, LIME, or SHAP values.
    By utilizing these interpretability techniques, stakeholders can gain a deeper understanding of how ML models recommend energy efficiency, comprehend the underlying causes of energy consumption, validate the models’ performance, and make informed decisions based on the provided explanations.
  • Diverse research objects: Although the data mining capabilities of ML models for building energy prediction problems can be effectively demonstrated through various research objects, heterogeneous research objects may not be advantageous for deploying ML models in practical implementations. Which energy prediction issues are the most pressing and accessible to ML technology solutions is still up for debate among the studies that have already been conducted. In other words, it is still unclear whether time-scale and prediction items are essential for enhancing building energy performance. However, the concept for the rapid development of ML and its widespread applicability in any discipline is that it can lead to ground-breaking discoveries. Finding appropriate application opportunities is more important than the model’s capacity to fit the data for that field. A work object’s non-uniformity makes it harder to compare and choose ML algorithms and non-standardized data structures, ultimately hindering the creation of a general ML-based building energy management modeling solution. To address this, it is required to identify the principal building energy efficiency issues that still need to be resolved using ML technology, which requires combining a market demand work with a technical feasibility analysis.
  • Diverse ML algorithms: A wide range of ML algorithms have been employed in studies of building energy efficiency due to the vigorous advancement of ML. Artificial neural networks (ANNs) [114], Support Vector Machines (SVMs), Multiple Linear Regression (MLR) [115,116], Extreme Learning Machines (ELMs) [114], deep learning (DL) [117,118], decision trees (DTs) [119], and random forests (RFs) [120,121] are among the algorithms employed historically. Recent research indicates that more than 128 ML algorithms have been utilized in various studies [111]. Nevertheless, no flawless algorithm has been identified to date [3]. Although various distinct ML solutions have been developed to tackle the same issue, it has not been proven that any ML strategy consistently surpasses other algorithms in all scenarios [1]. Inconsistent experimental conditions in studies involving real-world building scenarios result in divergence and a lack of generalizability in the findings. The discrepancies in prediction output, data formats, and data volume lead to uneven experimental designs. The lack of an optimal algorithm is further exacerbated by inconsistent evaluation criteria, including the coefficient of variation (CV), mean absolute error (MAE), and computing time. Limited research emphasizes the utilization of diverse evaluation criteria to evaluate the effectiveness of ML systems. The application of ML techniques for enhancing building energy efficiency has been constrained by unique building attributes and human preferences. Sharing data and established evaluation criteria can assist in identifying the most effective algorithms for building energy prediction issues, enabling cross-project comparisons and improving overall efficiency in this domain.
  • Unstructured data: The lack of a standardized data structure is a significant practical challenge in present research, which is essential for the advancement of ML-based building energy modeling. Data employed for constructing energy models has been aggregated and classified by current research [122] (see Table 9). Researchers often consider the characteristics of prediction objects and data availability when selecting input variables and sample rates. Consequently, the volume and composition of the data differ among studies. The principal obstacle is that the fundamental building energy management challenges for ML model analysis have not yet been established. Unifying the corresponding data structure is problematic due to the variability of the prediction objects [1]. Divergences in data structures hinder the development of a consistent standard for data collection, thereby complicating the comparability of research findings. Consequently, ML-based building energy forecasting models are less prevalent [123]. It is essential to define the primary issues that ML can address and to summarize the pertinent data resources. The data structure must be customized to the particular prediction problem being addressed.
  • Technology-oriented research paradigm: Finding challenges with considerable application potential is a crucial prerequisite for successful ML applications. These issues should be ones that ML models are likely to be able to resolve and that will have a big or revolutionary impact on their respective fields [1]. The technology-oriented research paradigm, however, encourages researchers to prioritize testing the ML models’ capacity to learn from building energy data [3]. In-depth analysis and discussion of the application values of the prediction problems and the applicability of the suggested ML models to building energy management methods are lacking in the research that has resulted. The inconsistent research objects and data formats reported in previous studies are primarily due to this. Eventually, it results in discrepancies between the demands of practice and the conclusions of recent studies. Researchers should adopt a problem-driven research paradigm that considers all steps in the application process of ML models to effectively promote the use of ML technology to build energy efficiency measures.
    Table 9. Major data types used in ML-based building energy modeling.
    Table 9. Major data types used in ML-based building energy modeling.
    Data TypeExamples
    Meteorological informationWeather-related data include temperature, humidity, wind speed, solar radiation, and precipitation.
    Indoor environmental informationIndoor environment of a building, such as indoor temperature, air quality parameters (CO2 levels, volatile organic compounds), and lighting conditions.
    Occupancy-related dataOccupancy of a building, including the number of occupants, their activities, and their behavior patterns.
    Time indexTimestamp or time index associated with the collected data, enabling temporal analysis and correlation with other variables.
    Building characteristic dataRelative compactness, surface area, wall area, roof area, overall height, orientation, glazing area, building envelope heat transfer coefficient, window-wall ratio, and shading coefficient.
    Socioeconomic informationIncome, electricity price, GDP, and population.
    Historical dataPast energy consumption records, including historical energy usage patterns, utility bills, and historical weather data.
  • Lack of model adaptability: Model adaptability denotes the capacity of ML models to be efficiently integrated into current personnel and building energy management systems. Most recent studies were performed in controlled environments and did not consider the practical contexts of application. Building proprietors encounter difficulties in effectively utilizing ML-based energy forecasting models due to insufficient adjustability. The applicability of the experimental environment’s data structure and data-cleaning processes remains questionable. Researching effective ways for the integration of ML models into existing building energy management systems is essential. The majority of building management personnel presently lack the requisite expertise in building science and data science essential for engaging in the deployment of ML models. Current research has failed to yield “plug-and-play” ML models or user-friendly solutions that can be readily adopted by existing personnel. Thus, transforming current structures into ML-based energy management systems continues to pose difficulties. To resolve this issue, it may be imperative to recruit additional technically proficient workers and augment the knowledge of existing building management experts. This knowledge gap has considerably impeded the implementation of ML-based energy forecasting models. Consequently, establishing a consistent and user-friendly installation methodology for ML models is essential for enhancing their adaptability in building management.
  • Insufficient focus on data privacy and security: A significant yet undetermined challenge in implementing ML to enhance building energy efficiency is data privacy and security, especially considering the widespread utilization of IoT devices in smart buildings. These systems persistently collect and evaluate sensitive data, encompassing details on tenant behavior, energy consumption patterns, and operational metrics from diverse, interconnected devices. In the absence of comprehensive security standards, this data is susceptible to unwanted access and cyberattacks, compromising individual privacy and the operational integrity of the facility. Data privacy issues are amplified when smart buildings frequently consolidate information from various sources, such as HVAC systems, lighting controls, and occupancy sensors. With the transmission and processing of this information, there is an increasing necessity for secure data management methods and encryption protocols to safeguard it against potential dangers. Insufficient data security measures jeopardize sensitive information and undermine occupant faith in smart building systems. Consequently, the implementation of robust data security protocols is essential. This must encompass end-to-end encryption, robust authentication methods, and periodic audits to guarantee data protection at all phases—from collection and transmission to storage and analysis. Resolving these privacy and security concerns is essential for cultivating a secure and reliable atmosphere for ML-based energy management in smart buildings, promoting broader acceptance while protecting occupants’ rights and enhancing resilience against cyber threats [108].
  • Lack of user confidence: The application of ML in building energy management directly influences the return on investment (RoI). Users employ ML models for building energy management passively due to this critical factor. ML marketing necessitates a substantial level of trust from users because of its status as an emerging technology [1]. Prospective users sometimes exhibit “wait-and-see” dispositions when adopting new technology. The domain of building energy management is one where this is highly prominent. Users adopt contemporary technology solely when they perceive it as providing tangible benefits. Clients exhibit a deficiency of confidence in the ongoing efforts of ML-based building energy efficiency. The current research predominantly evaluates the fitting capabilities of ML algorithms using building energy data without specific instances of how these methods might improve energy efficiency. Consequently, users struggle to evaluate the cost-effectiveness of ML-based energy-saving solutions, impeding their readiness to embrace this new technology. This difficulty is especially pertinent for older structures, necessitating considerable enhancements to their monitoring systems to adhere to the digital norms of ML-based energy efficiency protocols. Exhibiting the energy conservation advantages and financial feasibility of various implementations may motivate prospective users to choose ML-based solutions.
  • Urban-Scale Variability and Model Transferability: A significant constraint in machine learning-based energy modeling is its susceptibility to urban-scale variables, including building density, surface area, and morphological layout. These contextual factors greatly influence energy consumption behavior, presenting difficulties for the reliability and applicability of prediction models in diverse urban settings. Recent research by [124] revealed that building energy simulations can produce significantly divergent outcomes when conducted in isolation compared to those integrated within a dense urban environment. In their case study conducted in Montreal, the inclusion of adjacent building shade augmented winter heating demands by as much as 44% and diminished summer cooling requirements by up to 40%. These findings emphasize the significant influence of urban-scale shading on model results and stress the necessity of considering context in predictive modeling [124]. In addition, ref. [125] presented a tiled, multi-city dataset that facilitates the cross-comparison of urban energy simulations. Their dataset serves as a crucial resource for assessing the transferability of machine learning models trained in one urban typology to different ones. It examines the variability caused by population density and surface area in metropolitan situations, facilitating the creation of more scalable, adaptive models [125]. Collectively, these studies highlight the need for site-specific calibration and the incorporation of adaptive learning methodologies, such as domain transfer or federated learning, to guarantee scalability and precision across varied urban environments.
Table 10 summarizes the challenges that affect energy efficiency. These issues have different root causes and different effects on how ML is applied to improve building energy efficiency. Strategies like data collecting, preprocessing, model regularization, explainable ML approaches, distributed computing, system integration techniques, resource optimization, and privacy-preserving techniques are possible responses to these problems.

8.3. Open Issues and Challenges for Intelligent Energy Management Systems in Buildings

An effective IEMS must be scalable, secure, cost-effective, and dependable. Figure 5 describes unresolved concerns and challenges pertaining to IEMS in buildings, which are elaborated below.
  • Scalability: In smart buildings, “scalability” denotes the ability to expand. The extension of this building includes the incorporation of new modules and other devices [34]. It is essential to ensure the power quality of the EMS anytime new services, applications, or devices are integrated [126]. An unscalable EMS is incapable of accommodating development, ultimately leading to unreliability and necessitating replacement. The EMS in smart buildings must be scalable due to the increasing number of consumers and demands [127].
  • Privacy and security: Privacy and security are interrelated concepts. Inadequate security renders the EMS vulnerable to illicit exploitation and jeopardizes consumer privacy. Ensuring a high level of privacy and security in smart buildings is particularly problematic due to the absence of common standards for IoT security [126]. The collection and analysis of data is a critical component of an IoT-based EMS. The bulk of this information is accumulated over time through energy consumption and maintenance activities. Decisions on the efficient functioning of buildings are made upon the analysis of data. A comprehensive data management system can evaluate a building’s energy efficiency; however, information integrity, mutual trust, and authentication are the three crucial security criteria.
  • Performance management: IoT-based smart buildings contain billions of internet-connected devices. These devices must be governed by a system that guarantees adequate fault tolerance and detection. Consequently, it is essential to implement a service that manages IoT device configuration, communication, and accessibility across different user tiers [128].
  • Cost-effectiveness: The energy consumption in smart buildings is associated with various expenses. These expenditures encompass the cost of the instruments, their operation, the expense of technological services, and the maintenance fees. Moreover, there is an expense linked to the incorporation of diverse energy sources into an EMS. On the one hand, consumers are directly impacted by the high price of an EMS. On the other side, EMS performance may suffer if costs are cut by employing less expensive materials. For this reason, it is imperative to keep both in balance. Currently, most private customers cannot afford many IoT solutions because they are too expensive for smart buildings and other industrial applications [129].
  • Big data processing: In IoT-based systems, a growing volume of data needs efficient handling and processing. Using conventional techniques and equipment for data processing is not practicable. Real-time data is gathered and analyzed in IoT-based smart buildings to assist in decision-making. As a result, processing calls for new techniques and equipment [130]. Some of these technologies employ local data processing. Devices can conserve more network traffic since they are aware of the status of the primary server and their immediate neighbors. Additionally, because most big data processing occurs locally, adopting localized algorithms makes it possible to handle the enormous amount of big data [130].
  • Interoperability and standardization issues: A significant challenge for IEMS in buildings is the absence of standardization and interoperability among diverse building automation systems and communication protocols. The lack of standards obstructs seamless integration, establishing technical and operational impediments for ML models that depend on cohesive, continuous data streams from many sources, including HVAC systems, lighting, security systems, and IoT devices. Data fragmentation arises when systems utilize disparate protocols or exhibit incompatibility, resulting in inconsistencies and possible deficiencies in data quality, which leads to compromising the quality and performance of ML models; errors of fragmented data can lead to errors in addition to decreasing the training efficacy, and restrict the model’s capacity to generalize across diverse systems or environments. In addition, the absence of established protocols complicates the integration of ML-driven EMS across various building infrastructures, frequently necessitating bespoke solutions for each system. This problem escalates implementation expenses and slows the widespread adoption of ML-enhanced energy management solutions. Thus, universal standards should be established, and open communication protocols should be advocated within the building automation sector. This would facilitate improved interoperability among systems, guarantee uniform data flow, and augment the robustness and scalability of ML-based solutions for intelligent energy management [131].

8.4. Opportunities for Advanced Predictive Analytics

Opportunities to use cutting-edge predictive analytics to optimize building operations and energy use exist amid these issues. AI and ML technology can evaluate both past and current data to forecast energy use, maximize resource utilization, and improve occupant comfort [132], for example, providing a mobile-cloud framework to reduce the danger of data over-collection, allowing for more intelligent energy management in smart city settings. Smart buildings may become dynamic, effective, and user-responsive spaces using advanced predictive analytics, demonstrating the built environment’s capacity for sustainability and creativity.
As a result, smart buildings have significant potential for innovation through advanced predictive analytics, even though they also present security, privacy, and system integration challenges. To address these issues, a balanced strategy that considers the legal requirements, technological developments, and the changing needs of building managers and occupants is needed.

9. Future Trends in ML for Smart Buildings

This section investigates the prospective applications of ML in future smart buildings, as shown in Table 11. It highlights the transformative potential of technology in the built environment while also delineating the challenges and opportunities that must be addressed to develop structures that are more intelligent, sustainable, and efficient.

9.1. Predictive Maintenance

In smart buildings, predictive maintenance uses ML to predict equipment failures, increasing uptime and reducing downtime. To identify abnormalities and anticipate future failures, ML algorithms can evaluate historical and real-time data from building systems, enabling proactive maintenance measures [94]. Enhancing operating efficiency and prolonging the equipment’s lifespan not only saves money but also minimizes energy wastage.

9.2. Integration with Renewable Energy Resources

Smart buildings must integrate renewable energy resources to achieve sustainability and energy independence. ML can forecast patterns of energy production and consumption, which can maximize the use of these resources. By utilizing sophisticated analytics, ML enables the appropriate distribution and storage of renewable energy, guaranteeing steady and economical building operations. The integration of variable renewable energy sources is contingent upon using ML to improve energy profiles and problem detection in smart buildings as highlighted by [4].

9.3. Occupant-Centric Design

Occupant-centric ML methods facilitate demand-responsive management strategies, optimizing energy-intensive activities according to real-time occupancy and usage trends. This methodology harmonizes with the objectives of smart buildings by establishing a balanced system that emphasizes human comfort while ensuring efficient energy use. As smart building technology progresses, the occupant-centric design will be pivotal in enhancing sustainable and user-friendly building management systems, creating settings that are both comfortable for occupants and responsible for energy consumption [133].

9.4. ML and IoT in Building Automation

Building automation is transforming due to the convergence of ML and the IoT, making spaces smarter and more responsive. Large volumes of data produced by IoT sensors can be processed by ML algorithms to automate and improve building operations, including security, maintenance, and energy management. In their discussion of the potential of ML for next-generation wireless networks—the foundation of IoT-driven smart buildings—ref. [134] points to a trend toward more intelligent and integrated systems. In summary, ML and the incorporation of renewable energy sources will be pivotal in the future of smart buildings, propelled by the growth of IoT technology. These developments facilitate innovative building management solutions, ensuring enhanced operational efficiency and sustainability.

10. Suggestions and Recommendations

The following suggestions are made for improving IEMS utilizing ML:
  • Invest in reliable data gathering systems: Record current data on weather, occupancy, and other pertinent information as well as statistics on energy consumption. Integrate various data sources to raise the precision and efficiency of AIEMS.
  • Create cutting-edge ML algorithms: Concentrate on creating cutting-edge ML models designed especially for energy management. To optimize energy usage and enhance control strategies, investigate deep learning techniques, reinforcement learning, and other cutting-edge methodologies.
  • Adopt hybrid approaches: Integrate multiple ML techniques, including supervised and unsupervised learning, to use the benefits of each approach. Integrate data-driven methodologies with physics-based models for enhanced forecasting accuracy and superior control strategies.
  • Ensure scalability and adaptability: Develop AIEMS technologies that can manage massive energy systems and adjust to changing circumstances. Create structures and techniques for processing and analyzing enormous amounts of data quickly.
  • Focus on enhancing model interpretability: Make AIEMS models easier to understand and interpret. To help users and stakeholders understand and trust the system’s suggestions and actions, develop techniques that offer transparent insights into the decision-making process.
  • Integrate with IoT and edge computing: Investigate the connection between AIEMS and IoT hardware and edge computing innovations. Utilize real-time monitoring and control at the sensor or device level to enhance responsiveness and reduce energy usage.
  • Give robustness and cybersecurity: This is a higher priority and deals with the problems they cause for AIEMS. Apply methods for anomaly detection, privacy preservation, and secure communication to safeguard delicate energy data and guarantee system dependability.
  • Stress user-centric design: Take user preferences, comfort levels, and behavior patterns into account when designing and running AIEMS. Make individualized energy recommendations and involve users in energy-saving activities.
  • Encourage collaboration and knowledge sharing: Promote cooperation between academics, industry professionals, and decision-makers to share knowledge, collaborate on research projects, and promote the area of AIEMS. To address difficult energy management concerns, encourage interdisciplinary research and collaborations.
  • Carry out real-time evaluations: Test AIEMS solutions in real-time situations to gauge their efficiency, scalability, and efficacy. To perform field testing and receive feedback for continual development, work with building owners, operators, and energy management firms.
Adhering to these recommendations can expedite the development and implementation of IEMS utilizing ML, leading to more economical and sustainable energy practices, alongside enhanced energy efficiency.

11. Future Research Directions

Future ML-based IEMS research can concentrate on several important areas. By maximizing energy consumption, enhancing energy efficiency, and facilitating intelligent and sustainable energy management, these areas seek to improve the capabilities and efficacy of AIEMS further. Potential directions for future research include the following:
  • Sophisticated and advanced ML algorithms and models: They should be created, and they should be tuned specifically for AIEMS. To increase forecast accuracy, optimize control tactics, and facilitate improved energy management decision-making, this includes investigating deep learning techniques, reinforcement learning, generative adversarial networks (GANs), and other cutting-edge methodologies.
  • Integration of various data sources: Look into ways to use various data sources, like real-time energy, weather, occupancy, and building sensor data, to integrate and improve the accuracy and efficiency of AIEMS. This entails investigating data fusion strategies, data preprocessing techniques, and feature engineering techniques to glean insightful patterns and insights from various data streams.
  • Hybrid approaches and ensemble methods: Investigate the blending of various ML approaches, such as mixing supervised and unsupervised learning techniques or physics-based models with data-driven methodologies. Utilizing the advantages of various algorithms and models could result in more reliable and precise AIEMS solutions.
  • Scalability: Maintain AIEMS technologies that can handle big energy systems and adjust to changing conditions that are scalable. To achieve this, algorithms and structures that can quickly process and analyze vast amounts of data in real time, while also adapting to changing patterns of energy use, user behavior, and building characteristics, need to be developed.
  • Sustainability Considerations: As ML models, deep learning, and reinforcement learning frameworks require more computing power across various building types and climates, scalability issues arise. Moreover, frequent retraining may be necessary to accommodate evolving building dynamics and external variables, which can lead to increased energy consumption. Future research is needed to explore efficient algorithms, data management strategies, and hardware optimizations to reduce the computational costs of these systems, ensuring genuine sustainability. Furthermore, it is crucial to create scalable frameworks that preserve energy savings while minimizing environmental costs to attain sustainable, ML-driven energy management on a larger scale.
  • Explainability and interpretability: Address the difficulty of AIEMS, using model interpretability. Develop strategies to provide clear and understandable insights into the decision-making processes of AIEMS models, enabling users and stakeholders to understand and trust the system’s recommendations and course of action. This can help overcome some of the black-box characteristics of ML models and facilitate their implementation in practical energy management applications.
  • IoT and edge computing integration: Explore the integration of AIEMS with edge computing and IoT devices. This reduces latency and enhances responsiveness by facilitating the real-time monitoring, control, and optimization of energy systems at the device or sensor level. Investigate methods for employing edge computing in edge analytics, distributed ML, and decentralized decision-making within AIEMS.
  • Robustness and cybersecurity: Discuss the difficulties with the robustness and cybersecurity of AIEMS. Develop methods to ensure the security, robustness, and resilience of AIEMS against cyber threats, data breaches, and hostile attacks. This entails examining privacy-preserving measures, secure communication protocols, and anomaly detection methods to safeguard sensitive energy data and ensure the integrity and availability of AIEMS systems.
  • Human-centric AIEMS: Examine how human-centric methods might be incorporated into the design and use of AIEMS. When making decisions about energy management, it is essential to consider user preferences, comfort levels, and behavioral patterns. In addition to creating AIEMS systems that can adjust to the demands of each user individually, they offer individualized energy advice and promote user involvement and empowerment in energy conservation.
Therefore, the field of Automated and Smart Energy Management Systems can advance the use of ML techniques in energy efficiency, support sustainable energy practices, and promote the adoption of smart and autonomous energy management solutions by concentrating on these research directions.

12. Conclusions

This work explored the pivotal role of ML in enhancing the effectiveness and intelligence of smart buildings. It began by establishing a basic comprehension of smart buildings and the context of automated energy management systems. It then proceeded to analyze the complex elements and phases involved in using ML in building energy management systems. A comprehensive study was conducted to evaluate the significance of prediction metrics in accurately assessing energy efficiency. It analyzed several ML techniques, categorizing them into supervised, unsupervised, semi-supervised, and reinforcement learning methodologies. Each approach was further explored with its respective sub-categories and specific algorithms such as decision trees, SVM, ANNs. This classification not only highlighted the diverse range of ML applications in energy forecasting but also underscored the nuanced methods that address various aspects of energy management in smart buildings. The discussion on data management emphasized the vital stages of data collection, preprocessing, and feature extraction, which are imperative for the effective execution of ML. The study then examined the diverse applications of ML in energy forecasting, demonstrating its versatility in predicting energy demands over both short and long durations, as well as enhancing real-time energy utilization. An analysis of real-world case studies highlighted the actual applications and benefits of ML in smart buildings, offering a concrete understanding of its advantages and the enhancements it provides to energy efficiency. The accompanying examination of obstacles and possibilities provided insight into the tangible barriers, such as data privacy and infrastructure integration, while also emphasizing the potential for sophisticated predictive analytics in this field.
Future research should prioritize developing areas such as predictive maintenance, renewable energy integration, and the convergence of machine learning with IoT technology. Specifically, it is essential to construct explainable machine learning frameworks to alleviate real-time cybersecurity threats while maintaining predictive accuracy. Reinforcement learning must also be optimized to harmonize occupant comfort with variable energy requirements within multi-objective limitations. Moreover, edge-based federated learning frameworks demonstrate potential for facilitating decentralized, privacy-preserving solutions that can more efficiently accommodate varied urban topologies compared to conventional centralized approaches. These research directions will likely drive the advancement of next-generation smart energy systems that are efficient, adaptive, transparent, secure, and aligned with human-centric design principles.

Author Contributions

Conceptualization, F.E.H., H.N.N., O.S. and K.C.; methodology, H.N.N. and K.C.; investigation, F.E.H. and H.N.N.; writing—original draft preparation, F.E.H. and K.C.; writing—review and editing, H.N.N., O.S. and K.C.; visualization, F.E.H.; supervision, H.N.N. and O.S.; project administration, H.N.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Ola Salman is employed by the company DeepVu. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Wang, Z.; Liu, J.; Zhang, Y.; Yuan, H.; Zhang, R.; Srinivasan, R.S. Practical issues in implementing machine-learning models for building energy efficiency: Moving beyond obstacles. Renew. Sustain. Energy Rev. 2021, 143, 110929. [Google Scholar] [CrossRef]
  2. Al Dakheel, J.; Del Pero, C.; Aste, N.; Leonforte, F. Smart buildings features and key performance indicators: A review. Sustain. Cities Soc. 2020, 61, 102328. [Google Scholar] [CrossRef]
  3. Amasyali, K.; El-Gohary, N.M. A review of data-driven building energy consumption prediction studies. Renew. Sustain. Energy Rev. 2018, 81, 1192–1205. [Google Scholar] [CrossRef]
  4. Djenouri, Y.; Belhadi, A.; Lin, J.W.; Srivastava, G. Machine Learning for Smart Building Applications: Review and Taxonomy. ACM Comput. Surv. 2019, 52, 24. [Google Scholar] [CrossRef]
  5. Naji, S.; Keivani, A.; Shamshirband, S.; Alengaram, U.J.; Jumaat, M.Z.; Mansor, Z.; Lee, M. Estimating building energy consumption using extreme learning machine method. Energy 2016, 97, 506–516. [Google Scholar] [CrossRef]
  6. Chou, J.S.; Bui, D.K. Modeling heating and cooling loads by artificial intelligence for energy-efficient building design. Energy Build. 2014, 82, 437–446. [Google Scholar] [CrossRef]
  7. Marinakis, V.; Doukas, H.; Karakosta, C.; Psarras, J. An integrated system for buildings’ energy-efficient automation: Application in the tertiary sector. Appl. Energy 2013, 101, 6–14. [Google Scholar] [CrossRef]
  8. Yan, R.; Ma, Z.; Zhao, Y.; Kokogiannakis, G. A decision tree based data-driven diagnostic strategy for air handling units. Energy Build. 2016, 133, 37–45. [Google Scholar] [CrossRef]
  9. Chen, Z.; Jiang, C.; Xie, L. Building occupancy estimation and detection: A review. Energy Build. 2018, 169, 260–270. [Google Scholar] [CrossRef]
  10. Huchuk, B.; Sanner, S.; O’Brien, W. Comparison of machine learning models for occupancy prediction in residential buildings using connected thermostat data. Build. Environ. 2019, 160, 106177. [Google Scholar] [CrossRef]
  11. Sharif, S.; Hammad, A. Developing surrogate ANN for selecting near-optimal building energy renovation methods considering energy consumption, LCC and LCA. J. Build. Eng. 2019, 25, 100790. [Google Scholar] [CrossRef]
  12. Santos, G.; Teixeira, B.; Pinto, T.; Vale, Z. Automated energy management and learning. In Proceedings of the 2023 IEEE Conference on Artificial Intelligence (CAI), Santa Clara, CA, USA, 5–6 June 2023. [Google Scholar]
  13. Gaitan, N.; Ungurean, I.; Corotinschi, G.; Roman, C. An Intelligent Energy Management System Solution for Multiple Renewable Energy Sources. Sustainability 2023, 15, 2531. [Google Scholar] [CrossRef]
  14. Anastasi, G.; Corucci, F.; Marcelloni, F. An intelligent system for electrical energy management in buildings. In Proceedings of the 2011 11th International Conference on Intelligent Systems Design and Applications, Cordoba, Spain, 22–24 November 2011; pp. 888–893. [Google Scholar]
  15. Marinakis, V.; Doukas, H. An Advanced IoT-based System for Intelligent Energy Management in Buildings. Sensors 2018, 18, 610. [Google Scholar] [CrossRef]
  16. Park, D.; Kim, H.; Choi, I.; Kim, J. A literature review and classification of recommender systems research. Expert Syst. Appl. 2012, 39, 10059–10072. [Google Scholar] [CrossRef]
  17. Asare-Bediako, B.; Kling, W.; Ribeiro, P. Multi-agent system architecture for smart home energy optimization. In Proceedings of the 4th IEEE PES International Conference and Exhibition on Innovative Smart Grid Technologies (ISGT Europe), Lyngby, Denmark, 6–9 October 2013; pp. 1–5. [Google Scholar] [CrossRef]
  18. De Paola, A.; Ortolani, M.; Lo Re, G.; Anastasi, G.; Das, S.K. Intelligent Management Systems for Energy Efficiency in Buildings: A Survey. ACM Comput. Surv. 2014, 47, 13. [Google Scholar] [CrossRef]
  19. Beaudin, M.; Zareipour, H. Home energy management systems: A review of modelling and complexity. Renew. Sustain. Energy Rev. 2015, 45, 318–335. [Google Scholar] [CrossRef]
  20. Steg, L.; Shwom, R.; Dietz, T. What drives energy consumers? Engaging people in a sustainable energy transition. IEEE Power Energy Mag. 2018, 16, 20–28. [Google Scholar] [CrossRef]
  21. Shareef, H.; Ahmed, M.S.; Mohamed, A.; Al Hassan, E. Review on Home Energy Management System Considering Demand Responses, Smart Technologies, and Intelligent Controllers. IEEE Access 2018, 6, 24498–24509. [Google Scholar] [CrossRef]
  22. González-Briones, A.; Prieto, J.; De La Prieta, F.; Herrera-Viedma, E.; Corchado, J.M. Energy Optimization Using a Case-Based Reasoning Strategy. Sensors 2018, 18, 865. [Google Scholar] [CrossRef]
  23. Boodi, A.; Beddiar, K.; Benamour, M.; Amirat, Y.; Benbouzid, M. Intelligent Systems for Building Energy and Occupant Comfort Optimization: A State of the Art Review and Recommendations. Energies 2018, 11, 2604. [Google Scholar] [CrossRef]
  24. Becchio, C.; Bertoncini, M.; Boggio, A.; Bottero, M.; Corgnati, S.; Dell’Anna, F. The Impact of Users’ Lifestyle in Zero-Energy and Emission Buildings: An Application of Cost-Benefit Analysis. In New Metropolitan Perspectives; Springer: Cham, Switzerland, 2019; pp. 123–131. [Google Scholar] [CrossRef]
  25. Sardianos, C.; Varlamis, I.; Dimitrakopoulos, G.; Anagnostopoulos, D.; Alsalemi, A.; Bensaali, F.; Himeur, Y.; Amira, A. REHAB-C: Recommendations for Energy HABits Change. Future Gener. Comput. Syst. 2020, 112, 394–407. [Google Scholar] [CrossRef]
  26. Cattaneo, C. Internal and external barriers to energy efficiency: Which role for policy interventions? Energy Effic. 2019, 12, 1293–1311. [Google Scholar] [CrossRef]
  27. McIlvennie, C.; Sanguinetti, A.; Pritoni, M. Of impacts, agents, and functions: An interdisciplinary meta-review of smart home energy management systems research. Energy Res. Soc. Sci. 2020, 68, 101555. [Google Scholar] [CrossRef]
  28. Leitao, J.; Gil, P.; Ribeiro, B.; Cardoso, A. A Survey on Home Energy Management. IEEE Access 2020, 8, 5699–5722. [Google Scholar] [CrossRef]
  29. Alawadi, S.; Mera, D.; Fernández-Delgado, M.; Alkhabbas, F.; Olsson, C.; Davidsson, P. A comparison of machine learning algorithms for forecasting indoor temperature in smart buildings. Energy Syst. 2022, 13, 689–705. [Google Scholar] [CrossRef]
  30. Himeur, Y.; Ghanem, K.; Alsalemi, A.; Bensaali, F.; Amira, A. Artificial intelligence based anomaly detection of energy consumption in buildings: A review, current trends and new perspectives. Appl. Energy 2021, 287, 116601. [Google Scholar] [CrossRef]
  31. Dunne, R.; Morris, T.; Harper, S. A Survey of Ambient Intelligence. ACM Comput. Surv. 2021, 54, 73. [Google Scholar] [CrossRef]
  32. Himeur, Y.; Alsalemi, A.; Al-Kababji, A.; Bensaali, F.; Amira, A.; Sardianos, C.; Dimitrakopoulos, G.; Varlamis, I. A survey of recommender systems for energy efficiency in buildings: Principles, challenges and prospects. Inf. Fusion 2021, 72, 1–21. [Google Scholar] [CrossRef]
  33. Mirnaghi, M.S.; Haghighat, F. Fault detection and diagnosis of large-scale HVAC systems in buildings using data-driven methods: A comprehensive review. Energy Build. 2020, 229, 110492. [Google Scholar] [CrossRef]
  34. Mir, U.; Abbasi, U.; Mir, T.; Kanwal, S.; Alamri, S. Energy Management in Smart Buildings and Homes: Current Approaches, a Hypothetical Solution, and Open Issues and Challenges. IEEE Access 2021, 9, 94132–94148. [Google Scholar] [CrossRef]
  35. Ardabili, S.; Abdolalizadeh, L.; Mako, C.; Torok, B.; Mosavi, A. Systematic review of deep learning and machine learning for building energy consumption and forecasting. Front. Energy Res. 2022, 10, 786027. [Google Scholar] [CrossRef]
  36. Cheng, X.; Li, C.; Liu, X. A review of federated learning in energy systems. arXiv 2022, arXiv:2208.10941. [Google Scholar]
  37. Khan, N.; Shahid, Z.; Alam, M.M.; Bakar Sajak, A.A.; Mazliham, M.; Khan, T.A.; Ali Rizvi, S.S. Energy management systems using smart grids: An exhaustive parametric comprehensive analysis of existing trends, significance, opportunities, and challenges. Int. Trans. Electr. Energy Syst. 2022, 2022, 3358795. [Google Scholar] [CrossRef]
  38. Sari, M.; Berawi, M.A.; Zagloel, T.Y.; Madyaningarum, N.; Miraj, P.; Pranoto, A.R.; Susantono, B.; Woodhead, R. Machine learning-based energy use prediction for the smart building energy management system. J. Inf. Technol. Constr. (ITcon) 2023, 28, 622–645. [Google Scholar] [CrossRef]
  39. Baher, M.; Aziza I., H. Machine Learning Based Energy Management System for Smart Buildings. In Proceedings of the 2025 15th International Conference on Power, Energy, and Electrical Engineering (CPEEE), Fukuoka, Japan, 15–17 February 2025; pp. 344–348. [Google Scholar] [CrossRef]
  40. Bajwa, A.; Jahan, F.; Ahmed, I. AI-enabled smart building management systems: A systematic review of 472 studies. Am. J. Sch. Res. Innov. 2024, 3, 1–27. [Google Scholar] [CrossRef]
  41. Grataloup, A.; Jonas, S.; Meyer, A. A review of federated learning in renewable energy applications: Potential, challenges, and future directions. Energy AI 2024, 17, 100375. [Google Scholar] [CrossRef]
  42. Haghighat, M.; MohammadiSavadkoohi, E.; Shafiabady, N. Applications of Explainable Artificial Intelligence (XAI) and interpretable AI in smart buildings and energy savings: A systematic review. J. Build. Eng. 2025, 107, 112542. [Google Scholar] [CrossRef]
  43. Perera, A.; Kamalaruban, P. Applications of reinforcement learning in energy systems. Renew. Sustain. Energy Rev. 2021, 137, 110618. [Google Scholar] [CrossRef]
  44. Hastie, T.; Tibshirani, R.; Friedman, J.H.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, NY, USA, 2009; Volume 2. [Google Scholar]
  45. Duda, R.O.; Hart, P.E.; Stork, D.G. Pattern Classification; John Wiley & Sons: Hoboken, NJ, USA, 2006. [Google Scholar]
  46. Aicha, A.N.; Englebienne, G.; Kröse, B. Modeling visit behaviour in smart homes using unsupervised learning. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, Seattle WA, USA, 13–17 September 2014; pp. 1193–1200. [Google Scholar]
  47. Lapalu, J.; Bouchard, K.; Bouzouane, A.; Bouchard, B.; Giroux, S. Unsupervised mining of activities for smart home prediction. Procedia Comput. Sci. 2013, 19, 503–510. [Google Scholar] [CrossRef]
  48. Wu, E.; Zhang, P.; Lu, T.; Gu, H.; Gu, N. Behavior prediction using an improved Hidden Markov Model to support people with disabilities in smart homes. In Proceedings of the 2016 IEEE 20th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Nanchang, China, 4–6 May 2016; pp. 560–565. [Google Scholar]
  49. Tsai, C.W.; Lai, C.F.; Chiang, M.C.; Yang, L.T. Data mining for internet of things: A survey. IEEE Commun. Surv. Tutor. 2013, 16, 77–97. [Google Scholar] [CrossRef]
  50. Song, Y.Y.; Ying, L. Decision tree methods: Applications for classification and prediction. Shanghai Arch. Psychiatry 2015, 27, 130. [Google Scholar] [PubMed]
  51. Rokach, L.; Maimon, O. Decision trees. In Data Mining and Knowledge Discovery Handbook; Springer: New York, NY, USA, 2005; pp. 165–192. [Google Scholar]
  52. Delgado, M.; Ros, M.; Vila, M.A. Correct behavior identification system in a tagged world. Expert Syst. Appl. 2009, 36, 9899–9906. [Google Scholar] [CrossRef]
  53. Viswanathan, M. Distributed data mining in a ubiquitous healthcare framework. In Proceedings of the Advances in Artificial Intelligence: 20th Conference of the Canadian Society for Computational Studies of Intelligence, Canadian AI 2007, Montreal, QC, Canada, 28–30 May 2007; pp. 261–271. [Google Scholar]
  54. Parnandi, A.; Le, K.; Vaghela, P.; Kolli, A.; Dantu, K.; Poduri, S.; Sukhatme, G.S. Coarse in-building localization with smartphones. In Proceedings of the International Conference on Mobile Computing, Applications, and Services, San Diego, CA, USA, 26–29 October 2009; pp. 343–354. [Google Scholar]
  55. Verbert, K.; Babuška, R.; De Schutter, B. Combining knowledge and historical data for system-level fault diagnosis of HVAC systems. Eng. Appl. Artif. Intell. 2017, 59, 260–273. [Google Scholar] [CrossRef]
  56. Desarkar, A.; Das, A. Big-data analytics, machine learning algorithms and scalable/parallel/distributed algorithms. In Internet of Things and Big Data Technologies for Next Generation Healthcare; Springer: Cham, Switzerland, 2017; pp. 159–197. [Google Scholar]
  57. Burbidge, R.; Buxton, B. An Introduction to Support Vector Machines for Data Mining; Keynote Papers, Young OR12; Georgia Institute of Technology: Atlanta, GA, USA, 2001; pp. 3–15. [Google Scholar]
  58. Meyer, D.; Wien, F. Support Vector Machines; Interface Libsvm Package E1071 2015; Springer: Berlin/Heidelberg, Germany, 2021; Volume 28, p. 597. [Google Scholar]
  59. Das, B.; Cook, D.J.; Krishnan, N.C.; Schmitter-Edgecombe, M. One-class classification-based real-time activity error detection in smart homes. IEEE J. Sel. Top. Signal Process. 2016, 10, 914–923. [Google Scholar] [CrossRef] [PubMed]
  60. Zhao, H.; Hua, Q.; Chen, H.B.; Ye, Y.; Wang, H.; Tan, S.X.D.; Tlelo-Cuautle, E. Thermal-sensor-based occupancy detection for smart buildings using machine-learning methods. ACM Trans. Des. Autom. Electron. Syst. (TODAES) 2018, 23, 54. [Google Scholar] [CrossRef]
  61. Tu, J.V. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J. Clin. Epidemiol. 1996, 49, 1225–1231. [Google Scholar] [CrossRef]
  62. Mari, S.; Bucci, G.; Ciancetta, F.; Fiorucci, E.; Fioravanti, A. A Review of Non-Intrusive Load Monitoring Applications in Industrial and Residential Contexts. Energies 2022, 15, 9011. [Google Scholar] [CrossRef]
  63. Xing, F.; Xie, Y.; Su, H.; Liu, F.; Yang, L. Deep learning in microscopy image analysis: A survey. IEEE Trans. Neural Netw. Learn. Syst. 2017, 29, 4550–4568. [Google Scholar] [CrossRef]
  64. Chen, X.W.; Lin, X. Big data deep learning: Challenges and perspectives. IEEE Access 2014, 2, 514–525. [Google Scholar] [CrossRef]
  65. Alsheikh, M.A.; Selim, A.; Niyato, D.; Doyle, L.; Lin, S.; Tan, H.P. Deep activity recognition models with triaxial accelerometers. arXiv 2015, arXiv:1511.04664. [Google Scholar]
  66. Chen, Y.; Xue, Y. A deep learning approach to human activity recognition based on single accelerometer. In Proceedings of the 2015 IEEE International Conference on Systems, Man, and Cybernetics, Hong Kong, China, 9–12 October 2015; pp. 1488–1492. [Google Scholar]
  67. Ronao, C.A.; Cho, S.B. Human activity recognition with smartphone sensors using deep learning neural networks. Expert Syst. Appl. 2016, 59, 235–244. [Google Scholar] [CrossRef]
  68. Cheng, B.C.; Tsai, Y.A.; Liao, G.T.; Byeon, E.S. HMM machine learning and inference for activities of daily living recognition. J. Supercomput. 2010, 54, 29–42. [Google Scholar] [CrossRef]
  69. Chahuara, P.; Fleury, A.; Portet, F.; Vacher, M. On-line human activity recognition from audio and home automation sensors: Comparison of sequential and non-sequential models in realistic Smart Homes. J. Ambient Intell. Smart Environ. 2016, 8, 399–422. [Google Scholar] [CrossRef]
  70. Fu, T. A review on time series data mining. Eng. Appl. Artif. Intell. 2011, 24, 164–181. [Google Scholar] [CrossRef]
  71. Chen, F.; Deng, P.; Wan, J.; Zhang, D.; Vasilakos, A.V.; Rong, X. Data mining for the internet of things: Literature review and challenges. Int. J. Distrib. Sens. Netw. 2015, 11, 431047. [Google Scholar] [CrossRef]
  72. Esling, P.; Agon, C. Time-series data mining. ACM Comput. Surv. (CSUR) 2012, 45, 12. [Google Scholar] [CrossRef]
  73. Survadevara, N.; Mukhopadhyay, S.; Rayudu, R. Applying SARIMA time series to forecast sleeping activity for wellness model of elderly monitoring in smart home. In Proceedings of the 2012 Sixth International Conference on Sensing Technology (ICST), Kolkata, India, 18–21 December 2012; pp. 157–162. [Google Scholar]
  74. Bamdad, K.; Cholette, M.E.; Bell, J. Building energy optimization using surrogate model and active sampling. J. Build. Perform. Simul. 2020, 13, 760–766. [Google Scholar] [CrossRef]
  75. Böhmer, W.; Obermayer, K. Regression with linear factored functions. In Proceedings of the Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2015, Porto, Portugal, 7–11 September 2015; Part I. pp. 119–134. [Google Scholar]
  76. Brownlee, J. A tour of machine learning algorithms. Mach. Learn. Mastery 2013, 25. Available online: https://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/ (accessed on 27 June 2025).
  77. Chen, X.; Li, X.; Tan, S.X.D. From robust chip to smart building: CAD algorithms and methodologies for uncertainty analysis of building performance. In Proceedings of the 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), Austin, TX, USA, 2–6 November 2015; pp. 457–464. [Google Scholar]
  78. Martin, F. Recsys’09 industrial keynote: Top 10 lessons learned developing deploying and operating real-world recommender systems. In Proceedings of the RecSys ’09: Third ACM Conference on Recommender Systems, New York, NY, USA, 23–25 October 2009; pp. 1–2. [Google Scholar] [CrossRef]
  79. Jurek, A.; Nugent, C.; Bi, Y.; Wu, S. Clustering-based ensemble learning for activity recognition in smart homes. Sensors 2014, 14, 12285–12304. [Google Scholar] [CrossRef]
  80. Alam, M.R.; Reaz, M.B.I.; Ali, M.A.M. A review of smart homes—Past, present, and future. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2012, 42, 1190–1203. [Google Scholar] [CrossRef]
  81. Rashidi, P.; Cook, D.J. Keeping the resident in the loop: Adapting the smart home to the user. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2009, 39, 949–959. [Google Scholar] [CrossRef]
  82. Rashidi, P.; Cook, D.J.; Holder, L.B.; Schmitter-Edgecombe, M. Discovering activities to recognize and track in a smart environment. IEEE Trans. Knowl. Data Eng. 2010, 23, 527–539. [Google Scholar] [CrossRef]
  83. Nazerfard, E.; Rashidi, P.; Cook, D.J. Using association rule mining to discover temporal relations of daily activities. In Proceedings of the Toward Useful Services for Elderly and People with Disabilities: 9th International Conference on Smart Homes and Health Telematics, ICOST 2011, Montreal, QC, Canada, 20–22 June 2011; pp. 49–56. [Google Scholar]
  84. Rokach, L.; Maimon, O. Clustering methods. In Data Mining and Knowledge Discovery Handbook; Springer: New York, NY, USA, 2005. [Google Scholar]
  85. Fahad, L.G.; Tahir, S.F.; Rajarajan, M. Activity recognition in smart homes using clustering based classification. In Proceedings of the 2014 22nd International Conference on Pattern Recognition, Stockholm, Sweden, 24–28 August 2014; pp. 1348–1353. [Google Scholar]
  86. Fiorini, L.; Cavallo, F.; Dario, P.; Eavis, A.; Caleb-Solly, P. Unsupervised machine learning for developing personalised behaviour models using activity data. Sensors 2017, 17, 1034. [Google Scholar] [CrossRef] [PubMed]
  87. Tan, P.N.; Steinbach, M.; Kumar, V. Introduction to Data Mining, Addison; Wesley Longman, Publishing Co., Inc.: Boston, MA, USA, 2005. [Google Scholar]
  88. Kang, K.J.; Ka, B.; Kim, S.J. A service scenario generation scheme based on association rule mining for elderly surveillance system in a smart home environment. Eng. Appl. Artif. Intell. 2012, 25, 1355–1364. [Google Scholar] [CrossRef]
  89. Witten, I.H.; Frank, E.; Hall, M.A. Chapter 2—Input: Concepts, Instances, and Attributes. In Data Mining: Practical Machine Learning Tools and Techniques, 3rd ed.; Witten, I.H., Frank, E., Hall, M.A., Eds.; The Morgan Kaufmann Series in Data Management Systems; Morgan Kaufmann: Boston, MA, USA, 2011; pp. 39–60. [Google Scholar] [CrossRef]
  90. Fahmi, P.A.; Viet, V.; Deok-Jai, C. Semi-supervised fall detection algorithm using fall indicators in smartphone. In Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication, Kuala Lumpur, Malaysia, 20–22 February 2012; pp. 1–9. [Google Scholar]
  91. Perera, C.; Liu, C.H.; Jayawardena, S.; Chen, M. A Survey on Internet of Things From Industrial Market Perspective. IEEE Access 2014, 2, 1660–1679. [Google Scholar] [CrossRef]
  92. Dayan, P.; Niv, Y. Reinforcement learning: The good, the bad and the ugly. Curr. Opin. Neurobiol. 2008, 18, 185–196. [Google Scholar] [CrossRef] [PubMed]
  93. Cheng, Z.; Zhao, Q.; Wang, F.; Jiang, Y.; Xia, L.; Ding, J. Satisfaction based Q-learning for integrated lighting and blind control. Energy Build. 2016, 127, 43–55. [Google Scholar] [CrossRef]
  94. Qolomany, B.; Al-Fuqaha, A.; Gupta, A.; Benhaddou, D.; Alwajidi, S.; Qadir, J.; Fong, A.C. Leveraging machine learning and big data for smart buildings: A comprehensive survey. IEEE Access 2019, 7, 90316–90356. [Google Scholar] [CrossRef]
  95. Deb, C.; Zhang, F.; Yang, J.; Lee, S.E.; Shah, K.W. Review on the impact of data collection systems in intelligent buildings for enhancing energy efficiency. Energy Build. 2014, 85, 607–619. [Google Scholar]
  96. Wirth, R.; Hipp, J. CRISP-DM: Towards a standard process model for data mining. In Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining, Manchester, UK, 11–13 April 2000; pp. 29–39. [Google Scholar]
  97. Guyon, I.; Elisseeff, A. An introduction to variable and feature selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
  98. Machlev, R.; Heistrene, L.; Perl, M.; Levy, K.; Belikov, J.; Mannor, S.; Levron, Y. Explainable Artificial Intelligence (XAI) techniques for energy and power systems: Review, challenges and opportunities. Energy AI 2022, 9, 100169. [Google Scholar] [CrossRef]
  99. Voyant, C.; Notton, G.; Kalogirou, S.; Nivet, M.L.; Paoli, C.; Motte, F.; Fouilloy, A. Machine learning methods for solar radiation forecasting: A review. Renew. Energy 2017, 105, 569–582. [Google Scholar] [CrossRef]
  100. Allal, Z.; Noura, H.N.; Chahine, K. Machine Learning Algorithms for Solar Irradiance Prediction: A Recent Comparative Study. E-Prime Adv. Electr. Eng. Electron. Energy 2024, 7, 100453. [Google Scholar] [CrossRef]
  101. Tso, G.K.; Yau, K.K. Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks. Energy 2007, 32, 1761–1768. [Google Scholar] [CrossRef]
  102. Allal, Z.; Noura, H.N.; Salman, O.; Chahine, K. Power consumption prediction in warehouses using variational autoencoders and tree-based regression models. Energy Built Environ. 2024. [Google Scholar] [CrossRef]
  103. Chahine, K.; Drissi, K.E.K.; Pasquier, C.; Kerroum, K.; Faure, C.; Jouannet, T.; Michou, M. Electric Load Disaggregation in Smart Metering Using a Novel Feature Extraction Method and Supervised Classification. Energy Procedia 2011, 6, 627–632. [Google Scholar] [CrossRef]
  104. Chahine, K. Towards automatic setup of non intrusive appliance load monitoring— feature extraction and clustering. Int. J. Electr. Comput. Eng. (IJECE) 2019, 9, 1002. [Google Scholar] [CrossRef]
  105. Zoha, A.; Gluhak, A.; Imran, M.A.; Rajasegarar, S. Non-intrusive load monitoring approaches for disaggregated energy sensing: A survey. Sensors 2012, 12, 16838–16866. [Google Scholar] [CrossRef]
  106. Ruelens, F.; Claessens, B.J.; Vandael, S.; Schutter, B.D.; Babuška, R.; Belmans, R. Residential demand response of thermostatically controlled loads using batch reinforcement learning. IEEE Trans. Smart Grid 2016, 8, 2149–2159. [Google Scholar] [CrossRef]
  107. Alanne, K.; Sierla, S. An overview of machine learning applications for smart buildings. Sustain. Cities Soc. 2022, 76, 103445. [Google Scholar] [CrossRef]
  108. Rathor, S.K.; Saxena, D. Energy management system for smart grid: An overview and key issues. Int. J. Energy Res. 2020, 44, 4067–4109. [Google Scholar] [CrossRef]
  109. Wei, Y.; Zhang, X.; Shi, Y.; Xia, L.; Pan, S.; Wu, J.; Han, M.; Zhao, X. A review of data-driven approaches for prediction and classification of building energy consumption. Renew. Sustain. Energy Rev. 2018, 82, 1027–1047. [Google Scholar] [CrossRef]
  110. Chalal, M.L.; Benachir, M.; White, M.; Shrahily, R. Energy planning and forecasting approaches for supporting physical improvement strategies in the building sector: A review. Renew. Sustain. Energy Rev. 2016, 64, 761–776. [Google Scholar] [CrossRef]
  111. Li, K.; Xue, W.; Tan, G.; Denzer, A. A state of the art review on the prediction of building energy consumption using data-driven technique and evolutionary algorithms. Build. Serv. Eng. Res. Technol. 2020, 41, 108–127. [Google Scholar] [CrossRef]
  112. Chen, G.; Lu, S.; Zhou, S.; Tian, Z.; Kim, M.; Liu, J.; Liu, X. A Systematic Review of Building Energy Consumption Prediction: From Perspectives of Load Classification, Data-Driven Frameworks, and Future Directions. Appli. Sci. 2025, 6, 3086. [Google Scholar] [CrossRef]
  113. Murdoch, W.J.; Singh, C.; Kumbier, K.; Abbasi-Asl, R.; Yu, B. Definitions, methods, and applications in interpretable machine learning. Proc. Natl. Acad. Sci. USA 2019, 116, 22071–22080. [Google Scholar] [CrossRef] [PubMed]
  114. Sajjadi, S.; Shamshirband, S.; Alizamir, M.; Yee, P.L.; Mansor, Z.; Manaf, A.A.; Altameem, T.A.; Mostafaeipour, A. Extreme learning machine for prediction of heat load in district heating systems. Energy Build. 2016, 122, 222–227. [Google Scholar] [CrossRef]
  115. Ahmad, T.; Chen, H. Short and medium-term forecasting of cooling and heating load demand in building environment with data-mining based approaches. Energy Build. 2018, 166, 460–476. [Google Scholar] [CrossRef]
  116. Ciulla, G.; D’Amico, A. Building energy performance forecasting: A multiple linear regression approach. Appl. Energy 2019, 253, 113500. [Google Scholar] [CrossRef]
  117. Fan, C.; Xiao, F.; Zhao, Y. A short-term building cooling load prediction method using deep learning algorithms. Appl. Energy 2017, 195, 222–233. [Google Scholar] [CrossRef]
  118. Cai, M.; Pipattanasomporn, M.; Rahman, S. Day-ahead building-level load forecasts using deep learning vs. traditional time-series techniques. Appl. Energy 2019, 236, 1078–1088. [Google Scholar] [CrossRef]
  119. Yu, Z.; Haghighat, F.; Fung, B.C.; Yoshino, H. A decision tree method for building energy demand modeling. Energy Build. 2010, 42, 1637–1646. [Google Scholar] [CrossRef]
  120. Wang, Z.; Srinivasan, R.S.; Shi, J. Artificial Intelligent Models for Improved Prediction of Residential Space Heating. J. Energy Eng. 2016, 142, 04016006. [Google Scholar] [CrossRef]
  121. Smarra, F.; Jain, A.; De Rubeis, T.; Ambrosini, D.; D’Innocenzo, A.; Mangharam, R. Data-driven model predictive control using random forests for building energy optimization and climate control. Appl. Energy 2018, 226, 1252–1272. [Google Scholar] [CrossRef]
  122. Sun, Y.; Haghighat, F.; Fung, B.C. A review of the-state-of-the-art in data-driven approaches for building energy prediction. Energy Build. 2020, 221, 110022. [Google Scholar] [CrossRef]
  123. Hu, R.; Granderson, J.; Auslander, D.; Agogino, A. Design of machine learning models with domain experts for automated sensor selection for energy fault detection. Appl. Energy 2019, 235, 117–128. [Google Scholar] [CrossRef]
  124. Shahidi, M.; Eicker, U.; Nik-Bakht, M. The effect of Urban-scale shading on building energy modelling results of an educational building in Montréal. In Proceedings of the Building Simulation 2023: 18th Conference of IBPSA, Shanghai, China, 4–6 September 2023; Volume 18, pp. 2847–2854. [Google Scholar] [CrossRef]
  125. Ma, R.; Fang, D.; Chen, J.; Li, X. A tiled multi-city urban objects dataset for city-scale building energy simulation. Sci. Data 2023, 10, 352. [Google Scholar] [CrossRef]
  126. Estebsari, A.; Rajabi, R. Single Residential Load Forecasting Using Deep Learning and Image Encoding Techniques. Electronics 2020, 9, 68. [Google Scholar] [CrossRef]
  127. Zhao, X.; Askari, H.; Chen, J. Nanogenerators for smart cities in the era of 5G and Internet of Things. Joule 2021, 5, 1391–1431. [Google Scholar] [CrossRef]
  128. Premnath, S.N.; Haas, Z.J. Security and Privacy in the Internet-of-Things Under Time-and-Budget-Limited Adversary Model. IEEE Wirel. Commun. Lett. 2015, 4, 277–280. [Google Scholar] [CrossRef]
  129. Isaac, M.; van Vuuren, D.P. Modeling global residential sector energy demand for heating and air conditioning in the context of climate change. Energy Policy 2009, 37, 507–521. [Google Scholar] [CrossRef]
  130. Baccarelli, E.; Naranjo, P.G.V.; Scarpiniti, M.; Shojafar, M.; Abawajy, J.H. Fog of Everything: Energy-Efficient Networked Computing Architectures, Research Challenges, and a Case Study. IEEE Access 2017, 5, 9882–9910. [Google Scholar] [CrossRef]
  131. Kim, H.J.; Jeong, C.M.; Sohn, J.M.; Joo, J.Y.; Donde, V.; Ko, Y.; Yoon, Y.T. A comprehensive review of practical issues for interoperability using the common information model in smart grids. Energies 2020, 13, 1435. [Google Scholar] [CrossRef]
  132. Li, Y.; Dai, W.; Ming, Z.; Qiu, M. Privacy Protection for Preventing Data Over-Collection in Smart City. IEEE Trans. Comput. 2016, 65, 1339–1350. [Google Scholar] [CrossRef]
  133. Azar, E.; O’Brien, W.; Carlucci, S.; Hong, T.; Sonta, A.; Kim, J.; Andargie, M.S.; Abuimara, T.; El Asmar, M.; Jain, R.K.; et al. Simulation-aided occupant-centric building design: A critical review of tools, methods, and applications. Energy Build. 2020, 224, 110292. [Google Scholar] [CrossRef]
  134. Jiang, W.; Meng, Q.; Zhang, H.; Xu, X.; Zhao, J. Energy-efficient building design optimization based on an improved multi-objective genetic algorithm. Sustainability 2018, 10, 3566. [Google Scholar] [CrossRef]
Figure 1. Uses of ML in building energy efficiency.
Figure 1. Uses of ML in building energy efficiency.
Applsci 15 07682 g001
Figure 2. IEMS architecture and models examples.
Figure 2. IEMS architecture and models examples.
Applsci 15 07682 g002
Figure 3. The stages of ML implementation in energy management systems.
Figure 3. The stages of ML implementation in energy management systems.
Applsci 15 07682 g003
Figure 4. The PRISMA diagram of machine learning studies on smart-building energy management.
Figure 4. The PRISMA diagram of machine learning studies on smart-building energy management.
Applsci 15 07682 g004
Figure 5. IEMS challenges and limitations.
Figure 5. IEMS challenges and limitations.
Applsci 15 07682 g005
Table 1. Performance metrics.
Table 1. Performance metrics.
Metric NameDescriptionImportanceFormulaTypical Values
RMSE (Root Mean Square Error)Measures the average magnitude of the errors between predicted and actual values.Indicates the model’s accuracy in energy prediction; lower values signify better performance. 1 n i = 1 n ( y i y ^ i ) 2 0–10% of range of target variable
MAE (Mean Absolute Error)Average of the absolute errors between predicted and actual values.Shows the model’s precision in energy prediction without considering error direction. 1 n i = 1 n | y i y ^ i | 0–10% of range of target variable
AccuracyProportion of correctly predicted instances to total instances.Important for classification tasks in energy management, like fault detection. N u m b e r o f c o r r e c t p r e d i c t i o n s T o t a l n u m b e r o f p r e d i c t i o n s 90–100%
F1 ScoreHarmonic mean of precision and recall.Balances the trade-off between precision and recall in classification models. 2 × p r e c i s i o n × r e c a l l p r e c i s i o n + r e c a l l 0.7–1.0
Table 2. Previous work related to intelligent energy management systems.
Table 2. Previous work related to intelligent energy management systems.
ReferenceYearDescription
[16]2012Examining current issues in recommendation systems
[17]2013Proposed multi-agent architectures for energy management
[18]2014Presentation of architectural, technological, and algorithmic aspects of IEMS
[19]2015An analysis of modeling techniques for home energy management systems
[20]2017Overview of the main components of IoT systems
[21]2018Work of smart technologies, scheduling controllers, and demand response methods
[22]2018Applications of multi-agent systems as modeling tools for energy optimization
[23]2018Work of three model types of building energy management systems
[24]2018The significance of changing user behavior
[25]2019Micro-moments experienced by users as determinants of energy usage patterns
[26]2019Discussion of obstacles to the adoption of energy-efficient technologies
[27]2020Discussion of user-centric and technologically oriented perspectives on energy management systems
[28]2020Modern analysis of residential building energy management systems
[29]2020Compilation of the most recent research directions and developments in intelligent energy management
[30]2021Comprehensive review of AI-based anomaly detection in building energy use
[31]2021Overview of the ambient intelligence field
[32]2021Analysis of building energy efficiency recommendation systems
[33]2021Modern approaches to energy management
[34]2021Covers energy monitoring, optimization, and renewable integration
[35]2022Review of ML and DL methods for energy prediction in buildings
[36]2022Federated learning survey covering energy system applications, including smart buildings
[37]2022Comprehensive work of energy management systems (EMS) in the context of smart grids
[38]2023Energy use prediction model using ML and real-world smart building monitoring
[39]2023Work of ML-based EMS specifically designed for smart buildings
[40]2024Large-scale review of AI-enabled smart building management systems (472 studies)
[41]2024Federated learning in renewable energy, including integration in smart buildings
[42]2025XAI applications in smart buildings and energy decision systems
Table 3. ML techniques in energy management systems.
Table 3. ML techniques in energy management systems.
CategoryTypeAlgorithmsProsConsApplicability in IEMs
Supervised LearningClassificationNeural networksRequires little statistical training; Can detect complex non-linear relationshipsComputational burden; Prone to Overfitting; Picking the correct topology is difficult; Training can take a long time and a lot of dataUsed for classification, control and automated home appliances, next step/action prediction.
SVMCan avoid overfitting using the regularization; expert knowledge using appropriate kernelsN/AClassification and regression problems in SBs such as activity recognition, human tracking, energy efficiency services
Bayesian networksVery simple representation does not allow for rich hypothesesShould be trained on a large training set.Energy management system and human activity recognition.
Decision treesNon-parametric algorithm that is easy to interpret and explain.Can easily overfitPatient monitoring, healthcare services, awareness and notification services.
Hidden MarkovFlexible generalization of sequence profiles; Can handle variations in record structureRequires training using annotated data; Many unstructured parametersDaily living activities recognition classification
Deep LearningEnables learning of features rather than hand tuning; Reduce the need for feature engineeringRequires a very large amount of labeled data, computationally really expensive, and extremely hard to tune.modeling occupant’s behavior, and in human voice recognition and monitoring systems; Context-aware SB services.
RegressionOrthogonal matching pursuitFastCan go seriously wrong if there are severe outliers or influential casesFor regression problem such as energy efficiency services in SBs.
clustered-basedStraightforward to understand and explain, and can be regularized to avoid overfitting.It is not flexible enough to capture complex patternsGesture recognition.
Ensemble methodsN/AIncreased model accuracy through averaging as the number of models increases.Difficulties in interpreting decisions; Large computational requirements.Human activity recognition and Energy efficiency services.
Time seriesN/AN/AModel identification is difficult; Traditional measures may be inappropriate for TS designs; Generalizability cannot be inferred from a single work.Occupant comfort services and energy efficiency services in SBs.
Unsupervised LearningClusteringKNNSimplicity; Sufficient for basic problems; Robust to noisy training data.High computation cost; Lazy learnerHuman activity recognition.
K-pattern clusteringSimple; Easy to implement and interpret; Fast and computationally efficientOnly locally optimal and sensitive to initial points; Difficult to predict K-Value.Predict user activities in smart environments.
OthersN/AN/AN/A
Semi-Supervised learningN/AN/AOvercome the problem of supervised learning—having not enough labeled data.false labeling problems and incapable of utilizing out-of-domain samples.Provide context-aware services such as health monitoring and elderly care services.
Reinforcement learningN/AN/AUses deeper knowledge about domainMust have (or learn) a model of environment; must know where actions lead in order to evaluate actionsLighting control services and learning the occupants, preferences of music and lighting services.
Table 4. Data handling techniques in ML applications.
Table 4. Data handling techniques in ML applications.
TechniqueDescriptionApplication in Smart Buildings
Data CollectionGathering raw data from various sources such as sensors, IoT devices, and user inputsCollecting data on energy usage, temperature, occupancy, etc.
Data PreprocessingCleaning and preparing data for analysis, including handling missing values, noise reduction, and normalizationEnsuring the quality and consistency of data used for energy prediction and management
Feature ExtractionIdentifying and selecting relevant features from the dataset that contribute most to the prediction outcomeFeatures like time of day, weather conditions, and occupancy levels are crucial for energy management algorithms
Feature EngineeringCreating new features from existing data to improve the predictive power of the ML modelDeriving patterns of energy consumption or creating aggregated metrics from sensor data
Data AugmentationEnhancing the training dataset by creating new data points from existing data to improve model performanceBetter learning patterns of energy usage under varied conditions
Dimensionality ReductionReducing the number of variables under consideration to focus on the most important informationUsing techniques like PCA to reduce data complexity and improve model efficiency
Data IntegrationCombining data from different sources to provide a unified viewIntegrating data from energy systems, HVAC, and lighting controls for holistic energy management
Data CleaningRemoving inaccuracies and inconsistencies to improve data qualityEnsuring accurate and reliable data for managing and optimizing systems
Table 5. Data sources for energy prediction in smart buildings.
Table 5. Data sources for energy prediction in smart buildings.
Data SourceData TypeFrequencyUsage in ML
SensorsEnergy Consumption, Temperature, HumidityReal-timeTraining, Real-time Prediction
Smart MetersEnergy Consumption, Power QualityHourly, DailyTraining, Monitoring, Anomaly Detection
IoT DevicesOperational Parameters, Environmental ConditionsReal-timeReal-time Prediction, Monitoring
Weather StationsTemperature, Solar Radiation, Wind SpeedHourly, DailyTraining, Prediction Modeling
Building Management System (BMS)HVAC Performance, Lighting, OccupancyReal-time, HourlyTraining, Optimization, Control
Historical Energy DataPast Energy Consumption, Utility BillsMonthly, YearlyTraining, Trend Analysis, Forecasting
Table 6. Real-world applications of ML in smart buildings.
Table 6. Real-world applications of ML in smart buildings.
ApplicationPurposeDescription
Energy EfficiencyReduce energy use and costML algorithms optimize building systems to consume less energy while maintaining comfort and functionality.
Predictive MaintenancePrevent equipment failuresML models analyze historical and real-time data to predict and prevent equipment failures, reducing downtime and maintenance costs.
Load ForecastingManaging energy demandML techniques are used to predict the building’s energy load, facilitating more efficient energy management and planning.
NILMMonitor energy usageNILM uses ML to disaggregate total energy usage data into appliance-specific consumption without individual sensors.
HVAC OptimizationOptimize climate controlML optimizes HVAC operations to improve comfort and energy efficiency based on real-time data and usage patterns.
Lighting ControlAutomate and optimize lightingML-driven systems adjust lighting based on occupancy and ambient light levels to improve energy savings and comfort.
Occupancy DetectionUnderstand space utilizationML algorithms analyze sensor data to detect and predict occupancy patterns for better building management and energy use.
Energy ManagementIntegrate renewable sourcesML aids in managing and optimizing the use of renewable energy within smart buildings, promoting sustainability.
Security EnhancementEnhance safety and securityML improves building security through advanced surveillance, anomaly detection, and access control systems.
Real-time ControlImprove operational efficiencyML enables real-time control and adjustment of building systems, enhancing responsiveness and efficiency.
Table 7. Comparative analysis of prediction techniques in smart building management.
Table 7. Comparative analysis of prediction techniques in smart building management.
TechniqueStrengthsLimitationsBest Use Cases
Linear RegressionSimple and interpretableStruggles with non-linearity and complex interactionsTrend analysis and straightforward prediction tasks
Time Series AnalysisEffective for temporal data patternsLess effective for abrupt changesEnergy usage and occupancy trend forecasting
SVMHandles non-linear data wellComputationally intensive for large datasetsAnomaly detection in energy consumption
Decision TreesEasy to interpretCan overfit on complex modelsDecision-making tasks like energy distribution
Random ForestsReduces overfitting, improves accuracyComplex and computationally intensiveComplex decision-making and predictive maintenance
Neural NetworksCan model complex patternsRequires extensive data and computational resourcesHigh-dimensional data analysis, such as pattern detection
Deep LearningAdvanced capabilities for complex dataDifficult to interpret and requires a lot of dataAdvanced pattern recognition, like in security systems
Clustering AlgorithmsGood for data segmentationNot predictive, but exploratorySegmenting buildings/zones for tailored energy strategies
Table 8. Comparative evaluation of ML implementations in smart buildings.
Table 8. Comparative evaluation of ML implementations in smart buildings.
EntityLocationClimate ZoneBuilding TypologyML Approach/TechReported Outcome
Google DeepMindGlobal (Data Centers)TemperateCommercial/ IndustrialDeep RL for cooling optimizationUp to 40% cooling energy reduction
Stanford Univ.California, USAMediterraneanCampus/ Mixed-useHVAC demand forecasting using regression-based ML10% overall campus energy savings
The Edge (Deloitte)Amsterdam, NLOceanicOffice/ Smart buildingReal-time sensor data + decision trees for lighting/HVAC70% energy consumption reduction
Siemens NavigatorGlobal ClientsVariableCommercial/ IndustrialPredictive maintenance using supervised MLReduced faults and energy waste (exact figures vary)
Copenhagen Smart CityCopenhagen, DKOceanicMixed-use city-scale infrastructureIoT + ML-based urban demand forecastingEnabled carbon neutrality roadmap (by 2025)
Carnegie Mellon UniversityPittsburgh, USAHumid ContinentalAcademic/OfficeReinforcement Learning for real-time HVAC control20–25% improvement in operational efficiency
NTU EcoCampusSingaporeTropicalAcademic/Research campusSVM + ANN hybrid for appliance-level energy prediction 28% building energy savings
Table 10. Challenges and opportunities in ML for smart buildings.
Table 10. Challenges and opportunities in ML for smart buildings.
Challenge/OpportunityDescriptionImpact on Energy Prediction and ManagementProposed Solutions/Directions for Future Research
Data Quality and AvailabilityInconsistent or incomplete data can limit the effectiveness of ML models.Affects the accuracy of energy consumption forecasts and optimization strategies.Develop robust data collection and preprocessing methods; explore techniques to handle missing or noisy data.
Integration with Existing SystemsChallenges in integrating ML solutions with legacy building management systems.Can hinder the deployment of advanced energy management solutions.Create adaptable ML models and middleware solutions that can interface with various systems and protocols.
ScalabilityThe need for ML models that can scale with the growing data volumes and complexity of smart buildings.Affects the ability to maintain efficiency as buildings and systems evolve.Focus on scalable algorithms and cloud-based solutions that can grow with system demands.
Energy Efficiency vs. User ComfortBalancing energy savings with occupant comfort and convenience.Optimal energy management strategies may conflict with user comfort preferences.Investigate multi-objective optimization techniques; incorporate user feedback and adaptive learning algorithms.
Data Quality and AvailabilityInconsistent or incomplete data can limit the effectiveness of ML models.Affects the accuracy of energy consumption forecasts and optimization strategies.Develop robust data collection and preprocessing methods; explore techniques to handle missing or noisy data.
Cybersecurity and PrivacyProtecting the integrity and confidentiality of data used in ML applications.Essential for maintaining trust in energy management systems and protecting sensitive information.Develop secure ML frameworks and privacy-preserving algorithms; emphasize cybersecurity in system design.
Regulatory and Compliance IssuesAdhering to evolving standards and regulations regarding energy use and data privacy.Compliance with regulations can limit the scope of ML applications in energy management.Engage in policy development; design flexible ML systems that can adapt to regulatory changes.
Table 11. Future trends in ML for energy management in smart buildings.
Table 11. Future trends in ML for energy management in smart buildings.
TrendDescriptionPotential Impact on Smart BuildingsCurrent State of Research/Implementation
Advanced Predictive AnalyticsUtilizing complex algorithms to analyze data and predict future energy usage patternsAllows for more accurate energy demand forecasting and optimized energy distributionExperimental phase, with some deployments in high-tech buildings
ML in HVAC OptimizationImplementing ML algorithms to control HVAC systems more efficientlyReduces energy consumption and costs while maintaining comfort levelsTested and used in commercial buildings with significant energy savings reported
Energy Management Through Reinforcement LearningUsing reinforcement learning to make real-time decisions about energy usage based on current conditionsEnhances the building’s ability to adapt to changes and optimize energy use dynamicallyResearch stage, with some pilot programs in smart buildings
Integration with Renewable Energy SourcesCombining ML with managing renewable energy sources like solar and wind turbinesImproves the efficiency of renewable energy usage and reduces dependence on traditional power gridsEarly integration stages, with increasing interest in sustainable building technologies
Autonomous Building Management SystemsFully automated systems that manage all aspects of a building’s operation using MLMinimizes human intervention, maximizes efficiency, and reduces operational costsConceptual and developmental stage, with some features being tested in smart buildings
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

El Husseini, F.; Noura, H.N.; Salman, O.; Chahine, K. Machine Learning in Smart Buildings: A Review of Methods, Challenges, and Future Trends. Appl. Sci. 2025, 15, 7682. https://doi.org/10.3390/app15147682

AMA Style

El Husseini F, Noura HN, Salman O, Chahine K. Machine Learning in Smart Buildings: A Review of Methods, Challenges, and Future Trends. Applied Sciences. 2025; 15(14):7682. https://doi.org/10.3390/app15147682

Chicago/Turabian Style

El Husseini, Fatema, Hassan N. Noura, Ola Salman, and Khaled Chahine. 2025. "Machine Learning in Smart Buildings: A Review of Methods, Challenges, and Future Trends" Applied Sciences 15, no. 14: 7682. https://doi.org/10.3390/app15147682

APA Style

El Husseini, F., Noura, H. N., Salman, O., & Chahine, K. (2025). Machine Learning in Smart Buildings: A Review of Methods, Challenges, and Future Trends. Applied Sciences, 15(14), 7682. https://doi.org/10.3390/app15147682

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop