This section discusses the need for energy consumption forecasting models in the context of building energy management systems. In addition, the novelty and contributions of this research are presented.
1.1. Motivation
One of the main current challenges is the efficient consumption of energy due to economic and environmental reasons, among others. The massive energy consumption entails more economic expenses, impact on the environment, etc. However, thanks to the evolution of technology, it is possible to develop smart energy management systems (SEMSs) that efficiently save energy, without degrading user comfort. In this context, the application of soft computing techniques is necessary.
On the other hand, the building sector consumes more energy than the industry and transportation sectors, which is due mainly to Heating, Ventilation, and Air Conditioning (HVAC) systems, appliances/devices, and lighting [
1,
2]. Particularly, it is interesting to analyze the appliances/devices consumption in the context of a SEMS for several reasons, e.g., to minimize the utilization of the energy when the prices are excessive to guarantee the comfort to the users, or to have a sustainable rate of consumption considering the environment.
A smart building is a dynamic system where technology is used to improve its functioning, considering hundreds of elements, such as its HVAC system, appliances and devices, etc. In this context, SEMS must seek energy efficiency, implementing energy management tasks, such as monitoring of energy supply, predicting energy consumption, and anomaly detecting of energy use, among others.
Thus, among the possible reasons to analyze the energy consumption in smart buildings are the following: to determine the electrical load; to detect anomalies in consumption; to estimate energy consumption; to define load profiles using consumption behavior; and to classify the consumers, among others. In this way, to reach optimal management of energy consumption in a smart building, it is necessary to study the consumption, which is precisely the scope of this paper. Thereby, possible energy problems can be detected and solved. At the same time, around a smart grid, the energy is intermittent, distributed, mobile, and able to be stored. For example, renewable energy resources (RES) are characterized by their variability and intermittency, which make the prediction of the generated energy complex [
1,
2]. These attributes make the implementation of SEMSs more challenging, because more flexibility and stability is needed to secure its normal operation in a building, for which efficient energy consumption forecasting models are required. SEMSs today do not consider these aspects for this highly complex and rapidly changing scenario.
On the other hand, Artificial Intelligence (AI) can build useful knowledge of factors such as the prediction of energy consumption and the prediction of occupancy behavior, among others. AI techniques are already being used in the SEMSs, such as tasks of modeling, learning, and reasoning, among others. The motivation of this work is to analyze the behavior of the energy consumption data of the devices and electrical appliances in a building, in order to build models that allow prediction of their behavior.
1.2. Background
In the literature, there are some works similar to this work. For example, Rodriguez-Mier et al. [
3] proposed a knowledge model to define predictive models of energy consumption for smart buildings, and a multi-step prediction model based on a hybrid genetic-fuzzy system, which includes a feature selection method. The authors use a database that stores two types of signals: synchronous signals that record at a constant rate of 10 s (e.g., temperature, sensors, etc.) and asynchronous signals that record when a value changes (e.g., the indoor temperatures, error signals, etc.). In addition, they collect the humidity, solar radiation power, and pressure. Garcia et al. [
4] present a comparative study of different forecasting strategies of the energy consumption of smart buildings. Particularly, they determine that strategies based on Machine Learning (ML) approaches are more suitable. Alduailij et al. [
5] analyze several statistical and ML techniques to predict energy consumption for five different building types. They especially predict the peak demand that serves to achieve energy efficiency. Hernández et al. [
6] present an energy consumption forecasting strategy that allows hourly day-ahead predictions using several ML techniques. Then, they define an ensemble model using the mean of the prediction values of the top five models. In addition, Hernández et al. [
7] present a review of energy consumption forecasting for improving energy efficiency in smart buildings. They analyze different forecasting methods in nonresidential and residential buildings in terms of forecasting methods, forecasting objectives, input variables, and prediction horizon.
Moreno et al. [
8] define predictive models of energy consumption and save energy for buildings based on the Radial Basis Function (RBF) technique. Nabavi et al. [
9] propose a Deep Learning (DL) method that uses a discrete wavelet transformation and the long short-term memory method to forecast building energy demand and energy supply. These methods consider several factors, such as energy consumption patterns in buildings, electricity price, availability of renewable energy sources, and uncertainty in climatic factors. Somu et al. [
10] present an energy consumption forecasting model which employs LSTM. The hyperparameter optimization process (learning rate, number of layers, momentum, and weight decay) of the LSTM was optimized using the sine–cosine optimization algorithm.
On the other hand, Le et al. [
11] develop a framework for multiple energy consumption forecasting of a smart building based on the use of the Transfer Learning concept. Hadri et al. [
12] implement different energy consumption forecasting approaches of appliances by integrating the occupancy and the context-driven control information of buildings. In addition, Gonzalez-Vidal et al. [
13] defined a methodology to transform the multivariate time-dependent series to be used by ML algorithms for energy forecasting. Then, González-Vidal et al. [
14] proposed ML and grey-box approaches to predict energy consumption based on the physics of the building’s heat transfer. Sulo et al. [
15] analyzed the ways to improve the efficiency of the energy used by buildings using an LSTM model to predict the energy consumption of the buildings on the campuses of the City University of New York.
In other contexts, Aliberti et al. [
16] proposed a predictive model to estimate the indoor air temperature in individual rooms with a prediction window of up to three hours, and for the whole building with a prediction window of four hours. In addition, Lawadi et al. [
17] compared several ML algorithms to estimate the indoor temperature in a building, which were evaluated using different metrics, such as accuracy and robustness to weather changes. Siddiqui et al. [
18] introduced a DL approach to recommend consumption patterns for the appliances based on Term Frequency–Inverse Document Frequency (TF-IDF) to quantify the energy tags. The aim of the work of Bhatt et al. [
19] was to forecast the cost of energy consumption in smart buildings. They proposed a balanced DL algorithm that considers three constraints to solve the price management problem and high-level energy consumption in HVAC systems [
20]. Bourhnane et al. [
21] used Artificial Neural Networks (ANN) along with Genetic Algorithms (GA) to define an approach for energy consumption prediction and scheduling.
The motivation of the work of Hadri et al. [
22] was to determine the forecasting quality and the computational time of the XGBOOST, LSTM, and SARIMA algorithms in the context of forecasting energy consumption. Khan et al. [
23] proposed a short-term electric consumption forecasting model based on spatial and temporal ensemble forecasting. The ensemble forecasting model consists of a K-means algorithm to determine energy consumption profiles, and two deep learning models, LSTM and Gated Recurrent Unit (GRU). The model forecasts the energy consumption at three spatial scales (apartment, building, and floor level) for hourly, daily, and weekly forecasting horizons. The work of Keytingan et al. [
24] proposed predictive models for energy consumption based on a Support Vector Machine, Artificial Neural Networks, and K-Nearest Neighbour using real-life data of a commercial building from Malaysia. The goal of the work of Son et al. [
25] was to study adaptive energy consumption forecasting models in order to follow the dynamics of buildings. They consider active and passive change detection methods, which are integrated into the decision tree and deep learning models. The results showed that constant retraining, in some cases, is not good in performance. Moon et al. [
26] proposed an online learning approach to enable fast learning of building energy consumption patterns for unseen data. In addition, Pinto et al. [
27] presented three ensemble learning models (XGBOOST, random forests, and an adaptation of Adaboost) for energy consumption forecasting an hour ahead, using real data from an office building. Finally, the work of Somu et al. [
28] described a deep learning framework based on CNN (Convolutional Neural Networks)-LSTM to provide building energy consumption forecasts. CNN-LSTM uses K-means to determine the energy consumption pattern/trend, CNN to extract features about energy consumption, and LSTM to handle long-term dependencies.
In summary, the vast majority of recent works have been dedicated to carrying out comparative studies of different building energy consumption forecasting strategies based on statistical and ML techniques for different types of buildings (residential, office, among others), using specific datasets [
4,
5,
6,
7,
8,
16,
17,
18,
19,
22,
24,
27]. In some of these comparisons, specific aspects have been analyzed, such as occupation [
12], how to follow the dynamics of energy consumption [
25], or the use of online learning approaches to follow the consumption pattern in real time [
26]. On the other hand, some works analyzed the relationships of temporal dependencies in the time series in order to forecast the energy demand and/or the energy supply of the building [
9,
10,
14,
23], and in some cases, use the LSTM model [
15,
28] or feature selection methods [
3,
28] to predict energy consumption in buildings. Other works have studied the Transfer Learning concept in the context of forecasting the energy consumption of intelligent buildings [
11], or have combined it with other techniques to consider the prediction and programming of energy consumption [
20].
In conclusion, there are many works on the prediction of energy consumption in smart buildings, but none of them propose a scheme to carry out an exhaustive feature engineering process to analyze the variables, their dependencies, and their transformations, which allows improvement of the prediction of the forecasting models. Specifically, the great gap in the previous works is that they do not propose strategies to analyze the implicit temporal relationships in the time series that describe the pattern of energy consumption, which affects/degrades the ML and the statistical algorithms used to build the models of forecasting of energy consumption.
1.3. Novelty and Our Contribution
This work studies the process of data analysis and generation of prediction models of energy consumption in smart buildings. The focus of this paper is to estimate energy consumption in smart buildings based on the consumption of the appliances and devices in them. Therefore, thanks to the data collected in smart buildings, the study is carried out on this energy consumption as a function of time, in order to obtain a model capable of estimating total consumption, knowing the consumption of the devices and appliances in the building. The reason for working with time series is that energy consumption can be labeled by times of the year, days of the week, or even hours on the same day. For example, at Christmas, there may be a greater consumption of Christmas lights, and in holiday months, the consumption may be lower if we are traveling. The research question is whether the prediction of energy consumption in a building depends on an exhaustive analysis of time series that describe its behavior, which would imply carrying out a specific feature engineering process for that context.
This work carries out an analysis of these variables and their relationships, thanks to techniques such as the Pearson and Spearman correlations, and Multiple Linear Regression models. With the results obtained, the fusion and extraction of characteristics are carried out with the Principal Component Analysis (PCA) technique. On the other hand, the relationship of each variable with itself over time is analyzed, using techniques such as autocorrelation (simple and partial) and ARIMA models. Finally, several forecasting models are generated with LSTM. We start from the hypothesis that LSTM is an excellent technique for treating time series [
29,
30], so we have chosen this technique to analyze the results of our feature engineering process. With these results, the generation of prediction models is organized into three groups. The first group consists of prediction models in which only the first Principal Component (PC1) is taken into account. The second group includes PC2. The last group uses the original variables. These groups of models are evaluated using the following metrics: Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), and R
2. Thus, the main contribution of this article is the definition of a feature engineering methodological approach to analyze the energy consumption variables of buildings. Other specific contributions derived from that contribution are:
The definition of the phases to study the time series that define energy consumption in buildings;
The definition of the analysis process of the dependency relations between the variables, especially the temporal ones;
The definition of the analysis process of the dimensions in the dataset to determine the fusion and extraction of characteristics.
The utilization of this approach for the definition of forecast models based on time series.
The remainder of this work is organized as follows. A preliminary theoretical framework is described in
Section 2. The analysis of the energy forecasting problem is reported in
Section 3 based on two aspects. It begins by defining our feature engineering approach for energy consumption time series, then describes its detailed application in a case study, and finally builds a forecast model using LSTM from the results obtained with it.
Section 4 presents a comparison of LSTM with other machine learning techniques in different time series on building energy consumption, using our feature engineering approach to define the forecast model. Finally, a discussion about future directions in this domain is pointed out in
Section 5.