Analysis of the Integration of Drift Detection Methods in Learning Algorithms for Electrical Consumption Forecasting in Smart Buildings

: Buildings are currently among the largest consumers of electrical energy with considerable increases in CO 2 emissions in recent years. Although there have been notable advances in energy efﬁciency, buildings still have great untapped savings potential. Within demand-side management, some tools have helped improve electricity consumption, such as energy forecast models. However, because most forecasting models are not focused on updating based on the changing nature of buildings, they do not help exploit the savings potential of buildings. Considering the aforementioned, the objective of this article is to analyze the integration of methods that can help forecasting models to better adapt to the changes that occur in the behavior of buildings, ensuring that these can be used as tools to enhance savings in buildings. For this study, active and passive change detection methods were considered to be integrators in the decision tree and deep learning models. The results show that constant retraining for the decision tree models, integrating change detection methods, helped them to better adapt to changes in the whole building’s electrical consumption. However, for deep learning models, this was not the case, as constant retraining with small volumes of data only worsened their performance. These results may lead to the option of using tree decision models in buildings where electricity consumption is constantly changing.


Introduction
Buildings presently produce up to 40% of worldwide energy consumption and 30% of carbon dioxide emissions, numbers which are constantly increasing due to urbanization [1].Additionally, considering the long life expectancy of buildings, it is assessed that 85-95% of buildings that exist today will still be utilized in 2050 [2].Hence, changes in energy utilization on buildings are inclined to intensely affect current society, including major economic and environmental changes such as climate change and global warming [3,4].Buildings are becoming substantially more complex and sophisticated.They integrate conventional energy services systems, on-site energy generation systems, and charging systems [5].For this reason, energy management is becoming fundamental for buildings around the world, and energy forecasting is essential as an initial step to establish an energy management system [6].The forecasting of building energy utilization supports smart building performance through low energy and control procedures [7].
In recent times, because of their important application in various fields including electric energy consumption in buildings, data-driven models such as machine-and deep learning-based approaches have become exceptionally well known [8] and are being utilized to improve forecast accuracy [9].In real life, electrical consumption forecasting models should regularly be made online in real-time.An online setting brings extra challenges since there could be an anticipation of changes to the information distribution over the long haul [10].However, traditional electric energy forecasting models are normally trained once and not re-trained again with new data, thus missing out on the new information that new data can provide [11].When this situation happens, it can lead to incorrect forecasting [12].
Recognizing change points and incorporating these uncertain change points into electric energy forecasting models is one of the most difficult tasks [13].The unexpected changes in the data distribution over time, are known as concept drift [14].Concept drift has been perceived as the root cause of decreased effectiveness in data-driven decision support systems [15].Based on how the data change, concept drift can be separated into different kinds: sudden, gradual, recurring, and incremental [16].Sudden drift happens when the data change quickly and without variation.Whenever the data begin changing in class distribution, this is defined as gradual drift.Recurring drifts happen when the data change for a moment and then return sooner or later.Incremental drift occurs when the data continuously change over the long run [17].
To address those different situations in forecasting models, two main strategies have been used: active and passive methods.For active methods, a model is equipped with a change detection strategy and re-trained when a trigger has been flagged.Nonetheless, in passive methods, algorithms are re-trained at regular intervals regardless of whether a change has occurred or not [18].There has been a very important effort investigating concept drift in regression tasks (see Table 1) that have focused on load forecasting in houses [19,20], energy consumption in smart grids [21], electricity supply and demand [22], total reactive power [23], energy production for a wind farm [24], power generation in a photovoltaic plant [25], and electricity price [26,27].However, there have not been many works in real cases where concept drift techniques are used to maintain or improve the results of machine learning techniques in smart buildings.Therefore, this paper's objective is to provide a novel analysis of the integration of drift detection methods in decision trees and deep learning algorithms for whole building electricity consumption forecasting in smart buildings.
Given the above, the main contributions of this paper in this field of research could be summarized as follows:

•
Integration of drift detection methods to a multi-step forecasting strategy that forecasts the next 24 h from any hour of the day.

•
An analysis of the integration of drift detection methods in decision trees and deep learning algorithms for forecasting the electricity consumption of the entire building.

•
Comparison analysis between active and passive drift detection methods for building electricity consumption forecasting in smart buildings.
Table 1.Summary of literature review, their contributions, and their limitations.

Ref. Contributions Limitations [19]
A proposed approach for load forecasting where the model is persistently refreshed as new information shows up.
The tuning module could utilize a more modern approach to following precision patterns.
[20] Proposed online ensemble methods for load forecasting under the concept of drift.
The research did not evaluate concept drift or the performance during the drifting duration.
[21] Proposed a model that helps to identify anomalies using paired learners.
Delay of a few hours between the anomaly and its detection.
[22] Analyzed different drift detection methods for data streams in smart city applications.
Absence of accessible or reusable benchmark datasets in the literature to completely compare the outcomes. [23] Proposed an unsupervised drift detection approach capable of analyzing streaming data in a smart grid.
The approach was not compared with a deep learning algorithm that incorporates drift detection methods. [24] Suggested a drift detection approach based on the analysis of the change caused by new information using extreme learning machines.
Need for an automatic setting of the parameters for the proposed drift detection approach.
[25] Implemented a segmentation of time series based on stationarity using drift detection methods.
The approach needs to have previous knowledge about the time series cyclical behaviors. [26] Proposed a passive drift detection approach using Robust Soft and Generalized Learning Vector Quantization.
The proposed method was compared with drift detection algorithms without optimized hyperparameters. [27] Proposed an improvement for the Robust Soft Learning Vector Quantization algorithm to be used in drift detection.
The proposed approach method performs better in synthetic concept drift streams but not in real-world streams. [28] Proposed an approach based on random trees algorithm to deal with changes using drift detection methods.
The proposed approach discards the previous anomaly instead of updating the detection model.

Methodology and Approach
The use of drift detection methods is well known, however, the integration of these methods into a multi-step forecasting strategy to predict continuous hourly electricity energy consumption in the entire building turns out to be a novel topic.
Therefore, this section describes data preprocessing, forecasting algorithms, drift detection methods, and performance metrics used in this article.Section 2.1 provides information on how the datasets from the two buildings used to train the learning algorithms were made.Section 2.2 presents the approach and the learning algorithms used to forecast the electrical consumption in buildings.Section 2.3 describes the drift detection methods and their incorporation into the learning algorithms.Section 2.4 explains the metrics used for evaluating the performance of learning algorithms.A summary of the methodology used is shown in Figure 1.
tection methods, and performance metrics used in this article.Section 2.1 provides information on how the datasets from the two buildings used to train the learning algorithms were made.Section 2.2 presents the approach and the learning algorithms used to forecast the electrical consumption in buildings.Section 2.3 describes the drift detection methods and their incorporation into the learning algorithms.Section 2.4 explains the metrics used for evaluating the performance of learning algorithms.A summary of the methodology used is shown in Figure 1.

Datasets Construction
For this research, the data from two buildings located on the campus of the University of Valladolid were used.These data were obtained through smart meters installed in each of the buildings at their electrical power transformer stations, which record the active energy consumed (kWh) of the entire building in intervals of 15 min from 2016 to 2019.At the time of analyzing the data, some missing records were found, because these missing records did not exceed 0.5% of the total value of the data and were not found consecutively, a line interpolation technique was applied to complete these missing records.After completing the missing data, since it was desired to forecast the electricity consumption per hour, the data were conditioned to have the consumption per hour for each building.
Based on previous studies [29][30][31][32][33] where it has been proven that the use of weather, calendar variables, and past values data can help improve the training of learning algorithms, these were included in the datasets.To obtain the past values data, the autocorrelation and partial autocorrelation of the energy consumption variable were analyzed, resulting in a significant autocorrelation up to lag 25.For calendar variables, the timestamps of the historical data were used to obtain the variables of the hour, day, month, and year.Additionally, a variable was added to indicate when it is a working day or not, this variable was made based on the annual calendar of the university.The weather variables that were used were those that are related to the comfort of the occupants inside the building, such as relative humidity, precipitation, minimum temperature, average temperature, maximum temperature, heating degree days, cooling degree days, and all-sky surface

Datasets Construction
For this research, the data from two buildings located on the campus of the University of Valladolid were used.These data were obtained through smart meters installed in each of the buildings at their electrical power transformer stations, which record the active energy consumed (kWh) of the entire building in intervals of 15 min from 2016 to 2019.At the time of analyzing the data, some missing records were found, because these missing records did not exceed 0.5% of the total value of the data and were not found consecutively, a line interpolation technique was applied to complete these missing records.After completing the missing data, since it was desired to forecast the electricity consumption per hour, the data were conditioned to have the consumption per hour for each building.
Based on previous studies [29][30][31][32][33] where it has been proven that the use of weather, calendar variables, and past values data can help improve the training of learning algorithms, these were included in the datasets.To obtain the past values data, the autocorrelation and partial autocorrelation of the energy consumption variable were analyzed, resulting in a significant autocorrelation up to lag 25.For calendar variables, the timestamps of the historical data were used to obtain the variables of the hour, day, month, and year.Additionally, a variable was added to indicate when it is a working day or not, this variable was made based on the annual calendar of the university.The weather variables that were used were those that are related to the comfort of the occupants inside the building, such as relative humidity, precipitation, minimum temperature, average temperature, maximum temperature, heating degree days, cooling degree days, and all-sky surface longwave downward irradiance.The weather data were obtained from the NASA Langley Research Center (LaRC) POWER Project funded through the NASA Earth Science/Applied Science Program (https://power.larc.nasa.gov/,accessed on 16 March 2022).

Approach and Forecasting Algorithms
For the electricity consumption forecast, a multi-step forecasting strategy was used, which in this case can predict electricity consumption for the next 24 h from one hour.The advantage of this strategy is that it allows electricity consumption forecasting from any hour of the day, the disadvantage is that it is necessary to prepare the dataset with past values data so that this information can be used by the learning algorithms to forecast the multiple hours more accurately.
Based on studies where decision tree [34][35][36][37] and deep learning algorithms [38][39][40][41] obtained good results in forecasting electrical consumption in buildings, two decision trees, and two deep learning algorithms were selected.From the decision tree algorithms, Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) were selected, while from the deep learning algorithms, Convolutional Neural Network (CNN), and Temporal Convolutional Network (TCN) was chosen.The architectures of the learning algorithms used are shown in Figure 2.

Approach and Forecasting Algorithms
For the electricity consumption forecast, a multi-step forecasting strategy was used, which in this case can predict electricity consumption for the next 24 h from one hour.The advantage of this strategy is that it allows electricity consumption forecasting from any hour of the day, the disadvantage is that it is necessary to prepare the dataset with past values data so that this information can be used by the learning algorithms to forecast the multiple hours more accurately.
Based on studies where decision tree [34][35][36][37] and deep learning algorithms [38-41] obtained good results in forecasting electrical consumption in buildings, two decision trees, and two deep learning algorithms were selected.From the decision tree algorithms, Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) were selected, while from the deep learning algorithms, Convolutional Neural Network (CNN), and Temporal Convolutional Network (TCN) was chosen.The architectures of the learning algorithms used are shown in Figure 2. The algorithms used were programmed in Python using the Scikit-learn, XGBoost, Keras, and TensorFlow libraries.To obtain the best combination of hyperparameters and architecture for the algorithms, backtesting with sliding windows was used.The backtesting with sliding windows procedure consisted of keeping the same training size and sliding a data window to create five different training tests (see Figure 3).For this case, the The algorithms used were programmed in Python using the Scikit-learn, XGBoost, Keras, and TensorFlow libraries.To obtain the best combination of hyperparameters and architecture for the algorithms, backtesting with sliding windows was used.The backtesting with sliding windows procedure consisted of keeping the same training size and sliding a data window to create five different training tests (see Figure 3).For this case, the data from 2016 to 2017 were used for the training set, while the data from 2018 were used for the validation sample.Once the best architecture and parameters were defined through backtesting, the model was adjusted with data from 2016 to 2018, leaving 2019 as the testing set.The best combinations of parameters obtained in the backtesting process are shown in Table 2.The parameters that do not appear in the table are absent because their default values were used.
data from 2016 to 2017 were used for the training set, while the data from 2018 were used for the validation sample.Once the best architecture and parameters were defined through backtesting, the model was adjusted with data from 2016 to 2018, leaving 2019 as the testing set.The best combinations of parameters obtained in the backtesting process are shown in Table 2.The parameters that do not appear in the table are absent because their default values were used.

Drift Detection Methods
Since the selected algorithms are not capable of detecting changes in the data distribution, two well-known active drift detection methods (DDM), Adaptive Window (ADWIN) and Kolmogorov-Smirnov Window (KSWIN) [28] were incorporated into them.These methods were selected because the training uses the latest batch of data with the latest training instances and the size of the window is generally determined by the user.
ADWIN accurately keeps a variable-length window of late values; to such an extent that it holds that there has not been a change in the data distribution.This window is additionally isolated into two sub-windows (W0, W1) used to decide whether a change has occurred.ADWIN contrasts the median of W0 and W1 to affirm that they coincide with a similar distribution.Concept drift is identified assuming the distribution correspondence does not hold anymore.After recognizing a drift, W0 is changed by W1 and a new W1 is introduced.ADWIN utilizes a certainty value  ∈ (0,1) to decide whether the two subwindows coincide with a similar dispersion [42].

Drift Detection Methods
Since the selected algorithms are not capable of detecting changes in the data distribution, two well-known active drift detection methods (DDM), Adaptive Window (ADWIN) and Kolmogorov-Smirnov Window (KSWIN) [28] were incorporated into them.These methods were selected because the training uses the latest batch of data with the latest training instances and the size of the window is generally determined by the user.
ADWIN accurately keeps a variable-length window of late values; to such an extent that it holds that there has not been a change in the data distribution.This window is additionally isolated into two sub-windows (W 0 , W 1 ) used to decide whether a change has occurred.ADWIN contrasts the median of W 0 and W 1 to affirm that they coincide with a similar distribution.Concept drift is identified assuming the distribution correspondence does not hold anymore.After recognizing a drift, W 0 is changed by W 1 and a new W 1 is introduced.ADWIN utilizes a certainty value δ ∈ (0, 1) to decide whether the two sub-windows coincide with a similar dispersion [42].
KSWIN is a drift detection method based on the Kolmogorov-Smirnov (KS) measurable test.KS-test is a measurable test without really any suspicion of basic information appropriation.KSWIN keeps a sliding window Ψ of fixed size n (window_size).The last r (stat_size) tests of Ψ are accepted to address the last idea considered as R. From the main n − r examples of Ψ, r tests are consistently drawn, addressing an approximated last concept W. The KS-test is performed on the windows R also W, of a similar size.KS-test looks at the distance of the observational aggregate data distribution dist(R, W) [27].
A sudden change is distinguished by KSWIN if: where α is the probability for the test statistic of the KS-test, and r is the size of the statistic window.
The reason for using methods based on window size was because the training utilizes the last batch of data with the last training set.The window of fixed size approach is the least complex rule and the window size is usually decided by the user.By having data on the time size of the change, a window of the fixed size approach is a valuable decision [11].

Performace Metrics
To analyze the integration of the DDM, in addition to using active methods, it was proposed to use a passive method, which consisted of retraining the algorithms every 24 h regardless of whether there was a change in the data distribution.These methods were compared in each of the algorithms using performance metrics, mean absolute percentage error (MAPE), mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R 2 ) were used.
MAPE shows the measure of the precision of the estimated values comparative with the real values (in a percentage) [43], which is determined according to Equation (2).MAPE = MAE is utilized to assess how close estimates or expectations are to the real results.It is determined by averaging the absolute differences between the expected values and the real values [44], as shown in Equation (3).

MAE =
∑ RMSE evaluates the differences between the real values and estimated values [45], which is determined according to Equation (4): R 2 is a statistical measure of the variance between estimated values acquired by the model and real values (level of direct relationship among anticipated and estimated values) [46], which is determined according to Equation (5).
where y i is the expected value, ŷi is the real value, y i is the average value, and n is the total number of estimations.The reason why these metrics were chosen was to have an overview of the performance of the models.In the case of the MAPE, it was chosen because it is easy to understand since it presents percentage values, but due to its limitations, it was decided to accompany it with the MAE, which shows how much inaccuracy is expected from the forecast on average, helping to determine which models are better.However, because the MAE can have difficulty distinguishing large from small errors, it was combined with the RMSE to be on the safe side.As for R 2 , it was selected to know how the data fit the models.

Experimentation Setup
Two buildings with a continental Mediterranean climate were selected for testing.These buildings have a lighting and air conditioning control system, as well as an energy monitoring system to provide a balance between the comfort of the occupants and the consumption of electrical energy.The first building corresponds to the Faculty of Science of the University of Valladolid located at coordinates 41.663411 • , −4.705539 • , which is dedicated to administrative offices, while the second building corresponds to the Faculty of Economics located at coordinates 41.658586 • , −4.710667 • , which is dedicated to teaching activities.These buildings were selected due to their different behavior in electricity consumption during the selected years.In case of Building 1, it has had changes in consumption only in specific periods, while Building 2 has had a decrease in energy consumption gradually each year because energy efficiency improvements were made, and solar panels were integrated into the building (see Figure 4).The energy source used for Building 1 comes from the electrical grid, while for Building 2, the energy source comes from the electrical grid and photovoltaic panels.
mance of the models.In the case of the MAPE, it was chosen because it is easy to understand since it presents percentage values, but due to its limitations, it was decided to accompany it with the MAE, which shows how much inaccuracy is expected from the forecast on average, helping to determine which models are better.However, because the MAE can have difficulty distinguishing large from small errors, it was combined with the RMSE to be on the safe side.As for R 2 , it was selected to know how the data fit the models.

Experimentation Setup
Two buildings with a continental Mediterranean climate were selected for testing.These buildings have a lighting and air conditioning control system, as well as an energy monitoring system to provide a balance between the comfort of the occupants and the consumption of electrical energy.The first building corresponds to the Faculty of Science of the University of Valladolid located at coordinates 41.663411°, −4.705539°, which is dedicated to administrative offices, while the second building corresponds to the Faculty of Economics located at coordinates 41.658586°, −4.710667°, which is dedicated to teaching activities.These buildings were selected due to their different behavior in electricity consumption during the selected years.In case of Building 1, it has had changes in consumption only in specific periods, while Building 2 has had a decrease in energy consumption gradually each year because energy efficiency improvements were made, and solar panels were integrated into the building (see Figure 4).The energy source used for Building 1 comes from the electrical grid, while for Building 2, the energy source comes from the electrical grid and photovoltaic panels.The records of the electrical consumption that were used to test the proposed method were from 2016 to 2019.For the training stage, the years 2016 to 2018 were used, while for the test stage the year 2019 was used.To evaluate the learning algorithms with the DDM, The records of the electrical consumption that were used to test the proposed method were from 2016 to 2019.For the training stage, the years 2016 to 2018 were used, while for the test stage the year 2019 was used.To evaluate the learning algorithms with the DDM, two Python scripts were developed, one for the decision tree algorithms and the other for the deep learning algorithms.Two functions were created in the scripts, the first for updating the algorithms with a passive method and the second for updating with the active methods.In the passive method, the algorithms were retrained every 24 h over a period of one year, while in the active methods, the algorithms were retrained every time a change in the data distribution was detected for the same period.It should be noted that to apply the ADWIN and KSWIN methods to the models, the scikit-multiflow library was used.
For this study, the active methods take the first three years of the dataset as a reference and compare it with the new data.If a change is detected, the model is retrained.The way the model is retrained depends on the type.For decision trees, the model is built from scratch while for deep learning, transfer learning was used, to reduce training time.The transfer learning was carried out by freezing the layers except for the last two, which were updated every time the detection method indicated that it was required to retrain the model.

Decision Trees Models Evaluation
After integrating the active and passive DDM with decision tree models, the results obtained for Building 1 (see Table 3) show that the models with DDM obtained better performance for both algorithms than the model without DDM.Likewise, it is highlighted that the passive method used for training presents better results than the active methods.Table 4 shows the results in Building 2 where it is observed that, like Building 1, the models with DDM present better performance for both algorithms than the model without DDM.However, if we focus on the RMSE and R 2 metrics, the passive method does not clearly show that it obtains better performance than the KSWIN method in the case of XGBoost.The findings show that the decision tree algorithms certainly benefited from the integration of the DDM, showing improvement in the results.When analyzing the detection number, which corresponds to the number of sudden changes detected by the DDM, it could be concluded that for active methods a higher number of detections, which in our case would be the same as the retraining number, could lead to better results.However, when we compare the passive method with the KSWIN method, it can be seen that the results are very approximate but in the case of the KSWIN method, the number of retraining is less than 50% of the retraining performed by the passive method.
Even though the passive method has shown better performance, it cannot be affirmed with certainty that it would be better to use it since it assumes that the data distribution undergoes daily changes, which would not necessarily be true since it could be the case that the behavior of the occupants or energy savings measures causes changes in electricity consumption in periods greater than 24 h and the model is being retrained at a time when it is not necessary.

Deep Learning Models Evaluation
After integrating the active and passive DDM with deep learning models, the results obtained for Building 1 (see Table 5) show that for the TCN, the model without DDM obtains better performance than the models with DDM.However, in the case of CNN, it is observed that the model without DDM obtains better performance than the active methods but not better than the passive method if we focus on the RMSE and R 2 metrics.Table 6 shows the results in Building 2 where it is observed that, like Building 1, the TCN obtains better performance without DDM.However, for CNN, if we focus on the RMSE and R 2 metrics, the KSWIN method obtained better performance than the model without DDM.For the deep learning models, the findings show that the ADWIN method, which performs the smallest amount of retraining, presents the worst performance of the active methods, while the passive method presents the better performance.However, in general, the model without DDM obtains better performance except in the RMSE and R 2 metrics for CNN with DDM.Which would suggest that the type of change in the data distribution is not abrupt enough to require the retraining of the deep learning models.
This behavior in the performance of the deep learning models would make us question the need for retraining in this case, but if we compare the outcomes of the decision tree models versus the deep learning models, it can be seen that, in the case of Building 2 where the deep learning models without DDM have better performance than the decision tree models without DDM when DDM is applied, decision tree models perform better than deep learning models without DDM.
Figure 5 shows the average error of the forecast algorithms by hours of the electrical consumption of the entire building from the first hour that is forecast for each algorithm.As can be seen, when we analyze the average error per hour in each of the buildings, we realize that the decision tree models, when integrating the DDMs, improve their performance in each of the hours, however, this is not the case for deep learning models.tree models without DDM when DDM is applied, decision tree models perform better than deep learning models without DDM.
Figure 5 shows the average error of the forecast algorithms by hours of the electrical consumption of the entire building from the first hour that is forecast for each algorithm.As can be seen, when we analyze the average error per hour in each of the buildings, we realize that the decision tree models, when integrating the DDMs, improve their performance in each of the hours, however, this is not the case for deep learning models.The results show that the proposed method can be applied to maintain or even improve the performance of learning algorithms in situations where there are constant changes in the behavior of electrical consumption in buildings.A limitation is the drift detection methods that were integrated.In the case of ADWIN, only the confidence value parameter was allowed to be modified, while in the case of KSWIN an inappropriate modification of the values of the size of windows would cause the method to not detect sudden changes in the distribution data.

Conclusions
In this paper, the integration of drift detection methods is analyzed in models for electricity consumption forecasting in buildings so that these models can adapt to the changing behavior that has been occurring in buildings due to energy-saving measures.Two active methods and one passive method were proposed to be integrated with the decision tree and deep learning models to know when the models should be retrained according to changes in the data distribution.The passive method consisted of retraining the models every 24 h assuming that the models should be constantly updated, while the active methods were ADWIN and KSWIN, which are based on a variable-length window approach.
The main conclusion that can be learned from this study, after analyzing the results, is that in the case of decision tree models, the incorporation of DDM not only allows them to keep up to date with changes in the data distribution but also improves their accuracy.Being the best case RF, without DDM obtained a MAPE of 9.23% for Building 1 and 19.47% for Building 2 while with the passive DDM it obtained a MAPE of 8.46% for building 1 and 16.14% for Building 2. However, in the case of deep learning models, the incorporation of DDM did not turn out to be as favorable as decision tree models.With the CNN being the worst case, without DDM an MAPE of 9.40% was obtained for Building 1 and 16.97% for Building 2 while with the passive DDM it obtained an MAPE of 10.93% for building 1 and 18.89% for Building 2. We can deduce from this that in the case of deep learning models, constantly updating them with small volumes of data would only worsen their performance.In cases such as Building 2 with sudden changes in load curves due to improvements, the model becomes inefficient, because deep learning models cannot adapt with small data to constant changes in the short term.
Considering the results obtained in the deep learning models, for future lines of research it would be necessary to focus on how it would be possible to adapt the deep learning models to constant changes within the electrical consumption forecasting in buildings to avoid model obsolescence.

Figure 1 .
Figure 1.Methodology used for the analysis of the integration of drift detection methods.

Figure 1 .
Figure 1.Methodology used for the analysis of the integration of drift detection methods.

Figure 3 .
Figure 3. Backtesting with sliding windows procedure.Table 2. Best combinations of parameters obtained through backtesting procedure.Algorithms HyperparameterRandom Forest

Figure 5 .
Figure 5. (a) Performance of forecasting algorithms without DDM by hours in Building 1.(b) Performance of forecasting algorithms without DDM by hours in Building 2. (c) Performance of forecasting algorithms with DDM by hours in Building 1.(d) Performance of forecasting algorithms with DDM by hours in Building 2.

Figure 5 .
Figure 5. (a) Performance of forecasting algorithms without DDM by hours in Building 1.(b) Performance of forecasting algorithms without DDM by hours in Building 2. (c) Performance of forecasting algorithms with DDM by hours in Building 1.(d) Performance of forecasting algorithms with DDM by hours in Building 2.

Table 2 .
Best combinations of parameters obtained through backtesting procedure.

Table 3 .
Decision tree model results for Building 1.

Table 4 .
Decision tree model results for Building 2.

Table 5 .
Deep learning model results for Building 1.
Wo/DDM = without drift detection method, ND = numbers of detections, n/a = not applicable.

Table 6 .
Deep learning model results for Building 2.
Wo/DDM = without drift detection method, ND = numbers of detections, n/a = not applicable.