Use of Sensors and Analyzers Data for Load Forecasting: A Two Stage Approach

The increase in sensors in buildings and home automation bring potential information to improve buildings’ energy management. One promissory field is load forecasting, where the inclusion of other sensors’ data in addition to load consumption may improve the forecasting results. However, an adequate selection of sensor parameters to use as input to the load forecasting should be done. In this paper, a methodology is proposed that includes a two-stage approach to improve the use of sensor data for a specific building. As an innovation, in the first stage, the relevant sensor data is selected for each specific building, while in the second stage, the load forecast is updated according to the actual forecast error. When a certain error is reached, the forecasting algorithm (Artificial Neural Network or K-Nearest Neighbors) is trained with the most recent data instead of training the algorithm every time. Data collection is provided by a prototype of agent-based sensors developed by the authors in order to support the proposed methodology. In this case study, data over a period of six months with five-minute time intervals regarding eight types of sensors are used. These data have been adapted from an office building to illustrate the advantages of the proposed methodology.


Introduction
The electricity sector is facing several challenges due to concerns about environmental issues [1,2]. The efficient use of electricity can be improved with the support of smart grids [3]. In fact, in the context of smart grids, consumers can receive incentives for reducing electricity consumption in certain periods [4]. This is in the context of Demand Response (DR) programs, in which the consumers receive incentives or price signals in real-time. This enables them to modify consumption to reduce electricity costs [5].
For a building or facility, at the commercial, domestic, or industrial level, adequate planning of the targeted tasks and respective energy consumption forecasts is needed [6]. This is more feasible to improve their participation in DR programs. In the end, the available resources used will be optimized, and the energy bill will be reduced by adapting the consumption to the available opportunities in smart grids [7].
Focusing on building energy measurement, a real-time automatic energy forecast can be performed with data monitored in a building in the context of electricity to optimize energy management [8,9]. Different artificial intelligence techniques can be used [10][11][12][13]. Artificial Neural Networks (ANN) represent a model with neurons with weighted connections in a multilayer framework organized with After this introduction, the proposed method is described in Section 2, with details about each stage. In Section 3, the sensor's prototype description is provided, showing all technical specifications of the system. A case study is described in Section 4 to validate and test the performance of the model, and its results are presented in Section 5. Finally, Section 6 presents the main conclusions of the work.

Methodology Description
This section describes the different phases of the proposed methodology. This includes a definition of the running parameters, data import from the database, cleaning, training, import of the test parameters, forecast operations, presentation of the results, and their errors (see Figure 1). The steps of this two-stage approach are described in detail in Section 2.1, Section 2.2, Section 2.3, Section 2.4.
Sensors 2020, 10, x FOR PEER REVIEW 3 of 16 of the system. A case study is described in Section 4 to validate and test the performance of the model, and its results are presented in Section 5. Finally, Section 6 presents the main conclusions of the work.

Methodology Description
This section describes the different phases of the proposed methodology. This includes a definition of the running parameters, data import from the database, cleaning, training, import of the test parameters, forecast operations, presentation of the results, and their errors (see Figure 1). The steps of this two-stage approach are described in detail in Sections 2.1 to 2.4.

Tuning Process
The tuning process is a step that works exclusively for the parametrization of data involved in the data warehouse domain. The content involved in this field of study comprises two relevant aspects. The first is featured by a mechanism that evaluates the content of data in order to come up with the forecasting technique that is expected to provide better predictions in the specific case. The second one involves the creation of a replica of data with transformation changes that differ from the original version. These manipulations of data are set in action, evaluating the information more relevant that should be added to the data structure and discarding the remaining one.
There is a data structure balance between the simplicity of data to avoid wrong interpretability. It is also needed to maintain the completion of data to provide better predictions of future events. The content of the data determined through studies (in this step, it is significant to find the parameters) may result in forecasts with higher accuracy on future steps of forecasting. Moreover, it should be considered that the algorithm's accuracy is highly dependent on the data structure and data reliability. In fact, this step is only performed once per execution of the algorithm.
The real-time data involves all the monitored, persistent and available data that the system keeps track of. The information corresponds to consumption and sensor data measured and monitored in the building. From all these data, a sample is selected to be studied and analyzed in the historic dataset. The correlation is applied in order to determine the most relevant sensors to be used. This is a relevant analysis that studies the strength of relation between the variables. This will determine the more relevant sensors to be included in the forecasting dataset. This study makes it possible to reduce the data, as the less relevant sensors are discarded from the data structure. However, in fact, it may happen that some sensors providing nonlinear behavior of the consumption data could help the

Tuning Process
The tuning process is a step that works exclusively for the parametrization of data involved in the data warehouse domain. The content involved in this field of study comprises two relevant aspects. The first is featured by a mechanism that evaluates the content of data in order to come up with the forecasting technique that is expected to provide better predictions in the specific case. The second one involves the creation of a replica of data with transformation changes that differ from the original version. These manipulations of data are set in action, evaluating the information more relevant that should be added to the data structure and discarding the remaining one.
There is a data structure balance between the simplicity of data to avoid wrong interpretability. It is also needed to maintain the completion of data to provide better predictions of future events. The content of the data determined through studies (in this step, it is significant to find the parameters) may result in forecasts with higher accuracy on future steps of forecasting. Moreover, it should be considered that the algorithm's accuracy is highly dependent on the data structure and data reliability. In fact, this step is only performed once per execution of the algorithm.
The real-time data involves all the monitored, persistent and available data that the system keeps track of. The information corresponds to consumption and sensor data measured and monitored in the building. From all these data, a sample is selected to be studied and analyzed in the historic dataset. The correlation is applied in order to determine the most relevant sensors to be used. This is a relevant analysis that studies the strength of relation between the variables. This will determine the more relevant sensors to be included in the forecasting dataset. This study makes it possible to reduce the data, as the less relevant sensors are discarded from the data structure. However, in fact, it may happen that some sensors providing nonlinear behavior of the consumption data could help the forecasting algorithm to have better results. In this way, in the present paper, nonlinear relations between the consumption and sensors data are disregarded.
Although this reduction in data is applied to the historic dataset, the same rules obviously work for the whole content present in the real-time data. The new version of data with the reduced content is kept in the training service to be trained, but also in the forecasting service to perform prediction studies. Furthermore, a separate process known as the forecasting methodology application analyzer studies the historic dataset in order to determine which forecasting technique is expected to provide better insights. The forecasting technique is afterward sent to the training service. In this way, both the set of inputs and the forecasting technique that better fit the dataset under study are obtained.

Training Service
The training service can take action in the system in three ways: immediately after ending the tuning process operation; after the system receives a training request; after the error calculation requests a new retrain that checks if the results are not good enough. In the second case, it is important to guarantee that the tuning process has already defined the parameters.
The cleaning operation reorganizes the data in a structure to make it more suitable for the train/test split, considering several factors. First of all, all the data are reorganized in a unique spreadsheet with a date split into several fields (year, month, day of the month, day of week, hours, and minutes). Moreover, missing information is added following a criterion which makes sequential copies from previous records. Additionally, information associated with weekdays is considered unreliable and as a matter of fact, records following this assumption are excluded in the spreadsheet. Afterward, an automatic algorithm is applied to deal with the outlier issue. This occurs due to erroneous readings made by the devices that measure the consumption and sensor data. The algorithm's strategy consists of detecting occurrences with variations outside of the normal in the dataset. This is done with the support of the mean and standard deviation operations. For each value in the dataset, the mean and standard deviation are calculated for a particular distribution with the actual record and a limited number of records that occurred previously and afterward. The mean and standard deviation are calculated respectively in Equations 1 and 2. These calculations are simple and can be made using a spreadsheet.
F-frame used for calculation.
• S-standard deviation consumption in F; • F-frame used for calculation.
Every time, that the actual value is lower or equal to the mean minus the error factor times standard deviation, the actual value is replaced by the mean of the previous and following records (see Equation (3)). This is also true while the actual value is greater or equal to the mean plus the error factor times standard deviation. The error factor is defined during the tuning process according to the experimenting of different values. A common value is 2.
Finally, as the service name indicates, the system extracts a historic from the cleaning data to be kept with a suitable structure. This is used by the forecasting technique (the same structure defined in the tuning process).

Forecasting Service
The forecasting service can take action in the system in three ways: immediately after ending the training service operation; right after the system receives a test request; or even right after performing a new iteration after ending the error calculation process synced with the scheduled activity. In the second case, it is important to guarantee that the training process has already performed its tasks. This process can be done several times in the algorithm.
The forecasting service starts to read test parameters associated with the iteration, which syncs the information with the time schedule. The fields presented in the test dataset are the input data and the respective target that the program is supposed to forecast.
Afterward, the forecasts are scheduled to be performed in a specific period for the target defined previously with the technique determined in the tuning service. The forecast values are targets representing total consumption.

Test Service
The test service step takes action instantly after the forecasting service ends its tasks. As the service name indicates, the forecast error is calculated for the moment which will determine how far the forecast value is from the actual value.
Following this, a condition is tested in order to check if the error associated with the moment in question is acceptable-in other words, if it is low enough according to a trigger criterion. If the error is not low enough, the trigger is activated, which automatically sends a new train request in order to rerun the train service. However, the implicit train will be a new train with updated content composed of information contained in the test service that was previously rerun just before the trigger point. The new historic size is kept exactly as the old counterpart and while the historic is updated with new content, the old content is discarded. The test will be associated with the information from the trigger point until the end of the week. If the error is low enough, there are two possible alternatives. The error calculation is repeated for each new iteration or the loop is broken, which will end the entire process.

Sensor Prototypes
The proposed methodology uses data provided by sensors developed by the authors. These sensors are installed inside the building to give precious data for the methodology. Therefore, the sensors enable the consumption forecast algorithm to consider the building's context. The used sensors are part of a multi-agent system where each sensor is connected to an ATmega328-P microcontroller and a nRF24L01 module. This enables the autonomy of each sensor. The sensors/agents are self-triggered using the internal clock, enabling the sending of real-time data through the radio frequency module. The sensors are also capable of learning rules, pursuing goals and reacting to the environment. However, these functionalities will not be used in the proposed methodology. The sensors are used to monitor real-time data and to store these data to be later used in the training service.
The sensor's multi-agent system uses a compound organization with a combination of federative, team and congregation organizations. The system can have two types of agents: common agents, and delegate agents. Common agents with the same sensor type (i.e., movement) in the same room form a congregation. All sensors in the same room form a team. All sensors in the building form a federation, where a delegate agent represents the entire system in an Internet of Things (IoT)-based architecture. The delegate agent does not have any sensor, and it is used as a gateway between radiofrequency messages and Message Queuing Telemetry Transport (MQTT) messages. The delegate agent is also responsible to route the MQTT messages to the common agents. Figure 2 shows the two possible hardware boards for: common agents, and delegate agents. Each common agent integrates a single sensor, of any type. To allow MQTT messages, delegate agents use an Arduino Mega 2560 Rev3 and an ESP8266-01.

Sensor Prototypes
The proposed methodology uses data provided by sensors developed by the authors. These sensors are installed inside the building to give precious data for the methodology. Therefore, the sensors enable the consumption forecast algorithm to consider the building's context. The used sensors are part of a multi-agent system where each sensor is connected to an ATmega328-P microcontroller and a nRF24L01 module. This enables the autonomy of each sensor. The sensors/agents are self-triggered using the internal clock, enabling the sending of real-time data through the radio frequency module. The sensors are also capable of learning rules, pursuing goals and reacting to the environment. However, these functionalities will not be used in the proposed methodology. The sensors are used to monitor real-time data and to store these data to be later used in the training service.
The sensor's multi-agent system uses a compound organization with a combination of federative, team and congregation organizations. The system can have two types of agents: common agents, and delegate agents. Common agents with the same sensor type (i.e., movement) in the same room form a congregation. All sensors in the same room form a team. All sensors in the building form a federation, where a delegate agent represents the entire system in an Internet of Things (IoT)-based architecture. The delegate agent does not have any sensor, and it is used as a gateway between radiofrequency messages and Message Queuing Telemetry Transport (MQTT) messages. The delegate agent is also responsible to route the MQTT messages to the common agents. Figure 2 shows the two possible hardware boards for: common agents, and delegate agents. Each common agent integrates a single sensor, of any type. To allow MQTT messages, delegate agents use an Arduino Mega 2560 Rev3 and an ESP8266-01. Because it is a federative organization, external systems only interact and see the delegate agent. Therefore, all the data are queried to the delegate agent, which in turn, will query common agents in order to make the data reply. The building's SCADA system queries the delegated agent every five minutes in order to acquire real-time data. All these data are then stored in a database.
The proposed methodology also needs energy data regarding building consumption. These data are monitored using energy analyzers installed on electrical boards. The building's SCADA system is responsible for querying, using Modbus/RTU, all energy analyzers and stores their data every five minutes. Besides the mentioned data, the SCADA system is also responsible for monitoring and storing the data of photovoltaic (PV) generation. The generated data are monitored directly from the PV inverter using Modbus/RTU protocol. Because it is a federative organization, external systems only interact and see the delegate agent. Therefore, all the data are queried to the delegate agent, which in turn, will query common agents in order to make the data reply. The building's SCADA system queries the delegated agent every five minutes in order to acquire real-time data. All these data are then stored in a database.
The proposed methodology also needs energy data regarding building consumption. These data are monitored using energy analyzers installed on electrical boards. The building's SCADA system is responsible for querying, using Modbus/RTU, all energy analyzers and stores their data every five minutes. Besides the mentioned data, the SCADA system is also responsible for monitoring and storing the data of photovoltaic (PV) generation. The generated data are monitored directly from the PV inverter using Modbus/RTU protocol.

Case Study
The historic data of the building selected for this case study are divided into three different zones and are provided in five-minute time intervals. The selected dataset is from 22 May 2017 to 15 November 2019. Each zone has three rooms which include PV generation and consumption of loads (total energy consumption and power), and sensors data. This case study focuses on the sensors of zone 1, including: Outside of the rooms, the data in the corridors are monitored as well, which include the light power and the total consumption. The weekly consumption profile can be seen in Figure 3, for all weeks in the dataset. In this way, each series in Figure 3 (colors have no special meaning as these are any week profiles) have 1440 points in time axis which corresponds to one week of five days with five-minute intervals.

Case Study
The historic data of the building selected for this case study are divided into three different zones and are provided in five-minute time intervals. The selected dataset is from 22 May 2017 to 15 November 2019. Each zone has three rooms which include PV generation and consumption of loads (total energy consumption and power), and sensors data. This case study focuses on the sensors of zone 1, including: Outside of the rooms, the data in the corridors are monitored as well, which include the light power and the total consumption. The weekly consumption profile can be seen in Figure 3, for all weeks in the dataset. In this way, each series in Figure 3 (colors have no special meaning as these are any week profiles) have 1440 points in time axis which corresponds to one week of five days with five-minute intervals.  Figure 4 illustrates the plan of the building and the controllable devices in each office room. Due to space limitations in this paper, the sensor's data are not presented. These data are also available with five-minute time intervals, similar to the consumption data shown in Figure 3. The plan of the building is composed of temperature and light sensors, as well as controllable loads (air conditioning devices and lights).  Figure 4 illustrates the plan of the building and the controllable devices in each office room. Due to space limitations in this paper, the sensor's data are not presented. These data are also available with five-minute time intervals, similar to the consumption data shown in Figure 3. The plan of the building is composed of temperature and light sensors, as well as controllable loads (air conditioning devices and lights).

Results
This section presents the obtained results of the proposed methodology applied to the case study presented in Section 4. In Sections 5.1 to 5.4, the results are presented for the different phases of the proposed methodology.

Tuning
The tuning observations follow the step methodology illustrated in Figure 1 and detailed in Section 2.1. The first step aims to understand which foresting technique is more accurate to perform forecast consumptions targeted for specific data. A scenario is proposed using nearly two and a half years historic data (from 22 May 2017 to 8 November 2019) targeted for all five-minute time intervals. The data belong to 11 November 2019 until 17 November 2019. The goal is to use data in order to predict the consumption placed in the targeted period and focus on the mentioned area. The data include consumption and sensor information that measures temperature, humidity and light intensity from this area. The input data to the ANN and KNN results in the two right columns in Table 1 has been defined using the historic data of the sensors selected from the results shown in Table 2. The set of algorithms that test this scenario includes ANN and KNN. Three metrics are calculated based on the presented scenario for both algorithms, as can be seen in Table 1. The presented errors show that in the ANN approach, the error is lower. This means that it can be considered as the most accurate technique. Therefore, employing ANN as the definitive forecasting technique was decided.
The sensor data used can also have a great impact on the forecasts. To understand which sensor data are more relevant to be included, the correlation matrix is built for each data column, as can be seen in Table 2. It should be noted that light consumption in this paper refers to the electricity consumption of the bulbs installed in the building. However, light intensity includes the level of illumination of the environment, which depends on the building's bulbs as well as the available natural light.

Results
This section presents the obtained results of the proposed methodology applied to the case study presented in Section 4. In Section 5.1, Section 5.2, Section 5.3, Section 5.4, the results are presented for the different phases of the proposed methodology.

Tuning
The tuning observations follow the step methodology illustrated in Figure 1 and detailed in Section 2.1. The first step aims to understand which foresting technique is more accurate to perform forecast consumptions targeted for specific data. A scenario is proposed using nearly two and a half years historic data (from 22 May 2017 to 8 November 2019) targeted for all five-minute time intervals. The data belong to 11 November 2019 until 17 November 2019. The goal is to use data in order to predict the consumption placed in the targeted period and focus on the mentioned area. The data include consumption and sensor information that measures temperature, humidity and light intensity from this area. The input data to the ANN and KNN results in the two right columns in Table 1 has been defined using the historic data of the sensors selected from the results shown in Table 2. The set of algorithms that test this scenario includes ANN and KNN. Three metrics are calculated based on the presented scenario for both algorithms, as can be seen in Table 1. The presented errors show that in the ANN approach, the error is lower. This means that it can be considered as the most accurate technique. Therefore, employing ANN as the definitive forecasting technique was decided.
The sensor data used can also have a great impact on the forecasts. To understand which sensor data are more relevant to be included, the correlation matrix is built for each data column, as can be seen in Table 2. It should be noted that light consumption in this paper refers to the electricity consumption of the bulbs installed in the building. However, light intensity includes the level of illumination of the environment, which depends on the building's bulbs as well as the available natural light. The correlation values show that light intensity and CO 2 sensors are more reliable due to their high correlation strength. Therefore, they have been selected to be used and surveyed in this paper.
Regarding the structure of ANN, it consists of a feed forward network representing a multilayer model composed of neurons and weights linked together. This structure is featured by one input layer with 10 neurons, followed by two hidden layers of 64 neurons each. This ends in an output layer with a single output. The amount of neurons in the input layer holds an extent of 10 consumptions placed in sequential periods. The output layer has only one value, which is the consumption that takes place after the last input. Furthermore, the epochs of ANN were defined with 500 iterations, which means ANN will keep training the model 500 times. The learning function used in the training process was the gradient descent algorithm. The learning rate was defined with 0.001, a very small rate considering the necessity to reduce the loss of information.
Regarding the KNN, the assumed set of parameters were: the k-nearest neighbor algorithm is associated with the five nearest neighbors; the weight function used in prediction is the uniform function, which states that all points in each neighborhood are weighted equally; the program is configured to decide the best algorithm option used to compute the nearest neighbor-the most adequate one based on the training and test data provided; the leaf size passed to the algorithm is 30; the distance metrics used for the tree is Minkowski; the power parameter used for the Minkowski metric is 2, which is equivalent to the Euclidean distance; the number of parallel rods to run for the neighbors search is one.

Data Cleaning
Data cleaning is made according to the methodology illustrated in Figure 1 and detailed in Section 2. Different types of data cleaning operations were performed in order to evaluate the more accurate ones. Missing information occurrences through the entire dataset are an issue due to lack of historic observations. The provided solution for this data cleaning operation consists of adding missing information for the problem at hand. The strategy consists of replicating the previous iteration value to the missing record.
Other issues that downgrade the forecasting performance are associated with the spikes and outlier's existence. However, while it is relevant to discard outliers due to its erroneous data, the remaining spikes may be relevant to be aware of in some case scenarios. Therefore, only spikes representing outliers should be discarded from the historic data. In order to understand the data's impact, two different versions of the consumption dataset are compared: one for raw data, and the other one for the results of data transformations that include missing information handling and the removal of outliers. This is shown in Figure 5. The dataset size with treatment adjustments is longer than the raw data version as its cleaning operations include the addition of missing data, which increases the dataset size. Therefore, a lot of patterns presented in the original version and cleaned dataset are identical in different periods. Furthermore, the outlier existence is erased in the final dataset. The removal of outliers has a huge impact on the data as it corrects a lot of erroneous errors, making the consumption progress more accurate. This is verified in almost all periods, with lower or higher impact. While most corrections have a small impact, it can be verified that some exceptions contribute to a high-level improvement. These can be observed:

•
In the period [19,453-25, In the present case study, the value of error factor used in Equation (3) was 2.

Training and Forecast Dataset
The training and forecast datasets are integrated into the methodology illustrated in Figure 1 and detailed in Section 2. The obtained results of this scenario focus on studying the information that will be considered for the training and test sets. The test set intends to use all five-minute records from 11 November 2019 to 15 November 2019. The historic information is composed of the 20 working days (excluding weekends and holidays) that precede 11 November 2019. Figure 6 presents the consumption during the train and forecast sets. The dataset size with treatment adjustments is longer than the raw data version as its cleaning operations include the addition of missing data, which increases the dataset size. Therefore, a lot of patterns presented in the original version and cleaned dataset are identical in different periods. Furthermore, the outlier existence is erased in the final dataset. The removal of outliers has a huge impact on the data as it corrects a lot of erroneous errors, making the consumption progress more accurate. This is verified in almost all periods, with lower or higher impact. While most corrections have a small impact, it can be verified that some exceptions contribute to a high-level improvement. These can be observed:

•
In the period [ In the present case study, the value of error factor used in Equation (3) was 2.

Training and Forecast Dataset
The training and forecast datasets are integrated into the methodology illustrated in Figure 1 and detailed in Section 2. The obtained results of this scenario focus on studying the information that will be considered for the training and test sets. The test set intends to use all five-minute records from 11 November 2019 to 15 November 2019. The historic information is composed of the 20 working days (excluding weekends and holidays) that precede 11 November 2019. Figure 6 presents the consumption during the train and forecast sets.
The daily consumption variability keeps a similar pattern in almost every scenario, despite the consumption differences in each day. As can be seen in Figure 6, the consumption profile switches multiple times between high increases and decreases. The nonlinear effect has been represented by a high increase in consumption. This is followed by a high decrease that represents all the five instance consumption measures framed in a day. The consumption range measured during the day varies from nearly 450 to 1900 W. The consumption behavior gains activity in the morning, taking measures with a minimum of 800 W and a maximum of 1900 W. This variation between 800 and 1900 W depends on the schedule of the machine. This behavior loses activity in the evening, taking consumptions between 450 and 650 W. The consumption activity range is very different from day to day but also from week to week. There are cases where this activity reaches consumptions from 1650 to 1900 W including the first, second and third days of the first week, the last day of the third week and the second and fourth day of the last week of training. Furthermore, there are a lot of cases where the consumption does not exceed 1400 W and a few where the consumption does not reach 1250 W during the entire day. The daily consumption variability keeps a similar pattern in almost every scenario, despite the consumption differences in each day. As can be seen in Figure 6, the consumption profile switches multiple times between high increases and decreases. The nonlinear effect has been represented by a high increase in consumption. This is followed by a high decrease that represents all the five instance consumption measures framed in a day. The consumption range measured during the day varies from nearly 450 to 1900 W. The consumption behavior gains activity in the morning, taking measures with a minimum of 800 W and a maximum of 1900 W. This variation between 800 and 1900 W depends on the schedule of the machine. This behavior loses activity in the evening, taking consumptions between 450 and 650 W. The consumption activity range is very different from day to day but also from week to week. There are cases where this activity reaches consumptions from 1650 to 1900 W including the first, second and third days of the first week, the last day of the third week and the second and fourth day of the last week of training. Furthermore, there are a lot of cases where the consumption does not exceed 1400 W and a few where the consumption does not reach 1250 W during the entire day.

Forecast
This section proposes the results of the forecasts performed for all five-minute records shown in Section 5.3. As was mentioned in Section 5.3, the real consumptions for each moment of the test are monitored. Therefore, it is possible to calculate the errors associated with each moment. This procedure was illustrated in Figure 1 and its metrics calculation is explained in this section. This will provide insights about how accurate the forecasts are for each particular time interval. However, there are exceptions where the error is not low enough, which means the forecast is not accurate enough. To deal with this issue, it is important to define in the test set an error trigger of 25%. This means every time that the moment error exceeds the limit, the system should redo the forecasts using a historic one with updated content and a fixed size of 20 days. The training set will discard some of the initial records in order to keep a fixed size of 20 working days while adding updated content. This updated content will go just until the occurrence of the error trigger. The new version of the test set is composed of the remaining working days, counting only from the period to where the error trigger is activated. The unit used for the moment error will be the first one found. Figure 7 presents all the moment errors for all training and retraining. In this part, the error threshold (X parameter in Figure 1) has been set to 25%. This means that if the error is bigger than 25%, the forecasting algorithm is trained again with the most recent data.

Forecast
This section proposes the results of the forecasts performed for all five-minute records shown in Section 5.3. As was mentioned in Section 5.3, the real consumptions for each moment of the test are monitored. Therefore, it is possible to calculate the errors associated with each moment. This procedure was illustrated in Figure 1 and its metrics calculation is explained in this section. This will provide insights about how accurate the forecasts are for each particular time interval. However, there are exceptions where the error is not low enough, which means the forecast is not accurate enough. To deal with this issue, it is important to define in the test set an error trigger of 25%. This means every time that the moment error exceeds the limit, the system should redo the forecasts using a historic one with updated content and a fixed size of 20 days. The training set will discard some of the initial records in order to keep a fixed size of 20 working days while adding updated content. This updated content will go just until the occurrence of the error trigger. The new version of the test set is composed of the remaining working days, counting only from the period to where the error trigger is activated. The unit used for the moment error will be the first one found. Figure 7 presents all the moment errors for all training and retraining. In this part, the error threshold (X parameter in Figure 1) has been set to 25%. This means that if the error is bigger than 25%, the forecasting algorithm is trained again with the most recent data. The error trigger of 25% is activated on periods 445, 713 and 963. The error is shown by a nonlinear behavior that keeps increasing and decreasing gradually. While the first train keeps the forecast error under control, a second train with updated content is required for a new test starting in period 445. Afterward, the forecast error is once more kept under control until period 713. Therefore, the error trigger will be activated and ask for a new train with updated content and a new test starting from the period where the anomaly was detected. The same effect applies to period 963. From this point, the error is initially maintained under control and keeps its variations. This has been verified in the period , in which the maximum and minimum forecast errors will keep almost the same value. Afterward, the error is again out of control until the end of the test.
The real consumption, forecast and reforecast profiles from 11 November to 15 November 2019 are illustrated in Figure 8. The first train follows a similar pattern to its real counterpart during the whole process. There is a small consumption difference in the periods [0-130] and [220-400]. Furthermore, there is a local minimum in the period [130-173] which has low forecast accuracy. Following this occurrence, highlighting two patterns, a gradually increasing and decreasing consumption has small consumption differences. The error trigger of 25% is activated on periods 445, 713 and 963. The error is shown by a nonlinear behavior that keeps increasing and decreasing gradually. While the first train keeps the forecast error under control, a second train with updated content is required for a new test starting in period 445. Afterward, the forecast error is once more kept under control until period 713. Therefore, the error trigger will be activated and ask for a new train with updated content and a new test starting from the period where the anomaly was detected. The same effect applies to period 963. From this point, the error is initially maintained under control and keeps its variations. This has been verified in the period , in which the maximum and minimum forecast errors will keep almost the same value. Afterward, the error is again out of control until the end of the test.
The real consumption, forecast and reforecast profiles from 11 November to 15 November 2019 are illustrated in Figure 8. The first train follows a similar pattern to its real counterpart during the whole process. There is a small consumption difference in the periods [0-130] and [220-400]. Furthermore, there is a local minimum in the period [130-173] which has low forecast accuracy. Following this occurrence, highlighting two patterns, a gradually increasing and decreasing consumption has small consumption differences.
The second train tends to follow a similar pattern to its real counterpart during the whole process. This is, however, a small consumption difference in the period [460-510], which is complemented by a low forecast accuracy of a local minimum and a local maximum. The third train and fourth train tend to follow a similar pattern to their real counterparts. The final forecast errors for each train are calculated and shown in Table 3. In Figure 8, each training method has been labeled with a different color. The illustrated data in the same figure are for a week with five working days in five-minute time intervals (1440 periods in total). More specifically, the results of the first and second trains are shown in Figure 8a, the second and third trains in Figure 8b, and finally, the results of the third and fourth trains are demonstrated in Figure 8c. The second train tends to follow a similar pattern to its real counterpart during the whole process. This is, however, a small consumption difference in the period [460-510], which is complemented by a low forecast accuracy of a local minimum and a local maximum. The third train and fourth train tend to follow a similar pattern to their real counterparts. The final forecast errors for each train are calculated and shown in Table 3. In Figure 8, each training method has been labeled with a different color. The illustrated data in the same figure are for a week with five working days in five-minute time intervals (1440 periods in total). More specifically, the results of the first and second trains are shown in Figure 8a, the second and third trains in Figure 8b, and finally, the results of the third and fourth trains are demonstrated in Figure 8c.  As is clear in Figure 8, except for the first day, the actual profiles of the other four days have been forecasted by more than one train. This leads to the conclusion that using trains with updated content is more accurate for the forecast. Again, Figure 8 has 1440 points in the time axis which corresponds to one week of five days, with each day represented by 24 h with 12 periods of five minutes each.
The progress from the first train to the second train and then to the third train with updated content clearly shows an improvement change based on the error analysis. However, the forecast for the fourth train is worse than the third, and better than the second approach, due to less reliability at the end of the week.

Conclusions
This paper proposes an automatic energy consumption forecasting for a building equipped with various types of sensors and energy monitoring equipment. The forecast method aimed at a set of five-minute time intervals and are supported by two algorithms (Artificial Neural Networks and K-Nearest Neighbors) using the Python language.
After applying the developed cleaning operation, only the more relevant data are used as input to the forecast algorithms. Graph visualizations show that this step leads to more accurate forecasts. Furthermore, the system performs forecasts on two independent processes storing all the results. The outcomes include the real and forecast test consumptions and respective errors associated with each moment and to the whole period. The first process' forecasts were validated based on Artificial Neural Networks and K-Nearest Neighbors techniques and based on different scenarios with and without sensor data. The results of the paper demonstrated that the Artificial Neural Networks algorithm with sensor data has more accurate information. This is supported by the presentation of forecast result period errors and variables correlation study. It is important to highlight that the main results are only obtained at the end of the second process. Following this, the results obtained in the first process limit the number of test scenarios for the second process. In other words, there is a reduced number of tests in order to obtain the main outcomes. The results in this final process were shown by the graph moment forecasts and the period errors. From these results, it can be concluded that each retrain has advantageous implications in the forecast. All the simulations provided results with error between 4% and 7%, which are generally lower for each retrain with updated content, as proposed in the developed methodology.
As future work, multilayer models and more algorithm implementation should be done in order to perform more forecast tests that might achieve more accurate forecasts.

Funding:
This work has received funding from Portugal 2020 under SPEAR project (NORTE-01-0247-FEDER-040224), in the scope of ITEA 3 SPEAR Project 16001 and from FEDER Funds through COMPETE program and from National Funds through (FCT) under the project UIDB/00760/2020, and CEECIND/02887/2017.