A Methodology for Energy Load Proﬁle Forecasting Based on Intelligent Clustering and Smoothing Techniques

: The electrical sector needs to study how energy demand changes to plan the maintenance and purchase of energy assets properly. Prediction studies for energy demand require a high level of reliability since a deviation in the forecasting demand could a ﬀ ect operation costs. This paper proposed a short-term forecasting energy demand methodology based on hierarchical clustering using Dynamic Time Warp as a similarity measure integrated with Artiﬁcial Neural Networks. Clustering was used to build the typical curve for each type of day, while Artiﬁcial Neural Networks handled the weather sensibility to correct a preliminary forecasting curve obtained in the clustering stage. A statistical analysis was carried out to identify those signiﬁcant factors in the prediction model of energy demand. The performance of this proposed model was measured through the Mean Absolute Percentage Error (MAPE). The experimental results show that the three-stage methodology was able to improve the MAPE, reaching values as good as 2%. the Market show proposed methodology in previous energy demand data and the seasonal nature of that demand. The importance of historical data for the generalization of a stationary process behaviour is clear. Contributions: Conceptualization, J.J.M. and C.G.Q.M.; methodology, and software, J.J.M.; analysis, M.P.; investigation, J.J.M., L.N., C.G.Q.M. resources, J.J.M.; data curation, J.J.M.; writing—original preparation, J.J.M., L.N., M.P.; writing—review


Introduction
Energy demand forecasting is a topic that is drawing interest in electricity market agents (e.g., regulation and distribution companies), providing valuable information for conducting reliability studies in the electric power systems (EPS). For example, demand forecasting studies help to establish the levels of chargeability in circuits, electrical bars and transformers properly. The changing nature in the energy demand due to external factors such as weather conditions can increase the uncertainty when planning increasing the operation costs in the electricity market.
However, the energy demand has a dynamic behaviour which is dependent on several factors such as world population increase, weather-conditions variability and the evolution of EPS architectures (for example, due to renewable energy sources and distributed generation), among other factors. In particular, modified EPS schemes also require the development of new models that consider a more stochastic operation for improved reliability. Such prediction models should provide, at least, information to understand the user energy demand and the factors that affect it [1]. Thus, these models can be later used for regulatory agencies to calculate penalties against the distributors, for example, when energy supply problems arise.
The energy demand can be viewed as a time series whose nonstationary characteristics depend on external variables (e.g., the weather conditions, type of day and type of users), requiring the integration of multiple statistical techniques along with intelligent computational techniques [2][3][4][5]. Due to this dynamic behaviour, many authors have considered different statistical models based on conventional and nonconventional strategies. Among the most used conventional statistical models, there are the

Proposed Methodology
The proposed methodology comprises three (3) stages to obtain a model that provides a stable and dynamic prediction with low variability (see Figure 1). The methodology based on stages allows the combination of conventional statistical and computational-intelligence-based techniques to provide better robustness and enhance model functionality.
limitations, such as low adaptability, limited forecast ability, and difficulties in adjusting the model parameters to the nonstationary characteristics of the considered time series. On the other hand, there are techniques such as SVM (Support Vector Machines) [9,10], FL (Fuzzy Logic) [11][12][13], MC (Markov Chain) [14] and ANN (Artificial Neural Networks) [2,3,15]. Some authors have proposed the combination of several of these techniques, for example, to consider external variables such as the weather conditions [16][17][18], while other have proposed the use of clustering techniques to create subprofiles that allow different forecasts, and adding them to obtain more accurate results [19,20].
This paper proposed a methodology to model the energy demand based on the construction of base load profiles and the characterization of the deviations caused by the effect of external factors such as the weather conditions. This work used DTW (Dynamic Time Warp), hierarchical clustering and ANNs to build the prediction models, and the impact of external factors (i.e., the weather and type of day) in the hourly energy demand of five electricity markets. More than one electricity market was considered, since the temperature and the working time can affect the energy demand in a region or community according to the literature [21]. Thus, this paper was organized to present a description of the proposed methodology and to show an analysis of the obtained results. The results demonstrate the adaptability of the proposed methodology to predict the hourly energy demand in different contexts.

Proposed Methodology
The proposed methodology comprises three (3) stages to obtain a model that provides a stable and dynamic prediction with low variability (see Figure 1). The methodology based on stages allows the combination of conventional statistical and computational-intelligence-based techniques to provide better robustness and enhance model functionality.
The first stage consists of the implementation of an intelligent clustering technique to obtain the shape of a typical load curve for each type of day in a week (weekdays, weekends, holidays). The second stage presents the base curve (i.e., the reference curve) for each day that will be forecasted. This base curve was done by implementing statistical techniques that used historical data of energy demand and the typical curves obtained in the first stage. Finally, the third stage allows correcting the base curves considering external variables (e.g., weather conditions).

Determination and Selection of the Input Variables
The first step requires the characterization of the external variables and the energy demand with the objective of establishing which variables have a high level of significance. Here, an ANN has the advantage to work with nonstationary time series, a characteristic that is difficult to find in models such as ARIMA. Here, stationarity in the time series (at least in a weak sense) is a desired property, since it guarantees that the statistical properties do not change among periods. Thus, the mean and variance are constant regardless of position of the random variable within the stochastic process. For example, Figure 2 shows the daily energy demand for Market 1, and it is possible to note stationarity and trending characteristics in the energy demand. The first stage consists of the implementation of an intelligent clustering technique to obtain the shape of a typical load curve for each type of day in a week (weekdays, weekends, holidays). The second stage presents the base curve (i.e., the reference curve) for each day that will be forecasted. This base curve was done by implementing statistical techniques that used historical data of energy demand and the typical curves obtained in the first stage. Finally, the third stage allows correcting the base curves considering external variables (e.g., weather conditions).

Determination and Selection of the Input Variables
The first step requires the characterization of the external variables and the energy demand with the objective of establishing which variables have a high level of significance. Here, an ANN has the advantage to work with nonstationary time series, a characteristic that is difficult to find in models such as ARIMA. Here, stationarity in the time series (at least in a weak sense) is a desired property, since it guarantees that the statistical properties do not change among periods. Thus, the mean and variance are constant regardless of position of the random variable within the stochastic process. For example, Figure 2 shows the daily energy demand for Market 1, and it is possible to note stationarity and trending characteristics in the energy demand. To analyse the energy demand, historical energy demand data of five (5) energy commercialization markets in Colombia in the cities of Barranquilla, Cartagena, Monteria and Santa Marta were used in the period between 1 January 2016 to 4 June 2018. Energy demand data were normalized during the building process of the forecasting models for each commercialization market. Market 1 corresponds to Barranquilla, Market 2 is Cartagena, Markets 3 and 5 belong to Monteria, while Market 4 is devoted to Santa Marta. Weather data were acquired through the website www.accuweather.com. Figures 3-7 show the energy demand for Markets 1 to 5, respectively. The Dickey-Fuller test applied to the time series shows that all the considered time series had nonstationary characteristics since the null hypothesis (i.e., the existence of a unit root) was not rejected. Due to the similar characteristics of the five (5) markets, the authors decided to use Market 1 to show the proposed methodology in detail.  To analyse the energy demand, historical energy demand data of five (5) energy commercialization markets in Colombia in the cities of Barranquilla, Cartagena, Monteria and Santa Marta were used in the period between 1 January 2016 to 4 June 2018. Energy demand data were normalized during the building process of the forecasting models for each commercialization market. Market 1 corresponds to Barranquilla, Market 2 is Cartagena, Markets 3 and 5 belong to Monteria, while Market 4 is devoted to Santa Marta. Weather data were acquired through the website www.accuweather.com. Figures 3-7 show the energy demand for Markets 1 to 5, respectively. The Dickey-Fuller test applied to the time series shows that all the considered time series had nonstationary characteristics since the null hypothesis (i.e., the existence of a unit root) was not rejected. Due to the similar characteristics of the five (5) markets, the authors decided to use Market 1 to show the proposed methodology in detail. To analyse the energy demand, historical energy demand data of five (5) energy commercialization markets in Colombia in the cities of Barranquilla, Cartagena, Monteria and Santa Marta were used in the period between 1 January 2016 to 4 June 2018. Energy demand data were normalized during the building process of the forecasting models for each commercialization market. Market 1 corresponds to Barranquilla, Market 2 is Cartagena, Markets 3 and 5 belong to Monteria, while Market 4 is devoted to Santa Marta. Weather data were acquired through the website www.accuweather.com. Figures 3-7 show the energy demand for Markets 1 to 5, respectively. The Dickey-Fuller test applied to the time series shows that all the considered time series had nonstationary characteristics since the null hypothesis (i.e., the existence of a unit root) was not rejected. Due to the similar characteristics of the five (5) markets, the authors decided to use Market 1 to show the proposed methodology in detail.            Table 1 shows the statistical characteristics of the energy demand profile. Weekdays presented similar mean and median in the energy demand, on the contrary to Saturdays and Sundays/holidays. Table 1 reveals that there was data concentration to the left of the mean due to the reported positive asymmetry. Similarly, kurtosis values were similar for all types of days and lower than three, showing a shape not as sharp and with wider tails compared to a normal distribution.  Table 1 shows the statistical characteristics of the energy demand profile. Weekdays presented similar mean and median in the energy demand, on the contrary to Saturdays and Sundays/holidays. Table 1 reveals that there was data concentration to the left of the mean due to the reported positive asymmetry. Similarly, kurtosis values were similar for all types of days and lower than three, showing a shape not as sharp and with wider tails compared to a normal distribution.  Figure 8 shows an example of when the energy demand presents a seasonal component. This component exists due to the difference in the energy demand for Saturdays and Sundays/holidays, where there was a reduction in the consumption because several activities were stopped.
On the other hand, Figure 9 shows the energy demand distribution for all types of days for each considered period. This figure represents the summary characteristics of Table 1. The energy demand data distribution for each type of day was mostly concentrated toward the left of the mean, while the variance was approximately equal for every case. A multicomparison was used to verify the existence of statistical differences in the mean of the energy demand for each type of day. Figure 10 Table 1. The energy demand data distribution for each type of day was mostly concentrated toward the left of the mean, while the variance was approximately equal for every case. A multicomparison was used to verify the existence of statistical differences in the mean of the energy demand for each type of day. Figure 10     On the other hand, Figure 9 shows the energy demand distribution for all types of days for each considered period. This figure represents the summary characteristics of Table 1. The energy demand data distribution for each type of day was mostly concentrated toward the left of the mean, while the variance was approximately equal for every case. A multicomparison was used to verify the existence of statistical differences in the mean of the energy demand for each type of day. Figure 10     So far, the obtained results in the first stage show the differences among the energy demand for weekdays, Saturdays, and Sundays/holidays. Additionally, Figure 4 shows the necessity to build a model for each day, because the shape of energy demand distribution for weekdays was different. Therefore, eight neural networks were trained using the energy demand for each day. It is important to note that a variation in the energy demand existed for a particular day within a considered period, which suggests a changing energy demand for certain times of the year. Thus, the methodology proposed the addition of a classification stage that allowed for the consideration of historical trends to determine which of them had the highest similarity. Figure 11 shows the power spectral density (PSD) for energy demand. The spectral peaks in the So far, the obtained results in the first stage show the differences among the energy demand for weekdays, Saturdays, and Sundays/holidays. Additionally, Figure 4 shows the necessity to build a model for each day, because the shape of energy demand distribution for weekdays was different. Therefore, eight neural networks were trained using the energy demand for each day. It is important to note that a variation in the energy demand existed for a particular day within a considered period, Energies 2020, 13, 4040 7 of 16 which suggests a changing energy demand for certain times of the year. Thus, the methodology proposed the addition of a classification stage that allowed for the consideration of historical trends to determine which of them had the highest similarity. Figure 11 shows the power spectral density (PSD) for energy demand. The spectral peaks in the frequency for the days with a high grade of information, the presence of trends and stationarity in the energy demand and the presence of a high correlation can be observed. Using the PSD and the correlation function, it is possible to observe the existence of a predominant component in the temporal series of seven days.
So far, the obtained results in the first stage show the differences among the energy demand for weekdays, Saturdays, and Sundays/holidays. Additionally, Figure 4 shows the necessity to build a model for each day, because the shape of energy demand distribution for weekdays was different. Therefore, eight neural networks were trained using the energy demand for each day. It is important to note that a variation in the energy demand existed for a particular day within a considered period, which suggests a changing energy demand for certain times of the year. Thus, the methodology proposed the addition of a classification stage that allowed for the consideration of historical trends to determine which of them had the highest similarity. Figure 11 shows the power spectral density (PSD) for energy demand. The spectral peaks in the frequency for the days with a high grade of information, the presence of trends and stationarity in the energy demand and the presence of a high correlation can be observed. Using the PSD and the correlation function, it is possible to observe the existence of a predominant component in the temporal series of seven days. The autocorrelation function in Figure 12 shows a high correlation between each of the previous energy demand data and the seasonal nature of that demand. The importance of historical data for the generalization of a stationary process behaviour is clear. The autocorrelation function in Figure 12 shows a high correlation between each of the previous energy demand data and the seasonal nature of that demand. The importance of historical data for the generalization of a stationary process behaviour is clear.  Table 2 shows the correlation matrix for the energy demand and the meteorological variables for City 1, which corresponds to Market 1. This table highlights variables with a high level of correlation and data that have information about the thermal sensation of the zone of interest. Thus, humidity and wind speed were used as indicators for the thermal sensation of the considered area.
Using the results of Table 2, it can be noticed that the energy demand (D) and the temperature  Table 2 shows the correlation matrix for the energy demand and the meteorological variables for City 1, which corresponds to Market 1. This table highlights variables with a high level of correlation and data that have information about the thermal sensation of the zone of interest. Thus, humidity and wind speed were used as indicators for the thermal sensation of the considered area.  Using the results of Table 2, it can be noticed that the energy demand (D) and the temperature average (T) in the city had a high level of correlation. Therefore, the temperature was used as an external variable for the model. Similarly, as commented, the minimum humidity (H) and the maximum wind speed (W) were used to quantify the thermal sensation in the city.
As a conclusion of this procedure, a generalized methodology for the determination and selection of the input variables was proposed as:

1.
Tuckey-Kramer test and time-series distribution combination to identify the number of forecasting models.

2.
PSD analysis to assess the seasonality component of time series to add this information in the building of the base curve.

3.
ACF analysis to identify how many lags are necessary to build the base curve.

4.
Correlation matrix between weather data and energy demand to identify how the weather conditions affect the power consumption of customers.

Identification for Typical Curves in the Energy Demand
The objective for the first stage was to generate an algorithm that considered a set of curves associated with the same day of a week, clustered those curves that represented the typical behaviour of that day and rejected the ones that display different or atypical behaviour. To identify the typical curves, their normalized shapes were analysed.
For the implementation of this stage, the following actions are required: • Normalizing the energy demand curves by dividing all the data per curve by the maximum individual value. In this way, the curves ranged between 0 and 1, and it was possible to focus on the curve shape. Obtaining the clustering by employing the hierarchical clustering technique. This method divided the data into sets, considering how similar they were.

•
Measuring the similarity using DTW. This technique is considered the best metric for the comparison of time series. Due to the temporal characteristic in the curves of the energy demand, they were part of the set of problems in which this technique presented excellent results. Figure 13 shows the obtained results for this stage. The typical curve is observed with a thick solid line.
Energies 2020, 13, 4040 9 of 16 divided the data into sets, considering how similar they were.  Measuring the similarity using DTW. This technique is considered the best metric for the comparison of time series. Due to the temporal characteristic in the curves of the energy demand, they were part of the set of problems in which this technique presented excellent results. Figure 13 shows the obtained results for this stage. The typical curve is observed with a thick solid line.

Generation for the Base Curves
The second stage obtains the base curve for forecasting the energy demand for each of the considered days. The goal was to generate the base curves as accurately as possible to minimize the possible corrections. Thus, the base curve was a preliminary prediction, which was obtained using statistical techniques, and it was be corrected at the final stage of the proposed methodology.

Generation for the Base Curves
The second stage obtains the base curve for forecasting the energy demand for each of the considered days. The goal was to generate the base curves as accurately as possible to minimize the possible corrections. Thus, the base curve was a preliminary prediction, which was obtained using statistical techniques, and it was be corrected at the final stage of the proposed methodology.
The process to obtain the base curve is described as follows: • Once the day to be forecasted was identified, the typical curve obtained in Stage 1 was used to start the process. It is important to remember that such a curve is normalized but it represents the typical shape of the energy demand curve.

•
The typical curve of that day was located at the current average level of energy demand. This was achieved by verifying the average level of the energy demand from the previous days for weekdays, and from the average level of the energy demand of the same day in the last week for Saturdays, Sundays, and holidays.

•
The obtained curve was adjusted using the last four curves for the same day, obtaining the base curve. for Saturdays, Sundays, and holidays.  The obtained curve was adjusted using the last four curves for the same day, obtaining the base curve. Figure 14 shows a sample for the construction of one of the base curves for a weekday of Market 1.

Intelligent Correction for the Base Curve
Stage 2 corresponds to a statistical model that allows obtaining preliminary forecasting curves (i.e., base curve), which represent mostly the expected forecast. However, they have to be slightly corrected for performance improvement. The difference between a base curve and a real one was due to the effects of external variables (i.e., temperature, humidity, and wind speed). Thus, Stage 3 employed computational intelligence techniques to adjust the forecasting base curve in the function of such external variables.
The required steps for this stage were as follows:


The ANN technique was the computational intelligence technique used for adjusting. In total, eight independent models were obtained for each day. Each neural network had to adjust only one type of day (i.e., Sundays to Saturdays, taking the holidays as independent days).  The input for the ANN was a function of the external variables, while the output was the adjustment that must be applied to each period (increase or reduction) to obtain a better forecasting energy demand.  The neural networks were trained with the historical register of the last two years.
Each of the phases considered within the intelligent correction of the base of the curve is detailed below.

Intelligent Correction for the Base Curve
Stage 2 corresponds to a statistical model that allows obtaining preliminary forecasting curves (i.e., base curve), which represent mostly the expected forecast. However, they have to be slightly corrected for performance improvement. The difference between a base curve and a real one was due to the effects of external variables (i.e., temperature, humidity, and wind speed). Thus, Stage 3 employed computational intelligence techniques to adjust the forecasting base curve in the function of such external variables.
The required steps for this stage were as follows: • The ANN technique was the computational intelligence technique used for adjusting. In total, eight independent models were obtained for each day. Each neural network had to adjust only one type of day (i.e., Sundays to Saturdays, taking the holidays as independent days).

•
The input for the ANN was a function of the external variables, while the output was the adjustment that must be applied to each period (increase or reduction) to obtain a better forecasting energy demand.

•
The neural networks were trained with the historical register of the last two years.
Each of the phases considered within the intelligent correction of the base of the curve is detailed below.

Smoothing of the Corrected Curves
In the output of Stage 3, it was possible to observe a small ripple in the forecasting energy demand curves due to the fact that a certain level of randomness exists in ANN-based models. Therefore, a smoothing technique was implemented to further enhance the performance of the proposed model.
An inverse clustering technique was implemented to smooth the curves. This process considered the four most similar curves to the obtained curve after the intelligent correction. Such curves became a reference to adjust the trace of the corrected curve and reduce the ripple to obtain the required smoothing (see Figure 15). Therefore, a smoothing technique was implemented to further enhance the performance of the proposed model.
An inverse clustering technique was implemented to smooth the curves. This process considered the four most similar curves to the obtained curve after the intelligent correction. Such curves became a reference to adjust the trace of the corrected curve and reduce the ripple to obtain the required smoothing (see Figure 15).

Extreme Value Suppressors
Since no model is totally perfect, it was possible to obtain extreme values of the energy demand in each period of the generated curves. To correct these atypical values, if they appeared, an extreme value suppressor was applied. The suppressor allowed for better robustness, as well as an increase in reliability.
The extreme value suppressor was based on an absolute median deviation elimination using a sliding window (see Figure 16).

Extreme Value Suppressors
Since no model is totally perfect, it was possible to obtain extreme values of the energy demand in each period of the generated curves. To correct these atypical values, if they appeared, an extreme value suppressor was applied. The suppressor allowed for better robustness, as well as an increase in reliability.
The extreme value suppressor was based on an absolute median deviation elimination using a sliding window (see Figure 16).

Estimation of the ANN Parameters
The neural networks training to each market was carried out iteratively, varying the number of neurons in the hidden layer. The MAPE was used as a performance metric to evaluate the level of generalization in the neural networks. The training data were energy demand data and weather data of the markets during the period 2016-2017. The year 2018 was used to verify the performance of the selected models. In general, a MultiLayer Perceptron (MLP) with sigmoid and linear activation

Estimation of the ANN Parameters
The neural networks training to each market was carried out iteratively, varying the number of neurons in the hidden layer. The MAPE was used as a performance metric to evaluate the level of generalization in the neural networks. The training data were energy demand data and weather data of the markets during the period 2016-2017. The year 2018 was used to verify the performance of the selected models. In general, a MultiLayer Perceptron (MLP) with sigmoid and linear activation functions in the hidden and output layers were implemented, respectively. For all markets, eight neural networks were used for each type of day, including the holidays as a special case.
In this work, the methodology for the neural networks used by the authors of [18][19][20] was considered as a reference to select the best possible configuration.

Results and Conclusions
The proposed methodology established a set of steps to model the behaviour of energy demand. First, energy demand data for Market 1 were used to validate the proposed methodology. This section presents the obtained results for Markets 1 to 5. The MAPE has frequently been used in the literature to measure the level of adjustment and generalization of the prediction models. It is expressed as  Table 3 shows the error percentage obtained during each adjustment stages proposed in this methodology.
The results in Table 3 shows the level of contribution in the error reduction for each adjustment stage.  Table 4 shows the final performance for the obtained models for each market. The results of Figures 17-21 and Table 4 demonstrate the fulfilment of the proposed objective, since it was possible to achieve a MAPE below 5% for a 10-day forecast for each of the markets. The results show a better performance in comparison with the same demand curves described by the authors of [20].
The results of Figures 17-21 and Table 4 demonstrate the fulfilment of the proposed objective, since it was possible to achieve a MAPE below 5% for a 10-day forecast for each of the markets. The results show a better performance in comparison with the same demand curves described by the authors of [20].          The construction of consumption profiles proved to be a good step toward predicting demand. However, a preliminary base curve does not incorporate the effect of external variables such as weather conditions. Therefore, an intelligent correction with neural networks is necessary. Such correction technique can adjust each period independently. However, it is possible to induce ripple in the output curve. Therefore, the intelligent correction agent was complemented with mechanisms of smoothing and outliers suppressing to guarantee the best performance.
The design of an adaptive methodology capable of anticipating changes in the behavioural dynamics of the time series is proposed for future work. This will allow the automatic adjustment of the parameters of the models without affecting performance significantly. The addition of an intelligent system will help to identify when the model should be retrained or replaced by another that is available in the knowledge base that the proposed system can access to model a time series.  The construction of consumption profiles proved to be a good step toward predicting demand. However, a preliminary base curve does not incorporate the effect of external variables such as weather conditions. Therefore, an intelligent correction with neural networks is necessary. Such correction technique can adjust each period independently. However, it is possible to induce ripple in the output curve. Therefore, the intelligent correction agent was complemented with mechanisms of smoothing and outliers suppressing to guarantee the best performance.
The design of an adaptive methodology capable of anticipating changes in the behavioural dynamics of the time series is proposed for future work. This will allow the automatic adjustment of the parameters of the models without affecting performance significantly. The addition of an intelligent system will help to identify when the model should be retrained or replaced by another that is available in the knowledge base that the proposed system can access to model a time series.