# Energy Demand Forecasting Using Deep Learning: Applications for the French Grid

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- Statistical models: Purely empirical models where inputs and outputs are correlated using statistical inference methods, such as:
- ○
- Cointegration analysis and ARIMA;
- ○
- Log-linear regression models;
- ○
- Combined bootstrap aggregation (bagging) ARIMA and exponential smoothing.

- Grey models: These combine a partial theoretical structure with empirical data to complete the structure. When compared to purely statistical models, grey models require only a limited amount of data to infer the behavior of the electrical system. Therefore, grey models can deal with partially known information through generating, excavating, and extracting useful information from what is available. In return, the construction of the partial theoretical structure required by grey models is resource-demanding in terms of modeling. Thus, the cost of an accurate grey model for a particular application is usually high.
- Artificial intelligence models: Traditional machine learning models are data-driven techniques used to model complex relationships between inputs and outputs. Although the basis of machine learning is mostly statistical, the current availability of open platforms to easily design and train models contributes significantly to access to this technology. This fact, along with the high performance achieved by well-designed and trained machine learning models, provides an affordable and robust tool for power demand forecasting.

## 2. Materials and Methods

#### 2.1. Data Analysis

#### 2.1.1. Power Demand Data

#### 2.1.2. Weather Forecast Data

#### 2.2. Data Preparation

- Training Dataset (80% of the data): The sample of data used to fit the model.
- Validation Dataset (10% of the data): The sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning model hyperparameters.
- Test Dataset (10% of the data): The sample of data used to provide an unbiased evaluation of a final model fit on the training dataset.

#### 2.3. Deep Learning Architecture

- Week of the year: a number from 1 to 52.
- Hour: a number from 0 to 23.
- Day of the week: a number from 0 to 6.
- Holiday: true (1) or false (0).

#### 2.3.1. Convolutional Neural Network

- They use a fewer number of parameters (weights) with respect to fully connected networks.
- They are designed to be invariant in object position and distortion of the scene when used to process images, which is a property shared when they are fed with other kinds of inputs as well.
- They can automatically learn and generalize features from the input domain.

- A two-dimensional convolutional layer. This layer created a convolution kernel that was convolved with the layer to produce a tensor of outputs. It was set with the following parameters:
- ○
- Filters: 8 integers, the dimensions of the output space.
- ○
- Kernel size: (2,2). A list of 2 integers specifying the height and width of the 2D convolution window.
- ○
- Strides: (1,1). A list of 2 integers specifying the stride of the convolution along with the height and width.
- ○
- Activation function: Rectified linear unit (ReLU)
- ○
- Padding: To pad input (if needed) so that input image was fully covered by the filter and was stride-specified.

- Average pooling 2D: Average pooling operation for special data. This was set with the following parameters:
- ○
- Pool size: (2,2). Factors by which to downscale.
- ○
- Strides: (1,1).

- Flattening: To flatten the input.
- A fully connected network providing the featured output temperature:
- ○
- Layer 1: 64 neurons, activation function: ReLU.
- ○
- Layer 2: 24 neurons, activation function: ReLU.
- ○
- Layer 3: 1 neuron, activation function: ReLU.

#### 2.3.2. Artificial Neural Network.

- Layer 1: 256 neurons, activation function: ReLU.
- Layer 2: 128 neurons, activation function: ReLU.
- Layer 3: 64 neurons, activation function: ReLU.
- Layer 4: 32 neurons, activation function: ReLU.
- Layer 5: 16 neurons, activation function: ReLU.
- Layer 6: 1 neuron, activation function: ReLU.

#### 2.3.3. Design of the Architecture

#### 2.4. Training

- Reduce the training error to as low as possible.
- Keep the gap between the training and validation errors as low as possible.

- Batch size: 100. The number of training examples in one forward/backward pass. The higher the batch size, the more memory space needed.
- Epochs: 30,000. One forward pass and one backward pass of all the training examples.
- Learning rate: 0.001. Determines the step size at each iteration while moving toward a minimum of a loss function.
- β1 parameter: 0.9. The exponential decay rate for the first moment estimates (momentum).
- β2 parameter: 0.99. The exponential decay rate for the first moment estimates (RMSprop).
- Loss function: Mean absolute percentage error.

## 3. Results

## 4. Discussion

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Hu, H.; Wang, L.; Peng, L.; Zeng, Y.-R. Effective energy consumption forecasting using enhanced bagged echo state network. Energy
**2020**, 197, 1167–1178. [Google Scholar] [CrossRef] - Oliveira, E.M.; Luiz, F.; Oliveira, C. Forecasting mid-long term electric energy consumption through bagging ARIMA and exponential smoothing methods. Energy
**2018**, 144, 776–778. [Google Scholar] [CrossRef] - Wang, J.; Ma, Y.; Zhang, L.; Gao, R.X.; Wu, D. Deep learning for smart manufacturing: Methods and applications. J. Manuf. Syst.
**2018**, 48, 144–156. [Google Scholar] [CrossRef] - RazaKhan, A.; Mahmood, A.; Safdar, A.A.; Khan, Z. Load forecsating, dynamic pricing and DSM in smart grid: A review. Renew. Sustain. Energy Rev.
**2016**, 54, 1311–1322. [Google Scholar] - Hippert, H.; Pedreira, C.; Souza, R. Neural Network for short-term load forecasting: A review and evaluation. IEEE Trans. Power Syst.
**2001**, 16, 44–55. [Google Scholar] [CrossRef] - Gonzalez-Romera, J.-M.; Carmona-Fernandez, M. Montly electric demand forecasting based on trend extraction. IEEE Trans. Power Syst.
**2006**, 21, 1946–1953. [Google Scholar] [CrossRef] - Becalli, M.; Cellura, M.; Brano, L.; Marvuglia, V. Forecasting daily urban electric load profiles using artificial neural networks. Energy Convers. Manag.
**2004**, 45, 2879–2900. [Google Scholar] [CrossRef] - Srinivasan, D.; Lee, M.A. Survey of hybrid fuzzy neural approches to a electric load forecasting. In Proceedings of the IEEE international Conference on System, Man and Cybernetics. Intelligent System for the 21st Century, Vancouver, BC, Canada, 22–25 October 1995. [Google Scholar]
- Liu, K.; Subbarayan, S.; Shoults, R.; Manry, M. Comparison of very short-term load forecasting techniques. IEEE Trans. Power Syst.
**1996**, 11, 877–882. [Google Scholar] [CrossRef] - Bo, H.; Nie, Y.; Wang, J. Electric load forecasting use a novelty hybrid model on the basic of data preprocessing technique and multi-objective optimization algorithm. IEEE Access
**2020**, 8, 13858–13874. [Google Scholar] [CrossRef] - Wen, S.; Wang, Y.; Tang, Y.; Xu, Y.; Li, P.; Zhao, T. Real—Time identification of power fluctuations based on lstm recurrent neural network: A case study on singapore power system. IEEE Trans. Ind. Inform.
**2019**, 15, 5266–5275. [Google Scholar] [CrossRef] - Gu, J.; Wangb, Z.; Kuenb, J.; Ma, L.; Shahroudy, A.; Shuaib, B.; Wang, X.; Wang, L.; Wang, G.; Cai, J.; et al. Recent advances in convolutional neural networks. Pattern Recognit.
**2017**. [Google Scholar] [CrossRef][Green Version] - Fukushima, K. Neocognitron: A hierical neural network capable of visual pattern recognition. Neural Netw.
**1988**, 1, 119–130. [Google Scholar] [CrossRef] - Alom, M.Z.; Taha, T.M.; Yakopcic, C.; Westberg, S.; Sidike, P. A State-of-the-art survey on deep learning theory and architectures. Electronics
**2019**, 8, 292. [Google Scholar] [CrossRef][Green Version] - Wang, L.; Lu, H.; Ruan, X.; Yang, M.H. Deep networks for saliency detection via local estimation and global search. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Jasiński, T. Modeling electricity consumption using nighttime light images and artificial neural networks. Energy
**2019**, 179, 831–842. [Google Scholar] [CrossRef] - Kuo, P.-H.; Huang, C.-J. A high precision artificial neural networks model for short-term energy load forecasting. Energies
**2018**, 11, 213. [Google Scholar] [CrossRef][Green Version] - RTE. November 2014. Available online: http://clients.rte-france.com/lang/fr/visiteurs/vie/courbes_methodologie.jsp (accessed on 20 December 2019).
- Arenal Gómez, C. Modelo de Temperatura Para la Mejora de la Predicción de la Demanda Eléctrica: Aplicación al Sistema Peninsular Español; Universidad Politécnica de Madrid: Madrid, Spain, 2016. [Google Scholar]
- ARPEGE. Meteo France, Le Modele. 2019. Available online: https://donneespubliques.meteofrance.fr/client/document/doc_arpege_pour-portail_20190827-_249.pdf (accessed on 3 February 2020).
- Goodfellow, I.; Bengio, Y.; Courville, A. Optimization for training deep models. Deep Learning. 2017, pp. 274–330. Available online: http://faculty.neu.edu.cn/yury/AAI/Textbook/DeepLearningBook.pdf (accessed on 3 February 2020).
- Browlee, J. Machine Learning Mastery. Available online: https://machinelearningmastery.com/weightregularization- (accessed on 3 February 2020).

**Figure 1.**Comparison between traditional machine learning models (

**a**) requiring manual feature extraction, and modern deep learning structures (

**b**) which can automate all the feature and training process in an end-to-end learning structure.

**Figure 2.**Monthly French energy demand for the period 2018–2019. Q

_{x}indicates the X data percentiles. The colored lines within the Q

_{25}and Q

_{75}quartile boxes represent the median (orange line) and the mean (dashed green line). Points below and above Q

_{25}and Q

_{75}are shown as well.

**Figure 3.**Correlation between energy consumption and temperature as provided by the Réseau de Transport d’Electricité (RTE).

**Figure 6.**Division of the original dataset (365 days) into testing and training data. The testing data were used as a complementary means to further analyze the generalization performance of the resulting model. The remaining training data were divided as usual: 80% train, 10% validation, and 10% testing.

**Figure 7.**Deep learning structure composed of a convolutional neural network followed by an artificial neural network adapted to the energy demand forecasting problem.

**Figure 8.**Absolute percentage error distribution provided by the deep learning structure proposed in this paper and the RTE subscription-based service.

**Figure 9.**Absolute percentage error-specific monthly metrics over an entire year as provided by the proposed deep neural network.

**Figure 10.**Performance of the forecast provided by the proposed deep neural network on the eight full days extracted from the original data. (

**Left column**) Real energy consumption, neural network energy prediction and energy prediction by RTE model on a different full day in the Testing Set. (

**Right column**) Absolute Percentage Error in energy prediction throughout the day by the neural network and the model proposed by RTE).

**Table 1.**Summary of the results of the different structures. ANN: artificial neural network; CNN: convolutional neural network.

Model | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|

Layer 1 (CNN) | - | 64 | 64 | 64 | 64 |

Layer 2 (CNN) | 24 | 24 | 24 | 24 | 24 |

Layer 1 (ANN) | - | - | - | - | 256 |

Layer 2 (ANN) | - | - | - | 128 | 128 |

Layer 3 (ANN) | - | - | 64 | 64 | 64 |

Layer 4 (ANN) | 32 | 32 | 32 | 32 | 32 |

Layer 5 (ANN) | 16 | 16 | 16 | 16 | 16 |

Layer 6 (ANN) | 1 | 1 | 1 | 1 | 1 |

Train Error (%) | 1.9548 | 1.2275 | 0.6532 | 0.4797 | 0.4929 |

Validation Error (%) | 2.7721 | 2.6791 | 1.2307 | 0.9378 | 0.8603 |

Test Error 1 (%) | 2.8185 | 3.0435 | 1.2415 | 0.9125 | 0.8843 |

Test Error 2 (%) | 4.2818 | 4.1677 | 2.0604 | 1.6873 | 1.5378 |

Cross-Validation Error (%) | 5.8827 | 5.3691 | 2.6341 | 2.0806 | 1.6621 |

**Table 2.**Performance comparison metrics. MAE: mean absolute error; MAPE: mean absolute percentage error; MBE: mean bias error; MBPE: mean bias percentage error.

Model | MAE (MW) | MAPE (%) | MBE (MW) | MBPE (%) |
---|---|---|---|---|

Deep learning network | 808.317 | 1.4934 | 21.7444 | 0.0231 |

RTE forecast service | 812.707 | 1.4941 | 280.8350 | 0.4665 |

Month | MAE (MW) | MAPE (%) | MBE (MW) | MBPE (MW) |
---|---|---|---|---|

January | 965.1701 | 1.3542 | 54.1231 | 0.0499 |

February | 818.8975 | 1.2424 | 203.2982 | 0.2812 |

March | 823.4836 | 1.4667 | 25.0578 | −0.0033 |

April | 1041.4191 | 1.9774 | 554.1952 | 1.0199 |

May | 684.9614 | 1.4718 | 94.5629 | 0.1791 |

June | 544.8806 | 1.2693 | 29.9470 | 0.0536 |

July | 588.3867 | 1.2318 | −264.5221 | −0.5527 |

August | 572.0692 | 1.3592 | −98.9578 | −0.2256 |

September | 618.1227 | 1.3575 | −168.3288 | −0.3636 |

October | 676.8499 | 1.3964 | 67.5875 | 0.1724 |

November | 1062.7987 | 1.7511 | 211.3716 | 0.3383 |

December | 1303.7523 | 2.0131 | −385.4661 | −0.5891 |

Month | MAE (MW) | MAPE (%) | MBE (MW) | MBPE (MW) |
---|---|---|---|---|

January | 1078.5000 | 1.5249 | 275.5658 | 0.3773 |

February | 1011.5000 | 1.5078 | 504.5656 | 0.7169 |

March | 1082.8261 | 1.8753 | 843.4203 | 1.4599 |

April | 751.6769 | 1.4599 | 243.6000 | 0.4678 |

May | 722.7412 | 1.5619 | 138.6000 | 0.2854 |

June | 604.5833 | 1.3278 | 71.5307 | 0.1574 |

July | 623.2633 | 1.3025 | −156.375 | −0.3392 |

August | 607.4872 | 1.4180 | 92.2435 | 0.2202 |

September | 543.9652 | 1.2333 | 8.8125 | −0.0092 |

October | 760.9367 | 1.4758 | 222.9873 | 0.3394 |

November | 917.9838 | 1.5582 | 579.1290 | 1.0029 |

December | 1039.5945 | 1.6306 | 572.2432 | 0.9326 |

Model | MAE (MW) | MAPE (%) | MBE (MW) | MBPE (%) |
---|---|---|---|---|

Linear Regression | 6217.5683 | 12.3102 | −232.3066 | −2.5240 |

Regression Tree | 5436.8139 | 10.4437 | −254.3756 | −2.0879 |

Support Vector Regression (Lineal) | 6217.4404 | 12.2809 | −0.597 | −2.0014 |

Support Vector Regression (Polinomic) | 4813.6934 | 9.2218 | 246.248 | −1.047 |

ARIMA | 1179.964 | 2.9480 | 104.6737 | 0.2097 |

ANN | 1537.5137 | 2.8351 | 132.1096 | 0.1195 |

CNN + ANN | 808.3166 | 1.4934 | 21.7444 | 0.0231 |

RTE Model | 812.6966 | 1.4941 | 280.835 | 0.4665 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

del Real, A.J.; Dorado, F.; Durán, J. Energy Demand Forecasting Using Deep Learning: Applications for the French Grid. *Energies* **2020**, *13*, 2242.
https://doi.org/10.3390/en13092242

**AMA Style**

del Real AJ, Dorado F, Durán J. Energy Demand Forecasting Using Deep Learning: Applications for the French Grid. *Energies*. 2020; 13(9):2242.
https://doi.org/10.3390/en13092242

**Chicago/Turabian Style**

del Real, Alejandro J., Fernando Dorado, and Jaime Durán. 2020. "Energy Demand Forecasting Using Deep Learning: Applications for the French Grid" *Energies* 13, no. 9: 2242.
https://doi.org/10.3390/en13092242