Next Article in Journal
Operational Analysis of an Axial and Solid Double-Pole Configuration in a Permanent Magnet Flux-Switching Generator
Next Article in Special Issue
Using Generative Pre-Trained Transformers (GPT) for Electricity Price Trend Forecasting in the Spanish Market
Previous Article in Journal
Energy Management Systems’ Modeling and Optimization in Hybrid Electric Vehicles
Previous Article in Special Issue
Fault Detection and Prediction for Power Transformers Using Fuzzy Logic and Neural Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hybrid Metaheuristic Algorithms for Optimization of Countrywide Primary Energy: Analysing Estimation and Year-Ahead Prediction

by
Basharat Jamil
1,* and
Lucía Serrano-Luján
1,2
1
Department of Computer Science and Statistics, Universidad Rey Juan Carlos, Calle Tulipán S/N, Móstoles, 28933 Madrid, Spain
2
Department of Electronics, Plaza del Hospital S/N, Universidad Politécnica de Cartagena, 30202 Murcia, Spain
*
Author to whom correspondence should be addressed.
Energies 2024, 17(7), 1697; https://doi.org/10.3390/en17071697
Submission received: 8 February 2024 / Revised: 21 March 2024 / Accepted: 28 March 2024 / Published: 2 April 2024
(This article belongs to the Special Issue Optimization of Energy Systems Using Intelligent Methods)

Abstract

:
In the present work, India’s primary energy use is analysed in terms of four socio-economic variables, including Gross Domestic Product, population, and the amounts of exports and imports. Historical data were obtained from the World Bank database for 44 years as annual values (1971–2014). Energy use is analysed as an optimisation problem, where a unique ensemble of two metaheuristic algorithms, Grammatical Evolution (GE), and Differential Evolution (DE), is applied. The energy optimisation problem has been investigated in two ways: estimation and a year-ahead prediction. Models are compared using RMSE (objective function) and further ranked using the Global Performance Index (GPI). For the estimation problem, RMSE values are found to be as low as 0.0078 and 0.0103 on training and test datasets, respectively. The average estimated energy use is found in good agreement with the data (RMSE = 6.3749 kgoe/capita), and the best model (E10) has an RMSE of 5.8183 kgoe/capita, with a GPI of 1.7249. For the prediction problem, RMSE is found to be 0.0096 and 0.0122 on training and test datasets, respectively. The average predicted energy use has RMSE of 7.8857 (kgoe/capita), while Model P20 has the best value of RMSE (7.9201 kgoe/capita) and a GPI of 1.8836.

1. Introduction

Energy forms the basis of our lives, is central to a country’s economy, and is the primary requirement for every organism thriving on the planet. Economic and population growth has directly correlated to the energy demand [1]. Therefore, energy planning and development, research, and modelling have become a dedicated part of organizations to provide future generations with energy security. The depletion of oil and coal-based conventional fuels has also forced the human race to shift the focus on renewable energy sources and to adapt to energy-responsible behaviour.
India is a developing country with a rapid pace of industrial expansion and infrastructural development, coupled with an increasing population (1.4 billion people), which, consequently, leads to high energy demand. A few decades ago, on average, India’s industrial sector consumed 52% of the energy, while the transportation and domestic sectors accounted for 23% and 11%, respectively [2]. Furthermore, the evolution of the modern lifestyle, characterised by the use of electrical and electronic devices, has been pursued for greater comfort and enjoyment [3]. Primary energy use in India, in the decade after 2009, increased from 21 EJ to 34 EJ, a rise of 62% in only a decade, with an average growth rate of 4.7% [4]. At this rate, the energy supply needs to grow to meet the future energy demand [5].
Therefore, India needs to address its energy challenge through technological development and policy changes to include the options available within the energy sectors and accelerate toward achieving net-zero emissions by 2050. For this, estimating energy use to assess the growth of energy demand is an essential activity in drafting future policies.
Energy demand forecasting involves using long-term historical data to predict future energy demand using statistical methods [6]. These forecast models often depend on climate, socio-economic parameters, and demographic data, resulting in high uncertainties in the predicted results. These variations occur when projections are made at different scales. Researchers have extensively studied energy demand estimation and prediction problems across various time frequencies and scales (daily, monthly, and annual), combined with sectoral and regional levels, as well as the country as a whole (usually referred to as “primary energy demand”).
These modelling approaches to the problems of energy demand estimation and prediction can be classified into three main categories, viz, (i) statistical methods and time-series models, (ii) econometric models, and (iii) Artificial Intelligence (AI) methods [7,8]. Further AI methods have been developed by neuro-fuzzy methods and, more recently, metaheuristics [9,10,11].
AI techniques have been developed and implemented to solve energy demand forecasting problems. These provide greater accuracy in comparison to statistical/deterministic means. Therefore, recently, the available literature has focused more on developing these methods. In this section, we look at some of the approaches in recent years. Jiang et al. [1] presented a new method to forecast short-term electrical energy demand, combining adaptive Fourier decomposition and a new signal pre-processing technology that extracts the helpful element from the electricity demand data series by discarding the noise. They suggested that the method developed using pre-processing data to remove seasonality can be effectively used to forecast energy demand. Sajadi et al. [12] investigated the effect of energy prices on long-term energy forecasting. In addition, they studied the electricity generation from natural gas, which has been reported missing in the literature. They applied an approach to first-order Takagi–Sugeno type fuzzy inference systems (TS-FIS) to construct the regression models. Application of the developed model was exhibited through the case study of Iran, and it was deduced that high electricity prices result in considerably less energy use. In a similar study, Dalfard et al. [13] investigated the relationship between the hike in energy prices with the electricity demand and the use of natural gas. Adaptive Network-based FIS (ANFIS) combined with Monte Carlo simulation was developed to model natural gas consumption in power generation (NGPG). The approach was verified using data on electrical energy and natural gas combined with socio-economic parameters for Iran between 2010 and 2016. It was reported that the approach developed could be adopted for prediction problems where the energy prices suddenly vary. Daş [14] forecasted Turkey’s energy demand using Particle Swarm Optimization with mutation based on an improved neural network (PSOM-NN). Energy demand was modelled in terms of Turkey’s GDP, population, imports, and exports between 1979 and 2005. It was concluded that the PSOM-NN produces better forecasts with greater accuracy when compared to previous approaches for modelling energy demand. Salcedo-Sanz et al. [15] described the one-year-ahead energy demand estimation approach for Spain, using two different computational algorithms, described as a modified Harmony Search (HS) optimization algorithm with an exponential prediction model and an Extreme Learning Machine (ELM). Data for 14 macroeconomic variables for 30 years were used to model the energy demand. When compared with the previously published algorithms, the prediction accuracy was reported to improve the results by 15%. The results were extended to model CO2 emissions for the country using the same evolutionary algorithms, and the accuracy was reported to improve by 10%. Sánchez-Oro et al. [16] presented the hybrid neighbourhood variable search–Extreme Learning Machine algorithm to predict the energy demand for Spain. The feature selection mechanism combined with the exponential prediction model using the historical data for several macroeconomic variables resulted in an excellent performance of the proposed approach. It was further testified that even during the crisis year of 2008, the energy prediction was accurate within 2%. Toksar [17] applied Ant Colony Optimization (ACO) for Turkey’s energy demand estimation using the four commonly used socio-economic variables (GDP, population, imports, and exports). Two models (linear and quadratic) were proposed, and the ACO optimized both. It was concluded that the quadratic ACO model outperforms the linear model and has an accuracy of as low as −0.15% relative error. Unler [18], in a similar approach to Turkey, presented a swarm intelligence approach to estimate the energy demand using the data for macroeconomic variables (GDP, population, imports, and exports) from 1979 to 2005. Linear and quadratic models were developed and correspondingly compared to the PSO approach. Three scenarios were then presented to project the future energy demand for the country between 2006 and 2025. It was deduced that PSO underestimated the energy demand as compared to ACO. Yu et al. [19] proposed a hybrid PSO with a Genetic Algorithm to estimate China’s primary energy demand. GDP, population, economic structure, urbanization rate, and energy-use structure were used (20 historical years, 1990–2009) to model the energy demand, and the coefficients were optimized using PSO-GA. The projections were made for 2020, and energy demand was reported to be 6.91, 5.03, and 6.11 billion tce (“standard” tons coal equivalent) under three scenarios. Wang et al. [20] forecasted the energy demand behaviour of China and India through the use of single-linear, hybrid-linear, and non-linear time series forecast techniques based on Grey Theory. The estimates were developed for the years 1990–2016. It was confirmed that the proposed techniques have a high accuracy in terms of mean absolute percent error of single-linear, hybrid-linear, and non-linear techniques, with 1.30–3.08%, 0.80–2.57%, and 2.06–2.19%, respectively.
Table 1 summarizes the performance of AI techniques in forecasting energy consumption and electricity demand in various countries and regions. Sajadi et al. (2013) [12] utilized Logarithmic Regression, ANN, ANFIS, and Takagi– Sugeno-type fuzzy inference system (TS-FIS) to predict yearly energy consumption in Iran, achieving MAPE values ranging from 1.46% to 3.62%. Dalfard et al. (2013) [13] employed ANFIS models to forecast electricity consumption in Iran, achieving MAPE values as low as 0.89%. Wang et al. (2018) [20] used Multiple Granularity Mining (MGM) and Nonlinear Multiple Granularity Mining (NMGM) techniques to predict energy demand in China and India, with MAPE values ranging from 0.804% to 3.078%. Özdemir et al. (2022) [21] applied the Artificial Bee Colony method (M-ABC) algorithms to forecast yearly energy demand in Turkey, achieving high R-squared values and low MAPE values. Incremona and Nicolao (2022) [22] utilized Gaussian Process estimators to predict electricity load demands in Italy, reporting MAPE values of 1.77%. Torres et al. (2022) [23] employed Long Short-Term Memory (LSTM) models to forecast 10 min electricity demand in Spain, with MAPE values of 1.4472%. Additionally, several other studies, focusing on countries such as the USA, Iraq, Greece, and Australia, employed various AI techniques, including ANN, feedforward neural network (FNN), gated recurrent unit neural network (GRU-NN), and vector machine (VM), to predict electricity demand with varying degrees of accuracy.
This paper investigates the relationship between India’s primary energy use and socio-economic parameters, including GDP, population, and values of exports and imports. The input parameters were selected based on the literature review as they are the most influential parameters affecting energy use. The analysis was carried out using long-term annual historical data obtained from public websites. Metaheuristic algorithms were applied to first analyse the estimation of energy use. Once verified, a similar approach was applied to predict the year-ahead (short-term) energy use. The novelty of the work is the use of an ensemble of two metaheuristic techniques (Grammatical Evolution and Differential Evolution, GE-DE) to analyse the estimation problem, as well as the prediction problem for India. Further, the paper also presents energy-use values until recent years, considering the behaviour of socio-economic parameters.
The combination of DE and GE enhances the accuracy of the results. GE focuses on evolving symbolic expressions represented by grammatical structures, while DE excels at optimizing numerical parameters within these structures. In the context of this study, GE evolves symbolic expressions to represent energy-use patterns based on socio-economic variables, while DE optimizes numerical parameters within these expressions. So, the addition of DE enhances the parameter optimization. The hybrid approach avoids the entanglement of search spaces, enabling more effective exploration and exploitation of the solution space.
DE introduces a differential mechanism for exploring the search space, which differs from the traditional genetic operators used in GE. This differential mechanism introduces perturbations in the population, guiding the search towards promising regions, i.e., India’s energy-use analysis. By integrating DE into the evaluation phase of GE, the hybrid approach expands the dynamics of the search process, allowing for a more thorough exploration of diverse solution candidates and increasing the probability of discovering high-quality solutions. GE excels in generating diverse symbolic expressions that capture the underlying patterns in the data, while DE optimizes the numerical parameters within these expressions to fine-tune their predictive performance. By leveraging the strengths of both techniques, the hybrid approach achieves improved accuracy, robustness, and generalization across different datasets and problem domains, ultimately resulting in more effective predictive models.
Specifically, we address the following objectives:
  • Analyse long-term historical data of India’s energy use for the period (1971–2014).
  • Quantify the performance of ensemble of algorithms (GE-DE) to the energy-use estimation and prediction problems.
  • Analyse the associated uncertainties produced from the models in terms of statistical errors.
  • Select the best model using the Global Performance Index (GPI) and compare the average estimations and projections with the one from the best model.
  • Project the energy-use behaviour for India until the year 2022.
The article is structured into three main sections. Section 2 describes the source, collection, and pre-processing of data; the definition of the problem; and the objective function together with the metrics used to analyse the models. Section 3 presents the results obtained for each of the problems defined and the verification of the results. Finally, Section 4 concludes the work and provides recommendations for future work.

2. Methodology

2.1. Selection of Data

The data for the study of energy use in India have been obtained from the World Bank database [28]. The target (or the output of the model) is the energy use (in kg of oil equivalent per capita). The input data comprise four features which were found to be most influential on the energy use in the modelling approaches presented in the literature: (i) Gross Domestic Product (GDP, in current US$), (ii) population (total), (iii) exports of goods and services (current US$), and (iv) imports of goods and services (current US$). The data, as annual values for all four inputs and the output (energy use), were selected from 1971 to 2014 (44 years) based on availability.
For the sake of simplicity and ease, we denote energy use (in kg of oil equivalent per capita) as E, Gross Domestic Product (GDP in current US$) as X1, population (total) as X2, exports of goods and services (current US$) as X3, and imports of goods and services (current US$) as X4.
Table 2 shows the correlation matrix based on the data obtained. It can be seen that ‘Energy Use’ (E) has a strong correlation with the selected socio-economic parameters, with the highest correlation with GDP of 0.9655, followed by population with a correlation of 0.9515, exports of goods and services with a correlation of 0.9354, and finally, imports of goods and services with a correlation of 0.9268. The results confirm that the selected socio-economic parameters influence the energy-use behaviour of the country and were appropriately selected for the modelling procedure.

2.2. Selection of Training and Test Datasets

Data for the target value and input parameters had different ranges which can affect optimization. Thus, before the data were fed to the algorithm, a normalization process was applied to the data, which was performed by dividing all the values of a parameter by the corresponding maximum value as:
E i , n = E i E m a x   a n d   X i , n = X i , j X i , m a x
Consequently, the normalized target values ( E i , n ) and each of the inputs ( X 1 n , X 2 n , X 3 n   a n d   X 4 n ) were normalized with a maximum value of one.
An outline of the methodology followed under the current work is presented in Figure 1.
Following this, the complete data were bifurcated into training and test datasets. The training dataset was used to train the algorithm, while the test dataset was used to test the algorithm’s accuracy on the independent dataset. The process of bifurcation was purely random. In any case (estimation or prediction), half of the data were reserved for training and the rest for the test. More details on the problem formulations and the functioning of algorithms will be presented in the coming sections.

2.3. Problem Definition

Under the present work, two approaches are presented for quantifying energy-use behaviour in India, viz, estimation and prediction. These are explained below.
This study refers to the estimation problem of the target value and the input variables from the same year. The energy-use estimation problem is considered an optimization problem, and the general definition in mathematical form can be defined as:
E i , t = f ( X j i ) t  
where  E i , t  defines the energy use, with  i = 1 , 2 n , n being the number of data points in the complete dataset;  j  refers to the input variables  ( j = 1 , 2 , 3 , 4 ) ; and suffix  t  describes that both the sides of equation denote the same year. In other words, the estimation problems require estimating energy use from the input parameters from the same year.
On the other hand, a prediction problem is also considered an optimization problem. However, here, the energy use for a year-ahead  ( E t + 1 )  was predicted from the inputs of the current year. Mathematically,
E i , t + 1 = f X j i t
Thus, for each of the above-defined problems, the combination of metaheuristic algorithms (GE-DE) were applied to optimize the target value. The algorithms are explained in the following Section 2.4.

2.4. Algorithmic Methods

Under the current approach, an ensemble of Grammatical Evolution (GE) and Differential Evolution (DE) has been applied to both optimization problems of energy use, as presented in the previous section.
Grammatical Evolution (GE) is an Evolutionary Algorithm (EA) belonging to the class of Genetic Programming (GP) that was introduced by O’Neill and Ryan [29] and utilizes the Backus–Naur Form (BNF) grammar definition for generating a variable-length binary string. GE depends on the process of automatic programming while incorporating unique ways of using grammar. GE uses variable-length binary string genomes to degenerate the genetic code, where each codon represents an integer value, and every codon is a group of 8-bits. Based on the BNF definition, these integer values are then used to create appropriate production rules for the mapping process.
Differential Evolution (DE) is an optimization algorithm that belongs to the Evolutionary Algorithms class. Although known for its simplicity, DE is considered one of the most powerful tools for global optimization. Within the optimization algorithms, DE is a population-based optimizer. It is observed to have the advantages of attaining global optimum combined with excellent precision, fast convergence, self-adaptation, and zero-order information about the objective function.
To summarize, the advantage of using GE is its ability to guide the search of an algorithm using grammar. Meanwhile, DE represents a metaheuristic algorithm better suited for problems where some parameter values need to be found [10].
Figure 2 provides the flowchart of the schema for the execution of GE-DE, while Figure 3 provides the recursive grammar driven by GE together with DE used in the present study.
The GE algorithm develops an expression (a form of a model) using the input variables (X1, X2, X3, and X4) represented as <var> using the functions provided under the <expr>, and these can result in linear, power, exponential and logarithmic. The <recExpr> combines the <expr> with another expression developed in a recursive fashion using the operands denoted by <op>, which can be an addition, subtraction, or multiplication. DE optimizes the coefficients of the variables denoted by wi for each of the expressions in the recursive expression.
The properties of the algorithmic methods are shown in Table 3 for GE and DE, respectively.
As observed from Table 2, since the “number of runs” was set to 20, we obtained an average of 20 models from the algorithms. However, in the following section, we will look at each run individually and try to analyse and assess the models’ performance in terms of their structure and statistical errors.

2.5. Objective Function and Error Analysis

The accuracy of estimation and predictions was analysed in terms of the root mean squared error (RMSE), as shown in Equation (4), which was used as the objective function.
R M S E = 1 n i = 1 n E e s t , i E a c t , i 2
E a c t , i  and  E e s t , i  in Equation (4) are the actual and estimated (or predicted) values of energy use.
While other statistics in Equations (5)–(8), viz, average error (AE), coefficient of determination (R2), absolute error (ABS), and relative error (RE), are also used to evaluate the accuracy as follows:
A E = 1 n i = 1 n ( E e s t , i E a c t , i )
R 2 = i = 1 n ( E e s t , i E e s t , a v g ) ( E a c t , i E a c t , a v g ) i = 1 n ( E e s t , i E e s t , a v g ) 2 i = 1 n E a c t , i E a c t , a v g 2
B S = 1 n i = 1 n E e s t , i E a c t , i
R E = 1 n i = 1 n E e s t , i E a c t , i E a c t , i
These statistics were calculated to ease the comparison between the results obtained in this article and were selected based on previous studies.

2.6. GPI and Ranks of the Models

The literature shows that different evaluation criteria can lead to different outcomes while selecting the most appropriate model. Although the objective function for the algorithm was described as RMSE in the previous section, other metrics also help define the models’ fitness. The Global Performance Index (GPI) is a statistical tool proposed by Despotovic et al. [30]. GPI is a combined metric that uses the equal weights of all the statistical errors that are used.
GPI is calculated from the normalized values (between 0 and 1) of each statistical indicator for all the models ( Y ~ i j ). For each column of normalized values, a median value is obtained as  ( Y ~ j ) . The difference  Y ~ j Y ~ i j  is then calculated. In the final step, a weight factor,  α j , is multiplied by the obtained difference, and all the values obtained for each model are summed to calculate GPI. Mathematically, GPI is described as (for  i t h  model):
G P I i = j = 1 n α j Y ~ j Y ~ i j
The weight factor  α j = 1   for   R M S E ,   A E ,   A B S ,   R E 1   for   R 2 .

3. Results and Discussion

In this section, we look at the models generated from the ensemble of algorithms and the model forms together with outputs (estimated and predicted values) in terms of the statistical errors. Also, we will look at the evolution in the estimated and predicted values of energy-use behaviour as averages and variability from the generated models.

3.1. Estimation Results

The following models (E1–E20) were obtained for the estimation of energy use from the GE-DE algorithms:
E 1 .   E i E m a x t = 0.7293 X 2 i X 2 m a x 0.7374 0.2675 X 1 i X 1 m a x 1.9471 t
E 2 .   E i E m a x t = 0.2777 X 1 i X 1 m a x 1.8850 + l o g 1.0481 + X 2 i X 2 m a x t
E 3 .   E i E m a x t = l o g 1.1844 + X 3 i X 3 m a x × 1.1844 X 3 i X 3 m a x 0.9579 0.9948 X 2 i X 2 m a x 0.9579 + X 3 i X 3 m a x 1.1844 l n 0.9579 + X 3 i X 3 m a x × e 0.9948 X 2 i X 2 m a x + e 0.9948 X 2 i X 2 m a x t
E 4 .   E i E m a x t = 0.5929 X 1 i X 1 m a x 0.1710 0.5203 X 1 i X 1 m a x + 0.8683 X 2 i X 2 m a x 0.5203 t
E 5 .   E i E m a x t = ln 0.2257 + X 2 i X 2 m a x × exp 0.01457 + X 3 i X 3 m a x + 0.5331 + X 1 i X 1 m a x 0.0899 t
E 6 .   E i E m a x t = 0.2777 X 1 i X 1 m a x 1.8549 + l n 1.0482 X 2 i X 2 m a x t
E 7 .   E i E m a x t = 0.3621 X 1 i X 1 m a x 0.9576 + 0.6099 X 2 i X 2 m a x 0.5557 t
E 8 .   E i E m a x t = 0.2806 X 3 i X 3 m a x 0.4425 × e 1.2387 + X 3 i X 3 m a x 0.0457 X 1 i X 1 m a x 0.0457 + e 1.2387 * X 2 i X 2 m a x               e 0.2806 + X 2 i X 2 m a x t
E 9 .   E i E m a x t = l n 0.6785 X 2 i X 2 m a x × 0.6785 X 1 i X 1 m a x × 0.1093 X 4 i X 4 m a x 1.3213 ×             0.1093 X 3 i X 3 m a x 1.3213 0.1093 + X 2 i X 2 m a x 1.2032 t
E 10 .   E i E m a x t = 0.8728 X 1 i X 1 m a x 0.8239 × 0.6716 X 2 i X 2 m a x 0.8728 + 0.6716 X 1 i X 1 m a x 0.2737 ×                 l n 0.8239 + X 2 i X 2 m a x t
E 11 .   E i E m a x t = 0.8409 X 2 i X 2 m a x + e 0.4634 X 1 i X 1 m a x × 0.7251 X 1 i X 1 m a x 0.1138 t
E 12 .   E i E m a x t = 0.7569 + X 3 i X 3 m a x 0.0612 × e 0.3669 X 1 i X 1 m a x + l n 0.3669 + X 2 i X 2 m a x t
E 13 .   E i E m a x t = 0.1933 + X 2 i X 2 m a x × 1.3763 X 1 i X 1 m a x × 0.2294 X 2 i X 2 m a x + l n 1.3763 + X 2 i X 2 m a x t
E 14 .   E i E m a x t = l n 0.01908 X 1 i X 1 m a x × 0.2832 X 1 i X 1 m a x 1.0606 × 1.9291 + X 2 i X 2 m a x t
E 15 .   E i E m a x t = 1.1203 X 1 i X 1 m a x 0.0660 0.2447 X 3 i X 3 m a x × e 0.4696 X 1 i X 1 m a x                 0.4696 X 2 i X 2 m a x 0.4696 t
E 16 .   E i E m a x t = 0.3573 X 1 i X 1 m a x 2.1556 0.3573 X 3 i X 3 m a x + 0.3573 + X 3 i X 3 m a x 0.8374 +                 0.3551 + X 2 i X 2 m a x 0.3573 t
E 17 .   E i E m a x t = 0.2651 X 1 i X 1 m a x 2.0093 + 0.7321 X 2 i X 2 m a x 0.7432 t
E 18 .   E i E m a x t = 0.5062 + X 1 i X 1 m a x 0.5681 × 0.9627 + X 1 i X 1 m a x + l n 0.5681 + X 2 i X 2 m a x t
E 19 .   E i E m a x t = 1.6171 X 2 i X 2 m a x 0.2474 + 0.6058 X 2 i X 2 m a x × 0.6058 X 1 i X 1 m a x t
E 20 .       E i E m a x t = 0.3809 + X 2 i X 2 m a x 0.3809 × e 0.3809 X 1 i X 1 m a x 0.0700 X 3 i X 3 m a x 0.3809 + 0.2604 X 2 i X 2 m a x                     0.07004 + X 2 i X 2 m a x 0.38094 × 0.04804 X 3 i X 3 m a x 0.04804 t
The number of terms (in parenthesis) in an equation is between two and seven, as represented within the brackets. Further, the number of times an input parameter appears in the 20 runs or models generated from the algorithm (Equations (10)–(29)) is depicted in Figure 4. X2 (population, total) appears as the maximum number of 30 times in the models, followed by X1 (GDP) with 25, X3 (number of exports) with 14, and, finally, the least of X4 (number of imports) with one. Thus, the most crucial parameter that affects energy use is population.
Table 4 provides the comparison of values of statistical errors for training as well as the test dataset.
From the table, it can be observed that the values of computed statistical errors are small, and this represents a high estimation accuracy. On the training dataset, the RMSE ranges from 0.0078 for Model E10 to 0.0154 for Model E7. However, other models, e.g., E18 with an RMSE of 0.0080 and Model E11 with an RMSE of 0.0081, also represent very accurate estimations. Analysing the form of Model E10 (Equation (19)), it is observed that the equation has four terms in total, with two terms each containing X1 and X2. Since X2 and X1 appear the highest number of times overall in the equations (as previously discussed in Figure 3), it is justified that the dependence of a model towards X1 and X2 represents better estimation accuracy.
Further, each of the other statistical errors has a value that indicates the accuracy of Model E10, whose errors are the lowest compared to the rest of the obtained models (average error 0.0060, R2 0.9979, absolute error 0.1328, and relative error 0.0099). On the testing dataset, the best-resulting equation for statistical errors is again Model E10, with RMSE of 0.0103, average error of 0.0069, R2 of 0.9958, absolute error of 0.1515, and relative error of 0.0108). Like the training dataset, E18 and E11 also have comparable RMSE values of 0.0109 and 0.0110, respectively, compared to E10.
Figure 5a,b show the algorithm’s performance for estimation on the training and test datasets, respectively, representing the normalized energy data points. The deviation of estimated energy from the data points is visibly slight, with the estimated data points following the actual data very closely.
The energy use (in kgoe/capita) was computed back from the normalized value, and the evolution of energy use is represented (1971–2014) in Figure 6 as a box-and-whiskers chart to provide the variability from all 20 models (as Section 3.1 describes). Thus, for any year in the chart, the middle line represents the median energy-use value, the box’s ends represent the first and third quartiles, and the end of the whiskers represents the minimum and maximum values.
The error computation for the estimated energy use was performed for the 20 runs (or models) obtained from the experiment and is presented in Table 5.
The average RMSE is 6.3749 kgoe/capita, representing a very accurate energy-use estimation. Among the models, the most negligible value of RMSE (=5.8183 kgoe/capita) was obtained for model E10, Model 6 has the least value of the average error of −0.0920 kgoe/capita, Model 10 again has the highest R2 = 0.9969, and minimum values of absolute error of 4.1126 kgoe/capita and relative error of 0.0103 were found.
Following the calculation of statistical errors for estimated energy use, the procedure for calculating the GPI and, correspondingly, creating a ranking of the models was applied (as previously discussed in Section 2.6). The values of GPI and the ranking of the models are represented in Table 6.
The GPI values range from −1.7070 for Model 7 to 1.7249 for Model 10. Therefore, the best model for estimating energy use, again, is E10.
Figure 7 compares actual energy use, the estimated energy use from Model 10, and the average estimated value from all 20 models.

3.2. Prediction Results

We now look at the prediction of energy use, where the energy use for the next year is predicted using the input variables from the current year. The following models (P1–P20) were obtained to predict energy use.
E i E m a x t + 1 = 0.6899 X 1 i X 1 m a x 0.4354 1.3234 X 2 i X 2 m a x 0.4563 0.4563 + X 3 i X 3 m a x × 0.4354 X 1 i X 1 m a x 0.4563 t
E i E m a x t + 1 = 0.3077 + X 2 i X 2 m a x 0.4626 + e 1.0912 X 1 i X 1 m a x × l n | 1.1111 X 2 i X 2 m a x | t
E i E m a x t + 1 = 0.4826 X 3 i X 3 m a x 0.3828 + 1.1287 X 1 i X 1 m a x × 0.3828 X 4 i X 4 m a x 0.3828 0.4826 + X 3 i X 3 m a x + l n 0.5375 X 2 i X 2 m a x t
E i E m a x t + 1 = 0.3527 X 3 i X 3 m a x + 0.0543 X 1 i X 1 m a x 0.0543 1.5851 X 2 i X 2 m a x 0.3527 t
E i E m a x t + 1 = l n 1.0370 X 2 i X 2 m a x 0.2681 X 3 i X 3 m a x t
E i E m a x t + 1 = 0.3709 X 3 i X 3 m a x 1.1855 e 0.3709 X 1 i X 1 m a x × 0.2611 + X 2 i X 2 m a x 0.5105 t
E i E m a x t + 1 = 0.1155 X 4 i X 4 m a x + 0.1155 X 2 i X 2 m a x 1.2967 × 0.5011 + X 3 i X 3 m a x × 0.2459 + X 4 i X 4 m a x t
E i E m a x t + 1 = l n 1.0468 X 2 i X 2 m a x + 0.2724 X 3 i X 3 m a x 1.1910 t
E i E m a x t + 1 = 0.6389 X 3 i X 3 m a x 0.5956 × 0.6389 X 3 i X 3 m a x × ln 0.6389 + X 1 i X 1 m a x + 0.7776 X 2 i X 2 m a x 0.7508 t
E i E m a x t + 1 = 0.2663 X 3 i X 3 m a x 1.223 + 0.7231 X 2 i X 2 m a x 0.7231 t
E i E m a x t + 1 = e 0.2516 X 3 i X 3 m a x 1.2991 + X 2 i X 2 m a x 0.4492 t
E i E m a x t + 1 = 0.2573 X 3 i X 3 m a x 1.2890 0.7331 X 2 i X 2 m a x 0.7514 t
E i E m a x t + 1 = 0.7301 X 2 i X 2 m a x 0.7438 + 0.2595 X 3 i X 3 m a x 1.2786 t
E i E m a x t + 1 = 0.3796 X 3 i X 3 m a x 1.0995 × 0.9078 + X 1 i X 1 m a x 0.0861 × l n 0.9078 X 2 i X 2 m a x t
E i E m a x t + 1 = 0.0937 + X 1 i X 1 m a x 0.7299 × l n 0.1566 + X 4 i X 4 m a x × 0.0937 + X 2 i X 2 m a x × 0.7299 + X 1 i X 1 m a x × 0.1599 X 3 i X 3 m a x t
E i E m a x t + 1 = 0.1645 X 1 i X 1 m a x 1.0933 e 0.1645 X 1 i X 1 m a x × 0.0412 X 3 i X 3 m a x 0.1645 X 4 i X 4 m a x 0.3397 + 0.3397 X 3 i X 3 m a x + l n 1.0933 + X 2 i X 2 m a x t
E i E m a x t + 1 = 0.2724 X 3 i X 3 m a x 1.1928 + l n ( 1.0470 + X 2 i X 2 m a x t
E i E m a x t + 1 = 0.7310 X 2 i X 2 m a x 0.7438 0.2603 X 3 i X 3 m a x 1.2887 t
E i E m a x t + 1 = 0.2667 X 1 i X 1 m a x 1.5734 + l n 1.0393 + X 2 i X 2 m a x t
E i E m a x t + 1 = 0.8826 X 3 i X 3 m a x 0.9528 + 0.8826 X 2 i X 2 m a x 0.2309 l n 1.0503 X 3 i X 3 m a x + 1.0503 + X 2 i X 2 m a x t
We analysed the structure of the models in terms of the number of terms that appear in the equation and the number of times an input variable appears in total in all the models. The number of terms varies from two to six in the prediction models, with most of the models consisting of only two terms. Further, as depicted in Figure 8, X2 and X3 appear an equal number of times (i.e., 22) in the model equations, followed by X1 13 times and X4 five times. Therefore, compared to the estimation results (Section 3.2), the input parameters of population (X2) and exports (X3) show a higher effect in the case of prediction models. This difference in the outcomes can be attributed to the fact that the data fed to the algorithm are organised differently, as discussed in the problem statements, i.e., the prediction problem requires the input data from the current year to be used to predict the energy use for the next year, which is the fundamental explanation of the “year-ahead” energy prediction, while in the case of estimation problem, the input and the target value both belong to the same year.
Statistical errors were calculated for the prediction models and are tabulated in Table 7.
As observed from the table, the values of RMSE are minimal in the range of 0.0096 to 0.0159 on the training dataset, while on the test dataset, the values of RMSE are in the range of 0.0122 to 0.0188. The lowest value of RMSE is achieved for model P4 on the training dataset and model P9 for the testing dataset. The average error of 0.0082 and 0.0077 is achieved by Model P4 and P2 on training and test datasets, respectively. The coefficient of determination has the highest values of 0.9970 and 0.9936 for models P4 and P9, respectively, on the training and test datasets. Further, absolute errors have the lowest value of 0.1800 and 0.1613 on testing datasets, respectively, for P20 and P2. Finally, the relative errors are 0.0129 and 0.0122 for Models P20 and P2, respectively. Consequently, the error is similar in terms of training and test datasets and this represents a very good prediction.
The prediction of energy use (normalized) is depicted in Figure 9a,b for training and test datasets, respectively. From the figures, it is further justified that the predicted energy use agrees with the actual data.
The total amount of annual predicted energy use was computed back from the normalized predicted values, and further, the statistical errors were evaluated. These are presented in Table 8.
The average RMSE value of 7.8857 kgoe/capita is achieved in the predictions, while Model P4 has the lowest RMSE of 7.8402 kgoe/capita. Model P2 has an average error of 0.1174 kgoe/capita, the least among all the models. The average error for average predicted energy use from all the models is −1.5097 kgoe/capita. The highest R2 of 0.9944 was obtained for model P4, and the average predicted energy use has a very close value, obtained as 0.9943. The absolute error has the least value for Model P20 with 5.7149 kgoe/capita, and the average value is 6.0013 kgoe/capita. Lastly, the relative error is again the least for Model P20, with a value of 0.0142, against an average value of 0.0156 for average energy prediction. Therefore, different statistical errors approve of different models.
Figure 10 shows the prediction of year-ahead energy use from 1972 to 2014, along with the variabilities based on the output of the 20 models. Further, since the algorithm is trained to predict the energy use for a year ahead, the energy use in 2015 was also quantified (as highlighted), i.e., median of 628.44 kgoe/capita with minima and maxima of 621.33 kgoe/capita and 640.36 kgoe/capita, respectively, and the first and third quartiles of 627.25 kgoe/capita and 629.63 kgoe/capita, respectively. Therefore, the energy use for 2015 was less than the previous year, 2014, when the predicted median energy use was 630.21 kgoe/capita. For the same year, 2014, the actual energy use was 636.57 kgoe/capita.
Similarly, to the estimation results, we again computed the GPI and built the ranking of the prediction models. These values are depicted in Table 9.
The values obtained for GPI vary between −1.1839 and 1.8836, with the highest value for Model P20. Therefore, model P20 is ranked first in order of prediction accuracy. The actual energy use, predicted energy use from Model P20, and the average energy use predicted from the 20 models are displayed in Figure 11 for 1972 until 2014.
Using the analysis of the year-ahead prediction problem, we looked at the predictions for the years between 2015 and 2022. For this, the input data were obtained as the macroeconomic indicators from the website data of Macrotrends [31] due to the unavailability of recent data from the World Bank database. We analysed and extended the results on an independent dataset using the year-ahead prediction algorithm. Therefore, we were able to predict the energy use between 2015 and 2022, as shown in Figure 12.
Energy use decreased between 2015 and 2016, from a median of 654.91 kgoe/capita to 634.09 kgoe/capita. Then, in the following years, between 2017 and 2019, the energy use was 648.21 kgoe/capita, 678.76 kgoe/capita, and 699.08 kgoe/capita. For the year 2020, energy use was almost stagnant and had a median value of 700.31 kgoe/capita. However, 2021 showed a steep fall in energy use, and the predicted median value was 692.29 kgoe/capita. This decrease in energy use can be attributed to the slowdown of the economy due to the pandemic in the previous year and a decline in the values of macroeconomic indicators, consequently influencing energy use in 2021. Finally, 2022 showed a resurgence of energy use, with a median value of 770.29 kgoe/capita. To verify the results of the predictions from the algorithm, the actual data obtained from public websites are compared in the next section.

3.3. Verification of the Predictions Using Public Data

The comparison of predictions made from the algorithm was validated using the public data available for India [32] and referred to as energy use per person between 2015 and 2022. It is seen that the predicted energy use has a close association with the actual energy use, as depicted in Figure 13.
For the year 2022, the predicted energy use was found to be 780.02 kgoe/capita. Therefore, it was validated that the future predictions made based on the year-ahead prediction algorithm are significant, with a relative error of 2.63%.

4. Conclusions

The primary energy use has been modelled in terms of four socio-economic indicators, including GDP, population, and the values of exports and imports for India. The ensemble of Grammatical Evolution and Differential Evolution (GE-DE) was applied to the energy-use problems (estimation and prediction) on the historical data obtained from public websites from 1971 to 2014. Models were developed and compared with the help of statistical analysis comprising root mean square error, average error, coefficient of determination, absolute error, and relative error. Further, to establish the best of the models and create a ranking system, the Global Performance Index was applied. The models were deployed to estimate (for the estimation problem) and predict (for the year-ahead prediction problem) energy use. For any particular year, the energy use was defined by the median value, the first and third quartiles, and the minimum and maximum values. Based on the analysis, the following main conclusions were drawn:
  • The estimation of energy use based on the ensemble of GE-DE was found to have good accuracy, and the RMSE (based on average estimations) was quantified as 6.3749 kgoe/capita (1.25%). Based on the statistical analysis and the ranking established by GPI, Model E10 was the best model for the estimation with an RMSE and GPI of 5.8183 kgoe/capita and 1.7249, (meaning 1.03%). Population and GDP were found to have the highest number of instances of appearance in the estimation models and were, therefore, regarded as the influential parameters.
  • The energy prediction problem, with a year-ahead prediction, was found to have a good agreement with the data, and the RMSE was obtained as 7.8857 kgoe/capita (1.56% error); model P20, with an RMSE of 7.9201 (or 1.42%) and a GPI of 1.8836, was found to be the most accurate. Population and the value of exports were found to be the most influential parameters for the case of prediction equations (based on the number of times they appeared).
  • The predictions were further made for 2015–2022, and the results showed a slowdown in energy-use behaviour for 2020 and 2021. Further, a steady increase was found in energy use, with a median value of 770.29 kgoe/capita and an average value of 780.02 kgoe/capita.
Thus, it is established that the ensemble of GE-DE provides accurate estimation and prediction results and, therefore, can be applied to energy-use modelling as an optimization problem. Further work will be carried out to model the energy-use behaviour and project this energy use in the medium- to the long-term future under different growth scenarios.
It must, however, be considered that the models presented in the study have been developed and implemented as a case study for India. It is, therefore, recommended to verify the performance of the ensemble algorithms together with the country-specific data whenever predictions are made for any specific case. It should also be noted that the results of this study are based on annual data values; therefore, the derivatives of this study need to be extended to shorter time frames (monthly, weekly, or daily) with the appropriate data as available. Further, for future studies, it is recommended to consider a large number of datasets, in terms of the number of input parameters as well as the diversity of the regions, to generalize the results and to extend the study for wider cases.

Author Contributions

Conceptualization, B.J. and L.S.-L.; methodology, B.J.; software, B.J.; validation, B.J. and L.S.-L.; formal analysis, B.J.; investigation, B.J.; resources, L.S.-L.; data curation, B.J.; writing—original draft preparation, B.J.; writing—review and editing, L.S.-L.; visualization, B.J.; supervision, L.S.-L.; funding acquisition, B.J. and L.S.-L. All authors have read and agreed to the published version of the manuscript.

Funding

The work reported under the manuscript has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 754382. L.S.L. acknowledge the Spanish Ministry of Universities for the fellowship “Ayudas para la Recualificación del sistema español” and support from the State Research Agency (AEI), Government of Spain (grant number TED-2021-132368A-C22), and Ministry of Science and Innovation (MCIN/AEI/10.13039/501100011033/FEDER, UE) under the grant PID2021-126605NB-I00 and European Union NextGenerationEU/PRTR, project with reference TED2021-132368A-C22.

Data Availability Statement

Data will be made available on request.

Acknowledgments

The author would like to thank Abraham Duarte and Jose M. Colmenar (Department of Computer Science and Statistics, Universidad Rey Juan Carlos, Madrid, Spain) for extending their support towards the research work carried out in the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Nomenclature

Artificial Intelligence
ABCArtificial Bee Colony Method
ABPAAdaptive Back Propagation Algorithm
ACOAnt Colony Optimization
AFDAdaptive Fourier Decomposition
AMRIOAdaptive Multiregional Input–Output
ANFISAdaptive Network-Based Fuzzy Inference System
ANNArtificial Neural Network
ARIMAHAuto-Regressive Integrated Moving Average and Holtz-Winters
ELMExtreme Learning Machine
FNNFeedforward Neural Network
GAGenetic Algorithm
GRU-NNGated Recurrent Unit Neural Network
HSHarmonic Search
LSSVMLeast Squares Support Vector Machine
LSTMLong-Short Term Memory
MFOMoth-Flame Optimization
MGMMetabolic Grey Model
MGM-ARIMAMetabolic Grey Auto-Regressive Integrated Moving Average Model
MOSCOAMulti-Objective Sine Cosine Optimization Algorithm
NMGMNon-Linear Metabolic Grey Model
PSOParticle Swarm Optimisation
RNNRecurrent Neural Network
SARIMASeasonal Auto-Regressive Integrated Moving Average
SVMSupport Vector Machine
TS-FISTakagi-Sugeno-Type Fuzzy Inference System
VNSVariable Neighbourhood Search
Statistical Indicators
MAEMean Absolute Error
MAEMean Absolute Error
MAPEMean Absolute Percentage Error
MSPEMean Square Percent Error
RMSE Root Mean Square Error
Variables
E_priceElectricity price
ECElectricity consumption
EC_AAgricultural Consumption of Electricity
EC_CCommercial Consumption of Electricity
EC_DDomestic Consumption of Electricity
EC_GGovernmental Consumption of Electricity
EC_IIndustrial Consumption of Electricity
EDElectricity Demand
ELElectricity Loads
GDPGross Domestic Product
PoPopulation

References

  1. Jiang, P.; Li, R.; Lu, H.; Zhang, X. Modeling of Electricity Demand Forecast for Power System. Neural Comput. Appl. 2020, 32, 6857–6875. [Google Scholar] [CrossRef]
  2. Debnath, A.; Singh, S.V.; Singh, Y.P. Comparative Assessment of Energy Requirements for Different Types of Residential Buildings in India. Energy Build. 1995, 23, 141–146. [Google Scholar] [CrossRef]
  3. Das, A.; Paul, S.K. Changes in Energy Requirements of the Residential Sector in India between 1993–94 and 2006–07. Energy Policy 2013, 53, 27–40. [Google Scholar] [CrossRef]
  4. Dhawan, V.; Prasad, N. India: Transforming to a Net-Zero Emissions Energy System; The Energy and Resources Institute (TERI): Mithapur, India, 2020. [Google Scholar]
  5. Parikh, K.S.; Karandikar, V.; Rana, A.; Dani, P. Projecting India’s Energy Requirements for Policy Formulation. Energy 2009, 34, 928–941. [Google Scholar] [CrossRef]
  6. Chaturvedi, S.; Rajasekar, E.; Natarajan, S.; McCullen, N. A Comparative Assessment of SARIMA, LSTM RNN and Fb Prophet Models to Forecast Total and Peak Monthly Energy Demand for India. Energy Policy 2022, 168, 113097. [Google Scholar] [CrossRef]
  7. Islam, M.A.; Che, H.S.; Hasanuzzaman, M.; Rahim, N.A. Chapter 5—Energy Demand Forecasting. In Energy for Sustainable Development; Hasanuzzaman, M.D., Rahim, N.A., Eds.; Academic Press: Cambridge, UK, 2020; pp. 105–123. ISBN 978-0-12-814645-3. [Google Scholar]
  8. Löschel, A.; Managi, S. Recent Advances in Energy Demand Analysis—Insights for Industry and Households. Resour. Energy Econ. 2019, 56, 1–5. [Google Scholar] [CrossRef]
  9. Huang, C.; Zhang, Z.; Li, N.; Liu, Y.; Chen, X.; Liu, F. Estimating Economic Impacts from Future Energy Demand Changes Due to Climate Change and Economic Development in China. J. Clean. Prod. 2021, 311, 127576. [Google Scholar] [CrossRef]
  10. Wang, H.; Chen, Z.; Wang, W.; Wu, Z.; Wu, K.; Li, W. Improving Energy Demand Estimation Using an Adaptive Firefly Algorithm BT—Computational Intelligence and Intelligent Systems; Li, K., Li, W., Chen, Z., Liu, Y., Eds.; Springer: Singapore, 2018; pp. 171–181. [Google Scholar]
  11. Jamil, B.; Serrano-Luján, L.; Colmenar, J.M. On the Prediction of One-Year Ahead Energy Demand in Turkey Using Metaheuristic Algorithms. Adv. Sci. Technol. Eng. Syst. J. 2022, 7, 79–91. [Google Scholar] [CrossRef]
  12. Sajadi, S.M.; Asadzadeh, S.M.; Majazi Dalfard, V.; Nazari Asli, M.; Nazari-Shirkouhi, S. A New Adaptive Fuzzy Inference System for Electricity Consumption Forecasting with Hike in Prices. Neural Comput. Appl. 2013, 23, 2405–2416. [Google Scholar] [CrossRef]
  13. Majazi Dalfard, V.; Nazari Asli, M.; Nazari-Shirkouhi, S.; Sajadi, S.M.; Asadzadeh, S.M. Incorporating the Effects of Hike in Energy Prices into Energy Consumption Forecasting: A Fuzzy Expert System. Neural Comput. Appl. 2013, 23, 153–169. [Google Scholar] [CrossRef]
  14. Daş, G.S. Forecasting the Energy Demand of Turkey with a NN Based on an Improved Particle Swarm Optimization. Neural Comput. Appl. 2017, 28, 539–549. [Google Scholar] [CrossRef]
  15. Salcedo-Sanz, S.; Muñoz-Bulnes, J.; Portilla-Figueras, J.A.; Del Ser, J. One-Year-Ahead Energy Demand Estimation from Macroeconomic Variables Using Computational Intelligence Algorithms. Energy Convers. Manag. 2015, 99, 62–71. [Google Scholar] [CrossRef]
  16. Sánchez-Oro, J.; Duarte, A.; Salcedo-Sanz, S. Robust Total Energy Demand Estimation with a Hybrid Variable Neighborhood Search—Extreme Learning Machine Algorithm. Energy Convers. Manag. 2016, 123, 445–452. [Google Scholar] [CrossRef]
  17. Duran Toksarı, M. Ant Colony Optimization Approach to Estimate Energy Demand of Turkey. Energy Policy 2007, 35, 3984–3990. [Google Scholar] [CrossRef]
  18. Ünler, A. Improvement of Energy Demand Forecasts Using Swarm Intelligence: The Case of Turkey with Projections to 2025. Energy Policy 2008, 36, 1937–1944. [Google Scholar] [CrossRef]
  19. Yu, S.; Wei, Y.-M.; Wang, K. A PSO–GA Optimal Model to Estimate Primary Energy Demand of China. Energy Policy 2012, 42, 329–340. [Google Scholar] [CrossRef]
  20. Wang, Q.; Li, S.; Li, R. Forecasting Energy Demand in China and India: Using Single-Linear, Hybrid-Linear, and Non-Linear Time Series Forecast Techniques. Energy 2018, 161, 821–831. [Google Scholar] [CrossRef]
  21. Özdemir, D.; Dörterler, S.; Aydın, D. A New Modified Artificial Bee Colony Algorithm for Energy Demand Forecasting Problem. Neural Comput. Appl. 2022, 7, 17455–17471. [Google Scholar] [CrossRef]
  22. Incremona, A.; De Nicolao, G. Short-Term Forecasting of the Italian Load Demand during the Easter Week. Neural Comput. Appl. 2022, 34, 6257–6271. [Google Scholar] [CrossRef]
  23. Torres, J.F.; Martínez-Álvarez, F.; Troncoso, A. A Deep LSTM Network for the Spanish Electricity Consumption Forecasting. Neural Comput. Appl. 2022, 34, 10533–10545. [Google Scholar] [CrossRef]
  24. Li, R.; Chen, X.; Balezentis, T.; Streimikiene, D.; Niu, Z. Multi-Step Least Squares Support Vector Machine Modeling Approach for Forecasting Short-Term Electricity Demand with Application. Neural Comput. Appl. 2021, 33, 301–320. [Google Scholar] [CrossRef]
  25. Stergiou, K.; Karakasidis, T.E. Application of Deep Learning and Chaos Theory for Load Forecasting in Greece. Neural Comput. Appl. 2021, 33, 16713–16731. [Google Scholar] [CrossRef]
  26. Michell, K.; Kristjanpoller, W.; Minutolo, M.C. Electrical Consumption Forecasting: A Framework for High Frequency Data. Neural Comput. Appl. 2022, 34, 5577–5586. [Google Scholar] [CrossRef]
  27. Mohammed, N.A.; Al-Bazi, A. An Adaptive Backpropagation Algorithm for Long-Term Electricity Load Forecasting. Neural Comput. Appl. 2022, 34, 477–491. [Google Scholar] [CrossRef] [PubMed]
  28. IBRD IDA. The World Bank Data. Available online: https://databank.worldbank.org/home.aspx (accessed on 11 August 2022).
  29. O’Neill, M.; Ryan, C. Grammatical Evolution. IEEE Trans. Evol. Comput. 2001, 5, 349–358. [Google Scholar] [CrossRef]
  30. Despotovic, M.; Nedic, V.; Despotovic, D.; Cvetanovic, S. Review and Statistical Analysis of Different Global Solar Radiation Sunshine Models. Renew. Sustain. Energy Rev. 2015, 52, 1869–1880. [Google Scholar] [CrossRef]
  31. MacroTrends Global Metrics. Available online: https://www.macrotrends.net/ (accessed on 10 October 2022).
  32. Ritchie, H.; Roser, M.; Rosado, P. Energy. Available online: https://ourworldindata.org/energy (accessed on 15 October 2022).
Figure 1. Outline of the methodology followed under the present work.
Figure 1. Outline of the methodology followed under the present work.
Energies 17 01697 g001
Figure 2. Schema of ensemble algorithmic methods (GE-DE).
Figure 2. Schema of ensemble algorithmic methods (GE-DE).
Energies 17 01697 g002
Figure 3. Recursive grammar-driven GE-DE ensemble.
Figure 3. Recursive grammar-driven GE-DE ensemble.
Energies 17 01697 g003
Figure 4. Number of instances of appearance of different input parameters in the estimation models.
Figure 4. Number of instances of appearance of different input parameters in the estimation models.
Energies 17 01697 g004
Figure 5. Comparison of estimated energy use vs. the target values (normalized) on (a) training dataset and (b) testing dataset.
Figure 5. Comparison of estimated energy use vs. the target values (normalized) on (a) training dataset and (b) testing dataset.
Energies 17 01697 g005
Figure 6. Evolution of estimated energy use for 1971–2014, showing the variability of the 20 models obtained: median energy-use value, first and third quartiles, and minimum and maximum values.
Figure 6. Evolution of estimated energy use for 1971–2014, showing the variability of the 20 models obtained: median energy-use value, first and third quartiles, and minimum and maximum values.
Energies 17 01697 g006
Figure 7. Comparison of actual energy use with the estimated (Model E10) and the energy use estimated as an average from the twenty models.
Figure 7. Comparison of actual energy use with the estimated (Model E10) and the energy use estimated as an average from the twenty models.
Energies 17 01697 g007
Figure 8. Number of instances of appearance of input variables in the prediction models.
Figure 8. Number of instances of appearance of input variables in the prediction models.
Energies 17 01697 g008
Figure 9. Comparison of year-ahead predicted energy use vs. the target values (normalized) on the (a) training dataset and (b) testing dataset.
Figure 9. Comparison of year-ahead predicted energy use vs. the target values (normalized) on the (a) training dataset and (b) testing dataset.
Energies 17 01697 g009
Figure 10. Evolution of predicted year-ahead energy use for 1972–2014 and further for the predicted year 2015.
Figure 10. Evolution of predicted year-ahead energy use for 1972–2014 and further for the predicted year 2015.
Energies 17 01697 g010
Figure 11. Comparison of actual energy use with the predicted (Model P20) and the average of twenty prediction models.
Figure 11. Comparison of actual energy use with the predicted (Model P20) and the average of twenty prediction models.
Energies 17 01697 g011
Figure 12. Predicted energy use for the period of 2015–2022.
Figure 12. Predicted energy use for the period of 2015–2022.
Energies 17 01697 g012
Figure 13. Comparison of predicted values of energy use from the algorithms with the public data available from 2015 through 2022.
Figure 13. Comparison of predicted values of energy use from the algorithms with the public data available from 2015 through 2022.
Energies 17 01697 g013
Table 1. Literature review of application of Artificial Intelligence techniques to predict the energy demand.
Table 1. Literature review of application of Artificial Intelligence techniques to predict the energy demand.
ReferenceYear RegionOutputInput Parameters Modelling Techniques Accuracy Assessment
Toksari [17]2007TurkeyEnergy demandYearly GDP, population, import, and exportACOLargest deviation:
Linear model 3.87%
Quadratic model 2.83%
Yu and Wang [19]2012ChinaEnergy consumptionEconomic growth, total population, economic structure, urbanization rate, energy structure and energy pricePSO-GAR2 0.9991, average MAPE 0.93, standard deviation 0.0001
Sajadi et al. [12]2013IranYearly ECYearly GDP, population, E_priceLog. Regression,
ANN,
ANFIS,
TS-FIS
MAPE:
Logistic Regression 1.69
ANN 3.62
ANFIS 2.67
TS-FIS 1.46
Dalfard et al. [13]2013IranEc estimation and forecastYearly GDP, population, E_priceANFISMAPE:
Electricity-FIS 1.39
Natural Gas-FIS 3.79
NGPG-ANFIS 0.89
Salcedo-Sanz et al. [15]2015SpainEnergy demandYearly data of e 14 macroeconomic variablesHS optimization algorithmRelative MAE 2.36%
Sánchez-Oro et al. [16]2016Spain Energy demandYearly data of 14 macroeconomic variablesVNS–ELMBest MAE 1.66%
Average MAE 3.9%
Daş [14]2017TurkeyEnergy demandYearly GDP, population, import, and exportPSOM-NNRelative error 2.42
Absolute Relative Error 8.42
Wang et al. [20]2018China and IndiaEDYearly ED MGM,
MGM-ARIMA,
NMGM
(China) MAPE MGM 3.078%, ARIMA 2.571% NMGM 2.189%
(India) MAPE MGM 1.298%, ARIMA 0.804%, NMGM 2.061%
Jiang et al. [1]2020AustraliaHalf -hourly EcRaw electricity demandAFD-S-OLSSVMMAE 12.338
RMSE 18.989
MAPE 0.926
Li et al. [24]2021Australia0.5–3 h of ED30 min EDVM,
MOSCOA
(Best results, 1-step ahead)
MAE 22.77, MAPE 1.63%, MBE −0.57, THI 2.09 × 10−5
Stergiou and Karakasidis [25]2021Greece10 and 20 days ahead ELHourly ELFFNN,
GRU-NN, LSTM,
LSTM-NN
(1-step) RMSE 115.98, MAPE 1.37%
(10-step) RMSE 256.18, MAPE 3.45%
(20-step) RMSE 506.49, MAPE 7.77%
Huang et al. [9]2021ChinaAnnual energy consumptionGDP, population, and monthly electricity consumption dataAMRIONA
Chaturvedi et al. [6]2022IndiaEDMonthly energy demandSARIMA, LSTM RNN, and Facebook ProphetMonthly total:
RMSE 4.23 GWh, MAPE 3.3%
Peak demand
RMSE 6.51 GW, MAPE 3.01%
Özdemir et al. [21]2022TurkeyYearly energy demandYearly GDP, population, imports, exportM-ABC (Linear) R2 0.9598, RMSE 1.2821, MAE 1.0551, MAPE 0.0129)
(Quadratic) R2 0.9843, RMSE 0.7999, MAE 0.6417, MAPE 0.0082
Incremona and Nicolao [22]2022ItalyCountry EL quarter-hourQuarter-hourly electric load demandsGaussian Process estimatorMAPE 1.77
RMSE 0.64
MAE 0.51
[GW]
Torres et al. [23]2022Spain10 min ED10 min ECLSTMMAE 398.7652
MAPE 1.4472%
RMSE 545.8998
(MW)
Michell et al. [26]2022USAHourly ECHourly ECLSTMN,
ARIMAH
MCS MSE 1 (1.000)
MSE 66,215.85
MAPE 0.587%
Mohammed & Al-Bazi [27]2022IraqMonthly EDMonthly EL, EC_D, EC_C, EC_I, EC_G, EC_AANN (ABPA)MSE: 1.195.650
MAPE 0.045
Table 2. Correlation matrix for the historical data of energy and the selected socio-economic parameters.
Table 2. Correlation matrix for the historical data of energy and the selected socio-economic parameters.
Energy Use (E)GDP (X1)Population (X2)Exports (X3)Imports (X4)
Energy use (E)1.00000.96550.95150.93540.9268
GDP (X1)0.96551.00000.85960.99250.9893
Population (X2)0.95150.85961.00000.80610.7961
Exports (X3)0.93540.99250.80611.00000.9978
Imports (X4)0.92680.98930.79610.99781.0000
Table 3. Properties of the experiments.
Table 3. Properties of the experiments.
Properties of GE
Generations50
Crossover Probability0.65
Population20
Mutation Probability0.1
Max Wraps3
Number of Codons100
Tournament2
Number of Runs20
Properties of DE
Recombination Factor:0.88
Mutation Factor0.47
Population Size20
Table 4. Statistical errors obtained based on estimation accuracy of the training and test dataset.
Table 4. Statistical errors obtained based on estimation accuracy of the training and test dataset.
Training DatasetTest Dataset
ModelRMSEAv. ErrorR2Ab. ErrorRel. ErrorRMSEAv. ErrorR2Ab. ErrorRel. Error
E10.01080.00860.99590.18990.01610.01350.01070.99250.23610.0189
E20.01110.00900.99560.19880.01670.01380.01100.99220.24290.0194
E30.00850.00620.99740.13710.01030.01600.01040.98960.22940.0151
E40.01080.00900.99750.19810.01530.01430.01120.99500.24610.0193
E50.01120.00970.99580.21290.01600.02310.01350.97920.29710.0193
E60.01110.00900.99560.19880.01670.01380.01100.99220.24290.0194
E70.01540.01430.99170.31410.02400.01600.01390.98920.30580.0237
E80.01150.00920.99600.20190.01550.01900.01510.98750.33300.0256
E90.01210.00940.99480.20790.01510.01320.01020.99340.22330.0164
E100.00780.00600.99790.13280.00990.01030.00690.99580.15150.0108
E110.00810.00710.99770.15520.01170.01100.00740.99510.16320.0116
E120.00970.00760.99670.16660.01350.01700.01360.98790.30000.0251
E130.01150.01000.99530.22080.01710.01280.01090.99310.23950.0185
E140.01170.00980.99570.21590.01700.01370.00820.99300.17980.0141
E150.01270.00980.99430.21580.01650.01760.01100.98890.24120.0171
E160.01180.00980.99510.21670.01790.01430.01220.99160.26930.0208
E170.01080.00860.99590.18940.01610.01360.01060.99240.23410.0187
E180.00800.00610.99770.13490.01050.01090.00740.99540.16230.0118
E190.01410.01280.99290.28240.02260.01540.01330.99000.29220.0233
E200.01020.00790.99650.17390.01340.01390.01160.99180.25530.0204
Table 5. Statistical errors obtained for the 20 estimation models and the average estimation for estimated energy use.
Table 5. Statistical errors obtained for the 20 estimation models and the average estimation for estimated energy use.
ModelRMSEAverage ErrorR2Absolute ErrorRelative Error
E17.78480.27910.99436.16280.0175
E27.9891−0.11340.99406.39030.0180
E38.1561−1.02040.99385.30270.0127
E48.0769−5.12730.99636.42560.0173
E511.5509−1.70660.98767.37910.0176
E67.9894−0.09200.99406.39120.0180
E79.9752−0.56500.99068.96740.0239
E89.9787−3.93150.99207.73870.0205
E98.0673−1.09810.99416.23830.0157
E105.8183−0.71110.99694.11260.0103
E116.1505−0.54510.99654.60570.0116
E128.8273−0.11680.99276.75070.0193
E137.74270.33190.99436.65940.0178
E148.1162−1.75280.99425.72510.0156
E159.7573−1.71610.99146.61190.0168
E168.35100.17950.99347.03100.0193
E177.81580.10590.99436.12660.0174
E186.0823−0.97230.99664.29950.0111
E199.41710.28820.99168.31220.0229
E207.76950.24330.99446.20920.0169
Average6.3749−0.90200.99624.62630.0125
Table 6. GPI and ranks of the estimation models.
Table 6. GPI and ranks of the estimation models.
S. No.ModelGPIRank
1E1−0.030710
2E2−0.114311
3E30.61715
4E41.08444
5E5−1.305218
6E6−0.118612
7E7−1.707020
8E8−0.435515
9E90.25987
10E101.72491
11E111.39063
12E12−0.575017
13E13−0.157613
14E140.50776
15E15−0.366514
16E16−0.519716
17E170.00389
18E181.59582
19E19−1.453219
20E200.02208
Table 7. Statistical errors obtained based on prediction accuracy of the training and test datasets.
Table 7. Statistical errors obtained based on prediction accuracy of the training and test datasets.
Training DatasetTesting Dataset
ModelRMSEAvg. ErrorR2Ab. ErrorRel. ErrorRMSEAvg. ErrorR2Ab. ErrorRel. Error
P10.00980.00900.99690.19840.01500.01590.01080.98900.22600.0172
P20.01590.01120.99200.24660.01690.01230.00770.99320.16130.0122
P30.00990.00820.99670.18010.01370.01880.01180.98490.24690.0194
P40.00960.00820.99700.18010.01320.01470.01010.99090.21150.0156
P50.01200.00930.99520.20550.01540.01510.01210.99060.25360.0208
P60.01040.00910.99640.20120.01560.01650.01190.98820.25090.0201
P70.01150.00970.99560.21390.01550.01450.01140.99170.23950.0187
P80.01090.00930.99610.20510.01570.01530.01170.99000.24480.0196
P90.01270.01010.99460.22120.01580.01220.00960.99360.20230.0168
P100.01080.00930.99620.20510.01560.01500.01130.99010.23740.0188
P110.01130.00960.99580.21140.01620.01560.01210.98960.25400.0206
P120.01060.00890.99620.19530.01480.01530.01120.99010.23620.0191
P130.01060.00900.99620.19900.01510.01540.01130.99010.23810.0191
P140.00970.00880.99690.19350.01450.01540.00990.99000.20840.0161
P150.01150.00890.99560.19490.01410.01460.01000.99140.21020.0164
P160.01200.01020.99520.22350.01630.01330.01040.99260.21840.0169
P170.01090.00930.99610.20510.01570.01530.01170.98990.24470.0196
P180.01060.00900.99620.19730.01500.01530.01130.99000.23680.0190
P190.01430.01090.99320.24040.01740.01340.01030.99240.21650.0184
P200.01000.00820.99660.18000.01290.01450.00980.99130.20600.0156
Table 8. Statistical errors obtained for the 20 prediction models and the average prediction.
Table 8. Statistical errors obtained for the 20 prediction models and the average prediction.
ModelRMSEAverage ErrorR2Absolute ErrorRelative Error
P18.3746−1.29160.99356.28290.0160
P29.05430.11740.99226.03810.0146
P39.4985−1.93420.99176.32260.0165
P47.8402−1.46350.99445.79760.0143
P58.6785−1.37590.99326.79620.0180
P68.7244−1.52890.99306.69300.0178
P78.3090−2.23180.99396.71230.0170
P88.4253−1.64420.99356.66050.0176
P97.9373−1.12870.99416.26910.0163
P108.2998−1.33200.99366.55150.0172
P118.6435−1.63010.99316.88980.0184
P128.3512−1.69380.99366.38670.0169
P138.3858−1.86420.99366.47120.0171
P148.1465−1.64840.99395.94880.0153
P158.3469−1.80890.99375.99790.0152
P168.0485−1.40570.99406.54110.0166
P178.4260−1.63390.99356.65900.0176
P188.3498−1.55440.99366.42640.0170
P198.8482−1.40560.99276.76430.0179
P207.9201−1.73560.99435.71490.0142
Average7.8857−1.50970.99436.00130.0156
Table 9. GPI and rank of the prediction models.
Table 9. GPI and rank of the prediction models.
S. No.ModelGPIRank
1.P10.19167
2.P2−0.758315
3.P3−1.023518
4.P41.73642
5.P5−0.973017
6.P6−0.885516
7.P70.17548
8.P8−0.391013
9.P90.55945
10.P10−0.226112
11.P11−1.032819
12.P120.13259
13P130.062010
14.P141.10443
15.P150.95074
16.P160.28546
17.P17−0.395414
18.P180.002011
19.P19−1.183920
20.P201.88361
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jamil, B.; Serrano-Luján, L. Hybrid Metaheuristic Algorithms for Optimization of Countrywide Primary Energy: Analysing Estimation and Year-Ahead Prediction. Energies 2024, 17, 1697. https://doi.org/10.3390/en17071697

AMA Style

Jamil B, Serrano-Luján L. Hybrid Metaheuristic Algorithms for Optimization of Countrywide Primary Energy: Analysing Estimation and Year-Ahead Prediction. Energies. 2024; 17(7):1697. https://doi.org/10.3390/en17071697

Chicago/Turabian Style

Jamil, Basharat, and Lucía Serrano-Luján. 2024. "Hybrid Metaheuristic Algorithms for Optimization of Countrywide Primary Energy: Analysing Estimation and Year-Ahead Prediction" Energies 17, no. 7: 1697. https://doi.org/10.3390/en17071697

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop