Machine Learning-Based Approach to Predict Energy Consumption of Renewable and Nonrenewable Power Sources

: In today’s world, renewable energy sources are increasingly integrated with nonrenewable energy sources into electric grids and pose new challenges because of their intermittent and variable nature. Energy prediction using soft-computing techniques plays a vital role in addressing these challenges. As electricity consumption is closely linked to other energy sources such as natural gas and oil, forecasting electricity consumption is essential for making national energy policies. In this paper, we utilize various data mining techniques, including preprocessing historical load data and the load time series’s characteristics. We analyzed the power consumption trends from renewable energy sources and nonrenewable energy sources and combined them. A novel machine learning-based hybrid approach, combining multilayer perceptron (MLP), support vector regression (SVR), and CatBoost, is proposed in this paper for power forecasting. A thorough comparison is made, taking into account the results obtained using other prediction methods. and train-test split before training the model. We used various metrics to test the advantages of the proposed model: mean absolute error, mean absolute percent error, mean squared error, and root mean squared logarithmic error. We also selected the latest model for comparison with the proposed hybrid model.


Introduction
In recent years, countries worldwide have actively researched the power sector. Deep learning technology has brought new opportunities and challenges to power load forecasting [1]. The power system's main task is to provide a safe and reliable power supply for the consumers. Therefore, energy forecasting is of considerable significance to the power sector. Accurate power load forecasting is of great importance for saving energy, reducing power generation costs, and improving social and economic benefits. With the development of power reform and the deepening of power marketization, energy load forecasting has become more critical in the power system. It is also essential to increase power demand forecasting accuracy for the power system's stable and efficient operation. Nonrenewable energy sources such as coal, oil, natural gas, fossil fuels, nuclear, minerals, etc., cannot be regenerated in a short period, and their consumption rate far exceeds their regeneration rate. For instance, fossil energy is not only limited and will eventually dry up, but also its price is increasing day by day [2]. Moreover, once these energy resources are exhausted, they cannot be mined anymore in the future. Therefore, the world is shifting towards renewable energy sources such as wind energy, waterpower, and solar energy. They do not emit greenhouse gases, so they will not increase the risk of the greenhouse effect. Renewable energy has become the mainstream preferred energy solution [3]. Solar and wind energy can gradually keep pace with traditional energy sources. At the same time, due to the fact that renewable energy is affordable, low-carbon, environmentally friendly, stable, and reliable, consumers in emerging markets and enterprises have continuously increased their demand for renewable energy [4]. These driving factors and demand trends are particularly evident in developed and developing regions around the world. The utilization efficiency of renewable energy needs to be improved, and the system's absorption capacity needs to be improved. With the transformation of the global energy industry, renewable energy is gradually replacing traditional fossil energy. This transformation means that energy systems and power plants must improve the accuracy of renewable energy generation forecasts, optimize each desired power generation asset's output, and coordinate the overall production of multiple sites and equipment. At the same time, it is necessary to ensure the economy of doing so and the impact on the stability of the power grid.
Renewable energy sources are increasingly integrated with nonrenewable energy sources into electric grids to meet energy demand requirements [5]. The historical data of power load are an ordered collection sampled and recorded at a particular time interval, so they are a time series. As a branch of artificial intelligence, soft computing technology aims to gain more reliable and accurate systems and has proven to be an excellent tool for solving various energy applications problems. We have used machine learning to predict the integrated energy consumption of renewable and nonrenewable power sources. A novel machine learning-based hybrid approach, combining multilayer perceptron (MLP), support vector regression (SVR), and CatBoost, is proposed in this paper for power forecasting. As a case study, we obtained the actual energy consumption time series data of Jeju Island. Jeju Island is the largest island in Korea. It has a total population of over 690,000 people, an area of 1826 square kilometers, with a humid climate. Jeju Energy Corporation (JEC) has set the goal to liberate electricity imported from the Korean Peninsula and use only renewable energy to meet all power demands [6]. The plan includes replacing fossil fuel generators with renewable energy technologies, and that renewable energy will be generated within the island. JEC aims to achieve this in three main steps. The first consists of turning Jeju's small island (Gapa Island) into a test lab, making it the first carbonless island. Second is increasing the rate of renewable energy in the energy supply by 50% by 2020, and third is turning Jeju Island into a free of carbon city by 2030 [7].
The contributions of this paper are threefold: • the data analysis of power consumption from renewable energy sources and nonrenewable energy sources; • to propose a novel hybrid approach for energy consumption forecasting; • to compare the proposed model with prediction algorithms.
The paper's remainder is organized as follows: Section 2 discusses publications, articles, and related materials. Section 3 describes the process flow for predicting machine learning-based energy consumption and presents energy sources for data collection. Section 4 describes the proposed hybrid model and ML algorithms. Section 5 describes exploratory data analysis, preprocessing, and a training test division. Section 6 presents the performance results of the proposed model and analyzes the evaluation metrics. This paper is concluded in Section 7 after the discussion of experiments.

Related Works
The integration of solar energy to the energy system is increasing; hence, predicting solar power generation is becoming increasingly important to control energy quality and improve system reliability. The article by Asrari et al. [8] proposed using a hybrid prediction algorithm to estimate sunlight per hour. However, considering the competitiveness and burden of accounting, many cost-effective and cost-improvement methods and model structures have been developed. In the first stage, gradient-descent optimization is used to meet the artificial neural network's initial criteria. In the second stage, a meta-optimization model called an ensemble empirical mode decomposition (EEMD) is developed to find the optimal artificial neural networks.
Ensemble empirical mode decomposition and support vector machine degradation were used by Hu et al. [9] to improve the quality of wind speed prediction utilizing a hybrid method. Improved prediction results were obtained using the proposed method. Ahmad et al. [10] presented an overview of the evolution of power forecasting results using artificial intelligence techniques such as support vector machine and the artificial neural network (ANN). They concluded that the hybrid method is more suitable for predicting energy consumption. The technical document by Rui et al. [11] uses the advanced cuckoo search-extreme learning machine (ELM) model as the base model. It combines the standard genetic algorithm and the auto-regressive and moving average model to create a hybrid model. It can be seen that the improved cuckoo search-extreme learning machine model, following the order of experimental analysis, overcomes the performance limitations; that is, the initial weight gain is not easy to produce; it is variable and stable. Based on the results, the distance between the test input vector and the distribution matrix was calculated. A possible predictive sample space was generated, and the upper and lower prediction limits of the predicted values were provided. The proposed method could avoid major single point prediction failures by improving the accuracy of the performance prediction range.
Predicting solar radiation in photovoltaic power generation is essential for integrating natural renewable energy into electrical utility grids. The study by Paiva et al. [12] evaluated two machine learning algorithms that can predict sunlight all day: multi-gene programming and multilayer perceptron. The results indicated that site definitions, prediction ranges, and error calculations affect the accuracy of the model. In [13], Ju et al. considered a degradation cost model and proposed a discreet two-tier management system with a hybrid energy storage system. They represented the problem in a way that solves low operating costs by taking into account the energy fluctuations of renewable energy. They also developed a more economical cost model for batteries and supercapacitors to turn long-term capital costs into short-term operational problems. The proposed energy management system is used for microgrids that include common points of connection with the public grid, hybrid energy storage systems, renewable energy systems, and total load. They performed a simulation and found that various energy storage devices, including super batteries and capacitors, can be used for multiple decision-making goals in different tiers. However, they did not consider combining random planning and the proposed energy management system with renewable energy prediction and modeling uncertainty in renewable energy production in their current work. Based on the weather observation and forecast dataset by the Beijing Meteorological Administration, wavelet noise reduction and CatBoost-based models were used by Li et al. [14] for short-term weather forecasting.
The work by Catalao et al. [15] proposed a new hybrid technology for predicting short-term wind power in Portugal. Their proposed method was based on the fusion of a wavelet transform, particle swarm optimization, and an adaptive network-based fuzzy inference system. The proposed technique is newly effective in applying wind energy predictions. The average results are better than the other approximations, allowing average calculation time. Therefore, the results confirm the validity of their proposed short-term wind energy prediction method. Yu et al. [16] introduced a new import selection program that combines the group method of data handling and SVR to predict short-term hourly load. After configuring the group method of data handling networks multiple times under the same experimental conditions, they also set the network numerous times. Each entry was different due to the random distribution of the training dataset. Only one network configuration could produce biased input selection results. To demonstrate the implementation of the method proposed in the experimental evaluation, they used hourly Korean datasets to compare the estimated effects for one hour, one day, and one week. Experimental results show that the expected performance of the proposed method outperforms other methods.
Zhaojing et al. [17] provided a hybrid ensemble deep learning model for probabilistic low voltage loads, whose predictions were consistent and generally reliable. Nonlinear relation diagrams were mapped with deep training models and deep belief networks. Five-way bagging and boosting algorithms were used to improve network stability. Various transmission methods were used in the installation process to ensure the site's safety during the loading and unloading process. The accuracy of the augmented Dickey-Fuller test confirmed this. The experimental results show the enormous potential of real applications in the distribution network and significantly impact decision-making and assumptions.

Machine Learning-Based Energy Load Forecasting
Machine learning is widely used in the energy sector for energy load forecasting [18]. The machine learning method selects the load of the past period of time as the training sample, constructs a suitable network structure, and uses a particular training algorithm to train the network to meet the accuracy requirements. Figure 1 shows the flow of our proposed forecasting strategy. We took the two power sources and combined nonrenewable and renewable energy sources with the date and time. Renewable energy consists of fossil fuel (FF)-based power sources. Nonrenewable energy is obtained from small-sized solar energy generation without any contract called behind the meter (BTM), photovoltaic (PV), or generated solar energy and wind power (WP). We performed exploratory data analysis (EDA) to analyze and see the trends in data. After preprocessing, we split the data into training and test sets. Then, we trained the hybrid model, which consists of three base models, CatBoost, support vector regressor (SVR), and multilayer perceptron (MLP). We used the test dataset to validate the model. We used different error metrics to analyze the performance of the proposed model. Then, we got the forecasting results and compared them with existing ML models.

Power Sources
We used the energy consumption data of Jeju, South Korea, as the test data. Figure 2 represents the sources of data collection. Jeju Energy Corporation (JEC) is an energy distributor for Jeju province. Jeju energy corporation plans to actively promote and advance the photovoltaic power generation business along with wind power. It gets the electricity from two significant sources: Korea Electric Power Corporation (KEPCO) and Korea Power Exchange (KPX). These organizations are responsible for the electricity market's operations, power system, and real-time dispatch to support the government's planning and policy-making efforts. KEPCO provides renewable energy sources. It has three primary sources; small-sized solar energy generation without any contract called behind the meter (BTM), PV generated solar energy, and WP. KPX provides the energy sources generated from nonrenewable sources such as FF-based power sources.

Proposed Hybrid Ensemble Model
The main goal of machine learning is to train a stable model that performs well in all aspects, but the actual situation is often not ideal. Ensemble learning is the process of combining multiple models to obtain a better and more comprehensive strong supervised model [19]. The underlying idea of ensemble learning is that even if a particular weak classifier gets a wrong prediction, other weak classifiers can improve the mistakes. A novel machine learning-based hybrid approach, combining MLP, SVR, and CatBoost, is proposed in this paper to forecast energy consumption. Figure 3 shows the structure of the proposed hybrid model. It trains the CatBoost, MLP, and SVR models and ensembles the forecasting results of these models.

CatBoost
CatBoost is the open-source machine learning library. It belongs to the boosting algorithm family, similar to the XGBoost and LightGBM algorithms [20]. It is an improved implementation under the gradient boosting decision tree algorithm framework. This framework is based on a symmetric decision tree algorithm with few parameters, support for categorical variables, and high accuracy. It is mainly used to solve the problems efficiently and reasonably with processing the categorical features. CatBoost is composed of categorical variablesand boost, in addition to dealing with gradient bias and prediction shift problems. It improves the accuracy of the algorithm and the generalization ability [21]. High model quality can be obtained without parameter adjustment, and outstanding results can be obtained using default parameters, reducing the time spent on parameter adjustment. It also supports categorical variables without preprocessing non-numerical features.

Support Vector Regression
Support vector regression (SVR) is a machine learning method based on statistical learning theory. It is a learning method specialized in limited sample prediction [22]. The support vector regression method realizes the inductive principle of structural risk minimization. It can well solve small sample problems, nonlinearity, high dimensionality, and local minima, so it has been widely used in load forecasting. Support vector regression can be used for ultra-short and short-term load forecasting and medium-and long-term load forecasting. Its advantages are global optimization, simple structure, and strong generalization ability. However, the quality of SVR prediction results largely depends on its learning parameters, and the prediction accuracy is greatly affected by the parameters.

Multilayer Perceptron
There are many neural network variants, such as backpropagation neural networks [23], probabilistic neural networks [24], convolutional neural networks [25], time recurrent neural networks [26], long short-term memory networks [27], etc. However, the most straightforward and original neural network is the multilayer perceptron. The most typical MLP includes three layers: an input layer, a hidden layer, and output layer. The different layers of the MLP neural network are fully connected. Any neuron in the upper layer is connected to all neurons in the lower layer [28]. An MLP network takes a feature vector as input, passes the vector to the hidden layer, calculates the result through the weight and activation function, and passes it to the next layer until it is finally moved to the output layer. The MLP algorithm is trained to calculate and learn the weights, synapses, and neurons of each layer. y 0 is given as the input for the first layer to obtain the k th output using Equation (1).
where the sigmoid function is σ, w is the weight, and k denotes the multilayer perceptron model's output value, whereas y k is the output of the MLP. i indicates the number of hidden layer from one to n. w i,k is the connection between the i th hidden layer neuron and the k th output layer neuron.

Forecasting
Load forecasting is used to explain how much electricity power companies need to supply to balance demand [29]. We used the combined load data of renewable and nonrenewable energy sources to forecast energy consumption. We performed exploratory data analysis, pre-processing, and the train-test split before training of the model.

Exploratory Data Analysis
The recent actual energy consumption data of Jeju Island from the year 2012 to the middle of 2020 was collected for experimental purposes. Table 1 summarizes the energy source-wise dataset. It shows the count of each data source, mean, standard deviation, minimum, and maximum load.  Figure 4 shows the load values from different energy sources present in the dataset. Figure 4a shows that the amount of power generation from wind power is almost constant, but its contribution is very marginal. Figure 4b shows the solar energy obtained from PPAs, and Figure 4c represents the solar energy obtained from BTM. From these graphs, we conclude that the amount of the power generation of solar energy increased drastically from 2019. According to the decision of "New Energy and Renewable Energy Development, Use, and Distribution Promotion Act of Korea," renewable energy facilities should be obligatorily installed. The energy is supplied by using new and renewable energy for more than a certain percentage of the estimated energy consumption calculated during the design [30]. This act leads to a significant increase in the use of renewable energy from 2019. Figure 4d represents the power load from the fuel-based energy sources.  To better understand the power used on an hourly basis, we provide Figure 5. These graphs show the hourly energy load distribution according to the hour, day of the week, year, and the week of the year. We can see that the energy consumption is high between 11 AM and 09 PM in the hour graph. On the day of the week graph, it is observed that the energy consumption is low during the weekend. In contrast, the yearly chart shows that energy consumption highly increased since 2016. The week of year graph shows that energy demand is high during the first eight weeks and the 30 to 35th weeks of the year.  Figure 6 shows the distribution of variable load over quarters. These graphs show the univariate distribution of load observations in the four quarters of a year. One quarter consists of three months, such that the first quarter (Q1) includes January, February, and March. The second quarter (Q2) consists of April, May, and June. The third quarter (Q3) consists of July, August, and September, and the fourth quarter (Q4) consists October, November, and December. We observed that in Q1, values mostly lie between 550 and 700. In Q2, they lie between 480 and 590, whereas in Q3, they are between 490 and 670, then from 500 to 610 in Q4. It was observed during the data analysis that fossil fuel-based energy is very high compared to renewable energy. Figure 7 shows the comparison of combined renewable and nonrenewable energy sources. The Y-axis on the left side shows the load value obtained from fossil fuel energy sources, and the Y-axis on the right side shows the load value obtained from renewable energy sources. Figure 8 represents the count of each load value. Most of the load values lie between 500 and 600 MW.

Pre-Processing
The pre-processing phase is important to make data smooth for machine learning algorithms. Jeju has two options for the solar energy supply contract: KEPCO and KPX. In total, we have four types of energy sources in the dataset. The first one is from fossil fuel (FF). FF consists of the total amount of energy consumption hourly in Jeju acquired from KPX. The second source is behind the meter. BTM is the small sized solar energy generation without any contract received from KEPCO. The third is the generated photovoltaic solar energy contract acquired from KEPCO as a result of power purchase agreements (PPAs), and the final source is from wind power (WP). JEC receives this energy from KEPCO. We added all these types into one using Equation (2). The total number of rows becomes 72,936 from 291,744. Figure 9 shows the graph for the actual energy power generation in Figure 9a and the combined energy power generation in Figure 9b.
Since these data contain only load values, time series features have been created from the date-time index. These features include hour, day of the week, quarter, month, year, day of the year, day of the month, and the week of the year.

Training
The processed data are used to establish a training model. There are many parameters for the training of the model. While tuning the hyperparameters, the root mean squared error (RMSE) is defined as the evaluation metric. RMSE is the square root of the variance, obtained by Equation (3), where y a is the actual value and y p is the predicted value. N represents the sample size in terms of numbers. This statistical parameter is also called the standard deviation of the fit of the regression system. The low value of the RMSE is the indicator of a well-trained model, while sampling frequency is set according to each tree, and a symmetric tree is used as a grow policy. The loss function for training purposes is set to the RMSE, and the score function is set to the cosine.

Train-And-Test Split
The combined dataset is divided into two parts, test and training sets. Training data consist of January 2012 to August 2018, while test sets include data from September 2018 to 27 April 2020. The input data are a time series energy load consumption pattern. Time series data contain daily, hourly, weekly, monthly, and yearly energy consumption patterns. Forecasting is predicting the future based on historical and current data, mostly through trend analysis. Hence, the actual time series data represented in successive order describe the problem adequately.

Experimental Results and Discussion
This section presents the results of our simulation and already existing algorithms.

Feature Importance Analysis
There are different ways to analyze feature importance. A correlation graph is an intuitive tool for studying correlation. It is an N × N table used to summarize the prediction effect of the regression model; that is, the association between the label and the classification predicted by the model. In the confusion matrix, one axis represents the model's label, and the other axis represents the actual label. N represents the number of categories. Generally, before a detailed quantitative analysis, a correlation diagram can be used to roughly judge the direction, form, and closeness of the correlation relationship between features. Figure 10 represents the correlation graph of our training data.
Another method to analyze the feature importance is the Shapley value (SHAP). The purpose of SHAP is to calculate the contribution of each predictor function and to explain the prediction. The summary graph in Figure 11 combines the importance of features and their effect. Each point in the diagram has a unique value. The specified feature determines the Y-axis, and the Shapley value determines the X-axis position. The colors represent the value of the feature from the bottom to the top. The features are arranged according to their importance. The graph contains all the training data points. While features are categorized in descending order, the horizontal location means that the value is greater than or less than the expected value. The color indicates whether the variable is high (red) or low (blue). From the graph, we can observe that the "year" feature's implementation has a significant positive effect, as seen in red, and the "positive" impact appears on the X-axis. In addition, the "day of the week" feature has a negative relationship with the targeted variable, while the "day of the month" has less impact on training data.

Forecasting Results
The comparison of actual and forecasted load values is depicted in Figure 12. As we mentioned, the proposed model is trained on Jeju Island's actual energy consumption time series data. Therefore, the prediction is based on the training data. Figure 13 shows the forecast for June, and the error rate is calculated for the whole prediction model and not according to per month or day load forecasting. The error rate in Table 2 justifies that the proposed model is not 100% accurate, and the deviation in prediction is the reason for the error percentage. We demonstrate the graph of one day (25 May 2019) to better understand the forecasted result in Figure 14.

Model Goodness Inspection
Different evaluation metrics were used to inspect the goodness of the model, such as mean absolute error, mean absolute percent error, mean squared error, and root mean squared logarithmic error. We also chose the state-of-the-art models for the comparison with the proposed hybrid model. Table 2 shows the comparison of different evaluation metrics.

Mean Absolute Error
The mean absolute error (MAE) represents the difference between the original and expected values and is extracted as the dataset's absolute difference mean. The MAE of the proposed model is 15.727 g, and it is calculated using Equation (4).

Mean Squared Error
The mean squared error (MSE) is the difference between the original value and the expected value. It is extracted by squaring the mean squared error of the dataset using Equation (5).

Root Mean Squared Logarithmic Error
The root mean squared logarithmic error (RMSLE) is obtained by Equation (6). The root mean squared logarithmic error is the logarithmic relationship between the actual data value and the model's expected value. By applying the proposed model, we got an RMSLE score of 0.0378, which is the lowest compared to other models.

Mean Absolute Percent Error
The mean absolute percent error (MAPE) is a measure of the accuracy of a prediction. It measures the size of the error, calculated using Equation (7) and expressed as a percentage. We got an MAPE of 4.2949% by applying the proposed model. Table 3 shows the comparison of our model's MAPE to the scores of state-of-the-art methods. It also shows the minimum and maximum errors as a percentage. From the comparison, it is clear that the proposed hybrid model performed well compared to the other existing models. The proposed model is compared with the state-of-the-art models in this section. We compared our model with Lasso, Ridge, GradientBoost, MLP, SVR, and XGBoost. Figure 15 shows the graphical representation of the comparison with these models.

Conclusions
This article gives a brief introduction to the basic knowledge and classification of power systems' load forecasting. It focuses on classifying some load forecasting methods, the advantages, and disadvantages, into traditional and modern smart forecasting methods. In today's world, renewable energy is integrated with nonrenewable energy to empower electric grids, which introduces new challenges because of its interference and volatility. Energy forecasting using soft computing technologies plays an essential role in solving these challenges. When predicting electricity consumption, it is necessary to determine an appropriate prediction method according to the expected forecasting results and characteristics of the prediction model. Besides, individual forecasting models have some limitations to overcome. Therefore, combinations of prediction methods are receiving increasing attention. Consequently, we propose a new hybrid method based on machine learning. The proposed model combines multi-layered perceptron, support vector regression, and CatBoost. We used the combined load data of renewable and nonrenewable energy sources to forecast energy consumption. We performed exploratory data analysis, pre-processing, and train-test split before training the model. We used various metrics to test the advantages of the proposed model: mean absolute error, mean absolute percent error, mean squared error, and root mean squared logarithmic error. We also selected the latest model for comparison with the proposed hybrid model. Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest regarding the design of this study, the analyses, and the writing of this manuscript.

Abbreviations
The following abbreviations are used in this manuscript: