The emergence of smart grid (SG) technology has made the modern power system stable and increased its reliability. The communication between utility and consumers has proven to be beneficial for both parties. The former uses the information related to the demand and generates the effective incentive-based planes to control the pattern of demand up to some extent. Similarly, the latter receives the information related to price and incentives and alters the demand patterns to get the benefits in terms of low monetary cost. In spite of these positive aspects of the modern grid, there is still room for improvement. The main issues of the modern grid include improvement in energy efficiency, integration of renewable energy sources (RESs), and reduced emission of harmful gases which are a serious threat to the environment. These issues can be tackled by using the reliable and accurate information of future load demand, energy generation of RES, and suitable prices for upcoming intervals.
On the other hand, the latter are data-driven models which use historical data to train the model and, after training, predict the future values of the desired variables. These models include artificial neural networks (ANNs) and their variants, support vector machine (SVM) based on autoregressive moving average, etc. In this paper, a variant of ANN is implemented which is called long short term memory (LSTM) based recurrent neural network (RNN). This is a data-driven model which uses big data to train and map the input values to their respective output. It is a time series based model and usually works with a single value. In this paper, we have implemented this model with multiple input values to forecast the electricity price and demand. Moreover, the hyperparameters of this model are tuned using the Jaya optimization algorithm instead of the commonly used Adam optimizer.
Data preprocessing is also a very important phase of the electricity forecasting model. The data available for use is generally in raw form. In the preprocessing phase, it is cleaned and shaped according to the requirements of the forecasting model. The unprocessed and raw data leads the forecasting model to inaccurate predictions. A number of different data preprocessing techniques are available in the literature. In our paper, we have used only a selective number of input variables so we would not use any feature selection technique. Instead, we have used the z-score based method to eliminate the outliers from the data. Missing values are also removed and values of variables are normalized using a min-max scaler.
1.1. Related Work
The main objective of the study [4
] was to predict electricity prices using big data. Electricity price forecasting plays a significant role in cost reduction in SG. Decision tree (DT) and ANNs are popular for forecasting. The DT face overfitting problem which means it is good for training the model but it does not perform well for prediction. On the other hand, the ANN suffers poor convergence as its convergence is not easily controllable. Moreover, these learning-based algorithms do not use big data for predictions, instead they use only price data. So, by using big data for prediction, there is still room for improvement in accuracy. To address the aforementioned issues, the authors used big data and an SVM based model for electricity price prediction. SVM has two major issues that affect its performance. The first one is computational time and the second is tuning of super parameters. The first issue is addressed by reducing the dimensions of data and applying feature selection and extraction techniques. In the first step, random forest, and relief forest algorithms are used to select the important features then KPCA is applied for feature extractions. For parameter tuning, the traditional method of grid search and cross-validation are used.
Two architectures were proposed for SG big data analytics along with a software application in [7
]. This data was gathered from energy analyzers mounted at a suitable location in the SG system. The proposed software application was used for data analysis, monitoring, and collection. The main focus of this study was on peak clipping and load shifting. The performance of both frameworks was evaluated on the basis of latency, data rate, and network overhead. The results of both architectures depict that they are suitable and working efficiently on a small scale. However, the same architectures are extendable for the large scale. The representational state transfer (REST) interface is suitable for a relatively lower amount of analyzers. The integration of more analyzers in the network degrades the performance of the REST interface.
Traditionally, a data-driven or model-driven approach is used for dynamic event monitoring. However, the authors in [8
] have proposed a framework for dynamic event monitoring which combines the features of both approaches for better and efficient performance. In the proposed framework, the functional components were formulated using phasor measurement unit. In the next step, the representative features were learned from each component. In the third and final step, neural network based classifiers were trained to detect and classify the data and events. Simulations were carried out using the IEEE 39-bus system for evaluation of the proposed framework. The results depict the efficient event detection and classification.
Forecasting of energy consumption was investigated for different event venues in the study [9
]. Two forecasting methods, support vector regression (SVR) and neural networks, were used and their performance was compared on the basis of forecasting accuracy and mean absolute percentage error (MAPE) and cross validation error rate. Three scenarios were designed; one with fifteen-minute data, the second with one hour data, and the third with daily energy consumption data. Both methods were applied to these three scenarios and their performance was compared. The results show that both forecasting methods performed well in all three scenarios. However, the performance of neural networks, when applied to daily data, is better than SVR. Moreover, they concluded from performance evaluation that for the other two methods: fifteen minutes and hourly data, there is no dominance of one technique over another. Instead, both methods perform equally well. If we talk about the accuracy of peak consumption and total consumption then peak consumption was forecasted accurately by both models.
Authors in [10
] presented a comprehensive study of energy management based on big data analytics. Data generation sources and their properties were discussed. Moreover, a data processing model is also proposed, where steps of big data analysis are elaborated and each step is briefly discussed. This study covers the energy management aspects, microgrid and assets management, and demand-side management. Additionally, the challenges of data-driven approaches in SG, related to data collection, infrastructure, storage, integration, processing manipulation, security, and privacy are discussed and their issues are highlighted.
The study [11
] presents a detailed review of big data, its analysis methods, main issues, and challenges in the energy efficiency of buildings. A comparative analysis of big data research publications in energy with other disciplines is also presented in this paper. This comparison depicts the time-line of research publications on big data. The main issues of big data, taking out the required data in a limited time interval, and management of high dimensional data and limited processing capabilities of existing systems are highlighted. However, this paper does not suggest any suitable measure to address the issues of big data.
A data-driven approach was proposed to mine the energy consumption behavior of consumers over time in [12
]. The frequent pattern mining technique was used to capture the association of appliances of a user. The energy consumption data, generated from smart meters, was used for this purpose and mining methods were applied as this data is easily available. The proposed mechanism is an incremental progressive method. The association of appliances changes frequently, so this model records these frequent changes and mines out the dependencies of appliances. They concluded that in order to alter the energy consumption pattern and reduce energy consumption, the “appliances of interest” play a very important role. Moreover, they also demonstrated in results that consumer’s energy consumption behavior is directly affected by the association of appliances.
Authors in [13
] stated that energy consumption forecasting is beneficial for both building owners and utility. In this regard, the forecasting based on artificial intelligence and other conventional methods is reviewed. The conventional methods include stochastic time series and regression-based approaches. They concluded that the non-conventional methods are not efficient for non-linear data and they are also not flexible. On the other hand, the artificial intelligence-based methods discussed in this paper include ANN and SVM based methods and their variations. They stated that these methods have better performance than conventional methods and they perform quite well for data with non-linear patterns. Additionally, hybrid methods: hybrid SVM, hybrid neural networks and hybrid swarm-based methods are also discussed. These methods are proposed by researchers and proved to have better performance than originally available methods. They concluded that conventional methods are easy to develop but performance-wise artificial intelligence-based methods are better.
The study in [14
] aims at the use of big data in the electrical power system for control, process, and protection purposes. As the volume of data is huge and increasing rapidly, traditional database tools are unable to process this data. Special tools are used to handle this type of data. So, in this study, three important aspects of big data usage, feature extraction, and integration and application, are discussed. Moreover, this paper outlines the application of big data in power systems for asset management, fault detection, operation planning, and distributed generation. Data management steps are also discussed and future research directions of big data implication in SG are outlined.
Authors in [15
] discussed the application of IoT in SG. They have also considered the integration of RES in the system. They highlighted that IoT enabled SG generates a huge volume of data on a daily basis, which can be stored and used in the future. As this data is huge in volume, collected from heterogeneous sources, and increasing with the passage of time, it is referred to as big data. This big data can play a significant role in the operation, planning, and efficiency of future SG. Technologies used to store and process big data are also discussed in this paper. Moreover, an IoT framework for SG is also presented, which consists of three layers. The perception layer is used to collect data, the network layer transfers this data from source to destination using communication protocols, and the third layer is responsible for the definition of applications of this collected data, e.g., demand response, fault detection, demand profiling, the operation management of RES, etc.
The work of Jain et al. [16
] explores the impact of temporal granularity (daily, hourly, and 10-min intervals) on the accuracy of electricity consumption forecasting. They achieved the best results with hourly intervals and monitoring by floor level. Jain et al. studied a residential building; this research is concerned with large commercial customers, specifically event-organizing venues. To handle large variations in energy consumption caused by events, they have included contextual information about future events such as event type and schedule. Moreover, in addition to consumption prediction, this work also includes peak demand prediction.
Authors in [17
], proposed a short term load forecasting model for industrial applications. The proposed model is based on ANN and modified enhanced deferential evolutionary technique to improve the forecast accuracy. For feature extraction and network training, mutual information based techniques and a multivariate autoregressive model were used, respectively. The fast accurate short load forecast model is enabled via training to forecast the future load. Simulation results express that the proposed model provides 99.5% accurate predictions. However, the accuracy is improved by feeding the output of the forecasting module to the optimization module which takes more time to execute.
A real-time anomaly detection framework was proposed in [18
]. This framework detects the abnormalities with the help of smart meter generated data. At each consumer’s side, the error count was measured and delivered to the system with which that user is linked. The error counts of each consumer were infused together to detect the anomalies in the system and acquire information about if an anomaly is affecting multiple customers or only a single customer. In this model, multiple levels of an anomaly are defined which are mutually exclusive and represented by M. A health vector was defined which represents the detected anomalies. The smart meter measurements are the error counts and each smart meter can have one anomaly at a time. The number of the error count is defined over a Poisson distribution. At the system level, an additional variable was used which represents the ratio of anomaly detected and N customers. Missing points in data are also determined and defined over a Bernoulli distribution. Missing points in data means whether the error count has reached the literal or reached it with latency.
A big data framework was proposed in [19
] for SG related data. There are seven important stages of this framework: data generation, data acquisition, data storing, data querying, data analytics, data monitoring, and data processing. In this paper, authors have proposed a compact big data framework for SG, where they have discussed the open-source prevalent platform to tackle big data challenges and issues. In addition to the big data core components, the utilization of open-source state-of-the-art prevalent, Hadoop platform for addressing SGs big data challenges, are also discussed. The open-source tools are adopted, which provide an easy and cost-effective development environment for practicing engineering to develop similar tools for their demanding SG applications. Moreover, practical implementation of the framework, including necessary configuration and coding, and its application on real SG data are also presented. As data is gathered using a big data framework, this data can be used for forecasting by applying data mining techniques on available data. The forecasting can be beneficial for both utility and consumer. From the utility’s point of view, demand and energy consumption patterns of users can be predicted and energy generation and demand response programs can be implemented accordingly. From the user’s point of view, electricity cost can be predicted or electricity load information can be predicted and they can adjust their power usage accordingly.
Demand-side management is an important application in the future SG. In [20
], the authors proposed a two-tier cloud-based demand-side management system that schedules the power consumed by customers in different regions and in microgrids so that both customer and utility company costs are optimized. As the optimization problem is addressed globally at the cloud level, so, the objective function is also global and is beneficial for both users and utility at the same time. The cloud-based nature of the proposed model reduces the cost of computation. The two-tier framework solves the network congestion and high computational time problems as well because each region has its own edge cloud for computation and users interact with it, only filtered data and necessary information related to optimization from edge clouds is forwarded to the core cloud.
A data-driven approach was used in [21
] for anomaly detection in SG. In this work, the proposed architecture performed analysis and the final results were compared with the results obtained through random matrix theory (RMT). RMT is a purely mathematical model that is introduced in this paper to detect the anomalies along with the data-driven architecture. These methods are beneficial for both user and utility. The RMT uses a split window like a phenomenon to process data, so a group of consumers can share information and get the information in which they are interested in a small scale. Whereas, the utilities are big enough to control the groups and process their data.
The authors stated in [22
], that the deployment of monitoring devices in SG is generating big data. The use of this big data plays a very important role in increasing the efficiency of the SG system. In this work, the focus of researchers is on the issues of the data collection side. Two limitations in the existing literature are highlighted. One is that data is extracted and used for a single task only, whereas, the same data can be used for multiple tasks, e.g., voltage stability and prediction of the load. The second limitation is that data is acquired from source nodes and delivered to the destination nodes for information extraction. Instead of this, the intermediate nodes should be able to extract information and forward only important information. To overcome these limitations, two algorithms were proposed. The first one determines the suitable number and location for power nodes (intermediate nodes) placement and the second algorithm calculates the in-path bandwidth saving ratio. The proposed big data framework is effective for the improvement of computational efficiency and saving the in-path bandwidth ratio. The emulator is designed in C++.
A forecasting model for electricity prices and load was proposed in [23
]. This is a mixed forecasting model which uses an iterative neural network to forecast the future value of both electricity load and prices. From the results, they observed that the forecasting error, between forecasted values by model and real values of both electricity load and price, was very low. The forecasting problem is considered as an optimization problem, where, the objective is to reduce the forecasting error while forecasting the electricity load and prices simultaneously. As both variables are interdependent, the model uses values of both variables iteratively and does not include the values of known variables. Additionally, they claimed that this model can be applied to any real world scenario to forecast the future values of interdependent variables and the number of these variables can also vary and are not specific to two. The forecasting results of the proposed model were compared with existing techniques and it was evident that the forecasting accuracy of this model is better.
A load and price forecasting model was proposed in [24
]. They stated that the price of electricity changes dynamically and its change depends on the difference between the available power and load demand at a particular time interval. To manage demand of electricity according to the available power, forecasting of both variables is of great importance. Two separate kernel methods are proposed to forecast the values of these variables. These forecasted values are then used to alter the load consumption pattern. The user interaction is minimal in this model and the user controls the values of only three variables, i.e., how much load he wants to curtail, how much electricity he wants to consume, and which load to shift. As a result, the user gets the monetary incentives in the form of low electricity consumption cost.
Authors in [25
] proposed a two-staged short term load forecasting model that is price-dependent. They stated that the electricity prices were changed in real time according to the load consumption patterns. The consumers can alter their patterns according to the real time prices of electricity and minimize their electricity consumption cost. The proposed two-staged model first forecasts the load consumption without considering electricity prices. In the next stage, the price sensitivity is added to the forecasted values of the first stage. Fuzzy logic is used at this stage and to optimize the number of rules and parameters of the membership function automatically, a genetic algorithm is used. In this way, the forecasting values are improved and the forecasting error is minimized.
Another fuzzy logic based load prediction method is proposed in [26
]. In this model, the electricity consumption was forecasted using a neural network using historical load data. After load prediction, the values were adjusted by incorporating the effect of electricity prices on load consumption. Fuzzy logic was used for this price adjustment process. From the simulation results, it can clearly be observed that the addition of the electricity price effect on load forecasting significantly improved the forecasting accuracy of the proposed model.
The real time pricing schemes greatly affect the power consumption patterns and cost of electricity of users. To manage load efficiently and reduce the cost, the knowledge of future values of electricity price is inevitable. To assist the users in this regard, a price forecasting model was proposed in [27
]. A set of relevance vector machines is used to forecast the electricity price and later on these forecasted values are aggregated and their coefficients are determined. For an optimized ensemble method for price forecasting, a micro genetic algorithm was used. Moreover, the performance of the proposed model is compared with naive Bayes and auto regressive moving average in terms of mean absolute error (MAE). The results depict that the proposed ensemble method has the highest forecasting accuracy.