Artificial Intelligence Techniques in Smart Grid: A Survey

The smart grid is enabling the collection of massive amounts of high-dimensional and multi-type data about the electric power grid operations, by integrating advanced metering infrastructure, control technologies, and communication technologies. However, the traditional modeling, optimization, and control technologies have many limitations in processing the data; thus, the applications of artificial intelligence (AI) techniques in the smart grid are becoming more apparent. This survey presents a structured review of the existing research into some common AI techniques applied to load forecasting, power grid stability assessment, faults detection, and security problems in the smart grid and power systems. It also provides further research challenges for applying AI technologies to realize truly smart grid systems. Finally, this survey presents opportunities of applying AI to smart grid problems. The paper concludes that the applications of AI techniques can enhance and improve the reliability and resilience of smart grid systems.


Introduction
The concept of the smart grid is transitioning the traditional electric power grid from an electromechanically controlled system to an electronically controlled network. According to the US Department of Energy's Smart Grid System Report [1], the smart grid systems consist of information management, control technologies, digitally based sensing, communication technologies, and field devices that function to coordinate multiple electric processes. These smart grid technologies have changed the conventional grid planning and operation problems in at least three main areas, primarily in the ability to (1) monitor or measure processes, communicate data back to operation centers, and often respond automatically to adjust a process; (2) share data among devices and systems; and (3) process, analyze, and help operators access and apply the data coming from digital technologies throughout the grid. Some of the related problem space in smart grids include load forecasting (LF), power grid stability assessment, fault detection (FD), and smart grid security. These key elements are allowing massive amounts of high-dimensional and multitype data to be collected about the electric power grid operations. However, the traditional modeling, optimization, and control technologies have many limitations in processing these datasets; thus, the applications of artificial intelligence (AI) techniques in the smart grid become more apparent.
AI techniques use massive amounts of data to create intelligent machines that can handle tasks that require human intelligence. Machine learning (ML) is a branch of • Supervised learning: An AI paradigm in which the mapping of inputs and outputs has been studied to predict the outputs of new inputs. • Unsupervised learning: An ML class in which the unlabeled data are used to capture the similarity and difference in the data. • Reinforcement learning (RL): Differs from supervised and unsupervised learning, due to its intelligent agents strategy, which aims to maximize the notion of cumulative reward. • Ensemble methods: Combine the results from several AI algorithms to overcome the limitations of one algorithm with better overall performance.

Expert Systems
The ES (see Figure 1) is the first-generation intelligent system, which is designed to replace the human expert in a certain domain to solve a certain problem based on Boolean logic. The solution to many smart grid problems in certain fields-such as fault diagnosis, intelligent control, and energy router self-determination-still depends on the ES technique [10]. The domain knowledge acquired from the domain expert is represented in the knowledge base of the ES. Expert knowledge and databases form the knowledge base, which is the core component of ES. In the knowledge base, rules are defined in the form of if-then statements connected by logical operations [3]. The knowledge can be directly acquired from domain experts or from the results of research studies. The ES draws conclusions from the problem by testing the if-then rules with user-input information that interfaces with the knowledge base though the intermediate rule engine.
FL was proposed to handle the concept of partial truth. Unlike the Boolean logic used by ES, FL is an approach to computing based on values that vary between 0 and 1. FL emerged in the theory of fuzzy sets, which assigns a degree of membership, typically a value between 0 to 1. For example, the FL can use 0 to represent totally false, 1 to represent totally true, and the numbers between 0 and 1 to represent partial truth or partial false, by assigning degrees of truth to propositions. It is often understood in a very wide sense, which includes degrees of all kinds of formalism. A fuzzy inference system (FIS) first transfers input crisp variables into fuzzy variables. After applying the input variables to fuzzy operators in the "if" segment of the rule, consequent results can be inferred from the "then" part of the rule. The last step of FIS is defuzzification, which converts the output to crisp values. The Mamdani and Sugeno methods are two popular FIS-based approaches. Both methods apply several rules in which the methods determine the degree of fulfillment.

Supervised Learning
Supervised learning is the ML task of building general hypotheses for input and output trained by connecting labelled external input and output pairs [11]. The mapping function can then be used to predict future data after training. A wide range of supervised learning algorithms were developed in the last two decades and are widely used to improve smart grid systems. Figure 2 lists the common supervised learning algorithms of the smart grid.
Artificial neural networks (ANNs), which tend to emulate the biological nervous system [12], have enormously influenced a variety of areas in the last decade. ANN techniques, like many other ML techniques, do not need to be explicitly programmed, but use algorithms to make predictions based on data. ANNs solve image processing and pattern recognition problems, which are difficult to solve by traditional methods, very efficiently. Extreme learning machines (ELMs) that use one hidden layer feedforward neural network are an ANN algorithm, and they have been applied to solve smart grid problems, such as power system stability assessment [13][14][15] and fault detection [16][17][18]. Rumelhart et al. [19] proposed the back-propagation neural neural network (BPNN) for the learning procedure of neural networks by repeatedly adjusting the network weights until the error between the output and ground truth reach a certain level. BPNN has been widely used in different neural network algorithms. A multilayer perceptron is a feedforward neural network algorithm [20]. Another well-developed feedforward neural network is the probabilistic neural network (PNN), in which the parent probability distribution function of each class is used to estimate the class to input data [21].
Driven by increasing amounts of data and the need to solve more complex problems, there has been a significant emergence of new AI algorithms with the support of powerful computer hardware, allowing AI to enter the so-called AI 2.0 stage [22]. Deep learning (DL), which is a subset of ML, was originally used for image processing, starting from multilayer deep neural networks (DNNs). DL techniques have been rapidly developed in recent years, and numerous successful structures have been proposed to solve smart grid problems, including deep belief networks [23], convolution neural networks (CNNs) [24], recurrent neural networks (RNNs) [25], generative adversarial networks [26], and autoencoder [27].
Aside from the aforementioned algorithms, numerous AI methods are also employed for classification and regression problems. Support vector machine (SVM) is one of the most robust classification models proposed by Vapnik [28]. The k-nearest neighbors (KNN) algorithm, which is very fast for training, is also used for classification and regression in smart grid systems [29][30][31]. The decision tree learning model and logistic regression, which are very easy to interpret and implement, have also been widely adapted in smart gird systems [32,33]. Regression methods-such as linear regression (LR) [34], Gaussian process regression (GPR) [35], support vector regression (SVR) [36], and multivariate adaptive regression spline (MARS) [37,38]-provide solutions for problems with smart gird forecasting, fault detection, demand response, and so on.

Unsupervised Learning
Supervised learning algorithms show great performance after decades of development, but they are only beneficial when users have some ground truth or know what patterns to look for, which is not always guaranteed in the real world. This makes unsupervised learning useful because it can be used to infer potential information or find hidden patterns from data without labels. Figure 3 lists the common unsupervised learning algorithms.
Unsupervised neural networks-such as restricted Boltzmann machine, autoencoder, and variational autoencoder-are applied to anomaly detection [39,40], stability assessment [41], load forecasting [42][43][44], and so on. Clustering is the unsupervised task of grouping the population or data points into a set of groups, in which data in the same groups are similar to each other. K-means, fuzzy c-means, hierarchical clustering, and DB-SCAN (density-based spatial clustering of applications with noise) are commonly used for fault detection [45] and load forecasting [46][47][48]. Dimensional reduction (DR) techniques, which transform the data from a high-dimensional space to a low-dimensional space, are often required when processing smart grid data to reduce redundant features. Some of the DR methods commonly used in the smart grid [43,[49][50][51] include principal component analysis (PCA), linear discriminant analysis, generalized discriminant analysis, and non-negative matrix factorization.

Reinforcement Learning
RL is an increasingly popular algorithm when solving smart grid problems. RL consists of agent, environment, reward, and action. RL aims to maximize the cumulative reward by a continuous process of receiving rewards and punishments on every action. With limited knowledge of the environment and limited feedback on the quality of the decisions, RL can respond to unforeseen scenarios. Figure 4 lists the commonly used RL algorithms. Q-learning and SARSA (state-action-reward-state-action) are used in attack detection [52] and energy management [42,53]. Deep reinforcement learning (DRL) is an algorithm that combines the perception of DL with the decision making of RL. AlphaGo [54] presents the success of DRL by applying the rich perception of high-dimension input and policy control. Deep Q network and deep deterministic policy gradient are popular algorithms of DRL in smart grid systems [55][56][57][58][59].

Ensemble Methods
Ensemble methods combine results from multiple learning algorithms or different initial data to obtain better overall performance. Bootstrap aggregating, or bagging, treats each model in the ensemble vote with equal weight and trains them by using a random data subset. Random forest is a successful bagging model that combines random decision trees with a high-classification algorithm. It is also used on load forecasting [60], anomaly detection [61,62], and stability assessment [63]. Boosting is another ensemble method that builds a new model that attempts to correct the misclassification from the previous model and shows promising results in smart grid problems [64][65][66]. Stacking, which is an ensemble learning technique that combines the predictions of several classification or regression algorithms, is well-developed for load forecasting [67], anomaly detection [68], and cyberattack detection [69].

Artificial Intelligence Techniques in Smart Grids
This section presents a review of AI techniques in smart grids.

Research Methodology
In line with the objective of our research, the authors adopted an inductive approach and conducted a systematic literature review, following Tranfield, Denyer, and Smart [70]. Specifically, the review scope was defined, the related literature was searched, the representative methods were selected, and the collected materials were analyzed.
Several queries were run against Google Scholar databases to gain an overall understanding of the coverage offered by literature under the disciplines. We focused on peer-reviewed sources from top academic journals and conferences. For each criterion, searches were performed by using combinations of keywords containing the term of each criterion, "AI," and "smart grid" (e.g., "Short-Term Load Forecasting AI smart grid" for "Short-Term Load Forecasting"). The authors also opted to exclude studies in progress and tutorial literature from the search results. The search generated 148 peer-reviewed studies between 2015 and 2021. Figure 5 presents the yearly count of the 148 studies. All 148 studies are reviewed in this paper; however, 75 of the 148 studies are listed in Tables 1-4. The remainder of this section discusses the applications of AI techniques to (1) load forecasting, which is further divided into short-term load forecasting, mid-term load forecasting, and long-term load forecasting; (2) power grid stability assessments, which contain transient stability assessments, frequency stability assessments, small-signal stability assessments, and voltage stability assessments; (3) faults detection; and (4) smart grid security.

Load Forecasting
With the high integration of renewable energy-such as solar, wind, and tide powerthe uncertainty of the scheduling and operation of the smart grid are becoming increasingly challenging. LF, as one of the key components to keep the power system stable and smart, is critical for planning and operation in modern power systems. Accurate forecasting, which is beneficial for reducing production costs and saving electric power [71], is very challenging if the load is nonstationary. According to the time that must be forecasted, LF can be classified into three levels [72]: (1) short-term LF (STLF), which predicts the load from minutes to hours; (2) mid-term LF (MTLF), which predicts the load from hours to weeks; and (3) long-term LF (LTLF), which predicts the load for years. Moreover, LF can also be affected by various other features, such as weather, time, season, event, type of customer, and academic schedule. Generally, MTLF and LTLF forecasting are modeled as functions of historical data for power consumption, along with other factors, such as weather, customers, and demographic data [73]. STLF has mostly been studied in different applications, such as real-time control, energy transfer scheduling, and demand response [74]. MTLF and LTLF can be used to plan for future power plants and show the dynamics of the power system [73]. Based on the data provided by smart meters, many techniques are proposed and applied for power system LF.

Short-Term Load Forecasting
Qiu et al. [75] propose a hybrid incremental learning approach that comprised discrete wavelet transform, empirical mode decomposition, and random vector functional link network. By using the ensemble method, the efficiency and accuracy of STLF can be improved. Li et al. [76] present a model with an ensemble approach that integrates three base methods for STLF in which the experiments show the model's effectiveness for STLF. However, the choice of base methods in the ensemble approach needs further validation. Many DL-based methods are used to solve LF problems. In recent years, DNNs have been used to obtain the potential knowledge for a forecasting model. However, the ANN method is often trapped in local minima [77] and over-fitting problems. Shi et al. [78] proposed a pooling-based deep RNN for STLF to address the over-fitting issue by increasing data diversity and volume. To address the time-consuming procedure of building a optimal DNN, which determines the number of hidden layers in the DNN model, Moon et al. [67] used an ensemble method that combines multiple DNN models with different numbers of hidden layers to achieve overall better performance by eliminating the poorly performed models. However, the computing overhead is a limitation, because several CNNs are included. In He, Deng, and Li [79], a DBN embedded with parametric Copula models, is proposed to forecast the hourly load of a power grid of an urban area in Texas, and the results reflect the effectiveness of the method by comparing it with neural networks, SVR, and ELM. Hafeez et al. [43] propose a hybrid algorithm using factored conditional restricted Boltzmann machine (FCRBM) as a training module and genetic wind-driven (GWDO) as an optimization algorithm. The model is validated by outperforming the state-of-the-art algorithm. Aly [80] built a hybrid clustering method based on wavelet neural network (WNN) and ANN schemes and showed the higher performance of the proposed model, comparing it with other clustering methods.

Mid-Term Load Forecasting
Even though the majority of LF problems fall into STLF, MTLF and LTLF are also very crucial for stable and smooth power system operation. MTLF is used to coordinate load dispatch, maintenance scheduling, and balance demand and generation [81]. Unlike STLF, which fit data to a model, MTLF and LTLF have different problems that are often ignored due to their complications [82] and randomness [83]. The MTLF and LTLF are not only affected by some explicit factors, such as historical load and weather data, but are also affected by local economy and demographic data, such as population and appliances in use [81]. Unlike STLF, which treats all weather variables with equal importance, the weather indicators for MTLF and LTLF follow a decreasing order of importance from temperature, humidity, wind, and precipitation [84]. Jiang et al. [85] proposed a dynamic Bayes network (DBN)-based MTLF model to forecast the peak power load for the following year. In Askari and Keynia [86], the authors deployed a DNN model with an optimized training algorithm that comprises two search algorithms for MTLF in power systems and presented the effectiveness of the model. Liu et al. [87] also provided a neural network-based model with particle swarm optimization (PSO) and showed the feasibility and validity of the model. Rai and De [88] improved a support vector regression model for MTLF with an average minimum mean absolute percentage error (MAPE) of 3.60. Gul et al. [89] provide a solution based on CNN and LSTM methods. Dudek et al. [90] propose a hybrid DL model for MTLF that combines exponential smoothing, advanced LSTM, and the ensemble method. This is a competitive method that also uses the ensemble approach.

Long-Term Load Forecasting
LTLF is used to predict the power consumption, system planning, and scheduling of generation units expansion in power systems. Generally, it spans from a few years to a couple decades. Because it needs a huge investment to construct new power generation, it requires accurate and effective forecasting for power systems. There are many ML and AI techniques developed for the problem. Nalcaci et al. [37] show that the MARS method gives more accurate and stable results than ANN and LR models when predicting the relationship between load demand and several environmental variables. Ali et al. [91] applied a novel hybrid fuzzy-neuro model for LTLF. LSTM is also well used in the domain. In 2017, Zheng et al. [72] exploited the LSTM-based RNN for the long-term dependencies in the electric load time series for LTLF, in which the method had a promising performance. Agrawal et al. [92] also propose an LTLF model with hourly granularity by using the LSTM network with high accuracy. To solve the vanishing and exploding gradient problems of LSTM, Dong et al. [93] present a hybrid method based on LSTM and gated recurrent unit (GRU) with a good performance for LTLF. In Kumar et al. [94], Apache Sparks was used to deploy a hybrid model that comprises LSTM and GRU for hyperparameter tuning purposes. Bouktif et al. [95] also proposes an LSTM-RNN model for this task. Sangrody et al. [96] compared six commonly used ML technologies: ANN, SVM, RNN, KNN, GPR, and generalized regression neural network (GRNN). ANN showed better performance than the other five methods for LTLF. Table 1 summarizes the AI techniques for LF.

Power Grid Stability Assessment
The power grid stability assessment-which comprises transient stability, frequency stability, small signal stability, and voltage stability [97,98]-is fundamental for ensuring the reliability and security of the power system. Power system stability is the ability to stay at an equilibrium operation state or quickly reach a new equilibrium state of operation after a perturbation [99]. Traditional models [92,[100][101][102] for stability assessments are complex and require significant computing resources because they heavily rely on accurate real-time dynamic power system models [98]. Because of the development of phasor measurement units (PMU) and the wide area measurement system (WAMS), many data-driven AI methods for stability analysis have been applied on power grid stability analysis.

Transient Stability Assessment
Transient stability assessment (TSA) is the ability to determine whether a system will remain synchronised after a huge perturbation. The two most commonly used traditional methods for TSA are time domain simulations and direct methods. However, the increasingly complex power systems result in great challenges in making reliable decisions based on traditional TSA methods.
Fortunately, the development of AI technologies provides the new prospective methods to this issue by using the large volume of data collected by PMU and WAMS. In Baltas et al. [99], three ML algorithms-decision trees, SVMs, and ANNs, which are for online TSA-were compared by using two datasets. The results show similar performance for the methods, and performance varies according to dataset quality. Mahdi et al. [103] also used a trained ANN model for online TSA prediction with promising performance. Hu et al. [104] developed two improved SVM methods to solve the traditional SVM limitation that reduces the false and missed alarms. Mosavi et al. [105] present a deep neuroclassifier for TSA and showed the high-generalization capacity of the model. Tang et al. [106] propose a TSA method that combined trajectory fitting (TF) and ELM, and the hybrid method showed effectiveness and reliability. Yu et al. [107] propose an RNN-LSTM model that better learns from the temporal data dependencies of the input data. Tan et al. [108] built a supervised classifier that consists of CNN and stacked autoencoders (SAE) for TSA problems with high accuracy. Liu et al. [109] used an intelligent system that comprised an ensemble of neural networks based on ELMs with 100% accuracy. In 2020, the study [110] applied a deep belief network (DBN) for TSA with great accuracy improvement. Shi et al. [111] trained a CNN model to provide a solution for online TSA for power system control.

Frequency Stability Assessment
Power grid frequency stability assessments (FSAs) can be defined as the ability of a system to maintain a steady range of frequency following a severe system upset or perturbation that results in an imbalance between generation and load [98]. A large frequency deviation causes generation units to trip, and the system stability can eventually be influenced. A few studies focused on this area by using AI technologies. In 2019, Wang et al. [14] proposed a hybrid model that integrated a frequency response model with an extreme learning ML model for FSA.

Small-Signal Stability Assessment
Small-signal stability is defined as the ability of the system to maintain synchronism when it is under small disturbances [112]. The term "small-signal stability assessment" is interchangeable with the term "oscillatory stable assessment" (OSA). A CNN-based method [111] was also developed for OSA, and the results show that the model is robust to PMU noise and that algorithm performance will not be reduced as the system grows in scale. Xiao et al. [113] used a multivariate random forest regression (MRFR) algorithm for OSA on an 18 bus test system, and the results presented high accuracy and robustness. Kamari et al. [114] deployed a PSO scheme to accelerate the determination of OSA.

Voltage Stability Assessment
Voltage collapse can significantly influence the stability of power systems. Thus, a voltage stability assessment (VSA) model, which can evaluate the voltage stability of the system in a timely fashion, would be a prevention. Numerous AI-based models are proposed in VSA, such as ANN [115], SVM [116], decision trees [117], and FL [118]. Ashraf et al. [115] used an ANN model to estimate the loading margin of power systems and testified to the effectiveness on Institute of Electrical and Electronics Engineers 14 bus and 118 bus test systems. Amroune et al. [119] used a hybrid model by using dragonfly optimization and SVR for online VSA. Mohammadi et al. [116] proposes a method for VSA by using an SVM. The results showed that the misclassification rates of the SVMs are as low as 2% for real power grids. Yang et al. [120] built a moment-based spectrum estimation method to gain insight into changes of voltage magnitudes for real-time static VSA. In Meng et al. [117], a decision tree model was used for online VSA. Liu et al. [121] built a feature selection model using partial mutual information (PMI) on an iterated random forest (IRF) model. An in-depth review is also found in Amroune [122]. Table 2 summarizes the AI techniques for the power system stability assessment.

Faults Detection
Fazai et al. [123] used an ELM-based method for the fault location detection of the system after extracting features by using wavelet transform (WT) and compared it with SVR and ANN models. Miraftabzadeh et al. [124] presented a GPR-based generalized likelihood ratio test to enhance FD performance in photovoltaic (PV) systems. In Ashrafuzzaman et al. [125], two ensembles are used to detect stealthy false data injection with a supervised classifier and an unsupervised classifier. Niu et al. [126] built an ensemble framework that combined five ML algorithms for power grid frequency disturbances analysis. The model can detect faults with three levels of degree of severity. Sirojan et al. [127] focused on high-impedance FD (HIFD) in power systems and proposed an ANN-based method for solving the problem with high accuracy (98.67%). ELM is also used for HIFD and is normally based on wavelet packet transform [128]. Sirojan et al. [129] proposes a method for line trip fault prediction in power systems that use LSTM networks and SVM. In Haq et al. [130], the ML-based discrete wavelet transform and double channel extreme learning machine method are proposed to locate and classify the faults in transmission lines. To improve the accuracy of line trip fault prediction, Wang et al. [131] proposed a stacked sparse autoencoder-based network with SVM and PCA to demonstrate its application to real-world data. Table 2. Summary of approaches for the power system stability assessment.

Author (Ref.) Year Objective Techniques
Mahdi et al. [103] 2017 TSA ANN Tang et al. [106] 2017 TSA ELM, TF Tan et al. [108] 2017 TSA CNN, SAEs Liu et al. [109] 2017 TSA Ensemble, NN, ELM Ashraf et al. [115] 2017 VSA ANN Amroune et al. [118] 2017 VSA SVR, FL Baltas et al. [99] 2018 TSA Decision tree, SVM, ANN Mosavi et al. [105] 2018 TSA ANN Yu et al. [107] 2018 TSA RNN, LSTM Amroune et al. [119] 2018 VSA SVR Mohammadi et al. [116] 2018 VSA SVM Hu et al. [104] 2019 TSA SVM Wang et al. [14] 2019 FSA ELM Kamari et al. [114] 2019 OSA PSO Amroune et al. [122] 2019 VSA Survey Wang et al. [110] 2020 TSA DBN Shi et al. [111] 2020 TSA CNN Shi et al. [111] 2020 OSA CNN Xiao et al. [113] 2020 OSA MRFR Yang et al. [120] 2020 VSA Spectrum estimation method Meng et al. [117] 2020 VSA Decision tree Liu et al. [121] 2021 VSA Random Forest With the development of microgrids, which present an effective power solution for the increased integration of renewable sources, FD for microgrids remains a challenge. Shafiullah et al. [132] used a hybrid approach that combines S-transform and feedforward neural networks for the distribution grid FD. Wang et al. [133] also evaluate ANN-based methods, and the results demonstrate the effectiveness of the model when detecting the time and location of faults. To handle labeled and unlabeled data, Shafiullah and Abido [134] propose a semisupervised ML model, which consists of a KNN model and a decision tree model, for FD on the transmission and distribution of microgrid systems. Jayamaha, Lidula, and Rajapakse [135] built an SVM-based algorithm to solve the problem of islanding and grid FD, and the results showed better performance than traditional methods based on the experiment of a PV plant. In 2017, Abdelgayed, Morsi, and Sidhu [136] used a PNN classifier for FD and fault diagnosis in the DC side of a PV system. In 2020, Hussain et al. [137] proposed a fault detection algorithm for PV based on ANN with 97% overall accuracy. Condition monitoring in wind turbines is also important for improving maintenance by detecting faults at an early stage. Baghaee et al. [138] evaluate the effectiveness of deep ANNs in wind turbine FD. Gunturi and Sarkar [139] present the effectiveness to apply the ensemble method for energy theft detection. Table 3 summarizes the AI techniques for power system FD. Table 3. Summary of approaches for power system FD.

Smart Grid Security
With the integration of advanced computing and communication technologies, the smart grid integrates distributed and green energy with the power grid by adding a cyber layer to the power grid and providing two-way energy flow and data communication. However, this has exposed the smart grid to numerous security issues due to the complexity of smart grid systems and the inherent weakness of communication technology. The most probable outcomes of smart grid cyberattacks are operational failures, synchronization loss, power supply interruption, synchronization loss, power supply interruption, high financial damages, social welfare damages, data theft, cascading failures, and complete blackouts [140]. The attacks that are commonly used include false data injection attacks (FDIA) and distributed denial of service. The objective of FDIA is an attempt to mislead the system operators by altering the original data. Accurate and fast detection of the security issues or attacks is a prerequisite for stable grid systems operation. In recent years, many approaches have been proposed to improve the overall security of smart grid systems from the academic area and the industry domain. Several research papers were published that provided an overview of the prevailing problems related to security in smart grid systems from a different perspective [4,[141][142][143][144][145]. This section summarizes the state-of-the-art AI technologies that are used to improve smart grid security.
ANNs and SVMs were used previously to detect FDIA. Zhou et al. [146] built a stacked denoising autoencoder (SDAE) neural network model to identify and classify four attacks in the smart grid with an accuracy as high as 96%. Cui et al. [147] used an intrusion detection model for smart grid intrusion detection, which is based on a whale optimization-trained ANN algorithm with one hidden layer. Kosek [148] also used a ANN-based model to discover malicious voltage control actions in the low-voltage distribution grid. Wu et al. [149] used an awareness mechanism that integrated fuzzy cluster, game theory, and RL algorithms to perform the security situational analysis for the smart grid. Ni et al. [150] used an RL method for attacks detection. Zhang et al. [151] demonstrated the superiority of a semisupervised framework based on domain-adversarial training to transfer the knowledge of known attack incidences to detect returning threats at different hours and load patterns. The SVM method was also used for the detection. Ahmed et al. [152] used an SVM-based algorithm to detect a new type of assault in the smart grid called covert cyber deception assault. Ahmed et al. [153] also used an isolation forest method to detect the assault with better performance in 2019. Ozay et al. [154] compared several ML-based methods for smart grid security. Li et al. [155] demonstrated a novel hybrid CNN-random forest model for automatic electricity theft detection, which significantly influences power supply quality and operating profits. Table 4 summarizes the AI techniques for smart grid security. Table 4. Summary of approaches for smart grid security.

Challenges of Artificial Intelligence in Smart Grids
Traditional power systems are very complex, and their analysis and control primarily depend on physical modeling and numerical calculations. With the development of smart grids with the high penetration of environmentally friendly renewable energy and microgrids, the transition of the traditional power grid to smart grid systems exposed more uncertainties and problems of the complex environment. Meanwhile, the current power system uses old infrastructure, which adds more uncertainties to the modern smart grid systems. Because the communication network builds on power systems, very large volumes of data with high variability must be handled; this is still a challenge of smart grids. Additionally, researchers are still working on the robustness, adaptiveness, and online processing of AI algorithms [156]. Although numerous data-driven methods have been proposed to deal with the problems of smart grids, there are still many severe challenges, including the following.

•
Integration of renewable energy. Highly integrated renewable energy is a key characteristic of smart grids. However, it presents several significant challenges due to the variability and unpredictability of renewable energy in which the power output can vary abruptly and frequently [157]. • Preserving data security and privacy: Taking into account the employment of massive different devices and two-way communication on smart grid systems, it is more prone to cyberattacks because it is directly exposed to malicious users compared with the traditional power systems. The previous section showed that many novel security techniques were developed to offer fast identifications of cyber risks, false data injection, systems data theft, electricity theft, and so on. However, network protocols, operating systems, and physical equipment in the current smart grid are still exposing the system to a wide variety of attacks. The current AI solutions for smart grid cybersecurity also have trade-offs between security and performance. • Big data fast storage and analysis: Another significant challenge is how to continue improving the performance of storing and retrieving big smart grid data for AI applications robustly. • Explainability of AI algorithms: Generally, AI algorithms have the black box problem, and they are not interpretable or explainable. This is a barrier that AI algorithms currently face. Ibrahim, Dong, and Yang [158] provide a comprehensive discussion about this topic. • Limitations of AI algorithms: The development of AI technologies greatly influences the deployment of AI to smart grid systems. However, every method limitation should be considered before applying them to the smart grid.

Future of Artificial Intelligence in Smart Grids
The objective of smart grids is to achieve a fully self-learning system that will be responsive, adaptive, self-healing, fully automotive, and cost effective [4]. Future directions or opportunities to achieve the advanced smart grid systems are discussed as follows. • Integration with cloud computing: To achieve a fully self-learning smart grid system, the integration of AI with cloud computing-which can enhance security and robustness and minimize outages-will play a more important role in smart grid systems. • Fog computing: Fog computing tries to preprocess the raw data locally rather than forward the raw data to a cloud. By providing on-demand resources for computing, fog computing has numerous advantages (e.g., energy-efficiency, scalability, flexibility). Some studies [159][160][161][162] have conducted tentative research for integrating fog computing to the smart grid. Fog computing will play a bigger role as the amount of data in the future smart grid increases. • Transfer learning: The lack of label data is still one of the main challenges for smart grid analysis. Transfer learning reduces the requirements of training data, which motivate researchers to use them to solve the problem of insufficient data. In recent years, deep transfer learning tasks [163] have received more attention, and they could have widespread applications in smart grid systems. • Consumer behaviors prediction: With the help of fog computing and the evolution of the 5G network, demand-side management is becoming a vital task for managing the participation of users in power systems. Learning patterns of consumer behavior and power consumption can greatly contribute to demand response tasks on the consumer side.

Limitations
This review has limitations. First, the objectives of the study and the nature of the filtering process applied during the review naturally have a certain selection bias. For example, data collection processes, analyses, and interpretations are influenced by the subjective assessment of the authors. Moreover, limiting the literature search exclusively to Google Scholar might have omitted some relevant research. Second, using high-level search phrases for such a complex and diverse multidimensional subject area might have omitted some other related research. Finally, the authors are aware that their focus on certain application areas in smart grids might have omitted research that cuts across multiple application areas.

Conclusions
As the traditional electric grid system transitions to a smart grid system, the conventional power system methods present limitations in processing and analyzing the massive amounts of data that is now a norm with a smart grid. Thus, AI techniques are being developed and applied to many applications in smart grid systems with promising results. This paper presents a survey of recent applications of AI techniques in four critical areas (that is, load forecasting, power grid stability assessment, faults detection, and security problems) not previously addressed in previous studies. It also discusses current challenges, opportunities, and the future scope of applying AI techniques to realize a truly smart grid.
Based on this survey, our conclusion can be summarized as follows: (i) AI techniques have been applied to several application areas that are critical to the reliability and resilience of a smart grid; (ii) Even then, there are still some challenges limiting additional applications of AI techniques. Major among these challenges are data privacy and security, as well as handling the "black box" nature of some AI techniques to achieve a human-centered approach to AI solutions design; and (iii) This survey should stimulate discussions in application areas surveyed in this paper, which could further strengthen exchange of ideas. In summary, the applications of AI techniques are being leveraged to enhance and improve the reliability and resilience of smart grid systems.
Our future research in this area will focus on surveying the implications of the "black box" nature of AI techniques on smart grid operations. Specifically, we will survey how smart grid operators have handled this problem. Such a survey could help researchers design more human-centered approaches to AI solutions.

Conflicts of Interest:
The authors declare no conflict of interest.