Meticulously Intelligent Identiﬁcation System for Smart Grid Network Stability to Optimize Risk Management

: The heterogeneous and interoperable nature of the cyber-physical system (CPS) has enabled the smart grid (SG) to operate near the stability limits with an inconsiderable accuracy margin. This has imposed the need for more intelligent, predictive, fast, and accurate algorithms that are able to operate the grid autonomously to avoid cascading failures and/or blackouts. In this paper, a new comprehensive identiﬁcation system is proposed that employs various machine learning architectures for classifying stability records in smart grid networks. Speciﬁcally, seven machine learning architectures are investigated, including optimizable support vector machine (SVM), decision trees classiﬁer (DTC), logistic regression classiﬁer (LRC), naïve Bayes classiﬁer (NBC), linear discriminant classiﬁer (LDC), k-nearest neighbor (kNN), and ensemble boosted classiﬁer (EBC). The developed models are evaluated and contrasted in terms of various performance evaluation metrics such as accuracy, precision, recall, harmonic mean, prediction overhead, and others. Moreover, the system performance was evaluated on a recent and signiﬁcant dataset for smart grid network stability (SGN_Stab2018), scoring a high identiﬁcation accuracy (99.90%) with low identiﬁcation overhead (4.17 µ Sec) for the optimizable SVM architecture. We also provide an in-depth description of our implementation in conjunction with an extensive experimental evaluation as well as a comparison with state-of-the-art models. The comparison outcomes obtained indicate that the optimized model provides a compact and efﬁcient model that can successfully and accurately predict the voltage stability margin (VSM) considering different operating conditions, employing the fewest possible input features. Eventually, the results revealed the competency and superiority of the proposed optimized model over the other available models. The technique also speeds up the training process by reducing the number of simulations on a detailed power system model around operating points where correct predictions are made.


Introduction
During the last few decades, a pronounced growth in the global net consumption of electricity has amounted to approximately 23,398 billion kWh, in 2018 [1]. Power consumption is continuously increasing, which puts more pressure on the usage of the earth's natural assets to generate electricity to accommodate this huge demand (i.e., the United States' power system alone takes up to 40% of all nationwide carbon dioxide emissions [2]). To avoid this expensive and complicated scenario, there have been extensive research in the following areas: (1) optimized energy utilization and efficiency; (2) improvements in system reliability, security, and resiliency; (3) economical distribution and electricity management. The use of information and communication technology (ICT) and Artificial Intelligence (AI) have advanced the electric power system of tomorrow that integrates state-of-the-art The increasing deployment of the intelligent embedded systems to the SG due to the dynamic power behavior of the end-users has led to integrating the Information Technology (IT) with the physical side of the grid. In order to get a much more factual picture of the voltage stability phenomena, it is crucial to consider the dynamic behavior of the system in account [22]. On the other hand, applying traditional dynamic methods may need more computational analysis and a time-consuming process for online use. Using machine learning techniques would be an attractive alternative to overcome the aforementioned problems. This is because of the ability of the machine learning techniques to learn complex non-linear relationships and their modular structures, which allows parallel processing.
Consequently, it is indispensable to identify the stability state for the smart electric grid networks using autonomous intelligent techniques to minimize implementation risks. This paper proposes a novel machine learning-based framework to uncover the stability for SGs to provide early detection of system faults before its physical implementation process, which can minimize the instability impacts and optimize risk management. In this paper, seven machine learning techniques are modeled to classify smart electric grid network stability as either stable or unstable. To achieve the maximum classification The increasing deployment of the intelligent embedded systems to the SG due to the dynamic power behavior of the end-users has led to integrating the Information Technology (IT) with the physical side of the grid. In order to get a much more factual picture of the voltage stability phenomena, it is crucial to consider the dynamic behavior of the system in account [22]. On the other hand, applying traditional dynamic methods may need more computational analysis and a time-consuming process for online use. Using machine learning techniques would be an attractive alternative to overcome the aforementioned problems. This is because of the ability of the machine learning techniques to learn complex non-linear relationships and their modular structures, which allows parallel processing.
Consequently, it is indispensable to identify the stability state for the smart electric grid networks using autonomous intelligent techniques to minimize implementation risks. This paper proposes a novel machine learning-based framework to uncover the stability for SGs to provide early detection of system faults before its physical implementation process, which can minimize the instability impacts and optimize risk management. In this paper, seven machine learning techniques are modeled to classify smart electric grid network stability as either stable or unstable. To achieve the maximum classification performance, we have contrasted the seven machine learning models in terms of nine performance indicators in addition to contrasting our best-proposed model with other existing models. Eventually, the comparison outcomes revealed the competency and superiority of the optimized model over the other available models. Specifically, the contribution of this paper can be listed as follows: Providing comprehensive identification models that employ various machine learning architectures to classify and accurately predict the VSM records in SGs.

•
Evaluating the optimized performance of the identification system on a recent and significant dataset for smart grid networks stability (SGN_Stab2018), which has achieved a high identification accuracy (99.90%) with low identification overhead (4.17 µSec) for the optimizable-SVM architecture.

•
Providing an in-depth explanation of our implementation in conjunction with an extensive experimental evaluation as well as comparison with state-of-the-art models.
The remaining parts of this article are systematized as follows: the examination of related research and models are presented in Section 2. In Section 3, we provide an inclusive description for the model development workflow including details about the dataset for networked SGs, the diverse evaluation metrics, and the different examined predictive models emphasizing the optimum model architecture and specifications. Section 4 presents an extensive experimental evaluation as well as a comparison with state-of-the-art models. The main inferences that could be emphasized from the results are presented in Section 5. Lastly, in Section 6, we provide a conclusion of the research work.

Literature Review
Many dissertations/papers have been published discussing the use of AI applications in SGs. Most of the fields cover cyber security, micro-grids, load/power consumption forecasting, defect/fault detection, demand response, stability analysis, and other areas related to the technical fields of SGs. This section discusses the recent state-of-the-art works related to the stability analysis in SGs, more specifically from two sides: (i) the approaches that have treated the stability of the SGs, and (ii) the application of ML techniques to predict the behavior of the SG.
Industries must deal with high-volume data management by analyzing and evaluating data and identifying patterns within a specified period. However, SGs are highly non-linear, operating in constantly changing operating conditions and load variation in response to a disturbance. Generally, these are considered as the driving force for voltage instability. Traditionally, stability indicators have been used to estimate the operating conditions which have to be within a short time limit and require minimum computational analysis. In addition, the characteristics have to be predictable and quickly calculable [23]. However, several drawbacks of these indices include the fact that they are extremely non-linear and discrete for variable operation conditions [23,24]. One of the developments of this aspect is to improve parameters' measurement by using PMUs to increase the observability of the system due to their high sampling rate. In [24], a fuzzy inference system (FIS) has been discussed in order to estimate the loading margin (LM) in a real-time operation condition. Some voltage stability variables and indices are used as inputs to the FIS. To obtain better LM estimation, tuned adaptive neuro FISs and subtractive clustering are used. Dynamic stability enhancement is presented in [25,26] by measuring the grid frequency over adequate periods, linking the price directly to it, and utilizing both centralized and decentralized networks. The method was configured to eliminate or limit any non-Gaussian noise. Yet, the frequency spectrum needs to be averaged over a long time.
On the other hand, AI is considered as a more viable solution for real-time evaluation of voltage stability due to (i) Fast response by reducing the calculation time; (ii) Ability to provide knowledge about the system operation; (iii) Fewer data storage and capacity requirements as only the important measurements are used; and (iv) Ability to provide stability evaluation over a vast range of scenarios simultaneously. AI based-techniques (such as artificial neural networks (ANNs) [27], extreme learning machine (ELM) [28,29], fuzzy logic (FL) [30], and deep ensemble anomaly [31]) have drawn researchers' attention as a solution for the evaluation of voltage stability near real time due to their ability to solve non-linear problems with desired speed and accuracy [27][28][29][30][31][32].
ANN algorithms have been widely used in both short and long-term power system voltage stability assessment due to their ability to conduct computational analysis for complex non-linear mapping. Several ANN approaches (such as backpropagation [27], Radial Basis Function (RBF) [33,34], Kohonen [35], etc.) have been reported in scientific works. ANNs need to generate many operating conditions during the training and testing processes. Therefore, power system simulations are used to establish a relationship between voltage stability indicators (such as bus and line voltage stability indices and loading margin (LM)) [28] and the measurable parameters used as input variables (such as bus voltage magnitudes and angles, real and reactive power flows and injections, branch currents, etc.) [36]. The assessment of voltage stability using ANN with reduced input sets is discussed in [37,38]. They applied a methodology to eliminate redundant measurements, which minimize the variables required to support voltage stability analysis. However, this methodology considered a large number of measurements available. To reduce the high computational processes, [39] suggested installing PMUs on a few nodes to overcome economic issues and minimize the data storage. Utilizing the PMUs measurements, both the computational and communication burdens for large power systems have been reduced. On the other hand, ANNs still exhibit some shortcomings related to the excessive training time and large sets of data required [17][18][19].
Several researchers applied ELM [28,29] to evaluate online long-term voltage stability, and different input vectors were considered (such as active and reactive power flows and injections as well as voltage magnitudes and angles). An assembled ELM is proposed in [39] to develop the performance of the power system voltage stability using VSM estimation. However, the method needs extra time for training the ELM set. The authors in [40] have presented a methodology for long-term online voltage stability monitoring in power systems that exploits the feasibility of phasor-type information, the measurements were collected using PMUs, and the power system has been divided into sub-areas to improve the supervision.
SVM or/and its regression version, support vector regression (SVR), have been used for online voltage stability based on minimizing the structural risk and improving the statistical learning. The authors in [41] have proposed a bidirectional evaluation algorithm of VSM to be used for the large penetration of photovoltaic (PV) arrays in the SG system. A deep ensemble model was developed to collect data from the AMI to be able to classify the source of variability using simultaneous point and probabilistic predictions. An evolved technique through the VSM index called Kernel Extreme Learning Machine (KELM) has been proposed by [28] for long-term voltage stability. The methodology, which is an amalgamation of both kernel-type AI and ELM, has decreased the training time and improved the performance.
Generally, stability issues rarely happen in power systems, and thus, the features associated are difficult to extract. However, the authors in [42] proposed a data mining technique for short-term online stability assessment by improving the ML imbalance training and detection. They implement a discriminative subsequence classification algorithm and a forecasting-based non-linear synthetic minority oversampling to alleviate the distortion. In [43], an active learning technique is proposed to overcome the problems associated with the existing ML applications such as prediction time, training time, and accuracy. The authors in [44] proposed Multidirectional Long Short-Term Memory (ML-STM) to predict the voltage stability of the SG network. A comparison with several DL algorithms has been evaluated. Yet, the algorithm proposed is complex and requires a high computational process.

Proposed Predictive Model
Typically, predictive modeling is a data-driven methodology of predicting future trends/states based on a number of historical data modeling. As such, the development of a data-driven identification/predictive model for SG stability is proposed in this research. The workflow diagram of the proposed predictive model development is illustrated in Figure 2. The process started by collecting the representative dataset to formulate the basis for the autonomous detection/identification system, passing through several preprocessing actions to produce the data records in the form that can be adequate for machine learning models, processing through various machine learning techniques, evaluating using several evaluation metrics in order to pick up the optimal ML technique (which is SVM in our case) to model and validate the proposed problem statement, and lastly using it to provide the final data predictions (identification) for the SG stability status as either stable or unstable.

Proposed Predictive Model
Typically, predictive modeling is a data-driven methodology of predicting future trends/states based on a number of historical data modeling. As such, the development of a data-driven identification/predictive model for SG stability is proposed in this research. The workflow diagram of the proposed predictive model development is illustrated in Figure 2. The process started by collecting the representative dataset to formulate the basis for the autonomous detection/identification system, passing through several preprocessing actions to produce the data records in the form that can be adequate for machine learning models, processing through various machine learning techniques, evaluating using several evaluation metrics in order to pick up the optimal ML technique (which is SVM in our case) to model and validate the proposed problem statement, and lastly using it to provide the final data predictions (identification) for the SG stability status as either stable or unstable.

Data Collection Process
In this research, the benchmark training of the SG stability (SGN_Stab2018) simulated dataset is compiled from the UCI machine learning repository [45] to validate the proposed SVM approach. This dataset was originally collected by the Karlsruhe Institute of Technology, in November 2018, by using the local stability analysis of four-buses star system implementing a decentralized SG controller concept. Each item stands for the predictive attributes on a scale of [0, 10]. The dataset comprises 10,000 samples (divided into 3620 stable samples and 6380 unstable samples) with 13 input features and one output feature for binary class labeling (to identify the system status for SG as either stable or unstable). In order to predict the stability condition of a given system, this research exploits VSM as the output feature to be predicted. A VSM value closer to 1 indicates that the system is reaching its voltage collapse point. The input features contain specific parameters that need to be specified and generated from the following functions:


Reaction time ( ) value for energy producer ( ) and three consumers ( 2 , 3 , and 4 ) where: Figure 2. The overall framework of the proposed smart electric grid network stability identification system.

Data Collection Process
In this research, the benchmark training of the SG stability (SGN_Stab2018) simulated dataset is compiled from the UCI machine learning repository [45] to validate the proposed SVM approach. This dataset was originally collected by the Karlsruhe Institute of Technology, in November 2018, by using the local stability analysis of four-buses star system implementing a decentralized SG controller concept. Each item stands for the predictive attributes on a scale of [0, 10]. The dataset comprises 10,000 samples (divided into 3620 stable samples and 6380 unstable samples) with 13 input features and one output feature for binary class labeling (to identify the system status for SG as either stable or unstable). In order to predict the stability condition of a given system, this research exploits VSM as the output feature to be predicted. A VSM value closer to 1 indicates that the system is reaching its voltage collapse point. The input features contain specific parameters that need to be specified and generated from the following functions: • Reaction time (τ) value for energy producer (x) and three consumers (τ 2 , τ 3 , and τ 4 ) where: τ(x) ∈ [0.5 sec, 10 sec].

•
The regularization parameter (γ) related to price elasticity (x) where: if δ ≤ 0 then the systm is linearly stable if δ > 0 then the system is linearly Unstable .
The system's output is the maximum real part of the characteristic differential equation root (stab). The acquired data are cleaned from empty and misleading samples. Since the data have different scales, data normalization is essential to unify the range of data. The best solution for a large dataset is to eliminate the total rows containing missing values [46]. The final data are divided into three folds, 70% of the data is devoted to training and validation. The remaining 30% is used for testing purposes.

Data Engineering Actions
The measurements of the SG systems are very redundant, and the number of variables is considerably high. Thus, restricting the input space to a small subset of the available input variables has explicit economic benefits in terms of computational requirements, cost, and data storage of future data collection. Furthermore, lowering the number of input variables derives more understanding of the model; i.e., the optimum dataset variables are considered to be the set that has a smaller number of input variables with no uninformative variables, and a minimum degree of redundancy is used to characterize the output behavior of the system. Typically, data engineering or data preprocessing is a key module in every machine/deep learning system, similar to any machine learning-based system. In this module, data records pass through a number of preprocessing actions to prepare the dataset samples for the learning process [46]. In this research, our target dataset has undergone the following processes:

1.
Transformation process: converting the data from comma-separated values into a double matrix of vectored dataset instances each with 14 columns (14 × 10,000).

2.
Class labeling process: representing the categorical class feature into a binary label (0 : stable and 1 : unstable).

3.
Dataset randomizing or shuffling process: re-allocating the instances into the dataset to help the training process to converge fast and preclude any bias throughout the training process.

4.
Splitting up: dividing the dataset into two datasets with random indices where 70% of the data records are used for the training process and the remaining 30% of the data records are used for the testing process.
To confirm a confident validation (testing) process, we have conducted five-fold crossvalidation [47] that incorporated five dissimilar experiments for each machine learning model with different subsets for training and validation processes designated for each experiment after data shuffling.

Applied Machine Learning Techniques
In this stage, we applied the preprocessed data into different machine learning techniques in order to investigate their performance metrics to contrast them accordingly. In this research, our preprocessed dataset has been applied using the following machine learning techniques: • Support Vector Machines (SVM): Supervised learning model and usually used as a linear binary classifier [48] to address classification and regression problems. SVM can be used to solve linear and non-linear problems for various real-life applications. The basic idea of SVM classification is to create a line or a hyperplane that can separate the data into two classes, which will be discussed thoroughly in the next section.
As a result, our SVM-based model has been configured with Linear SVM as a kernel function, in which the kernel scale is set to 0.001-1000 and the box constraint level is 9946 (should be constrained between 0.001 and 1000, and the larger the better) with Bayesian optimizer with a standardized dataset. Finally, the total misclassification cost is 59 samples.
After that, accumulated sums are passed through the sigmoid function (σ(∑(F i × W i ) + b)) to generate the binary class outputs. In this paper, we have employed logit function metric as a cost function in the logistic regression classifier to evaluate classification probability. A typical logistic probability will never go below 0 and above 1, and the logistic cost function J(θ) can be calculated for each sample (x) from the total number of samples (m) with probability (p) as: As a result, our logistic regression-based model has been configured with logit function as a loss/cost criterion, sigmoid function as an output classification, and the total number of misclassifications is 101 samples with all features used in the model.

•
Decision Tree Classifier (DTC): Supervised learning mechanism that can be used for predictive models (classification/regression) having a tree-like structure [50]. The decision tree deals with the features of the dataset and builds the learning model by splitting the dataset based on its features of datasets, where the best feature of the dataset is placed at the root node. This procedure is continually performed until all the features of the dataset are split, reaching the leaf node at each branch. Thereafter, a decision tree uses estimates and probabilities to calculate likely outcomes. In this paper, since we are dealing with a binary classification problem (target variable "Stable" or "Unstable"), we have employed the Gini index metric as a cost function in the decision tree to evaluate splits in the dataset. The Gini index for each node of the tree (i) of all nodes (C) probability (P i ) is given as: As a result, our decision tree-based model has been configured with the Gini index as the split criterion, and the maximum number of splits is 100 with no surrogate decision splits. Finally, the total misclassification cost is 82 samples.

•
Naïve Bayes Classifier (NBC): Supervised learning mechanism used for constructing classifiers based on Bayes theorem [51]. Naïve Bayes is a conditional probability approach used to predict the likelihood that an event will occur, given evidence defined in the data feature. Naïve Bayes is used for probabilistic classification and regression purposes. In this paper, we employed Gaussian naïve Bayes for a numeric predictors algorithm to provide two-class classification. Given a data instance X with an unknown class label, H is the hypothesis that X belongs to a specific class C; then, the conditional probability of hypothesis H given observation X is denoted: As a result, our naïve Bayes-based model has been configured with multivariate multinomial (mvmn) distribution as a binary categorical predictor, and the total number of misclassifications is 211 with PCA (principal component analysis) configured over the features.

•
Linear Discriminant Classifier (LDC): A dimensionality reduction method that is generally employed to address supervised classification tasks [52] by separating the data labels into two or more classes. The basic idea of LDC is to project the features in higher dimension space into a lower dimension space. For example, we have two classes, and we need to separate them efficiently. LDC starts the classification by using only a single feature, and then, we will keep on increasing the number of features for proper classification. In this paper, since we are dealing with a binary classification problem (target variable "Stable" or "Unstable"), we have employed Fisher's Linear Discriminant (FLD) with two-dimensional input vector projection to classify between two classes of the target feature. Given that N1 and N2 denote the number of samples in classes C1 (Stable) and C2 (Unstable), respectively, then, we find the projection J(W) using FLD, which learns a weight vector W, where m 1 and m 2 are the mean vectors for the two classes, and S 1 and S 2 are the variance vectors for the two classes, using the following criterion: As a result, our linear discriminant-based model has been configured with a full covariance structure as a binary categorical predictor, and the total number of misclassifications is 312 with PCA (principal component analysis) configured over the features.
• k-Nearest Neighbor (kNN): [53] A supervised machine learning classifier that memorizes the labeled observations within the training dataset to predict classifications for new unlabeled observations. kNN is among the simplest and most easy-to-implement classifications, and it makes its predictions based on similarity. Similarity comparisons can be based on any quantitative attribute such as weather, distance, age, income, and weight (the simplest and most common comparative attribute is distance). In this paper, we have employed k-nearest neighbor with k set to one neighbor and the Euclidian function as a distance metric function to evaluate the degree of neighborhood for every sample in the dataset. The Gini index for each node of the tree (i) of all nodes (C) probability (P i ) is given as: As a result, our kNN-based model has been configured with the Euclidian index as the distance criterion with the distance weight set to true, and the total number of misclassifications is 1008 samples with a standardized dataset.

•
Ensemble Boosted Classifier (EBC): Supervised learning method that employs multiple learners (weak learners/models) to resolve the identified classification problem and then aggregate their outcomes to produce the final output [54]. Aggregation can be done using a boosting mechanism that creates a strong classifier by combining the final result from a number of sequential homogeneous weak learners using a deterministic aggregation approach. In this paper, we have employed the RUSBoost mechanism as an ensemble method and the decision tree as a classification learner to evaluate splits in the dataset.
As a result, our ensemble-based RUSBoost-based model has been configured with the Gini index as the split criterion; the maximum number of splits is 20, using 30 learners with a learning rate of 0.1 and no surrogate decision splits. Finally, the total misclassification cost is 2896 samples.

Evaluation Metrics
To pick up the best predictive model that can be used to identify the stability state of the electrical smart grid defined by the target dataset, we need to evaluate the proficiency of the machine learning models employed in this research. One should first investigate the binary confusion matrix [55] to provide the values for true positive (TP), true negative (TN), false-positive (FP), and false-negative (FN). In addition, we have used the following key performance indicators [56]:

•
Model Accuracy (ACC) measures the ability of the system to provide correct sample classification with respect to the whole number of samples and is given as: • Positive Predictive Value (PPV) measures the ability of the system to provide correct sample classification with respect to the positive number of samples and is given as: • True Positive Rate (TPR) measures the ability of the system to provide correct sample classification with respect to the number of samples that should be retrieved and is given as: • Harmonic Mean Score (HMS) is a weighted score for the relation between Positive Predictive Value (PPV) and True Positive Rate (TPR) and is given as: • False Alarm Rate (FAR) measures the proportion that the system provides incorrect sample classification with respect to the whole number of samples and is given as: • False Discovery Rate (FDR) measures the proportion that the system provides incorrect sample classification with respect to the positive number of samples and is given as: • False Negative Rate (FNR) measures the proportion that the system provides incorrect sample classification with respect to the number of samples that should be retrieved and is given as: • Area Under Curve (AUC) measures the ability of the system to rank a randomly selected positive sample higher than a randomly selected negative sample and is given as: • Identification Speed (IDS) measures the number of samples that the system can process within the unit and is given as: • Identification Delay (IDD) measures the time required by the system to provide a single sample prediction (in µSec) and is given as:

Predictive Model-Based SVM
As we will see in the next section, after evaluating all the aforementioned ML models using the prescribed evaluation metrics, we end up concluding that the SVM model is the optimal ML technique to model and validate the proposed problem statement of identifying the SG stability status as either stable or unstable. SVM is a supervised machine learning approach that is employed for the application of prediction and classification [57]. By applying the training records, each is categorized to a certain category. SVM is a nonparametric technique, since it comprises a number of weighted vectors, nominated from the training dataset, where the number of support vectors is less than or equal to the number of training samples. For example, in ML applications for natural language processing (NLP), it is not unheard of to have SVMs with tens of thousands of vectors each comprising hundreds of thousands of data features [58]. Figure 3 illustrates the architecture of the SVM technique employed in our predictive model.

Predictive Model-Based SVM
As we will see in the next section, after evaluating all the aforementioned ML models using the prescribed evaluation metrics, we end up concluding that the SVM model is the optimal ML technique to model and validate the proposed problem statement of identifying the SG stability status as either stable or unstable. SVM is a supervised machine learning approach that is employed for the application of prediction and classification [57]. By applying the training records, each is categorized to a certain category. SVM is a non-parametric technique, since it comprises a number of weighted vectors, nominated from the training dataset, where the number of support vectors is less than or equal to the number of training samples. For example, in ML applications for natural language processing (NLP), it is not unheard of to have SVMs with tens of thousands of vectors each comprising hundreds of thousands of data features [58]. Figure 3 illustrates the architecture of the SVM technique employed in our predictive model. According to Figure 3, in the beginning, the bootstrapping technique [59] is used to create a number of SVMs (1, 2, … N) which are assigned dynamically by the automatic classification learner tool. These training subsets are created by randomly resampling with replacement from the original training dataset repeatedly. Every SVM is trained distinctly with the training subsets generated from the original training dataset, and once the training process is completed, the trained SVMs are aggregated using a suitable combination approach such as the following binary sign ensemble aggregation function ( ) that is used in our binary classification system: : are the linear outputs from every SVM (1, 2, …., N). : is the classifier bias, in default, it is set to = 0. ( . ): A kernel function is applied by each SVM to fit the non-linear models into a higher dimensional space (via "kernels") before finding the optimal hyperplane to separate the classes. Typically, the formula of the Radial Basis Function (RBF) kernel [60] is commonly used, and it is also used in this paper and defined as follows: According to Figure 3, in the beginning, the bootstrapping technique [59] is used to create a number of SVMs (1, 2, . . . N) which are assigned dynamically by the automatic classification learner tool. These training subsets are created by randomly resampling with replacement from the original training dataset repeatedly. Every SVM is trained distinctly with the training subsets generated from the original training dataset, and once the training process is completed, the trained SVMs are aggregated using a suitable combination approach such as the following binary sign ensemble aggregation function D(x) that is used in our binary classification system: β i y i : are the linear outputs from every SVM (1, 2, . . . ., N). b: is the classifier bias, in default, it is set to b = 0. K(x.x i ) : A kernel function is applied by each SVM to fit the non-linear models into a higher dimensional space (via "kernels") before finding the optimal hyperplane to separate the classes. Typically, the formula of the Radial Basis Function (RBF) kernel [60] is commonly used, and it is also used in this paper and defined as follows:

System Evaluation and Results
In this section, we provide extensive simulation results that have been obtained from the evaluation of the seven above-mentioned machine learning models using the prescribed evaluation metrics. Table 1 contrasts the performance of the examined supervised machine learning models (SVM, DTC, LLRC, NBC, LDC, kNN, and EBC) in terms of ACC, FAR, PPV, FDR, TPR, FNR, HMS, and IDS metrics. To gain more insights into these results of predictive models, we are visualizing the comparison of predictive models in terms of four quality indication metrics (ACC, PPV, TPR, and HMS) in Figure 4A and the comparison of predictive models in terms of three error analysis metrics (FAR, FDR, and FNR) in Figure 4B, respectively.  Figure 5 contrasts the seven experimented machine learning techniques in terms of four quality indication metrics and three error analysis metrics for the proposed stability identification model of the smart electric grid network. According to the figure, the best performance metrics are registered for the SVM model with the highest quality indicators of 99.93%, 99.89%, 99.92%, and 98.62% for ACC, PPV, TPR, and HMS, respectively, and lowest error indicators of 0.07%, 0.11%, and 0.08% for FAR, TDR, and FNR, respectively. Conversely, the lowest performance metrics are registered for EBC model with lowest quality indicators of 71.04%, 68.80%, 60.00%, and 63.75% for ACC, PPV, TPR, and HMS respectively, and highest error indicators of 28.96%, 31.20%, and 40.00% for FAR, TDR, and FNR, respectively. In addition, other noticeable and comparable classifiers/alternatives are the DTC and LRC, which recorded very high accuracy measures with 99.18% and 98.99% for DTC and LRC. However, DTC performed slower with higher perdition delay than LRC (i.e., 11.90 µSec vs. 7.14 µSec for DTC and LRC, respectively). Moreover, another important feature that distinguishes our optimized SVM model is the prediction time/speed, which is identified as the fastest classifier scoring the minimum prediction delay with only 4.17 µSec for the sample prediction. In contrast, the slowest classifier was the kNN model scoring the maximum prediction delay with 83.33 µSec for the sample prediction.  As we demonstrated from the last table, figures, and discussion, the optimizable SVM classifier has been picked up as the best classifier to provide the final data predictions (identification) for the smart electric grid network stability status as either stable or unstable. Therefore, the next figures, discussion, and analysis will focus on the SVM model. Hence, Figure 5 illustrates the performance evaluation for the developed optimizable SVM classifier using a minimum classification error plot after 30 iterations of the optimization process. The plot tracks the trajectories for the estimated minimum classification error and the observed minimum classification error. As can be seen, both error trajectories (estimated and observed) follow a decreasing tendency with the increasing iterations before they are both saturated after almost 17 iterations of the optimization process where the best point hyperparameters have a minimum classification error ≤ 1 × 10 −3 . Such very low values for the minimum classification error allowed the system to record the highest prediction accuracy for the target dataset.
tives are the DTC and LRC, which recorded very high accuracy measures with 99.18% and 98.99% for DTC and LRC. However, DTC performed slower with higher perdition delay than LRC (i.e., 11.90 μSec vs. 7.14 μSec for DTC and LRC, respectively). Moreover, another important feature that distinguishes our optimized SVM model is the prediction time/speed, which is identified as the fastest classifier scoring the minimum prediction delay with only 4.17 μSec for the sample prediction. In contrast, the slowest classifier was the kNN model scoring the maximum prediction delay with 83.33 μSec for the sample prediction. As we demonstrated from the last table, figures, and discussion, the optimizable SVM classifier has been picked up as the best classifier to provide the final data predictions (identification) for the smart electric grid network stability status as either stable or unstable. Therefore, the next figures, discussion, and analysis will focus on the SVM model. Hence, Figure 5 illustrates the performance evaluation for the developed optimizable SVM classifier using a minimum classification error plot after 30 iterations of the optimization process. The plot tracks the trajectories for the estimated minimum classification error and the observed minimum classification error. As can be seen, both error trajectories (estimated and observed) follow a decreasing tendency with the increasing , and the general two-class confusion matrix parameters employed to measure the quality and error indicators (mentioned earlier). According to the figure, the confusion matrix outcomes clearly indicate the high quality and optimality of the voltage stability prediction process of our optimizable SVM model, and thus, this is the main reason to record the best performance evaluation metrics. Furthermore, Figure 7 visualizes the plots for area under curve (AUC) for each class (stable and unstable). AUC curves investigate the association between true and falsepositive rates (TPR vs. FPR) at different prediction thresholds [61]. AUC provides a mean of the system's ability to rank a randomly selected positive sample higher than a randomly selected negative sample as an area of percentages from (0,0) to (1,1) for each axis. For that reason and according to the AUC plots in Figure 7, it can be concluded that both classes exhibit perfect ranking measurements, recording AUC values of 100%. TN = 6376), TPR vs. FNR analysis for each individual class (both classes scored TPR = 99.9% and TNR = 0.1%), PPV vs. FDR analysis for each individual class (both classes scored PPV = 99.9% and TDR = 0.1%), and the general two-class confusion matrix parameters employed to measure the quality and error indicators (mentioned earlier). According to the figure, the confusion matrix outcomes clearly indicate the high quality and optimality of the voltage stability prediction process of our optimizable SVM model, and thus, this is the main reason to record the best performance evaluation metrics. Furthermore, Figure 7 visualizes the plots for area under curve (AUC) for each class (stable and unstable). AUC curves investigate the association between true and false-positive rates (TPR vs. FPR) at different prediction thresholds [61]. AUC provides a mean of the system's ability to rank a randomly selected positive sample higher than a randomly selected negative sample as an area of percentages from (0,0) to (1,1) for each axis. For that reason and according to the AUC plots in Figure 7, it can be concluded that both classes exhibit perfect ranking measurements, recording AUC values of 100%.

Discussion and Evaluation
The stability of the SG is decidedly influenced by objective conditions such as transmission line aging, power generation limitations, dynamic behavior and intermittent loads, volatile behavior of RES, etc. An efficient identification system is proposed to predict the smart grid stability margin based on pre-established computational features and is implemented, analyzed, and assessed in this paper. The following inferences can be emphasized from the results: 1. Voltage instability phenomenon is considered the main threat to stability, security, and reliability in modern power systems. Due to the load changes and sudden con-

Discussion and Evaluation
The stability of the SG is decidedly influenced by objective conditions such as transmission line aging, power generation limitations, dynamic behavior and intermittent loads, volatile behavior of RES, etc. An efficient identification system is proposed to predict the smart grid stability margin based on pre-established computational features and is imple-mented, analyzed, and assessed in this paper. The following inferences can be emphasized from the results:

1.
Voltage instability phenomenon is considered the main threat to stability, security, and reliability in modern power systems. Due to the load changes and sudden contingencies occurrence, off-line voltage stability monitoring can no longer ensure a secure operation of the power system. Hence, fast and efficient methods to assess power system voltage stability are of great importance to experts and industries in order to avoid the risk of large blackouts.

2.
Focusing on point prediction and interval forecasting is extremely important to weaken the uncertainty and support the grid stability-based SG paradigm. So, this paper developed an efficient computing framework to solve the Smart Grid Stability Prediction. An electrical grid stability simulated dataset was considered for the validation of the proposed approach.

3.
Even though this is the first research work to address the stability status prediction for the electric smart grid network using conventional machine learning techniques (to the best of our knowledge), we still can compare it with other state-of-art techniques that employ the same dataset using deep learning techniques [62]. Therefore, Table 2 contrasts all applicable evaluation metrics with the results we obtained for our optimizable SVM model. Based on the information provided in the table, we can clearly see that the proposed predictive model is comparable and superior in several evaluation metrics even though it is less complex and has a lower prediction overhead than the other deep learning models provided in the table. In addition, we have provided an overall metric in the last column (overall score) that averages the values of metrics associated with the same model to come up with a single score to represent the overall quality for the predictive model. Although all models in the table have recorded a high overall score, the proposed predictive model has recorded the highest overall score (i.e., 99.93%) among all models in the table where the model achieved a 1.02-3.67% increase in the overall metric.

4.
For applications with a higher dimensional dataset, future work could further improve the proposed framework's performance by combining the proposed technique with big data frameworks to improve the prediction model's operational efficiency for longer prediction horizons. Furthermore, this method can be implemented for other applications to smart grids, such as power and load forecasting, to increase energy efficiency.

Conclusions and Remarks
A novel efficient identification system to predict the stability of smart grid networks based on pre-established computational features is implemented, analyzed, and assessed in this paper. The proposed system tends to provide precocious warnings/alarms of smart grid system faults that can minimalize/avoid the instability impacts at the physical implementation phases for the smart grid system. The developed model employs seven machine learning techniques to classify smart electric grid network stability into either stable or unstable, namely, Optimizable-Support Vector Machine (SVM), Decision Trees Classifier (DTC), Logistic Regression Classifier (LRC), Naïve Bayes Classifier (NBC), Linear Discriminant Classifier (LDC), k-Nearest Neighbor (kNN), and Ensemble Boosted Classifier (EBC). The developed ML models have been assessed using a contemporary and inclusive smart grid network stability dataset (SGN_Stab2018) in terms of several performance indicators including binary confusion matrix (BCM), identification accuracy (ACC), positive predictive value (PPV), true positive rate (TPR), harmonic mean score (HMS), false alarm rate (FAR), false discovery rate (FDR), false-negative rate (FNR), area under curve (AUC), identification speed (IDS), and identification delay (IDD). Accordingly, the seven machine learning models have been contracted in terms of identified performance indicators to exploit the maximum system performance. Ultimately, the comparison outcomes have revealed the competency and superiority of the optimized model over the other available models. Our best-obtained performance outcomes have surpassed the performance outcomes for the existing smart grid networks' stability predictive models.