Smart Grid Stability Prediction Model Using Neural Networks to Handle Missing Inputs

A smart grid is a modern electricity system enabling a bidirectional flow of communication that works on the notion of demand response. The stability prediction of the smart grid becomes necessary to make it more reliable and improve the efficiency and consistency of the electrical supply. Due to sensor or system failures, missing input data can often occur. It is worth noting that there has been no work conducted to predict the missing input variables in the past. Thus, this paper aims to develop an enhanced forecasting model to predict smart grid stability using neural networks to handle the missing data. Four case studies with missing input data are conducted. The missing data is predicted for each case, and then a model is prepared to predict the stability. The Levenberg–Marquardt algorithm is used to train all the models and the transfer functions used are tansig and purelin in the hidden and output layers, respectively. The model’s performance is evaluated on a four-node star network and is measured in terms of the MSE and R2 values. The four stability prediction models demonstrate good performances and depict the best training and prediction ability.


Introduction
The conventional power grid contains standard power generation units grounded on fossil fuels. With the soaring energy prices, the need for renewable energy sources and climate change, the old power grid is becoming outdated and facing various limitations, such as cybersecurity, privacy and power losses due to one-way communication [1]. This pushes for deploying renewable energy sources to improve sustainability and reliability. A smart grid is a solution to this. The smart grid system is a digital future electricity system that enables a two-way flow of communication, i.e., between the center and the device to the center [2].
This bidirectional communication utilizes advanced computing infrastructure, digital sensing and software capabilities to optimize all the grid components and improve reliability and sustainability. There is a unidirectional flow of energy from the energy provider to the consumer in a traditional grid, and consumers are charged based on their consumption. However, in a smart grid system, the users in the grid can consume, produce, store and trade energy with other users [3]. The smart grid introduces demand response, and the price information is determined as the demand is evaluated with supply and conveyed to the customer. This paper used the DSGC model to define and relate the price to the grid frequency [2,4]. The mathematical model based on DSGC differential equations seeks to find grid instability for a four-node star architecture [5]. The four-node star architecture consists of a central generation node, the power source and three consumer nodes. The response time of the smart grid users is considered to adjust the consumption/production concerning the price changes.
The model involves real-time pricing, and thus the grid stability has to be maintained with fluctuations in reaction times and electricity price of all users. It is critical to evaluate grid stability as the process is time-critical dynamically. This is because smart grid stability prediction helps increase efficiency through grid optimization, improves electrical supply reliability and consistency and analyses disturbances and fluctuations in energy consumption or production.
Before the utilization of modern techniques to predict smart grid stability, traditional approaches consisted of simulations combining fixed values for one subset and fixed distribution of values for the remaining subset variables [6,7]. The generation of electricity by photovoltaic power is related to the global horizontal irradiance. For the unknown cloud statistics, the irradiance is uncertain for predicting the stability in power generation, causing optical instability in the solar irradiance [4]. Measurement-based methods are another complex and challenging traditional method used to predict power grid stability [8].
Various statistical approaches have been investigated, including autoregressive moving average, Kalman filter and Markov chain model [9], which contribute to insufficient reliability of the grid [10]. Other types of early statistical methods [11] for load forecasting in smart grids have various drawbacks and affect the accuracy of the prediction model. These are built by ineffective, simple regression functions, and thereby do not yield good performance in vast uncertainties [12]. Further, traditional approaches, such as time series analysis, ARMA, ARIMA and Markov models for stability forecasting, exist only in specific operating ranges [13,14].
Additionally, some research involved using conventional parametric methods that include linear regression, auto-regressive moving average and the general exponential methods. Although such models return satisfactory prediction accuracy, they persist with major disadvantages, such as improper response and complex computational problems to meteorological variables and nonlinear electrical load [15]. A probabilistic model was introduced in [16] for power stability.
However, some uncertainties have been observed between regular grid operation and cascading failure operation in the simulation result. Adding on, techniques used for stability assessment require extensive computation time and massive data analysis volume, which makes it tough to obtain a reliable prediction and makes it difficult to take decisions for an operating power system [17].
A few hybrid systems used for dynamic stability prediction have been based on unreliable self-organized maps and responded slowly [6,18]. Another method introduced a situational awareness for stability prediction, a perception of elements for a given time and space in the environment [5]. It was proven that optimized deep-learning models are one of the excellent prediction tools for smart grid stability. Using neural networks for stability prediction has various advantages.
They have multiple training algorithms, do not require significant dataset pre-processing and can produce high accuracy values during training and testing [4]. Further, they can recognize different sets within a whole dataset and give adequate results even when the dataset is incomplete or inaccurate [19]. Finally, the ability to implicitly detect complex nonlinear relationships between independent and dependent variables makes it viable for stability prediction [20].
Comprehensive review work in [21] concluded that most of the works on prediction models using machine learning reported little or no information on the presence and handling of missing data. The missing data is omitted in most models, which is ineffective, affecting their performance. The missing data for the analysis results from many things, such as sensor failure, equipment malfunctions, lost files, etc. This challenges the increasing cost and prediction ability of the proposed models. Thus, there is a need for significant research in handling missing data. On the other hand, predicting the data with neural networks or machine-learning models is more efficient than simply omitting the data or resorting to mean values.
Being motivated by the literature, this paper proposes a novel method to predict the smart grid stability of a four-node star network using a neural network with complete and missing input data, consisting of missing input variables. Thus, the significant contributions of this paper are highlighted as follows: • The classic FFNN is designed to predict the stability of the smart grid system of a four-node star network with complete input data. • The sub-neural networks are proposed to predict the missing input variables, which are caused due to a sensor, network connection or other system failures. Then, the system's stability is forecast using these predicted missing input data. • The performance of the proposed approach is evaluated in four different case studies in which at least one input variable is missing.
The subsequent sections of the paper are organized as follows: Section 2 presents the comprehensive literature review on smart grid stability prediction using neural networks. Section 3 describes the mathematical modeling and data description of the four-node star network used for the smart grid stability prediction. Section 4 shows the development and performance of the FFNN with complete input data, and Section 5 describes the development and performance evaluation of the FFNN to handle the missing inputs. Finally, Section 6 highlights the conclusions of the proposed work.

Literature Review
In this section, an extensive literature survey is conducted on smart grid stability using neural networks. This literature review shows that various neural-network-based techniques have been used for analysis, and the data analyzed in the works are with complete input data. The developed approaches are robust and accurate due to their complex structure that helps classify problems and recognize correlations in raw data and hidden patterns.
A summary of works focused on smart grid stability prediction using various neural networks is highlighted in Table 1. The table contains 55 papers published in the last decade,  categorized into publication year, smart grid architecture, neural network type, neural network architecture, activation functions, training algorithms, performance measures and comparison techniques considered for each study. From the research works in Table 1, the year-wise and the publisher-wise contributions to the smart grid's stability prediction during the last decade are shown in Figure 1. Figure 2 depicts the smart grid architectures identified in the literature survey conducted.
The most popular architectures are IEEE bus systems [6,16,18,22,23] and node network types [4,8,24]. Therefore, in this paper four-node star network was selected to perform the proposed research.
In the analysis, several types of neural networks and hybrid networks were identified, as depicted in Figure 3. Among the most popular neural networks identified are FFNN, which includes the hybridized versions, such as FF-BPNN [25] and FF-DNN [26]. In addition, CNN is another most widely used, with its enhanced and hybrid versions, namely ECNN [27] and CNN-RNN [28]. The hybrid versions of LSTM, including LSTM-RNN [29,30] and LSTM-CNN [31], can also be seen in this literature.
In addition, the performance of DNN for stability prediction was improved by hybridizing with RNN, RL [32], CNN and IRBDNN [33]. On the other hand, optimization algorithms, such as SSA, have also been used with RBFNN to obtain the network's optimal weights [7]. The hybrid versions of GRU models, such as BiGRU [8] and GRU-RNN [9], have also been used for node networks' stability prediction. The sub-classification of all these neural-network-based models is also illustrated in Figure 3. Figure 4 summarizes the various training algorithms and activation functions used in the research work reported in Table 1. The figure concludes that the LM algorithm is the most commonly used for training algorithms, followed by Adam's optimization algorithm [29,32]. Furthermore, it further depicts that sigmoid, ReLU, tansig and tanh are the most frequently used hidden layer activation functions [30,34]. In contrast, the purelin activation function followed by Sigmoid is most commonly used in the output layer of the neural network [34][35][36].        The significant findings from the literature review on smart grid stability prediction using neural networks are highlighted as follows: • No work was conducted to predict stability when there is a missing parameter. Most studies showed that missing data had been either omitted, unreported or replaced with mean/median values. • The most popular architectures used for the case studies are IEEE bus systems and node network types (see Figure 2). • Among the several types of conventional and hybrid neural networks proposed in the literature, the FFNN and its hybrid versions, such as FF-BPNN and FF-DNN, are widely presented (see Figure 3 and Table 1). • The Levenberg-Marquardt algorithm is the most frequently used training algorithm for various networks to predict smart grid stability (see Figure 4). • The tansig and purelin activation functions have frequently been used in various networks' hidden and output layers to predict smart grid stability (see Figure 4).
From the above research gaps, this paper made an effort to develop a forecasting model that handles the missing input data. For the proposed neural-network-based forecasting model, the LM training algorithm was selected as it is one of the fastest backpropagation algorithms and is widely recommended in the literature. The literature proves that effective training necessitates a nonlinear and linear combination of activation functions. Thus, the tansig and purelin activation functions are utilized in the hidden and output layers, respectively.
Further, in our previous work reported in [37], an effort was made to compare the performance of FFNN, cascade and recurrent neural-network-based models for smart grid stability prediction. The work concludes that, for the considered application, the FFNN demonstrated superior performance in terms of the MSE and R 2 values compared to cascade and recurrent neural networks. On the other hand, over the years, researchers have proposed different methodologies and theories for selecting the number of hidden layers and the number of hidden neurons in each hidden layer. As reported in [38], it was concluded that a network with only one hidden layer but sufficient neurons can achieve better performance.
Moreover, this performance can be further improved by adding additional hidden layers. However, the variation in this performance with a multilayer network is minimal. The work reported in [39] concluded the same, stating that a multilayer network has achieved better performance but increased the complexity of the network. Therefore, for the considered application, the FFNN with a single hidden layer was used for all the cases, which improved the performance in predicting smart grid stability.

Mathematical Modeling and Data Description of Four-Node Star Network
In Section 3.1, the mathematical modeling of the four-node star architecture network is developed based on the equations of motion and binding the electricity price to the grid frequency. Then, the description of the generated dataset from the final dynamic equation of DSGC and the correlation analysis between the network parameters are provided.

Mathematical Modeling and Stability Analysis of Four-Node Star Network
In this section, the mathematical modeling of the four-node star network and the stability analysis are conducted. The central node (center of the "star") communicates directly with the consumer nodes in a star network topology. The consumer nodes are connected to the central (generation node), enabling bidirectional communication between each node, which helps them to operate at lower power levels. One of the main advantages of star topology is that the networks are independent. In case of failure or errors in one of the consumer nodes, the other consumer nodes are not affected, and the network operates typically.
The network is formed with one power producer in the center (i.e., generation node) and three consumers (i.e., consumer node). Star topologies depend heavily on delay and averaging time. Intermediate delays in a four-node star topology benefit stability, making it a simple, effective and efficient system [4]. From the literature survey conducted, we observed that the star and bus topologies were popular. A conclusion was drawn that star networks used in previous works having similar objectives showed good performance and can achieve the mathematical modeling for the DSGC system. Thus, the four-node star topology was chosen for this work, and the mathematical model of the DSGC system was obtained for the four-node star architecture network given in Figure 5.

Mathematical Modeling
The mathematical model of the DSGC system is obtained for the four-node star architecture network given in Figure 5. The figure shows that the network is formed with one power producer in the center (i.e., Generation Node) and three consumers (i.e., Consumer Node). The mathematical modeling developed with assumptions, such as no uncertainties and external disturbances comprises two parts. The first describes the generator and load dynamics based on equations of motion. The second part is based on binding the electricity price to the grid frequency [4,5,37,68].

Generation Node
Consumer Node Consumer Node Consumer Node The first step in the modeling is applying the energy conservation law. As per the energy conservation law, the power balance equation is given as follows: where P s is the power generated from source.
In (1), P d is the dissipated energy from the turbine, which is proportional to the angular velocity square given as, where j is the node index (either generator or load), K j is the friction coefficient of jth node and δ j (t) is the rotor angle of jth node defined as, where ω is the grid frequency and θ j is the relative rotor angle. Similarly, in (1), P a is the accumulated kinetic energy, and P t is the transmitted power given as, where M j is the moment of inertia of jth node and P max jm is the maximum capacity of line between jth and mth node.
By substituting (2), (4) and (5) in (1), P s j is obtained as follows: Now, substituting δ j (t) from (3) in (6), d 2 dt 2 θ j (t) is obtained as follows: where P j is the generated or consumed power, α j is the damping constant and K jm is the coupling strength between jth and mth nodes. These coefficients are computed as follows: The final step in the modeling is binding the electricity price to the grid frequency ω, allowing consumers to adjust their consumption or production. Thus, the electricity price p j for the jth node is computed as, where p ω is the electricity price when dθ j /dt = 0, c 1 is the proportionality coefficient, T j and τ j are the average and reaction times, respectively. The power consumed or producedP j (p j ) at price p j is defined as, where c j is the coefficient proportional to the elasticity price.
For the four-node star network shown in Figure 5, it is assumed that the algebraic sum of power consumed or generated is equal to zero. Thus, the assumption is given as, Therefore, the final dynamic equation of DSGC system for the four-node star architecture network is obtained by substituting (7), (9) and (10) in (11) as follows: where γ j = c 1 × c j .

Stability Analysis
In the first stage of analyzing the network's dynamical stability around the grid's steady-state operation, the fixed points of the network are computed by solving d 2 dt 2 θ j = 0 and d dt θ j = 0, which are obtained as, The above equation shows that the fixed point exists only if the grid has an adequate coupling strength coefficient K jm to transmit the power from the generation nodes to the consumer nodes. Furthermore, in the obtained fixed point, the value of ω j is d dt θ j , which is equal to zero. Thus, the value of ω * j = 0. The fixed points highlight that it only depends on the value of θ j , which must be analyzed to determine the stability.
Next, the Jacobian matrix of the system is obtained to compute the eigenvalues that determine the network's stability. Thus, the Jacobian matrix J is calculated as, The eigenvalues λ of the above Jacobian matrix determine the network's stability. The matrix has infinitely many solutions. However, only a finite number of solutions can have a real positive component (Re(λ) ≥ 0), determining the network's instability. In addition, the negative real part (Re(λ) < 0) indicates stability. Therefore, the network's stability condition is summarized as follows:

Data Description of Four-Node Star Network
From the differential Equation (12), it is to be noted that the parameters τ j , P j and γ j are the predictive features of the network. The values of these parameters used for the simulation are shown in Table 2 [68]. The value of j ranges from 1 to 4, in which index 1 is the generator node, and the remaining indices (2, 3 and 4) are consumer nodes. Further, the values of simulation constants α j , T j and K jm used in the simulation are given in Table 2. The range of P j (j ∈ (2, 3, 4)) at the consumer nodes are also shown in Table 2. The value of P 1 at the generating node in the model is computed as, The generated dataset contains 60,000 samples for all the 12 predictive variables and one dependent variable, Re(λ). The predictive features are shown in Figure 6a-c, and the dependent variable Re(λ), whose values are the real part of the roots from the dynamic equation of DSGC system in (12), is shown in Figure 6d. The dataset is zoomed in, and the region of the first 300 samples is shown in the figures. Simulation constants

Correlation Analysis
The Pearson's correlation matrix between the predictive features of the network (τ j , P j , γ j ) and dependent variable Re(λ) is shown in Figure 7. As reported in [69], the interpretation from the Pearson's correlation coefficients is given in Table 3. Therefore, from Figure 7 and Table 3, it can be observed that there is a moderate negative correlation coefficient of −0.579 between P 1 and its sum components (P 2 , P 3 and P 4 ). In addition, there is a weak positive correlation between dependent variable Re(λ), τ j and γ j of around 0.28 and 0.29, respectively. In contrast, there is negligible correlation between dependent variable Re(λ) and P j . Furthermore, it is worth highlighting that there is a negligible correlation between the predictive features (τ j , P j , γ j ) of the network.
As Pearson's correlation matrix describes the strength and associated direction between the variables, it can be concluded that the relationship between any two predictive features or between predictive features and the dependent variable is not very strong. Therefore, in this analysis, all the parameters are considered for developing and evaluating the performance of the proposed model (refer to Section 4). In addition, only the power parameters have been considered for developing and assessing the performance of the proposed model that handles the missing inputs since there is a moderate correlation between P 1 and its sum components compared to other parameters (refer to Section 5).  A detailed research flowchart of the complete smart grid stability design model is portrayed in Figure 8. Interpretation (refer to Table 3) FFNN Using Complete Input Data (refer to Figure 9) FFNN to Handle Missing Inputs (refer to Figure 12) Solving the differential equation (12) using the constants in Table 2, the 12 predictive variables and one dependent variable are obtained (refer to Figure 6)

Development and Performance Evaluation of Feedforward Neural Network
This section develops and evaluates the performance of an FFNN to predict the stability of the smart grid. The methodology for preparing a prediction model using complete input data is shown in Figure 9. The figure shows that data collection, analysis and pre-processing occur first. The input data is identified for the next step, which includes the prediction model to predict stability using the input data. The dataset used in this study consists of 60,000 samples.
The neural network used to predict stability utilizing the input data is a three-layered FFNN as shown in Figure 10. The first layer in the architecture indicates the input layer, which consists of 12 nodes equivalent to the 12 input parameters τ j , P j , γ j ∀ j ∈ {1, 2, 3, 4}. The number of nodes in the middle layer, i.e., the hidden layer 'N h ,' is 10, which can be calculated using the number of nodes in input layer 'N i ' as follows [70]:  Figure 9. Flow chart of implementation of prediction model with complete input data. Hidden Layer ∈ R 10 Input Layer ∈ R 12 Output Layer ∈ R 1 Figure 10. The architecture of FFNN for predicting smart grid's stability.
The third layer represents the output layer, consisting of 1 node, the output parameter (Stability). The dataset is divided into 80% and 20% for training and testing. The neural network is trained using the Levenberg-Marquardt algorithm. The tansig and purelin activation functions are utilized in the hidden and output layers. The training algorithm and activation functions are chosen as per the results of the comprehensive literature review conducted as shown in Figure 4. The training and testing outputs for the neural network are shown in Figure 11a,b. The neural network performance is measured in terms of R 2 and MSE [71][72][73][74]. The model has achieved an R 2 value of 0.9739 during training and 0.9738 during testing. Additionally, the model achieved MSE values of 0.0077 during both training and testing. Thereby, the accurate performance of the prediction model is depicted as the R 2 , and the MSE values are close to 1 and 0.

Development and Performance Evaluation of Feedforward Neural Network to Handle Missing Input
This section develops and evaluates a novel prediction model to handle missing inputs. The flowchart for the methodology adopted for predicting the missing input data is represented in Figure 12. Herein, four cases of missing inputs are taken, described in the flow chart as Case 1, Case 2, Case 3 and Case 4. This flowchart is represented in three stages.
The first stage includes data collection, analysis, pre-processing and defining the missing inputs. A prediction model is prepared using a sub-neural network to handle the missing inputs in the second stage. After the missing input parameters are predicted, the prediction model that handles missing inputs is prepared to predict stability. The primary neural network is an FFNN trained using the Levenberg-Marquardt algorithm for each case.
The tansig and purelin transfer functions are used in the hidden and output layers. The dataset consists of 60,000 samples, out of which 80% are used for training and 20% for testing. The input layer consists of 12 nodes corresponding to the 12 input parameters. The output layer consists of one node corresponding to the one output parameter. The number of nodes in the middle layer, i.e., the hidden layer 'N h ', is 10, which can be calculated using (17).
Standard specifications for each sub-neural model in the four cases are as follows: tansig and purelin transfer functions are used in the hidden and output layers. The dataset, consisting of 60,000 samples, is divided into 80% for training and 20% for testing. The training algorithm used for the sub-neural network is the Levenberg-Marquardt algorithm. Different missing input variables are considered in each layer for each of the four cases, as explained underneath.  Figure 12. Flow chart of implementation of prediction model that handles missing input data for the four cases.

Case 1
One missing input variable is considered in the first case, which will be predicted using a sub-neural network. The sub-neural-network model for this section is an FFNN that consists of three layers (refer to Figure 13). The first input layer consists of three nodes composed of three nodes similar to the three input parameters: accumulated power (P 2 ), dissipated power (P 3 ) and transmitted power (P 4 ). The last layer consists of one output node composed of one output parameter, i.e., source power (P 1 ). The number of nodes in the hidden layer is 10, computed using (17).
The training and testing outputs for the case 1 sub-neural network is shown in Figure 14a,b for 60,000 samples and zoomed in for 300 samples as shown in the bottom subplot. The neural network performance is measured in terms of R 2 and MSE. The model achieved an R 2 value of 0.9992 during training and testing. Additionally, the model achieved MSE values of 0.0008 during training and 0.0008 during testing.
The primary neural network was trained to predict stability using the predicted output variables. The testing output variables of the sub-neural network are substituted in the primary neural network. Finally, the leading neural network is tested after predicting the missing input variables. The training and testing outputs for the case 1 primary neural network is shown in Figure 15a,b for 60,000 samples and zoomed in for 300 samples as shown in the bottom subplot. The neural network performance is measured in terms of R 2 and MSE. The model achieved an R 2 value of 0.9721 during training and 0.8413 during testing. Additionally, the model achieved MSE values of 0.0080 during training and 0.0085 during testing. Hidden Layer ∈ R 10 Input Layer ∈ R 3 Output Layer ∈ R 1 Figure 13. The architecture of FFNN developed for case 1.

Case 2
Case 2 involves two missing input variables for stability prediction using FFNN as shown in Figure 16. The network shows that the input layer consists of two nodes relative to the input parameters: the source and transmitted powers (i.e., P 1 and P 4 ). The output layer has two nodes corresponding to accumulated and dissipated power (i.e., P 2 and P 3 ). The number of nodes in the hidden layer is 10 (refer to (17)). In the next step, the testing output variables of the sub-neural network are substituted and trained in the primary neural network model.
Upon prediction of the missing input variables, the primary neural network is tested. The MSE and R 2 performance measures are used to handle the missing data for the prediction model. The sub-neural-network model achieved MSE values of 0.1661 during the training and 0.1667 during testing. The R 2 values are 0.7082 during training and 0.7072 during testing. The sub-neural network's performances during training and testing for the first 300 samples are shown in Figure 17a,b, respectively.  The next step of case 2 involves training the primary neural network by utilizing the obtained testing output variables from the sub-neural network that is also measured in terms of MSE and R 2 values. The neural network attains an MSE of 0.0077 during training and testing. Furthermore, the R 2 values have obtained 0.9738 during the training and testing phases. The final developed model's performance for all the 60,000 samples and the first 300 samples of both the phases are represented in Figure 18a,b, respectively. The response in plots of the final leading model showcases the best prediction and tracking ability at both phases. The MSE and R 2 values relative to 0 and 1, respectively, indicate that the final proposed model for this case gives a superior performance.

Case 3
Further, Case 3 uses two missing input variables for a feedforward sub-neural network for stability prediction as shown in Figure 19. The input layer has two nodes relative to the two input parameters: the source power and accumulated power (i.e., P 1 and P 2 ). The output layer has two nodes corresponding to the two output parameters, dissipated and transmitted powers (i.e., P 3 and P 4 ). The number of nodes in the hidden layer is 10. In the next step, the testing output variables of the sub-neural network are substituted and trained in the primary neural network model. Upon prediction of the missing input variables, the primary neural network is trained. The MSE and R 2 performance measures are used to handle the missing data for the prediction model. The sub-neural-network model achieved MSE values of 0.1659 during the training and 0.1673 during the testing phases.
The R 2 values are found to be 0.7085 and 0.7061 during training and testing. The subneural network's performance during training and testing for the first 300 samples is shown in Figure 20a,b, respectively.
Next, the primary neural network was trained by utilizing the obtained testing output variables from the sub-neural network measured using MSE and R 2 values. The neural network attained an MSE of 0.0083 during training and 0.0082 during testing. Furthermore, the R 2 values obtained were 0.9720 during training and 0.9721 during the testing phases. The final developed model's performance for the 60,000 samples and the first 300 zoomedin samples at both stages are represented in Figure 21a,b, respectively. The response in plots of the final leading model showcases the best prediction and tracking ability at both phases. The MSE and R 2 values relative to 0 and 1, respectively, indicate that the final proposed model for this case gives a satisfactory performance. Hidden Layer ∈ R 10 Input Layer ∈ R 2 Output Layer ∈ R 2 Figure 19. The architecture of FFNN developed for case 3.

Case 4
Finally, the study of case 4 considers one missing input variable that is predicted using a sub-neural network as shown in Figure 22. Here, the first input layer has three nodes representing the input parameters: the source power, accumulated power and dissipated power (i.e., P 1 , P 2 and P 3 ). The last layer, the output layer, is composed of one node corresponding to the transmitted power (i.e., P 4 ). The number of nodes in the hidden layer is 10. The primary neural network was trained to use the predicted output variables to predict the stability after replacement in the primary neural network.
Once the missing input variable is predicted, the primary neural network is trained similarly to previous cases. The model's performance handling the missing data is measured using R 2 and MSE. The sub-neural-network model achieved MSE values are 0.0001 in during training and testing and an R 2 value of 0.9999 during the training and testing phases. The performance of the sub-neural-network model developed for all the 60,000 samples and the first 300 samples zoomed in is depicted in Figure 23a,b for training and testing, respectively. Hidden Layer ∈ R 10 Input Layer ∈ R 3 Output Layer ∈ R 1 Figure 22. The architecture of FFNN developed for case 4. Further, the primary neural network was trained similarly to other cases by utilizing the testing output variables from the sub-neural network. The model obtains an MSE of 0.0084 during training and 0.0084 during the testing phase. The R 2 values were 0.9717 during training and 0.9715 during testing. The performance of the final developed neural network having 60,000 samples and the first 300 samples during both training and testing are shown in Figure 24a,b, respectively. The model's response offers both phases the best training and prediction ability. The MSE values obtained are close to 0, and the R 2 values are relative to 1, highlighting that the proposed model is accurate for stability prediction.  Table 4 depicts the performance evaluation of the developed FFNN model using complete input data and the model that handles the missing inputs (Case 1, Case 2, Case 3 and Case 4). The R 2 and MSE results of the sub-neural-network model of case 4, with one missing input variable: transmitted power (i.e., P 4 ), shows the model's best training and prediction ability in both phases. For all the sub-neural-network models, the R 2 value has achieved at least 70% and 97% for the primary neural network. We noticed that the MSE values obtained are close to 0. Furthermore, the R 2 values are relative to 1, which indicates the excellent performance of all the models.

Conclusions
The primary goal of this paper was to tackle the issue of stability prediction when there are missing variables involved. This missing variable could be due to the failure of a sensor, network connection or other system. This paper successfully solved this issue by proposing a novel FFNN model that handles missing inputs. The model's performance was evaluated on a four-node star network. In this study, four cases of missing input variables were taken.
For each case, a sub-neural network was first prepared to predict the missing variables, and then these predicted values were fed into the primary neural network to predict the stability. Among all four cases, case 4 showed the best performance with an MSE value of 0.0001 and an R 2 value of 0.9999 during training and testing for the sub-neural network. In addition, the primary network showed an MSE value of 0.0084 and an R 2 value of 0.9717 during training and 0.9715 during testing. For all the four cases, the models achieved an MSE close to 0 and an R 2 value close to 1 thus indicating the excellent performance of the prediction models.
However, this work was limited to predicting the power parameter using a sub-neural network because the algebraic sum of the power consumed or generated was assumed to be zero, and uncertainties and disturbances were not considered. Moreover, the reaction time and price elasticity parameters are highly nonlinear in the considered dataset. As a result, this proposed model faces a shortcoming in predicting the missing variables (reaction time and price elasticity) using a sub-neural network that predicts the stability using a primary network. Therefore, extending the proposed model to predict these highly nonlinear input parameters will be addressed in our future work.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: