A Data-Driven and Data-Based Framework for Online Voltage Stability Assessment Using Partial Mutual Information and Iterated Random Forest

Due to the rapid development of phasor measurement units (PMUs) and the wide area of interconnection of modern power systems, the security of power systems is confronted with severe challenges. A novel framework based on data for static voltage stability margin (VSM) assessment of power systems is presented. The proposed framework can select the key operation variables as input features for the assessment based on partial mutual information (PMI). Before the feature selection procedure is completed by PMI, a feature preprocessing approach is applied to remove redundant and irrelevant features to improve computational efficiency. Using the selected key variables, a voltage stability assessment (VSA) model based on iterated random forest (IRF) can rapidly provide the relative VSM results. The proposed framework is examined on the IEEE 30-bus system and a practical 1648-bus system, and a desirable assessment performance is demonstrated. In addition, the robustness and computational speed of the proposed framework are also verified. Some impact factors for power system operation are studied in a robustness examination, such as topology change, variation of peak/minimum load, and variation of generator/load power distribution.


Introduction
Static voltage stability is a crucial issue for the secure operation of power systems, since major power outage incidents around the world have been associated with it [1][2][3]. This issue not only causes huge economic losses but also has an unpredictable impact on the lives of people and on industrial production. For these reasons, an accurate and rapid assessment tool to assess whether a current operation point is prone to voltage collapse is essential for power system operators.
Static voltage stability analysis aims to find the distance from the current operation point to the voltage collapse point when the generation and load are increased slowly [4,5]. The conventional technique of researching the static voltage stability margin (VSM) is the model-based method that solves the power flow iteratively from a basic operation point to the voltage stability limit. For this technique, there are different kinds of meth-ods for VSM assessment, such as continuation power flow (CPF) [6,7], singular value decomposition [8], and sensitivity analysis [9]. However, these model-based methods may not be suitable for online applications in practice because of the difficulties of accommodating complex operation conditions and the massive time consumption.
With the development of wide area measurement systems (WAMS) and the widespread adoption of phasor measurement units (PMUs) in power systems, determining how to make full use of rapidly accumulated PMU data has become an important topic [10][11][12]. Compared with the asynchronism and slowness of supervisory control and data acquisition (SCADA), the application of synchronized data from PMUs can facilitate decision-making and system operation. To efficiently utilize PMU data, the application of data-driven and data-based methods in online voltage stability assessment (VSA) has attracted widespread attention in recent years. In the literature, support vector machines (SVMs) [13], decision trees (DTs) [14], artificial neural networks (ANNs) [15], and extreme learning machines (ELMs) [16] have been employed for online static VSA. By extracting and formulating a mapping relation between the operation data and VSM based on the training process, the data-driven tools can provide assessment results when real-time PMU data are received. However, the above data-driven methods still have some shortcomings in online applications for large-scale power systems, including the complexity of decision-making rules, the difficulty of processing large-scale samples, and the inconvenience of dealing with missing data.
In this paper, to make VSA more efficient for the practical operation of power systems, a data-driven and data-based framework is proposed that can achieve online VSA with low computational complexity, rapid processing speed, and considerable prediction performance in modern power systems. In the proposed framework, feature preprocessing, feature selection, and regression prediction are applied. First, a feature preprocessing approach is used to remove redundant and irrelevant features from the collected PMU data to improve computational efficiency. Second, partial mutual information (PMI) [17] is used in the feature selection procedure to screen out the key variables, which can conveniently explore connotative correlations. Finally, the key variables are sent to the VSA model based on iterated random forest (IRF) [18] to complete the VSM prediction. The desirable performance of the framework is demonstrated by tests on the IEEE 30-bus system and a practical 1648-bus system.
Compared with previous studies, the contributions of this paper can be summarized as follows: 1) In this paper, a feature preprocessing approach and a feature selection procedure are designed. The approach can remove redundant and irrelevant features from the collected PMU data and aims to improve computational efficiency. The procedure can significantly reduce the dimension of the sample set in preparation for the subsequent prediction. Specifically, PMI is applied in the feature selection procedure to select key variables by detecting connotative correlations, which can overcome the problems of underestimation and overestimation in conventional feature selection methods. 2) In view of large-scale operation data, the partially missing PMU data, and the real-time requirement of VSA in power systems, a VSA model based on IRF is presented. IRF has the following advantages: accommodating large-scale data sets, dealing with partially missing data, and reducing the computational burden. In addition, a model update mechanism is designed for VSA models, which can adapt to unforeseen changes of practical power system conditions. 3) Some impact factors in the practical operation of systems are taken into consideration in this paper, including topology change, variation of peak/minimum load, and variation of generator/load power distribution. A desirable assessment performance and the robustness of the VSA model are verified.
The rest of this paper is organized as follows: Section 2 contains the problem statement and introduction of supporting methods. Section 3 introduces the proposed framework for online VSA in detail. Section 4 presents the performance test of the proposed framework in the IEEE 30-bus system. Section 5 applies the framework to a practical 1648-bus system. Section 6 concludes the paper.

VSM
In this paper, the CPF method is used to draw the P-V curve of the system. The P-V curve is commonly used to describe the correlation between the voltage of the load bus and the active power delivered to the load [19], as shown in Figure 1. Point a represents the initial operation point, which indicates that the system is operating in a state of light load. With an increase in the load level, the operation point gradually approaches point b.
In practice, the operation point of a power system is located on curve ab, and the load voltage decreases with a continuous increase in the active power of the load. When the load continues to increase, the operation point will finally reach point b, where the system is in a critical state of static stability. If the load is increased at this point, the voltage will collapse and cause system instability [20]. The load active power margin and the relative VSM are defined as Equation (1) and Equation (2), respectively.
The idea of CPF is utilized to determine the collapse point and complete the VSM calculation. n different operation points are generated from the system, and the maximum deliverable power for each operation point is determined. The load active power margin directly reflects the capacity of the current system to maintain voltage stability, and it is helpful for operators to acquire the degree of security of the system intuitively and correctly.

PMI
The detection of associations between variables is an important challenge in large data sets. PMI is a method to measure the degree of association dependence between variables based on information theory, which can accurately quantify the connotative associations between measured variables. PMI is used to infer direct dependencies in the field of biology, which has exhibited superior performance compared with some conventional methods. In practical tests, conventional methods usually have problems of underestimation and overestimation [21,22]. PMI can not only overcome the above problems but also retain quantitative characteristics [17]. In this paper, PMI is introduced into the field of power systems for feature selection to address the curse of dimension for system operation variables. By utilizing PMI to explore the associations between the operation variables and the relative VSM, the associations can be measured with the given scores. Therefore, a variable weakly related to the VSM will be given a low score and removed by the feature selection based on PMI, and the dimension of the data set can be reduced. The considered features are the system steady-state operation variables, such as branch active/reactive power flow, bus voltage amplitude and phase angle, load active/reactive power, and generator active/reactive power output.
For random variables x, y, and z, x and y are one-dimensional variables and z is an n-2 dimensional vector (n > 2 is a positive integer), where n is the dimension of vector (x, y, z) [17].
( , ) p x y z  is the joint probability distribution of x and y with the condition z, which is defined as Equation (3): where and are defined as Equation (4) and Equation (5), respectively [23]: where p(x) and p(y) are the marginal distributions of x and y, respectively. ( , ) p x z y  is the joint probability distribution of x with the condition z and y. ( , ) p y z x  is the joint probability distribution of y with the condition z and x. ( is the average value of ( , ) p x z y  over y, and  if x and y are independent with the condition z. ( ) p x z  is the condition probability distribution of x with the condition z, and ( ) p y z  is the condition probability distribution of y with the condition z.
Then the PMI value between variables x and y the condition z is defined as Equation (6): where p is the joint probability distribution of x, y, and z. p(z) is the marginal distribution of z. The value of PMI falls between 0 and 1. Some characteristics of PMI are as follows.
1) A larger value of PMI P means that a stronger association exists between x and y. 2) PMI = 0 P means that there is a statistically independent association between x and y.
3) A value of PMI P that is close to 1 means that there is a close association between x and y.

IRF
IRF is an ensemble learning technique for classification and prediction. Based on the classic random forest (RF) algorithm, IRF trains a feature-weighted ensemble of decision trees to handle prediction problems with the same order of computational cost as RF [18]. Compared with some conventional prediction tools, IRF has a relatively high predictive accuracy with a low computational cost. In this paper, IRF is introduced into the field of power system stability assessment and is applied to VSM prediction.
As shown in Figure 2, the IRF algorithm consists of the following procedures.

Iteratively Reweighted RF
Given a number of iterations K, IRF iteratively grows K feature-weighted RFs based on the data set. The iterative process is represented by and stores the importance (mean decrease in impurity) of the p features as (1) , the feature importance from the previous iteration is used as the new weight [24].

Generalized RIT
The generalized random intersection tree (RIT) is applied to the last feature-weighted RF grown in iteration K [25], and this process indicates that decision rules generated in the process of fitting provide the mapping from continuous features required for the RIT.

Validity Assessment
The validity of the final output should be assessed. The generalized RIT is applied to grade the output results, and a result with a score q greater than 0.5 is retained (the score falls between 0 and 1). In this paper, IRF is used as a regressor to build the VSA model for the efficient VSM prediction.

Proposed Framework for Online VSM Assessment
In this paper, the proposed framework for online VSM assessment is shown in Figure 3. The framework includes three stages: offline training, model update, and online assessment.
The data of system operation variables are obtained by the collection of PMUs, and a knowledge base including a large number of variables and the relative VSM can be established. Before training the VSA model, the feature preprocessing approach and the feature selection procedure are applied to remove redundant features and reduce the dimension of the knowledge base. In contrast to conventional data collection, PMUs can quickly and precisely measure the voltage phasor at a bus and the current phasor of lines connected to the bus [26]. Therefore, based on the real-time PMU data from WAMS, online VSM assessment can be executed. In practical applications, the offline trained model needs to be updated to effectively deal with the unforeseen operation conditions of the system. Therefore, a model update mechanism is introduced to increase the generalization ability of the VSA model and achieve seamless online assessment.

Knowledge Base Construction
In the offline training stage, it is important to generate a reliable and abundant knowledge base that is consistent with the practical operation of power systems. The construction of the knowledge base can provide empirical data to build accurate mapping relations between operation data and the VSM.
Based on the historical statistical data of system operations and offline simulations, a knowledge base containing a massive number of operation variables and the relative VSM can be obtained. In general, the historical operation data of power systems can be collected by PMUs and SCADA. Nevertheless, since some potential system operation behaviors may not be recorded, it is insufficient to consider only historical statistical data. To establish a larger operation space and a more abundant knowledge base, linear interpolation between two close historical operation points can be used to capture additional points that are consistent with the practical operation of power systems [27]. Additionally, some reasonable fluctuations can be added to practical operation points, and the considered fluctuations include topology change, variation of peak/minimum load, and variation of generator/load power distribution. In offline simulations, the system operation variables are obtained by solving the power flow on operation points generated by the above methods. Then, the VSM related to each operation point can be acquired based on the CPF concept discussed in Section 2.1. Since the obtained abundant knowledge base covers sufficient and reliable operation scenarios, the generality and adaptability of the VSA model can be ensured.

Feature Preprocessing
Since the scale of the modern power system has become larger and the fluctuation of system operation has become more frequent, the dimensionality of the data set may grow to an unacceptable level. In this paper, a feature preprocessing approach is used to remove the redundant and irrelevant features to overcome this problem. The basic process is executed in the following three steps: Step 1: Some subsets are generated from the knowledge base through a division strategy.
The division strategy is shown in Figure 4. The original input feature set INP is divided into K subsets The divided subset K INP is defined as Equation (7): Where ( 1, 2,..., ) l inp l n  denotes the features of system operation, including branch active and reactive power flow, voltage magnitude, phase, etc.
Step 2: The divided subset F is used for a loop flow for the subset processing, as shown in Figure 5. An assessment function is used to assess the subset and provide decisional information for the loop stopping criterion. The loop is terminated when the stopping criterion is satisfied and the final feature subset is obtained. The assessment function is defined as Equation (8): where f is a feature in subset F obtained from step 1, S is the newly generated subset consisting of selected features from F, represents the mutual information, and s is a feature selected from F to S, and is a user-defined parameter for adjusting the number of finally selected features. In accordance with experiments, a value of between 0.5 and 1 is recommended.
Step 3: After subset processing, the feature subsets are merged into a large data set, which is used in the subsequent feature selection procedure.

I  
Each feature f is set as a subset and the stopping subset number is set as K between subsets is calculated

Feature Selection Procedure
In this stage, PMI is applied to perform the feature selection procedure. The purpose of the feature selection from the data set is to select the key features significantly related to the VSM, which can further reduce the dimension of the features.
First, a series of features after feature preprocessing in the knowledge base are used as the input. Second, PMI is used to explore the connotative correlations between operation variables and the relative VSM, which are assigned scores and ranked. Finally, the operation variables with highly ranked correlations are selected as the output for building the sample set, which is an optimal sample set consisting of the key variables. The ultimate sample set is sent to the IRF regressor to execute offline training.

Model Update
The model update stage cannot be neglected on account of the variable operation environments of power systems. Therefore, some impact factors of power system operation are considered for driving the model update mechanism, which is designed with the following details.
As shown in Figure 3, the impact factors including topology change, variation of peak/minimum load, and variation of generator/load power distribution are considered in this work. For the first factor, the system network topology often changes due to possible operation requirements, such as contingencies, economic dispatch, and scheduled maintenance. For the second factor, the load demand level tends to change over time and it may be different in winter and summer. For the last factor, the variation of generator/load power distribution may be caused by the fluctuation of renewable energy and distributed generation with high penetration. Generally, a list of credible operation conditions is available from utility companies in practice. Hence, a series of models trained for credible operation conditions is prepared and included in the offline knowledge base to achieve a rapid response to operation condition changes.
When operation conditions change due to the above factors in the application of the proposed framework, the corresponding handling method is as follows: 1) If the changed operation condition has been recorded in the knowledge base, the corresponding VSA model will be immediately selected out to replace the original one. 2) If the match cannot be found, the assessment accuracies of readily available Set S=Ø and initialize Ɛ as a small number larger than zero Find out the f that maximizes I(VSM, f ) and put it into S as the first feature For the changed S and F, can the feature f satisfy J(S, f ) > Ɛ ?
Put the feature f into S Terminate the loop

Yes
No trained models will be checked based on the changed operation condition. (1) If some models can provide acceptable accuracies, the model with the highest accuracy of such candidates will be used to accomplish VSA for the system with the changed operation condition. (2) If the existing models cannot provide acceptable accuracies for the changed operation condition, the construction process of a new VSA model will be activated. Then, the changed operation condition will be recorded, and the corresponding new model will be absorbed in the knowledge base. By continuously executing the model update, fewer unseen operation conditions will be encountered and seamless online VSA can be gradually achieved.

Online Assessment
As shown in Figure 3, once real-time PMU measurements are received from the WAMS server, the data of selected features will be sent to the VSA model, and the model can provide the synchronous assessment results instantly.

Test System and Data Generation
The VSA model proposed in this paper was tested on the IEEE 30-bus system, which consists of 30 buses, 6 generators, and 37 transmission lines. The original topology of the system is shown in Figure 6. The tests were conducted on an Intel Core i7 3.40 GHz Central Processing Unit (CPU) with 8 GB of Random Access Memory (RAM). In this study, Power System Simulator/ Engineering (PSS/E) was applied to generate abundant operation points and Python programs were used to automatically control PSS/E simulators.
By considering some impact factors of power system operation, such as topology change, variation of peak/minimum load, and variation of generator/load power distribution, more operation points were generated to enrich the knowledge base. In total, 4759 records were generated for the VSA model. The obtained sample set was split into two independent data sets: 70% of the records were randomly selected for training the model, and the remaining 30% of the records served the purpose of model testing. The training and testing were replicated over 5 times until the mean and standard deviation of the accuracy became stable.

Feature Selection
By exploring the connotative correlations between variables, the variables with high scores were selected as the key features, which were recorded as the inputs of IRF. Finally, 22 variables and the relative VSM were used to establish the sample sets.
From the computational consumption point of view, the feature selection procedure was able to avoid unnecessary computational burden for the IRF application. In the VSA model, it was desirable to maintain an acceptable prediction performance by using the representative features.

VSA Test
The performance of the VSA model was tested using the residual squared error ( 2 R ) and root mean squared error (RMSE) as follows: Where n is the number of records, i Y is the actual VSM i , * i Y is the prediction value obtained by the VSA model, and Y is the mean of i Y . In regression, 2 R is a statistical measure of how well the regression line approximates the real data points. The value of 2 R usually falls between 0 and 1. In general, the closer a value of 2 R is to 1, the better the prediction is. RMSE is another statistical tool for prediction performance, which is adopted to measure the specific difference between the predicted VSM and its corresponding actual value. A smaller RMSE value means a better performance, and the value of RMSE depends on the base magnitude of the specific object to be assessed. Table 1 shows the accuracy of the VSA model in which the 2 R value is close to 0.99 and the RMSE value is less than 0.02. 2 0.90 R  is acceptable based on experimental results [28] and was used as a basic requirement of prediction accuracy in this work. Therefore, this result indicates that the VSA model had a desirable ability for VSM prediction.

Application to a Larger System
To further verify the performance of the proposed framework in this paper, the VSA model was applied to a practical 1648-bus system provided by PSS/E, which contains 1648 buses, 313 generators, 182 shunts, and 2294 transmission lines. The method of building the knowledge base was the same as that for the IEEE 30-bus system. A total of 34,367 variables were extracted from the operation data of the 1648-bus system, and 16,579 records were generated for the VSA model. The feature preprocessing and feature selection procedures were executed before the IRF training procedure in the tests. The statistical accuracy of the assessment is shown in Table 1.

Comparison with Different Regression Tools
The VSA model was compared with some conventional VSM prediction tools to verify its advantages in regression prediction. A comparison of the accuracy results of different tools is shown in Table 2, including logistic regression (LR), SVM, regression tree (RT), ANN, and ELM. In the tests, the same number of input features was used for each tool. Table 2 shows that the VSA model based on IRF performed better in VSM prediction than some conventional tools, and the advantages of the VSA model based on IRF are summarized as follows.
(1) For LR and SVM, it is an arduous task to train a massive number of samples. In particular, the accuracy may not be acceptable when LR is applied to a large-scale feature space [29]. Due to the attribute of IRF for accommodating large-scale data and the feature selection procedure, the VSA model is able to train massive samples efficiently.
(2) For RT, missing data situations cannot be effectively handled. In addition, overly complex rules may be established when the depth of a tree is large [30]. Compared with RT, IRF has a parallel data processing structure with multiple trees, which can provide sufficient alternative choice of feature sets to overcome data missing. (3) For ANN and ELM, the high calculation cost is an obvious problem for the implementation of online VSA. The iterative tuning and slow learning speed may lead to a large consumption of computational resources when ANNs and ELMs are trained for prediction with large-scale data. Because of the rapid calculation of the IRF regressor and screening of the key variables in advance, the computational burden can be significantly reduced.

Robustness Assessment
In addition to accuracy-based measures, the robustness of the VSA model was also assessed. In this paper, some impact factors of system operation were studied, such as topology change, variation of peak/minimum load, and variation of generator/load power distribution.
(1) Topology Change: Different network topologies were tested in this study and a part of them is shown in Table 3. The corresponding test results are shown in Figure 7, where R 2 -30 and RMSE-30 represent the accuracies of the tests for the IEEE 30-bus system. Similarly, R 2 -1648 and RMSE-1648 represent the accuracies of tests for the 1648-bus system.
(2) Variation of Peak/Minimum Load: The impact of different peak/minimum load ranges on the assessment accuracy was examined, and the test results are shown in Figure 8. Although fluctuation occurs in the prediction accuracy, the accuracy can still be maintained in an acceptable range (  2 0.90 R ) for the two systems.
(3) Variation of Generator/Load Power Distribution: Different generator/load power distributions were taken into account for testing, and the corresponding results are shown in Figure 9. Table 3. Different network topologies for the two systems.  It should be noted that the variation ranges (the variation of peak/minimum load and the variation of generator/load power distribution) are based on the original loads and power distribution, respectively.
According to the above test results, it can be seen that the VSA model can provide desirable prediction accuracy for variable operation conditions, and good robustness of the model is demonstrated. Figure 9. Tests for variation of generator/load power distribution.

Impact of PMU Measurement Errors
In practice, although PMUs are precision level measurement units, there is a possibility that the signal processing may introduce some errors in the phasor calculations. Generally, PMUs that are Level 1 compliant with the standard should provide a total vector error (TVE) of less than 1% [31]. The TVE is the vector difference between the exact applied signal and the measured one. The impact of PMU measurement errors on the prediction performance was studied, and two scenarios were tested as follows.
(1) Noise was added only to the test set.
(2) Noise was added to both the training set and test set.
The test results for the considered scenarios are summarized in Table 4. It is shown that the VSA model can provide an acceptable prediction accuracy considering PMU measurement errors. In addition, the model with measurement error performs better than those without the error taken into account in the training data set.

Impact of Training Set Size
To explore the impact of training set size on prediction accuracy, a series of training sets with different sizes was tested, and 5%, 10%, 30%, 50%, 80%, and 100% of the original training sets were used in each test. The overall accuracies of the tests for the two systems are given in Figure 10 and Figure 11. It can be seen that sufficient training samples are required to ensure a high accuracy for VSA.
According to replicated test results, at least approximately 10% of the original training samples can provide an acceptable assessment accuracy (  2 0.90 R ) for the IEEE 30-bus system, and at least approximately 20% of the original training samples are needed for the 1648-bus system. Therefore, operators can conveniently choose an appropriate set size according to the required assessment accuracy.

Data Processing Speed
Data processing speed was also a concern for the online application of the VSA model. In practice, the sampling frequency of PMUs for system operation data is at least 30 times per second. Accordingly, to achieve a VSA for each snapshot, the processing time of PMU data should be less than 0.033 s [32]. Therefore, the capability of making full use of fast updated PMU data is essential for realizing online assessment.
The computation time results of the VSA model are summarized in Table 5. It is observed that a new operation point can be assessed in less than 0.002 s for both the IEEE 30-bus system and the 1648-bus system. Therefore, the data processing speed of the VSA model is rapid enough to satisfy the requirements of online application.

Conclusions
A data-driven and data-based framework for online VSA is proposed in this paper. The proposed framework is based on feature preprocessing, feature selection, and regression prediction. Using the feature preprocessing approach, redundant and irrelevant features can be removed from collected PMU data to improve computational efficiency. To further screen out the key variables highly related to the VSM, the feature selection procedure is completed based on PMI due to its advantages in exploring associations. IRF is applied to achieve the VSM prediction, which can effectively overcome the deficiencies of some conventional regression tools in VSA. The proposed framework supports model updating and adapts to unforeseen changes in system conditions. The desirable performance of the framework was demonstrated by the tests for the IEEE 30-bus system and a practical 1648-bus system. The robustness of the framework for some impact factors of system operation was also verified, including topology change, variation of peak/minimum load, and variation of generator/load power distribution.

Conflicts of Interest:
The authors declare no conflicts of interest.