Optimization for Data-Driven Preventive Control Using Model Interpretation and Augmented Dataset

: Transient stability preventive control (TSPC) ensures that power systems have a sufﬁcient stability margin by adjusting power ﬂow before faults occur. The generation of TSPC measures requires accuracy and efﬁciency. In this paper, a novel model interpretation-based multi-fault coordinated data-driven preventive control optimization strategy is proposed. First, an augmented dataset covering the fault information is constructed, enabling the transient stability assessment (TSA) model to discriminate the system stability under different fault scenarios. Then, the adaptive synthetic sampling (ADASYN) method is implemented to deal with the imbalanced instances of power systems. Next, an instance-based machine model interpretation tool, Shapley additive explanations (SHAP), is embedded to explain the TSA model’s predictions and to ﬁnd out the most effective control objects, thus narrowing the number of control objects. Finally, differential evolution is deployed to optimize the generation of TSPC measures, taking into account the security and economy of TSPC. The proposed method’s efﬁciency and robustness are veriﬁed on the New England 39-bus system and the IEEE 54-machine 118-bus system.


Background
The power system is a strategic system for national economic development. Its safety and stability are of great significance to ensuring the electrical power supply. In the course of social development, there inevitably are situations where the network or the operation of power systems is not compatible with power supply demand, bringing the operating point close to the stability limit and bringing a significant challenge to the dispatch and control of power systems. The increasing uncertainty of the intermittent renewable generation and load demand, and transmission line unavailability threaten the security of power systems [1][2][3]. Transient instability is often the main reason for large-scale accidents in power systems. Practical transient stability assessment (TSA) and transient stability preventive control (TSPC) of power systems are the keys to preventing accidents [4,5]. TSPC is usually performed as power generation rescheduling. The safe operation of power systems is of primary concern. However, in the context of the electricity market and sustainable development, the economic operation of power systems cannot be ignored either. Therefore, a trade-off between safety, economy, and efficiency needs to be considered in preventive control.

Research Review
In most studies, the generation of TSPC strategies is defined as a global mathematical optimization problem, usually solved by dynamic security-constrained optimal power flow [6]. The dynamic security constraints include small signal stability constraints [7], voltage stability constraints [8], and transient stability constraints [9][10][11][12].
In this paper, we focus on the transient stability constrained optimal power flow (TSCOPF). A variety of methods of TSCOPF have been proposed in the past few years . In general, TSCOPF-based TSPC is generally divided into two steps: (1) Define transient stability constraints according to the state variables of power systems and the transient stability index (TSI), ensuring that the generators in the power systems can maintain synchronization after expected failures. Some mainstream TSIs are classified as follows: (a) transient energy [4], determined by the kinetic and potential energy of a post-fault power system; (b) stability margin [5] or critical clearing time [9] based on extended equal area criterion; and (c) the maximum rotor angle difference of the system in the transient process [10][11][12].
Machine learning was introduced to construct transient stability constraints to meet the rapid generation of TSPC strategies [5,[31][32][33][34]. With the characteristics of offline training and online real-time prediction, machine learning models can quickly assess the stability of power systems and judge whether it is within the transient stability constraints, realizing the rapid generation of TSPC strategies. However, there are still two problems to be solved.
The first problem is that there are too many expected contingency scenarios, and thus, it is unrealistic to construct TSA models for each given contingency scenario. Therefore, it is necessary to build a synthetic TSA model suitable for all expected faults.
The second problem is that many generators exist in practical power systems, where it is not economical to schedule the generators globally. Therefore, the scope of dispatch control needs to be narrowed to meet the requirements of precision and economy.

Contributions
The contributions of this paper consist of developing an enhanced data-driven preventive control strategy for a more effective and precise search for global optima of TSCOPF using model interpretation and augmented dataset method.

•
In response to the first problem, an augmented dataset for training the TSA model is constructed by adding fault information to the traditional TSA dataset, thus improving the accuracy and the generalization ability of the TSA model. The novel TSA model trained by the proposed dataset can adapt to the multi-contingency condition without paying much more computation complexity. In addition, adaptive synthetic sampling (ADASYN) is adopted to generate a balanced dataset by synthesizing a group of minority class instances. • As for the second problem, the most effective control objects are screened to narrow the adjustment objects in TSCOPF using an instance-based interpretation method, Shapley additive explanations (SHAP). Unlike some model-based global analysis approaches, SHAP can locate the most influential control objects for the individual current operation point, thus accomplishing more effective preventive control. Furthermore, DE is implemented to solve TSCOPF to minimize the operation and control cost, considering the transient stability constraints and economic factors.
The rest of this paper is structured as follows. Section 2 gives a detailed introduction of the proposed TSPC methodology, including a general framework and some specific techniques. Section 3 presents the numerical results of the proposed strategy in the two benchmark systems, the New England 39-bus system and the IEEE 54-machine 118-bus system, verifying the feasibility of the proposed method. Finally, the main conclusions and prospects of this paper are summarized and highlighted in Section 4.

General Framework
The general framework of the proposed method is divided into two stages, an offline stage and an online stage, as illustrated in Figure 1. TSA machine learning models are trained by a time-domain simulation (TDS) dataset covering fault information in the offline stage. ADASYN is deployed to deal with the imbalance in the TDS dataset. In addition, the dataset is preprocessed by outlier detection and normalization. Then, the TSA models, including a predictor and a classifier, are generated. The predictor outputs the predicted stability probability, and the classifier implements instance classification according to the predicted stability probability and a predetermined classification threshold.
In the online stage, the data of power systems are obtained in real-time. The trained classifier is used to monitor the N − 1 stability of the current operating status of power systems. Once the instability is identified, the mechanism of SHAP is triggered. The generators or loads that have a more significant influence on the stability are regarded as candidate control objects. Furthermore, TSCOPF is solved to calculate the control amount, where the predictor provides the transient stability constraint. Next, N − 1 TSA is performed on the preliminary generated TSPC measures by TDS or machine learning. If the result of the stability check is stability, TSPC is activated. Otherwise, the control objects and their amount are readjusted.

Dataset Covering Fault Information
Since TSPC is implemented before the fault, the data of the TSA models applied to TSPC must also be the pre-fault steady-state variables of power systems. In most of the literature, a set of TSA models are constructed based on specific expected faults. With the increase in faults, the cost of model training, storage, and maintenance increases, which is not suitable for large-scale power systems. In addition, since there is a certain similarity or correlation in fault characteristics and in the post-fault state of power systems for different faults, the TSA models between different faults can be aggregated. Based on the original TSA model dataset, the information of the expected line faults is included, as shown in Table 1, aiming to aggregate the stability laws of different faults. The datasets for other types of failures can be constructed similarly.

Adaptive Synthetic Sampling in TSA
TSA based on machine learning is essentially a classification problem, which requires sufficient and relatively balanced data to train a TSA model. If the data are imbalanced, the trained model is unreliable. For example, in 100 instances, category A accounts for 95%while category B accounts for only 5%. Suppose that the trained model discriminates all instances as A. Then, the accuracy is 95%, which definitely cannot accurately reflect the model's performance. In power systems, the amount of the actual instability data is small, and at the same time, the instances of TDS may also face the problem of an imbalance. Therefore, in this paper, ADASYN is introduced in the framework of TSA.
The core idea of ADASYN is to generate different numbers of new instances of minority classes according to their learning levels. The level of learning for minority instances is expressed by the number of majority classes around minority samples. ADASYN adaptively transfers the classification boundary to difficult-to-learn instances and reduces the bias introduced by the initial imbalance data distribution. The specific ADASYN algorithm refers to Reference [35].
The performance indices, including accuracy, precision, recall, F1_score, and G_mean, are formulated as (1)-(5), where TP represents the number of instances truly predicted as positive, TN represents the number of instances truly predicted as negative, FP represents the number of instances falsely predicted as positive, and FN represents the number of instances falsely predicted as negative.
Precision is intuitively the ability of the classifier to not label a sample that is negative as positive. Recall is intuitively the ability of the classifier to find all of the positive samples. The F1_score can be interpreted as a weighted average of the precision and recall, where an F1_score reaches its best value at 1.0 and its worst at 0. The relative contributions of precision and recall to the F1_score are equal. G_mean characterizes the average accuracy of minority and majority classes without sacrificing any category.

Determination of Control Objects Based on Shapley Values
In general, deep learning models are a kind of black box. Although the prediction results of deep learning models can be easily obtained, the internal prediction logic of for specific instances is hidden from their users. Due to the requirements of safety and reliability of power systems, the unexplainable TSA models make it difficult for operators and dispatchers to trust and apply them thoroughly. Model interpretation provides a way to open the black-box models, making models more informative and easier to understand.
Strumbelj, E. et al. [36] proposed an instance-based model interpretation method, SHAP, introducing Shapley values to interpret machine learning models in the image and text recognition domains. Shapley values represent a kind of feature importance, computed based on a game-theoretic solution with the idea of permutation. The specific definition of Shapley values is as follows: The importance of the ith feature for a specific sample x is defined as ϕ i (x).
.., A n } is a certain subset of features in a given sample x ∈ A, and ∆(Q)(x) represents its influence. Function f : A → R is an regression model. For a classification model, f could be the probabilistic prediction of classification. p is the probability of a given sample y ∈ A. τ(x, y, W) = (z 1 , z 2 , ..., z n ), where z i = x i iff i ∈ W and z i = y i otherwise. π(n) is the set of all permutations of n elements, and Pr i (O) is the set of all features that precede the ith feature in permutation O ∈ π(n).
The approximating algorithm for the Shapley value is intuitively shown in Algorithm 1, where the permutation O and the assistant sample y are selected at random. K is the number of background data used to calculate the Shapley value.
The background dataset is used for integrating the features. For small-scale problems, this background dataset can be the whole training set, but for larger-scale problems, consider using a single reference value or using the k-means function to summarize the dataset, which will reduce the time cost of interpretation. If background data are used to calculate the Shapley values without being summarized, the operation matrix would be too large, resulting in insufficient allocated memory. The classical k-means algorithm is as follows.

•
Step 1: Select K samples from the dataset x as the initial clustering center c randomly. • Step 2: For each sample x i , calculate its distances to K clustering centers and classify the sample to the class C corresponding to the nearest clustering center. • Step 3: Recompute the clustering center c for each class C k , c k ← 1 |c k | ∑ x∈C k x.

•
Step 4: Repeat steps 2 and 3 until the termination conditions are satisfied, such as the convergence of clustering centers and the maximum number of iterations.
Algorithm 1 Calculating Shapley value ϕ i (x), the importance of the ith feature for sample x and model f. Take K samples.
In order to improve the convergence speed of k-means in the iterative process and to reduce the time elapsed, k-means++ algorithm [37] is introduced to initialize the cluster centers. The k-means++ algorithm is as follows.

•
Step 1: Select one sample of the dataset x as the initial clustering center c 1 randomly. • Step 2: Calculate the minimum distance of each sample x i to the current clustering center, represented by D(x i ).

•
Step 3: For each sample, calculate the probability of being selected as the next clustering center, Step 4: Determinate the next clustering center with the roulette wheel method. • Step 5: Repeat steps 2, 3, and 4 until K clustering centers are identified. • Step 6: The rest procedures are the same as steps 2, 3, and 4 in the classical k-means.
Moreover, the Shapley values of features have a welcome nature that they are implicitly normalized, satisfying (3), where f base is the average prediction of all background samples. Thus, the interpretation results can explain how the machine learning model's output is pushed from the base value to the final prediction by each feature's influence.
In power systems, especially in TSA and TSPC, unstable instances are often of great concern. In the TSPC stage, dispatchers scan and assess N − 1 contingency in the entire power system, looking for volatile failures to improve the stability margin of the power system by rescheduling the power generation. In this paper, SHAP is used to obtain each feature's Shapley values and to explain the unstable instances. Then, the features are ranked according to Shapley values, representing the feature importance for the instances. The top-k important features are taken. Since the objects of TSPC must be controllable, the importance of controllable features needs to be identified and ranked. Define the importance of the ith controllable feature (I ctrl (i)) as the weighted sum of the absolute values of the Spearman correlation coefficients (cc) of the top-k important features and the controllable features, where the weights are the absolute values of Shapley values (ϕ) of the top-k important features.
Define m as the number of the control objects of TSPC. The alternative identification strategies of m are as follows.
According to a ranking of the importance of controllable features, the first-m controllable features with a strong correlation with system instability are selected as the candidate control objects. As a result, the scope of control objects is narrowed, laying the foundation for an economical TSPC.

TSCOPF Based on Differential Evolution
A TSCOPF problem can be formulated as the minimum of the sum of control cost and operation cost with three types of constraint conditions, (13)- (20). The constraints of TSCOPF include (1) power flow constraints, (2) steady-state constraints (upper and lower limits constraints of generator active, reactive power output, and bus voltage magnitude, and thermal limits of lines, respectively), and (3) transient stability constraints.
In the above formula, (13) is the cost function; P Gi is the active power of generator i; ∆P Gi is the adjustment of the active power of generator i; C Gi is the unit adjustment cost of the active power of generator i; n G is the number of generators; a i , b i , and c i are the coefficients of the operation cost function of generator i; (14) and (15) are the active and reactive power balance constraints; P G and Q G are the vectors of generator active and reactive power output, respectively; P L and Q L are the vectors of active and reactive load, respectively; V and θ are the vectors of bus voltage magnitudes and angles, respectively; (16)- (19) are the steady-state constraints; P Gmin , P Gmax , Q Gmin , Q Gmax , V min , and V max are the vectors of the lower and upper limits of the generator's active and reactive power outputs, and the bus voltage magnitudes, respectively; S(V, θ) is the vector of apparent power flowing across the transmission lines, and S max is the vector of thermal limits of those lines; (20) is the transient stability constraints; f (·) is the function of the security boundary given by dynamic variables; and f thr is a suitable threshold value.
The control cost of the active power of generators can further be divided into an upward adjustment cost and a downward adjustment cost. In order to simplify the case analysis, the control cost is not considered in the case.
Transient stability constraints are constructed by TSA machine learning models. The stability probability of TSA is taken as the security boundary, f (·). For a binary classification problem, the threshold, f thr is usually set to 0.5. However, for power system stability control, the threshold can be increased appropriately.
In addition, TSI is based on the maximum rotor angle difference (∆δ max ) of the power system. When TSI is less than 0, the system is unstable; when TSI is greater than 0, the system is stable.
The differential evolution algorithm is essentially one of the population-based heuristic search algorithms that are suitable for continuous space and can achieve multi-objective optimization. Therefore, in this paper, TSCOPF is solved by using DE (DE-TSCOPF). DE algorithm originates from the idea of genetic algorithm (GA), including mutation, crossover, and selection. Different from GA, in the genetic process of DE, the mutation vector is generated from the parental differential vector and crosses the parental individual vector to generate a new individual vector. Then, the dominant individual is selected among the newborn individual and its parental individuals. The specific DE algorithm refers to Reference [38].
Our optimization strategy of TSPC is universal. Besides DE, most other approaches of TSCOPF can be suitably incorporated into our strategy, whether it is a heuristic algorithm or a mathematical approach.

Results and Discussion
In order to demonstrate the applicability of the proposed method, two benchmark systems are tested, including the New England 10-machine 39-bus system and the IEEE 54-machine 118-bus system. The simulations and tests are implemented on a PC with an Intel Core i5-2400 CPU and 4.00 GB of RAM, and the real-time measurement is replaced by the simulated data. The power flow calculation results are obtained using the MATPOWER 6.0 package, and TDS is performed in PSAT.

Dataset Generation
The single line diagram of the New England 10-machine 39-bus system is illustrated in Figure 2. The expected faults are determined based on the N − 1 contingency criterion. The operation data are generated with the Monte Carlo method for each fault, and TDS is performed. All faults are set as the most severe three-phase metallic short-circuit fault with a transient resistance of 0.001 Ω. The duration is randomly selected within 0.1-0.3 s, and the fault location is randomly selected within 0-100%. The data are labeled by TSI after TDS. Finally, an augmented dataset covering fault information is obtained.
In the simulations, all data about the fault of line AC 22 (BUS  in the augmented dataset are labeled as unstable, indicating that neither the TDS method nor the machine learning method can generate TSPC measures against the line fault. In this case, the fault of line AC 22 is handled in the emergency control stage. Therefore, it is assumed in this paper that the fault on line AC 22 is ignored during the preventive control stage.
After eliminating the outliers, a total of 2947 stable instances and 221 unstable instances are generated, indicating that the dataset is imbalanced. After implementing ADASYN, a balanced dataset is finally obtained, containing 2947 stable instances and 2518 unstable instances.

Transient Stability Assessment
Multi-layer perceptron (MLP) is selected as the prototype of TSA models. The structure of MLP is [500-100-50-10]. In total, four TSA models are tested, including the original TSA model, the TSA model with a dataset covering fault information, the TSA model with a balanced dataset augmented by ADASYN, and the proposed TSA model constructed using both dataset-augmenting techniques listed above. The performance of the TSA models is shown in Table 2. The accuracy of the TSA model trained in the proposed method is 99.2% and the G_mean index is 99.2%, both higher than those of the original model and the models constructed using the single dataset-augmenting technique.
The accuracy of the original TSA model is 95.6%. The accuracy of the model trained with the dataset covering the fault information is 96.1%. The accuracy of the model trained with the ADASYN balanced dataset is 98.8%. Both are higher than those of the original model. In particular, the G_means index of the TSA model with the ADASYN-balancing dataset is higher than that of the original model. Other performance indices are also improved to a certain extent. The scheme proposed in this paper combines the advantages of the above two dataset-augmenting methods. The accuracy of the trained TSA model is 99.2%, 3.6% higher than the original model. The G_mean index of the trained TSA model is 99.2%, 24.2% higher than the original model. The test results show that the proposed model is superior to the original model and the models constructed using the single dataset-augmenting technique.
Conclusions can be drawn from the performance of the TSA models as follows: (a) The dataset including the fault information can improve the model performance to a certain extent; (b) ADASYN can effectively tackle the problem of data imbalance in power systems.

Control Object Determination
A specific operational scenario in the New England 10-machine 39-bus system is assumed to be an actual scenario discriminated as unstable by the N − 1 TSA. SHAP is applied to the unstable scenario. The features with the top-10 Shapley values are illustrated in Figure 3. The interpretation results of SHAP show that the features, including P AC1 , P AC12 , Q AC1 , P L4 , and P G10 , contribute a lot to the instability of the scenario. These features are partly controllable and partly uncontrollable. Assuming that the controllable objects available for dispatch include all generators except for the frequency modulation generator, P G2 , the importance of each controllable feature is calculated and ranked, as shown in Table 3. The TSPC with m control objects, including the frequency modulation generator, is defined as TSPC m . For example, TSPC 0 represents the initial operational scenario without any control objects; TSPC 7 represents the operational scenario with seven candidate control objects including P G1 , P G2 , P G6 , P G7 , P G8 , P G9 , and P G10 . Note that the frequency modulation generator is automatically set as one of the controllable objects to ensure the power balance of the system, except in TSPC 0 .

Control Amount Determination
Within the control scope obtained in the previous section, TSCOPF is solved by DE to obtain the control amount of TSPC in this section. The operation cost functions of generators (formulated as [10]) and initial operational schedule (TSPC 0 ) of the New England 10-machine 39-bus system are shown in Table 4.
The results of DE-TSCOPF in different control strategies (TSPC 6 , TSPC 7 , and TSPC 10 ) are also shown in Table 4. The operation costs after taking TSPC 6 , TSPC 7 , and TSPC 10 are 65,266.29 $/h, 65,351.60 $/h, and 66,422.23 $/h, respectively. All TSPCs cost less than the initial operation condition without taking TSPC. As the number of controlled objects increases, the operation cost also increases slightly after TSPC. Table 4. Generator data and initial schedule for the unstable scenario in the New England 10-machine 39-bus system.

Gen
Cost Function ($/h) TSPC 0 (MW) TSPC 6 (MW) TSPC 7 (MW) TSPC 10 (MW) The asterisks indicate that the active power of the corresponding generators is adjusted in the TSPC.
The radar maps and the histogram of TSIs under expected faults with different TSPCs in the New England 10-machine 39-bus system are illustrated in Figures 4 and 5. The radar maps represent the stable domains of different TSPCs.  Before taking TSPC (TSPC 0 ), there are four line faults that do not meet the N − 1 stability criterion: Line AC1 fault, Line AC2 fault, Line AC14 fault, and Line AC34 fault. After taking TSPC 7 and TSPC 10 , all expected faults meet the N − 1 stability criterion, and the stability margin of the power system is improved. However, for TSPC 5 and TSPC 6 , the Line AC34 fault cannot meet the N − 1 stability criterion. In general, as the number of controlled objects increases, the TSI under each fault also increases. The TSIs of TSPC 7 and TSPC 10 are similar, indicating that, after taking TSPC measures, the stability of the two operational scenarios is similar. Furthermore, TSPC 7 has fewer control objects, which is conducive to the actual generation schedule. Therefore, TSPC 7 is selected as the ultimate TSPC on the power system.
The generator rotor angle trajectories under Line AC1 fault, Line AC2 fault, Line AC14 fault, and Line AC34 fault, which do not meet the N − 1 stability criterion before taking TSPC 7 , are illustrated in Figure 6. For the Line AC1 fault, Line AC2 fault, Line AC14 fault, before taking TSPC 7 measures, the rotor angle of generator 10 deviates from other generators and the system becomes unstable. After taking TSPC 7 measures, the system can maintain stability after each fault. For the Line AC34 fault, before taking TSPC 7 , the rotor angle of generator 9 deviates from other generators and the system becomes unstable. After taking TSPC 7 measures, the power system remains stable after the Line AC34 fails.

Dataset Generation
The dataset of a larger-scale system, the IEEE 54-machine 118-bus system, is generated similar to that of the New England 10-machine 39-bus system in the previous section; 159 AC line faults are included in the pre-defined fault set. An imbalanced dataset of operation conditions is generated, including 7801 stable instances and 196 unstable instances. After implementing the ADASYN technique, a relatively balanced dataset is finally obtained, containing 7801 stable instances and 1408 unstable instances.

Transient Stability Assessment
The TSA models of the IEEE 54-machine 118-bus system are constructed with MLP similarly to those of the New England 10-machine 39-bus system. The performance of the TSA models is shown in Table 5. The proposed TSA model trained by the ADASYNbalanced dataset with fault information has the best performance in accuracy (98.8%) and F1_score (99.3%) and has a good performance in G_mean (96.2%). However, the first two TSA models in Table 5 trained by the dataset without balancing by ADASYN have low G_mean of 26.2% and 37.1% due to the imbalanced data. The proposed TSA model trained by the ADASYN-balanced dataset improves the average accuracy of minority and majority classes for TSA models.

Control Object Determination
A certain operational scenario in the IEEE 54-machine 118-bus system is assumed as an actual current scenario discriminated as unstable by the N − 1 TSA. SHAP is performed on the unstable scenario.  In the IEEE 54-machine 118-bus system, 19 of 54 generators, including one frequency modulation generator, output active power to the system, while others only output reactive power. The active power output of these 19 generators is controllable and can be scheduled. Therefore, the importance of active power features of the controllable generators is calculated and ranked, as shown in Table 6.
In general, the number of control objects, m, can be determined by operators when applied to the practical power systems. In the simulation analysis, m is set as 0, 9, 14, and 19 to display the feasibility of the proposed method.

Control Amount Determination
Within the control scope obtained in the previous section, TSCOPF is solved by DE to obtain the control amount of TSPC in this section. Take TSPC m as different operational schedules of the IEEE 54-machine 118-bus system, including the initial operation condition (TSPC 0 ).
The results of DE-TSCOPF in different control strategies are shown in Figure 8. The operation costs of TSPC 0 , TSPC 9 , TSPC 14   The stable domains of different TSPCs in the IEEE 54-machine 118-bus system are shown in Figure 9. The TSI of the no. 159 fault (the fault on Line 76−118 ) is −0.83 in TSPC 0 . That means the system would be unstable once it suffers from the no. 159 fault. Other faults are safe. After taking the TSPCs, the TSI of the no. 159 fault is increased significantly, for example, from −0.83 to 0.49 after taking TSPC 14 .
The averages of TSIs under the TSPC 0 , TSPC 9 , TSPC 14 , and TSPC 19 are 0.644, 0.628, 0.704, and 0.705, respectively. When the number of controllable generators is nine, which is minor, the average TSI may be less than that without TSPC. That is because, in TSPC 9 , the safety margins of some faults are sacrificed due to the limit of control, thus ensuring that the system is a secure state under any fault circumstance. The scarification is acceptable because it maintains the system security. The security of the system is reinforced as the number of controlled generators increases. However, when the number exceeds a certain amount, the reinforcement is somewhat limited. Control strategies with fewer generators to be adjusted are recommended. The proposed method provides a possible way to narrow the number of controlled generators and improves the convenience of operation and the precision of preventive control. The power system dispatchers should consider the trade-off between the convenience of the implementation and the security of systems, which can be further determined according to optional and specific objectives.
The generator rotor angle trajectories under the no. 2 fault (the Line 1-3 fault) and the no. 159 fault (the Line 76-118 fault) before and after taking TSPC 14 are illustrated in Figure 10. For the Line 1-3 fault, before taking TSPC 14 measures, the rotor angle of G1 deviates very much from other generators in the first oscillation period. After taking TSPC 14 measures, the oscillation amplitude of the rotor angle of G1 is reduced and the system can maintain a relatively high stability margin after each fault. For the Line 76-118 fault, before taking TSPC 14 , the rotor angle of G76 deviates from other generators and the system becomes unstable. After taking TSPC 14 measures, the power system remains stable after the Line 76-118 fails.

Time Performance
The time cost of the proposed method mainly covers the interpretation time and the TSCOPF time. The time consumption of SHAP interpretation is related to the number of input features and the number of background data for the calculation of Shapley values. For the training data of the IEEE 54-machine 118-bus system, the number of input features is 919 and the number of initial training data is 7367. K-means++ algorithm is introduced to cluster the initial large amounts of data and to reduce the background data to K. The sum of squared distances (SSD) of samples to their closest cluster center is used as the clustering performance index. The smaller the sum of squared distances, the better the clustering performance. The elapsed time of interpretation and the clustering performance of k-means++ with different K are tested and illustrated in Figure 11. As K increases, the sum of squared distances of data decreases, and the time elapsed for pure clustering and the total time for clustering and SHAP increase. If the background data exceed 140, the run time slows down sharply due to insufficient allocated memory. It is suggested that the clustering approach be used to aggregate the background data into less than 100 samples. Furthermore, the time consumptions of DE-TSCOPF of TSPC 9 , TSPC 14 , and TSPC 19 are 56.453 s, 59.483 s, and 58.843 s. They are within a minute, which satisfies the time requirement of online preventive control for the power system in such a size. Note that the proposed method of accurately narrowing the control objects is generalized and is suitable for most TSCOPF approaches. Other TSCOPF approaches can be selected or improved to reduce the time consumptions of TSPC further.

Conclusions
A model interpretation-based multi-contingency coordinated preventive control optimization strategy, adaptive to the scenarios of multiple generators and multiple contingencies, is proposed in this paper. In the proposed strategy, an augmented balanced dataset is constructed by covering fault information and using the ADASYN technique. The proposed TSA machine learning model has a better performance than the original model both in accuracy and in the G_mean index. In addition, the proposed model is suitable for stability judgment under all anticipated faults in power systems. The model interpretation mechanism is used to narrow the scope of preventive control objects and realizes the control and economic optimization of generation scheduling under the premise of ensuring system stability. Through k-means++ clustering, the background data can be summarized into a small but representative dataset, which improves the feasibility and practicability of SHAP interpretation.
Besides differential evolution used in this paper, other approaches to solving optimization problems can also be applied to the proposed preventive control strategy. Due to the feasibility of the model interpretation mechanism, the proposed optimization method can be easily extended to other applications such as voltage stability or available transfer capability. Furthermore, the high penetration of renewable sources, including wind power generation and photovoltaic systems, increased the operational uncertainty of power systems. Therefore, TSA and TSPC strategies under renewable sources are exciting, which will be our future works.