Construction of a Frequency Compliant Unit Commitment Framework Using an Ensemble Learning Technique

: Frequency control is essential to ensure reliability and quality of power systems. North American Electric Reliability Corporation’s (NERC) Control Performance Standard 1 (CPS1) is widely adopted by many operating authorities to examine the quality of the frequency control. The operating authority would have a strong interest in knowing how the frequency-sensitive features affect the CPS1 score and ﬁnding out more effective unit-dispatch schedules for reaching the CPS1 goal. As frequency-sensitive features usually possess multi-variable and high-correlated characteristics, this paper employed an ensemble learning technique (the Gradient Boosting Decision Tree algorithm, GBDT) to construct Frequency Response Model (FRM) of the Taipower system in Taiwan to evaluate by CPS1 score. The proposed CPS1 model was then integrated with Unit Commitment (UC) program to determine the unit-dispatch that achieves the targeted CPS1 score. The feasibility and effectiveness of the proposed CPS1-UC platform were validated and compared with the other benchmark model-based UC methods by two operating cases. The proposed model shows promising results: the system frequency could be maintained well, especially in the periods of the early morning or the high renewable penetration.


Introduction
The problem of determining the optimal unit commitment (UC) in power generation systems is solved by minimizing the total system cost while maintaining the load generation balance. At early stages, traditional UC are mainly based on cost considerations with unit and transmission security constraints. However, with the growing global emphasis on reduced emissions and renewable penetration, the UC model has recently been modified to take better account of the need for emission reduction or increased renewable resources during the power generation process [1][2][3][4][5].
Nevertheless, ensuring satisfactory frequency quality is still one of the most important practical concerns. The metric for evaluating frequency quality is usually referred to as the North American Electric Reliability Corporation's (NERC) two Control Performance Standards (CPS1 and CPS2) [6]. The CPS score (i.e., frequency response quality index) could be affected by the generator type, load characteristics, regulation reserve, and the nature of the system itself (i.e., interconnection or island). Thus, each Independent System Operator (ISO) in North America has formulated its own requirements for the regulation reserve based on its own particular system [7].
In recent years, several statistical-based methods have been proposed to evaluate the CPS score by constructing a reference based on the load frequency control (LFC) model or UC model. For example, Chung developed an artificial neural network (ANN) algorithm to improve the CPS score by estimating the area control error (ACE) of the LFC model [8]. Some studies built up frequency constraints on specific generating units within the UC

Construction of the CPS1 Compliant Model
The conventional wisdom to model a complicated system generally involves in identifying key features that have influential impacts on the system behavior, followed by the model construction by interlinking the features with model parameters. To make the features effectively exert the model to closely trace the real system behavior, hyperparameter tuning is essential. This work is done either by going through a regression method or an intelligent training. Once the model validation is confirmed, the finely turned model could be gainfully used for the other applications, such as unit commitment program in this study.
Generally speaking, feature identification as well as the model parameter tuning are considered as the most complicated parts in system modeling because they usually involve in multivariate and highly correlated functions which cannot be easily grasped by the traditional linear system approach. Until the recent development of machine learning techniques, the accuracy of the complex system model could be elevated to a higher level. Figure 1 demonstrates the example of system modeling procedure using the machine learning algorithms. The construction of the CPS1-compliance model using the same procedure will be detailed in the following subsections.
Energies 2021, 14, x FOR PEER REVIEW 3 of 20 model construction by interlinking the features with model parameters. To make the features effectively exert the model to closely trace the real system behavior, hyperparameter tuning is essential. This work is done either by going through a regression method or an intelligent training. Once the model validation is confirmed, the finely turned model could be gainfully used for the other applications, such as unit commitment program in this study.
Generally speaking, feature identification as well as the model parameter tuning are considered as the most complicated parts in system modeling because they usually involve in multivariate and highly correlated functions which cannot be easily grasped by the traditional linear system approach. Until the recent development of machine learning techniques, the accuracy of the complex system model could be elevated to a higher level. Figure 1 demonstrates the example of system modeling procedure using the machine learning algorithms. The construction of the CPS1-compliance model using the same procedure will be detailed in the following subsections.

Exploratory Data Analysis for CPS1 in Taipower System (Feature Definition)
The Taipower system is an independent power system, with no interconnected ties to the neighboring areas. Since the load-generation balance is solely reflected by the system frequency, Taipower company abides by the CPS1 compliance rule (i.e., the system frequency quality index), which is the same as the metric adopted by the Electric Reliability Council of Texas (ERCOT) system in Texas (USA). Since the Taipower system is an isolated power system, therefore, the CPS1 score was calculated by Equation (1), which is shown as follows [6]: where AVGT represents the averaging window within T periods; Fn is the actual frequency; Ft is the target frequency; ε is the target noise and is equal to 30 mHz in TPC's case.
To better identify which the system states significantly influence the CPS1 record, an exploratory data analysis (EDA) method was employed to analyze the hourly operational data of the Taipower system. Figure 2 shows the average hourly CPS1 scores over the 12month period. Note that for each hour, the white spot indicates the average CPS1 score, the black bars indicate the range of CPS1 scores, and the red dashed line represents the threshold of the CPS1 score (i.e., 100%). In general, the results show that the CPS1 scores of the first-shift period (i.e., 00:00 to 08:00) were lower than those of the other periods, with more than half of the scores failing to meet the target requirement of 100%.

Exploratory Data Analysis for CPS1 in Taipower System (Feature Definition)
The Taipower system is an independent power system, with no interconnected ties to the neighboring areas. Since the load-generation balance is solely reflected by the system frequency, Taipower company abides by the CPS1 compliance rule (i.e., the system frequency quality index), which is the same as the metric adopted by the Electric Reliability Council of Texas (ERCOT) system in Texas (USA). Since the Taipower system is an isolated power system, therefore, the CPS1 score was calculated by Equation (1), which is shown as follows [6]: where AVG T represents the averaging window within T periods; F n is the actual frequency; F t is the target frequency; ε is the target noise and is equal to 30 mHz in TPC's case.
To better identify which the system states significantly influence the CPS1 record, an exploratory data analysis (EDA) method was employed to analyze the hourly operational data of the Taipower system. Figure 2 shows the average hourly CPS1 scores over the 12-month period. Note that for each hour, the white spot indicates the average CPS1 score, the black bars indicate the range of CPS1 scores, and the red dashed line represents the threshold of the CPS1 score (i.e., 100%). In general, the results show that the CPS1 scores of the first-shift period (i.e., 00:00 to 08:00) were lower than those of the other periods, with more than half of the scores failing to meet the target requirement of 100%.  Given the results presented in Figure 2, the present study aims to improve the frequency quality of the Taipower system to increase the CPS1 score in the first-shift period. Following lengthy discussions with the Taipower dispatchers, 11 frequency-sensitive features were identified as shown in Table 1. Features x1 to x3 are time-dependent, while features x6 to x11 relate to the generating unit. Among these latter features, x6 to x8 relate to the AGC signal control. Feature x9 relates to the inertia of the parallel units, where a larger inertia value results in a more stable system frequency. Features x10 and x11 refer to the pumped-storage hydro units (PSHUs) pumping load operation. According to Taipower's operation rule, PSHUs are operated mainly in the pumping mode during the first-shift period. Finally, feature y1 refers to the CPS1 score and is obtained by averaging the hourly values.

Outlier Detection
As the outliers are unlabeled in the present study, an unsupervised ensemble learning algorithm is required to perform outlier detection. Reference [20] evaluates the average accuracy, stability, computing time, and memory usage degree of 14 common unsupervised algorithms when applied to large-scale public and industrial datasets. The results showed that the isolation forest (iForest) algorithm performed better in identifying outliers than probabilistic methods such as probabilistic principal component analysis (PPCA), neural network methods such as grow when required (GWR), domain-based Given the results presented in Figure 2, the present study aims to improve the frequency quality of the Taipower system to increase the CPS1 score in the first-shift period. Following lengthy discussions with the Taipower dispatchers, 11 frequency-sensitive features were identified as shown in Table 1. Features x1 to x3 are time-dependent, while features x6 to x11 relate to the generating unit. Among these latter features, x6 to x8 relate to the AGC signal control. Feature x9 relates to the inertia of the parallel units, where a larger inertia value results in a more stable system frequency. Features x10 and x11 refer to the pumped-storage hydro units (PSHUs) pumping load operation. According to Taipower's operation rule, PSHUs are operated mainly in the pumping mode during the first-shift period. Finally, feature y1 refers to the CPS1 score and is obtained by averaging the hourly values.

Outlier Detection
As the outliers are unlabeled in the present study, an unsupervised ensemble learning algorithm is required to perform outlier detection. Reference [20] evaluates the average accuracy, stability, computing time, and memory usage degree of 14 common unsupervised algorithms when applied to large-scale public and industrial datasets. The results showed that the isolation forest (iForest) algorithm performed better in identifying outliers than probabilistic methods such as probabilistic principal component analysis (PPCA), neural network methods such as grow when required (GWR), domain-based methods such as a one-class SVM, and distance-based methods such as the Mahalanobis distance. iForest is an ensemble method composed of multiple binary iTrees. The iForest algorithm has the following four main advantages for large-scale optimization-type problems: (1) it does not need to calculate the distance and density of the samples, and hence, the computational load and computing time are reduced; (2) the computing time scales proportionally with the size of the dataset); (3) the accuracy improves as the data volume increases; and (4) only a few hyperparameters must be set in advance [21].
Having constructed the iForest, the path length of sample p is normalized by the average path length of the whole sample, and the outlier score of sample p is calculated according to Equation (2). The samples are then sorted in accordance with their outlier scores [21]: where s(p, n) is the outlier score; h(p) is the path length of sample x in the iTree; E(h(p)) is the mean height of multiple iTrees h(p); and c(n) is the average path length of the iTree constructed by n samples. In our case, the 11 frequency-sensitive features, which are identified in Table 1, were separately applied to the iForest algorithm for outlier detection. In the implementation of the iForest algorithm, the hyperparameters were specified as follows: (1) a subsample size of 256, (2) a tree depth of 8, and a forest size of 100 (trees). In other words, for each feature, 256 independent samples were randomly selected to constitute an iTree with a maximum depth of 8; at most, 100 iTrees were constructed. Figure 3 shows the outlier detection results for six of the frequency-sensitive features. The results confirm that the detected outliers are all located at the periphery. The inliers were then used for further feature selection and modeling purposes, which will be described in next section. methods such as a one-class SVM, and distance-based methods such as the Mahalanobis distance. iForest is an ensemble method composed of multiple binary iTrees. The iForest algorithm has the following four main advantages for large-scale optimization-type problems: (1) it does not need to calculate the distance and density of the samples, and hence, the computational load and computing time are reduced; (2) the computing time scales proportionally with the size of the dataset); (3) the accuracy improves as the data volume increases; and (4) only a few hyperparameters must be set in advance [21].
Having constructed the iForest, the path length of sample p is normalized by the average path length of the whole sample, and the outlier score of sample p is calculated according to Equation (2). The samples are then sorted in accordance with their outlier scores [21]: where s(p, n) is the outlier score; h(p) is the path length of sample x in the iTree; E(h(p)) is the mean height of multiple iTrees h(p); and c(n) is the average path length of the iTree constructed by n samples. In our case, the 11 frequency-sensitive features, which are identified in Table 1, were separately applied to the iForest algorithm for outlier detection. In the implementation of the iForest algorithm, the hyperparameters were specified as follows: (1) a subsample size of 256, (2) a tree depth of 8, and a forest size of 100 (trees). In other words, for each feature, 256 independent samples were randomly selected to constitute an iTree with a maximum depth of 8; at most, 100 iTrees were constructed. Figure 3 shows the outlier detection results for six of the frequency-sensitive features. The results confirm that the detected outliers are all located at the periphery. The inliers were then used for further feature selection and modeling purposes, which will be described in next section.

Feature Selection
Generally speaking, features presenting with a higher proportion have a greater effect on the model outcome. To extract the influential features, we introduce the ensemble learning technique.
The underlying concept of ensemble learning is to construct a single strong learner through the integration of multiple weak learners [22]. Among the many available ensemble learning techniques, gradient boosting decision tree (GBDT), which belongs to the boosting tree-based algorithms, has been suggested in this case because it can use multiple weak decision trees to construct a single strong decision tree (low variance low bias). In the implementation of GBDT, the convergence rate for the loss function is accelerated by a gradient function, and the overall deviation of the algorithm is gradually reduced as the training process proceeds [23]. After careful consideration, we decided that the analysis of the Taipower data collected in the present study did not require parallel computing, but the high prediction accuracy was a greater concern. Therefore, the GBDT algorithm was deliberately chosen for feature selection. Figure 4a presents a schematic diagram of the GBDT procedure. In general, the GBDT algorithm consists of a decision tree (DT) structure constructed within a gradient boosting (GB) architecture. The integrated architecture ensures that the predicted value goes along the negative gradient direction to approach the actual value and minimizes the residual between the two values (i.e., the predicted value and the actual value). In the present study, the loss function of the GBDT algorithm is set as the least square error between the predicted value and the actual value, as shown in Equation (3): where y i is the actual value. In this research, the historical CPS1 score is regarded as y i , the predicted value is f (x i ), and x i is the input of the ith sample. Each sample has 11 features during the feature selection procedure.

Feature Selection
Generally speaking, features presenting with a higher proportion have a greater effect on the model outcome. To extract the influential features, we introduce the ensemble learning technique.
The underlying concept of ensemble learning is to construct a single strong learner through the integration of multiple weak learners [22]. Among the many available ensemble learning techniques, gradient boosting decision tree (GBDT), which belongs to the boosting tree-based algorithms, has been suggested in this case because it can use multiple weak decision trees to construct a single strong decision tree (low variance low bias). In the implementation of GBDT, the convergence rate for the loss function is accelerated by a gradient function, and the overall deviation of the algorithm is gradually reduced as the training process proceeds [23]. After careful consideration, we decided that the analysis of the Taipower data collected in the present study did not require parallel computing, but the high prediction accuracy was a greater concern. Therefore, the GBDT algorithm was deliberately chosen for feature selection. Figure 4a presents a schematic diagram of the GBDT procedure. In general, the GBDT algorithm consists of a decision tree (DT) structure constructed within a gradient boosting (GB) architecture. The integrated architecture ensures that the predicted value goes along the negative gradient direction to approach the actual value and minimizes the residual between the two values (i.e., the predicted value and the actual value). In the present study, the loss function of the GBDT algorithm is set as the least square error between the predicted value and the actual value, as shown in Equation (3): where is the actual value. In this research, the historical CPS1 score is regarded as , the predicted value is ( ), and is the input of the ith sample. Each sample has 11 features during the feature selection procedure. At first, the initialization of the estimated value 0 ( ) is set as the mean of the samples. The estimated residual of each iteration is determined by taking the partial derivation of the loss function as shown in Equation (4). Then, the residuals of every sample are calculated by the difference between the actual value of the mth tree and the predicted value of the (m − 1)th tree. The estimated residual of the ith sample in the mth tree is calculated by At first, the initialization of the estimated value f 0 (x) is set as the mean of the samples. The estimated residual γ im of each iteration is determined by taking the partial derivation of the loss function as shown in Equation (4). Then, the residuals of every sample are calculated by the difference between the actual value of the mth tree and the predicted value of the (m − 1)th tree. The estimated residual of the ith sample in the mth tree is calculated by Then, a regression decision tree (RDT) is constructed to fit the estimated residual γ im , and the terminal regions R jm of the decision tree are given. Note that each leaf R jm of a decision tree can have more than one sample γ im . However, it is unclear what the real output value of each leaf should be if there is more than one sample in the same leaf. Therefore, the new residual γ jm (i.e., the value of each leaf) is calculated by the samples of the terminal leaf R jm , as shown in Equation (5): where R jm refers to the jth terminal region in the mth tree.
Finally, the estimated model f m (x) as shown in Equation (6) is updated according to the new residual γ jm and the learning rate α: The entire GBDT procedure is completed according to a set number of iterations (i.e., the number of decision trees M) [24]. For this case, the process deals with 120,000 data samples. Among the collected inliers, 70% were taken as the training set, with the remainder used for testing. To avoid over-fitting, the training set was used to train the model using five-fold cross-validation. After finishing all settings of the grid search, the best result was obtained in the 200th iteration, with the main hyperparameters of the model, namely the learning rate and the depth of the regression tree, having values of 0.08 and 9, respectively.
From the results of the feature selection, the importance of all frequency-sensitive features with CPS1 score are ranked and shown in Figure 5. Figure 5 shows the GBDT results for the relative importance of the 11 features in the CPS1 model. The percentage of each feature indicates the associated number of nodes over the whole decision tree that is shown in Figure 4b. The highest six features, namely AGC down, AGC up, inertia, load variation, load, and ramp rate, account for 80% of the total number of nodes. In other words, these features have greater influence on the CPS1 score. It is interesting to note that the dispatchers originally felt that the amount of pumped-storage load could be an important factor that affected the CPS1 quality. However, the result of Figure 5 shows that the influence of the pumped-storage load was not as high as they expected. Consequently, the frequency-sensitive features that have higher percentage weightings in the rank are selected to construct the FRM model.
Then, a regression decision tree (RDT) is constructed to fit the estimated residual , and the terminal regions of the decision tree are given. Note that each leaf of a decision tree can have more than one sample . However, it is unclear what the real output value of each leaf should be if there is more than one sample in the same leaf. Therefore, the new residual (i.e., the value of each leaf) is calculated by the samples of the terminal leaf , as shown in Equation (5): where refers to the jth terminal region in the mth tree. Finally, the estimated model ( ) as shown in Equation (6) is updated according to the new residual and the learning rate : The entire GBDT procedure is completed according to a set number of iterations (i.e., the number of decision trees M) [24]. For this case, the process deals with 120,000 data samples. Among the collected inliers, 70% were taken as the training set, with the remainder used for testing. To avoid over-fitting, the training set was used to train the model using five-fold cross-validation. After finishing all settings of the grid search, the best result was obtained in the 200th iteration, with the main hyperparameters of the model, namely the learning rate and the depth of the regression tree, having values of 0.08 and 9, respectively.
From the results of the feature selection, the importance of all frequency-sensitive features with CPS1 score are ranked and shown in Figure 5. Figure 5 shows the GBDT results for the relative importance of the 11 features in the CPS1 model. The percentage of each feature indicates the associated number of nodes over the whole decision tree that is shown in Figure 4b. The highest six features, namely AGC down, AGC up, inertia, load variation, load, and ramp rate, account for 80% of the total number of nodes. In other words, these features have greater influence on the CPS1 score. It is interesting to note that the dispatchers originally felt that the amount of pumped-storage load could be an important factor that affected the CPS1 quality. However, the result of Figure 5 shows that the influence of the pumped-storage load was not as high as they expected. Consequently, the frequency-sensitive features that have higher percentage weightings in the rank are selected to construct the FRM model.

CPS1 Model Construction
To reduce the data dimensionality and data processing complexity, only the first six features in Figure 5 were selected to construct the CPS1 model, i.e., the FRM model. The method of constructing the CPS1 model was the same as the method of feature selection described in Section 2.3. The hyperparameters of the CPS1 model were once again adjusted using the grid search method and the five-fold cross-validation. Figure 6 shows the variation of the root mean square error (RMSE) of the CPS1 model with the number of iterations as a function of the depth and learning rate. The optimal model performance was obtained using a learning rate of 0.08 and a tree depth of 9. It is seen that for a constant learning rate u, a simple model with a lower depth (e.g., a tree depth, dmax, equal to 8) leads to under-fitting compared to the optimal solution (dmax = 9, u = 0.08). By contrast, a complex model with greater depth (e.g., dmax = 10) converges rapidly during the first 60 iterations, but reaches the minimum RMSE value only much later. The average RMSE of the training set and testing set were equal to 15.3% and 17.2%, respectively.

CPS1 Model Construction
To reduce the data dimensionality and data processing complexity, only the first six features in Figure 5 were selected to construct the CPS1 model, i.e., the FRM model. The method of constructing the CPS1 model was the same as the method of feature selection described in Section 2.3. The hyperparameters of the CPS1 model were once again adjusted using the grid search method and the five-fold cross-validation. Figure 6 shows the variation of the root mean square error (RMSE) of the CPS1 model with the number of iterations as a function of the depth and learning rate. The optimal model performance was obtained using a learning rate of 0.08 and a tree depth of 9. It is seen that for a constant learning rate u, a simple model with a lower depth (e.g., a tree depth, dmax, equal to 8) leads to under-fitting compared to the optimal solution (dmax = 9, u = 0.08). By contrast, a complex model with greater depth (e.g., dmax = 10) converges rapidly during the first 60 iterations, but reaches the minimum RMSE value only much later. The average RMSE of the training set and testing set were equal to 15.3% and 17.2%, respectively.

UC Optimization Program for the Taipower System
The generation mix of the Taipower system in 2015 was roughly 76% fossil-fueled, 11% nuclear, 5% pumped-storage hydro, 4% cascading hydro, 3% PV generation, and 1% other renewable resources [25]. To feature different generating units for complying with the unit dispatch constraints. A systematic UC model was constructed using the general algebraic modeling system (GAMS). This UC model comprised 118 generating units, including nuclear, coal, combined-cycle, diesel, heavy oil, hydro cascade, and pumped-storage units. The original aim of the model was to minimize the total system cost (comprising the fuel cost, the minimum output cost, the start-up cost, and the ancillary service cost) by performing peak load shifting of the load curve. Having constructed the model with around 140,000 equations, the UC problem was solved using the mixed integer linear programming (MILP) method of the Gurobi solver.

Objective Function
The objective function was designed to minimize the total cost of thermal units in the whole Taipower system. The total cost includes the incremental fuel cost, minimum level cost, hot/warm/cold start-up cost and the penalty of the power shortage. In order to construct the linearized UC model, the incremental fuel cost of each unit is fitted by six piecewise functions. An example of the incremental fuel cost with the piecewise fitting function is depicted in Figure 7 and the total cost function is described in Equation (7):

UC Optimization Program for the Taipower System
The generation mix of the Taipower system in 2015 was roughly 76% fossil-fueled, 11% nuclear, 5% pumped-storage hydro, 4% cascading hydro, 3% PV generation, and 1% other renewable resources [25]. To feature different generating units for complying with the unit dispatch constraints. A systematic UC model was constructed using the general algebraic modeling system (GAMS). This UC model comprised 118 generating units, including nuclear, coal, combined-cycle, diesel, heavy oil, hydro cascade, and pumped-storage units. The original aim of the model was to minimize the total system cost (comprising the fuel cost, the minimum output cost, the start-up cost, and the ancillary service cost) by performing peak load shifting of the load curve. Having constructed the model with around 140,000 equations, the UC problem was solved using the mixed integer linear programming (MILP) method of the Gurobi solver.

Objective Function
The objective function was designed to minimize the total cost of thermal units in the whole Taipower system. The total cost includes the incremental fuel cost, minimum level cost, hot/warm/cold start-up cost and the penalty of the power shortage. In order to construct the linearized UC model, the incremental fuel cost of each unit is fitted by six piecewise functions. An example of the incremental fuel cost with the piecewise fitting function is depicted in Figure 7 and the total cost function is described in Equation (7): where PminCost g is the cost of the scheduled power generation of each unit g at minimum level [unit: New Taiwan (7) where PminCostg is the cost of the scheduled power generation of each unit at minimum level [unit: New Taiwan

UC Constraints
The constraint of balancing requirements of supply-side and demand-side is shown as Equation (8), the constraints of unit characteristics are shown as Equations (9)- (19) and the reservoir-related constraints are shown as Equations (20) and (21).
(i) The balancing requirement of supply-side and demand-side considering the PSHU: where is the maximum output limit of each unit [unit: MW].
(iii) The ramp up/down limit of units:

UC Constraints
The constraint of balancing requirements of supply-side and demand-side is shown as Equation (8), the constraints of unit characteristics are shown as Equations (9)- (19) and the reservoir-related constraints are shown as Equations (20) and (21).
(i) The balancing requirement of supply-side and demand-side considering the PSHU: where (ii) The upper and lower limits of units including thermal, hydro and PSHU: where P max g is the maximum output limit of each unit g [unit: MW].
(iii) The ramp up/down limit of units: where RR g is the ramp rate supplied by unit g in period h [unit: MW/min].
(iv) The states of start-up and shut down of units: Su g,h + Sd g,h ≤ 1 (13) where Su g,h is the start-up status of each traditional generator g in the period h; Sd g,h is the shut down status of each traditional generator g in the period h.
(v) The minimum up/down time constraints of units where MU g is the must up time period of each traditional generator g [unit: hour]; MD g is the must down time period of each traditional generator g [unit: hour].
(vi) Maximum start-up times per day: where MDS g is the maximum start-up time of each traditional generator g per day.
(vii) Hot/warm/cold start-up: where T hw g is time interval between hot start-up and warm start-up of each traditional generator g; T wc g is time interval between warm start-up and cold start-up of each traditional generator g.
(viii) The efficiency of the conversion of PSHU: (20) where η PSHU g is the conversion efficiency of each PSHU generator g.
(ix) The reservoir model for PSHU: where Q level res,h is the water level of each reservoir [unit: m 3 ]; ρ g is the electric-water ratio [unit: MWh/m 3 ].

Addional Frequency Control Constraints
Among the selected features, AGC up, AGC down, inertia, and ramp rate are more capable of being controlled by the generating side management and are thus named as the frequency-sensitive features that could be implemented inside the UC program. In this study, the constraints of the frequency control between the supply and demand of the power system included not only the load requirement and power generation of all the units in the whole system, but also the PSHU pumping load [26][27][28][29]. The frequency control requirements specified by the CPS1 model would be incorporated into the UC model to ensure the model output was CPS1-compliant. The following paragraphs describe the inertia, AGC up, AGC down, and ramp rate constraints imposed on the UC solution by the CPS1 model.
The total inertia of the system was composed of the power generation side and the load demand side. In the actual condition, the inertia of the load demand side could not be controlled. However, the inertia of the power generation side can be obtained from the installed capacity and the inertia of the scheduling results, as shown in Equation (22). Note that I Req h is the required inertia of the system in each period and is obtained directly from the CPS1 model. As the installed capacity of the generators increases, the inertia also increases. The UC optimization program selects the most suitable units by meeting the system inertia requirement and minimizing cost in every period. Thus: where MVA g is the capacity of each traditional generator g and has units of MW, g ∈ {1, 2, . . . , G}; MW Renew h is the power generation of renewables that do not contribute to system inertia in each hour; I g is the inertia of each traditional generator g.
In addition to the system inertia, the system frequency is also affected by the AGC up, AGC down, and ramp rate features. The AGC up capacity that each unit can supply in each period is described by Equations (23) to (25) below. Depending on the characteristics of each unit, if a unit intends to supply the full AGC up capacity, it must first be already committed and within AGC high, as shown in Equation (23). According to Taipower's operation rules, the available AGC up capacity must be less than or equal to the ramp rate over a three-minute period, as shown in Equation (24). Finally, the total AGC up capacity provided by all the units must be greater than the total requirement given by the CPS1 model in each period, as shown in Equation (25).
where AGCU p Acp g,h is the scheduled quantity of AGC up of each unit g in period h [unit: MW]; AGC High g is the maximum limit of the AGC range of each unit g [unit: MW]; U AGC g,h is the status of the AGC mode of each unit g in period h.
The AGC Down capacity provided by a unit stabilizes the fluctuation of the system frequency and is described by Equations (26) to (28). As shown in Equations (26)-(28), the AGC Down capacity falls between the set point and the AGC low limit of the unit, and must be less than the ramp rate over three minutes, as shown in Equation (26): where AGC Low g is the minimum limit of the AGC range for each unit g [unit: MW]. The ramp rate of the parallel units in the system also exerts an impact on the CPS1 score (see Table 1). In the UC model, the total ramp rate of the committed units is constrained by the following ramp rate requirement determined by the CPS1 model:

Network Security Constraint
The network security constraint is to ensure system's load generation balance in UC, while the line flow limit is also satisfied in the N-1 case. The network topology was built by PSS/E software including the transmission line of the voltage levels on 161 kV and 345 kV (around 1200 lines). For further details, please refer to [30,31]. The network security constraint is defined as follows: where P min g is the minimum output limit of each unit g

Integrating CPS1 Model into the UC Optimization Program
As shown in Figure 8, we integrated the UC program described above with the FRM (i.e., CPS1 model) in an iterative framework to ensure that the UC scheduling output satisfied the CPS1 target. Before executing the day-ahead UC model, we obtained the dayahead load forecast and self-schedule information. Finally, the UC results were exported to the CPS1 model (i.e., outer loop checking) to compute the corresponding CPS1 score and to check whether the CPS1 target was achieved. If the CPS1 score could not achieve the target score, the requirements of the frequency control variables would be increased through the inner loop to satisfy the target score. Within the inner loop that is shown in the green block, additional AGC up and AGC down were called for. The updated AGC reserve would be checked again by CPS1 model (i.e., inner loop checking). If the CPS1 score still could not achieve the target score, then the requirements of inertia and ramp rate were increased (This may commit the other offline units). Note that the data resolution inside the CPS1 model is finer than that of the UC model (minute vs. hour) to closely reflect the minute-scaled LFC response.
The inertia and ramp rate quantities appear discrete in the UC model because they are related to the online generator numbers. The inertia and ramp rate quantities are accordingly adjusted after revising the AGC up and AGC down quantities. The adjustment of AGC up and AGC down quantities is more direct. That is, if the computed CPS1 score fails to meet the target score, the AGC up/down requirements are increased by 50 MW each time (see the left-hand side flowchart in Figure 8). Note that the smaller the increment of AGC up/down, the more the iterations of the UC procedure. On the other hand, the larger the increment of AGC up/down, the more the total operating cost. Consequently, a suitable increment of AGC up/down is determined, i.e., 50 MW. The renewed AGC up/down range is exported to the CPS1-compliant UC model (CPS1UC) to generate a UC solution to meet the overall constraints. If the present online units cannot fulfill the renewed AGC up/down range, the UC program calls another offline unit and updates the system ramp rate and system inertia by adding the associated unit parameters. If the target score cannot be reached in three iterations, the program directly calls another generating unit online. Finally, the frequency-sensitive features are returned to the UC model for final rescheduling.
Energies 2021, 14, x FOR PEER REVIEW 13 of 20 reserve would be checked again by CPS1 model (i.e., inner loop checking). If the CPS1 score still could not achieve the target score, then the requirements of inertia and ramp rate were increased (This may commit the other offline units). Note that the data resolution inside the CPS1 model is finer than that of the UC model (minute vs. hour) to closely reflect the minute-scaled LFC response. The inertia and ramp rate quantities appear discrete in the UC model because they are related to the online generator numbers. The inertia and ramp rate quantities are accordingly adjusted after revising the AGC up and AGC down quantities. The adjustment of AGC up and AGC down quantities is more direct. That is, if the computed CPS1 score fails to meet the target score, the AGC up/down requirements are increased by 50 MW each time (see the left-hand side flowchart in Figure 8). Note that the smaller the increment of AGC up/down, the more the iterations of the UC procedure. On the other hand, the larger the increment of AGC up/down, the more the total operating cost. Consequently, a suitable increment of AGC up/down is determined, i.e., 50 MW. The renewed AGC up/down range is exported to the CPS1-compliant UC model (CPS1UC) to generate a UC solution to meet the overall constraints. If the present online units cannot fulfill the renewed AGC up/down range, the UC program calls another offline unit and updates the system ramp rate and system inertia by adding the associated unit parameters. If the target score cannot be reached in three iterations, the program directly calls another generating unit online. Finally, the frequency-sensitive features are returned to the UC model for final rescheduling.

Case Studies
According to NERC's rule, CPS1 score should be 100% or higher [6]. In Taiwan, Taiwan Power Company sets the goal of CPS1 score to be 120% so that the dispatchers can have more margin to deal with daily system variance. The effectiveness of the CPS1UC

Case Studies
According to NERC's rule, CPS1 score should be 100% or higher [6]. In Taiwan, Taiwan Power Company sets the goal of CPS1 score to be 120% so that the dispatchers can have more margin to deal with daily system variance. The effectiveness of the CPS1UC was evaluated using two case studies. Case I was taking the real operating data including load demand and the unit dispatch outcome by manual dispatch (MD). We compared the unit dispatch results obtained from the CPS1UC outcome with the other two model-based UC: UC combined with the reduced order load frequency control model (ROLFC-UC) [13], and UC considering the CPS1 constraint using the long short-term memory (LSTM) algorithm (LSTM-UC) [32]. The Case II was conducted on a hypothetical case in which a photovoltaic (PV) power with maximum output of 3.9 GW (accounting for 11% of the total system generation capacity) was added to the Taipower system. In this case study, the aforementioned methodologies were used to compare the difference of the dispatch outcome. The proposed method was simulated in a GAMS platform and Python environment running using an Intel Core E3-1245V5 CPU operating at 3.50 GHz with 32 GB RAM.

Historical Operation Case (in the Early Morning (First Shift Period))
In this case, we focus on the CPS1 enhancement for the first shift period. Table 2 shows the model validation and total operating cost of the first shift period. Despite the same load demand in the testing case, the total operating costs of the individual methods are different due to the different dispatch outcomes. The accuracy of the proposed CPS1 model (i.e., RMSE 17) was better than that of the LSTM model (i.e., RMSE 25) under the model validation process. In this case, the memory usage of the LSTM is 8 times that of the GBDT. The training time of the hyperparameter tuning is 10 times that of the GBDT (GBDT took 500 s). It is evident that the tree-based ensemble learning algorithm dealing with the domain knowledge formulation is better than the LSTM.  Figure 9 shows the details of the four methods in each hour. During the load dropping period, the MD scenario shows that the operator continued to turn the generating unit offline, resulting in low system inertia. The other problem is that the provisions of AGC reserve (AGC up and AGC down) as well as ramp rate were quite limited due to the insufficient on-line units. The sequential consequence limited system's frequency regulation capability, resulting in low CPS1 score. Compared to the MD scenario, the remaining three methods required higher Inertia in the UC constraint, the total operating cost still maintained low among the scenarios. Note that ROLFC-UC did not take the inertia into consideration, it would need more AGC reserve (AGC up and AGC down) as well as ramp rate to keep CPS1 at 120%. On the contrary, the LSTM-UC and CPS1UC took inertia as additional UC constraint. That is the reason why these two methods came up with less regulation reserve but higher inertia requirement. Although LSTM-UC and CPS1UC led to similar results, the accuracy of the model evaluation was different.

Hypothetical Case (PV Case)
In the Case II (PV Case), the load demand was the same as that in Case I but with a maximum PV power generation of 3.9 GW (accounting for 11% of the power generation of the whole system) that was assumed to be integrated into the system. Figure 10 shows the corresponding CPS1UC scheduling results. The green curve shows the load demand. The blue curve shows the net load which is obtained by deducting the PV power generation curve (black dotted line) from the load demand curve. The net load curve shows that the original peak load was shifted from noon to evening due to the addition of PV power generation. In this case, we focus on the daytime period (from 7 a.m. to 3 p.m.) when the PV power generation was active. Table 3 shows the cost comparison by the different three methods (i.e., ROLFC-UC vs. CPS1UC vs. LSTM-UC). Figure 11 shows that the three model-based UC all achieved the target CPS1 score. However, the values of inertia, ramp rate and AGC reserve (AGC up and AGC down) are different.

Hypothetical Case (PV Case)
In the Case II (PV Case), the load demand was the same as that in Case I but with a maximum PV power generation of 3.9 GW (accounting for 11% of the power generation of the whole system) that was assumed to be integrated into the system. Figure 10 shows the corresponding CPS1UC scheduling results. The green curve shows the load demand. The blue curve shows the net load which is obtained by deducting the PV power generation curve (black dotted line) from the load demand curve. The net load curve shows that the original peak load was shifted from noon to evening due to the addition of PV power generation. In this case, we focus on the daytime period (from 7 a.m. to 3 p.m.) when the PV power generation was active. Table 3 shows the cost comparison by the different three methods (i.e., ROLFC-UC vs. CPS1UC vs. LSTM-UC). Figure 11 shows that the three model-based UC all achieved the target CPS1 score. However, the values of inertia, ramp rate and AGC reserve (AGC up and AGC down) are different. The common results of these three methods show that the Inertia values during the high noon hours were lower than those during the other operating hours. The ROLFC-UC called for more AGC reserve (AGC up and AGC down) as well as ramp rate to keep CPS1 at 120% due to the lower inertia value. On the contrary, the CPS1UC and the LSTM-UC took Inertia as an additional UC constraint to keep the inertia at higher level. Therefore, they came up with less regulation reserve and ramp rate values. This case shows a trade-off between Inertia and regulation reserve, which implies that inertia is important for the frequency stability. For a power system with high penetration renewables, the unit commitment should consider not only the ramp rate and AGC reserve but inertia to ensure a good frequency quality.
In the Case II (PV Case), the load demand was the same as that in Case I but with a maximum PV power generation of 3.9 GW (accounting for 11% of the power generation of the whole system) that was assumed to be integrated into the system. Figure 10 shows the corresponding CPS1UC scheduling results. The green curve shows the load demand. The blue curve shows the net load which is obtained by deducting the PV power generation curve (black dotted line) from the load demand curve. The net load curve shows that the original peak load was shifted from noon to evening due to the addition of PV power generation. In this case, we focus on the daytime period (from 7 a.m. to 3 p.m.) when the PV power generation was active. Table 3 shows the cost comparison by the different three methods (i.e., ROLFC-UC vs. CPS1UC vs. LSTM-UC). Figure 11 shows that the three model-based UC all achieved the target CPS1 score. However, the values of inertia, ramp rate and AGC reserve (AGC up and AGC down) are different. The common results of these three methods show that the Inertia values during the high noon hours were lower than those during the other operating hours. The ROLFC-UC called for more AGC reserve (AGC up and AGC down) as well as ramp rate to keep CPS1 at 120% due to the lower inertia value. On the contrary, the CPS1UC and the LSTM-UC took Inertia as an additional UC constraint to keep the inertia at higher level. Therefore, they came up with less regulation reserve and ramp rate values. This case shows a tradeoff between Inertia and regulation reserve, which implies that inertia is important for the frequency stability. For a power system with high penetration renewables, the unit commitment should consider not only the ramp rate and AGC reserve but inertia to ensure a good frequency quality.

Conclusions
This paper has proposed a model-based UC (CPS1UC) platform to ensure a CPS1compliant unit scheduling outcome. The proposed FRM was constructed by an ensemble learning technique, the GBDT algorithm, to deal with feature identification and model training. Unlike the traditional model formulation, the ensemble learning technique help model define its own structure, tune the model parameters, and validate the model accu-

Conclusions
This paper has proposed a model-based UC (CPS1UC) platform to ensure a CPS1compliant unit scheduling outcome. The proposed FRM was constructed by an ensemble learning technique, the GBDT algorithm, to deal with feature identification and model training. Unlike the traditional model formulation, the ensemble learning technique help model define its own structure, tune the model parameters, and validate the model accuracy. Once the model validation is complete, the quantities of the frequency control features determined by the UC result were routinely exported to the FRM to check the CPS1 compliance until the UC result achieves the target CPS1 score. The validity and feasibility of the proposed model-based UC platform were evaluated by two case studies. In Case I, the proposed CPS1UC effectively exerted the dominant frequency control features to reach CPS1 target with less operating cost and computing resource. In Case II, system frequency becomes relatively unstable due to the increasing PV penetration into the system. To comply with the CPS1 target, it is better to maintain a certain level of system inertia. The CPS1UC solved this problem by committing several synchronous condensers and gas turbines on line. The gas turbine is more flexible to be committed in a short time and is more responsive to the frequency regulation.
As described above, the proposed FRM was constructed based on the mass historical data, the accuracy of the model could improve to reflect the most updated system as the proposed CPS1UC platform continuously operates in real time.
To sum up, the important findings of this study are listed as follows: (1) This study commences by formulating a set of procedures and employing an ensemble machine learning technique to analyze the real energy management system (EMS) data of an isolated power system (i.e., 118 generating units) to construct a FRM. The resulting FRM is then integrated with a UC program to determine the unit scheduling result, which minimizes the system operating cost while the frequency compliance is also maintained. (2) During the period of the controlled frequency with lower compliance, i.e., the early morning or high penetration RES, the system frequency quality could be improved by the proposed UC with the integration of the FRM. Testing result shows that retaining the proper quantity of frequency-sensitive feature, especially Inertia, is the most effective way to keep the high frequency quality. (3) The dominant features of the LFC (i.e., parameters of units, load damping, system inertia) are hard to capture. The proposed method in this paper assists dispatchers to clearly identify the frequency-sensitive features, and grasp their features effectively. (4) Power system frequency operation is a nonlinear and complex problem. Compare to the traditional LFC model, the proposed FRM has the advantage of self-learning by feeding in the updated inputs so that the model output is more close to the real system behavior. This feature validates the effectiveness of an artificial intelligent algorithm in a smart grid system.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article. power generation of renewables n total sample in iForest P min g minimum output limit of each unit g P max g maximum output limit of each unit g Penalty E penalty price for shortage of energy Penalty AGCU p penalty price for shortage of AGC Up reserve Penalty AGCDown penalty price for shortage of AGC Down reserve  path length of pth sample in iTree L(·) loss function of GBDT s(p, n) outlier score