2.3. The CART-Based Adaptive Damping Control Scheme
CART is a recursive partitioning method which builds classification and regression trees. Classification trees are built for obtaining the splitting rules of different operating point subspaces whereas regression trees are built for identifying which subspace of an operating point the power system is in. In this way, the coordinated PSSs pre-designed off-line can be switched adaptively using on-line measurements. As long as the wind power output fluctuates, the operating point of the power system will deviate from the initial subspace and randomly move forward to other operating point subspaces or back to the initial subspace. The CART will function to identify the attraction subspace in order for the PSSs to be switched on line.
The CART is structured from the top to the bottom consisting of root node, test nodes and terminal nodes. Each test node or root node corresponds to an optimal splitting rule and a subset of the learning dataset. A terminal node is a pure node which could not be split further. The learning dataset itself corresponds to the root node of the CART. The classification process starts from the top root node, and at each level the subsets will be divided according to the optimal splitting rules. The optimal rules are in the form of “if-then-else” rules. In this paper, each terminal node represents an operating point subspace. The measurements from all operating point subspaces consist of the learning dataset which are the input data of the CART. Reference [
18] gives a full introduction of general theory and methods of the CART.
To ensure the accuracy of classification, those measurable and controllable measurements are used as the training data which could characterize the subspaces. Since generators’ speed includes the information of power flow routing, topology and the power oscillation modals in a power system, the CART uses the speeds of generators as the learning dataset. However, in most cases, the speeds of generators are not measurable or they are measurable without time tags. Therefore, the generator bus frequencies are employed as the learning dataset in this paper instead of generators’ speed. The reason is that the generator bus frequencies, which are the derivation of generators’ external bus angle, is an approximation to the generators’ speed and the generator bus frequencies can be obtained directly from PMUs.
In a large scale power system, only one measurement could not adequately characterize an operating point subspace. Thus multiple measurements need to be employed for tracking the variation of the power system operating point. Multiple measurements from multiple subspaces make the classification process complex. Therefore, in order to distinguish the features of measurements from different subspaces, the Euclidean distance to the hyperplanes is used as the classification algorithm to process large amounts of measurement data.
Take
Figure 3 as an example. The small circles and stars represent the measurements from subspaces
α and
β, respectively. The horizontal ordinate and vertical ordinate denote the measurements 1 and 2, respectively. Then the dataset from two measurements is expressed in a two dimensional space. In the same way, an n-dimension space is needed when there are n measurements. As shown in
Figure 3, a line can distinguish two groups of data and obviously a plane is needed for a three dimensional space. In the large scale power system, with multiple measurements, a hyper-plane is used in this paper to distinguish the subspaces of operating points. In order to explain the algorithm developed in this paper, an example with two measurements is deduced as follows.
Assume a sample from
α group has the coordinates value
, where
and
represent the values of
x-axis and
y-axis of the
i-th sample. The values of
x-axis and
y-axis of the
j-th sample from group
β are denoted
. Then the means of samples from
α group could be obtained by:
where
n represents the number of samples from the
α group. The means of samples from the
β group could be obtained by:
where
m represent the number of samples from the
β group. Denote
and
the covariance of measurements of subspaces
α and
β, respectively. There should be many lines between those two groups of data.
M denotes the classification line, and
W denotes the normal vector of
M. The classification rules of two classes with different covariance is defined as the ratio of the variance between the classes to the variance within the classes, which is named as FLD index [
19]. Mathematically, it is:
The FLD index is the best for discriminating two groups of data when the FLD index
S is greatest. To achieve maximum value of
S, the normal vector
is found to be given by:
The middle point
should be on this line. With the point on the line and the normal vector
identified, the best line is determined. Then the following step is to find the distance from a point
in the two-dimensional space to the classification line
M as shown below:
where the classification line is
M:
ax +
by +
c = 0. In this way, a single dimensional variable (distance)
d(
x,
y) is used as the training data instead of two dimensional data. If
, the operating point is identified to be inside subspace
α, otherwise the point is inside subspace
β.
Extend the example above in two-dimension space to the case in a multiple-dimension space. It is easy to understand that the hyper-plane can be used to classify subspaces in the high-dimension space as illustrated as follows.
In an n-dimension space, the normal vector of a hyper-plane can also be calculated by (3). The vector composed of middle points between the means is also on the hyper-plane
:
where
represents the position of a point in
n-dimension space, i.e.,
. The distance from a point
to the hyper-plane can be obtained according to (5):
In this way, the distance vectors composed of the distance from the points of the subspaces to the hyper-planes are identified. Then they are used as the input variables for the CART to perform the classification process and achieve the splitting rules. After the process of classification, the regression process is performed to identify which subspace the current operating point is in. A new distance vector from current operating point to the hyper-planes is obtained and used as inputs to the CART, thus the terminal node which characterizes a subspace is reached. In this way, the CART can track the variation of the system operating point and thus guide the updating of appropriate PSSs into service adaptively.