Wind Turbine Power Curve Upgrades: Part II

: Wind turbine power upgrades have recently become a debated topic in wind energy research. Their assessment poses some challenges and calls for devoted techniques: some reasons are the stochastic nature of the wind and the multivariate dependency of wind turbine power. In this work, two test cases were studied. The former is the yaw management optimization on a 2 MW wind turbine; the latter is a comprehensive control upgrade (pitch, yaw, and cut-out) for 850 kW wind turbines. The upgrade impact was estimated by analyzing the difference between the post-upgrade power and a data-driven simulation of the power if the upgrade did not take place. Therefore, a reliable model for the pre-upgrade power of the wind turbines of interest was needed and, in this work, a principal component regression was employed. The yaw control optimization was shown to provide a 1.3% of production improvement and the control re-powering provided 2.5%. Another qualifying point was that, for the 850 kW wind turbine re-powering, the data quality was sufﬁcient for an upgrade estimate based on power curve analysis and a good agreement with the model result was obtained. Summarizing, evidence of the proﬁtability of wind turbine power upgrades was collected and data-driven methods were elaborated for power upgrade assessment and, in general, for wind turbine performance control and monitoring.


Introduction
The wind capacity worldwide is impressively growing and furthermore many multi-megawatt wind turbines have been operating for years. The production optimization therefore has two main directions as regards each single wind turbine: On the one side, diminishing the unavailability time through condition-based maintenance strategies. For example, it is estimated that the unavailability time of a modern wind turbine is currently of the order of 3% [1] and can further diminish. On the other side, the technology update of wind turbines in their operational lifetime has been flourishing in the latest years and has been producing non-negligible improvements of wind kinetic energy conversion efficiency: the assessment and the methodologies for studying these wind turbine power upgrades constitute the topic of the present work.
There are basically two types of wind turbine power upgrades that are currently employed in operating wind turbines: aerodynamic and control upgrades, or possibly a combination of the two. Examples of aerodynamic retrofitting of the blades are installation of vortex generator, passive flow control devices, Gurney flaps and so on [15][16][17][18][19][20][21][22][23]. Control upgrades typically deal with pitch [24,25], rotor revolutions per minute [26], and yaw management. The increase of production can be achieved also by modifying the wind speed cut-out management, as discussed, for example, in [27][28][29].
It likely happens that wind farm manufacturers and wind farm owners cooperate as regards to the technology improvement of operating wind turbines with forms of profit sharing of wind turbine power upgrades. This fact has considerably stimulated the high-level analysis of operational data in the industry and the collaboration with academia. Actually, there are several critical points about the assessment of wind turbine power upgrades: • The wind source is stochastic and it does not make sense to compare the cumulative production before and after a power upgrade.

•
It is difficult to account for the multiple dependency of wind turbine power on climate and operating conditions. • It can be difficult to reliably know the wind conditions on site, in general because nacelle anemometers are mounted behind the rotors of the wind turbines and in particular because cup anemometers might not provide adequate measurement precision.
On these grounds, the power curve study might be a reliable tool for assessing power upgrades only when considerably long datasets are available, in order to avoid the effect of seasonal biases due to the variation of climate conditions on site. If, as commonly happens, wind farm practitioners aim at obtaining an estimate after just few months of upgrade operation, more complex and powerful methods are needed. A certain amount of literature has been flourishing about this problem and some interesting methods have recently been proposed. The common ground is the following idea: • After the upgrade, of course, the power production is known if operation data are available.

•
The production improvement is the difference between the measured production post-upgrade and a simulation of how much the wind turbine would have produced, in the same conditions, if the upgrade did not take place.

•
The simulation must be achieved with a model based on pre-upgrade.
Chronologically, the first relevant study is [24]: in that work, a modification of the Gaussian kernel regression method is proposed to account for the multivariate dependency of the power of the wind turbine. Two upgrade test cases are studied: one is aerodynamic (vortex generator installation) and is studied through the analysis of operation data and the other regards the control of the pitch and is studied artificially because the pitch behavior is simulated and data are synthesized accordingly. In [30], another critical point of this kind of problems is discussed in depth: the statistical significance and the dataset dimensionality. The proposed solution is the use of time-resolved operation data, rather than Supervisory Control And Data Acquisition (SCADA) data. The former kind of data actually has sampling time of the order of the second, while the latter kind of data has sampling time of some minutes (typically, ten). In [30], it is shown that, using the time-resolved data, it is possible to obtain results that are similar to the ones from the Kernel-plus of [24], but with a much simpler method: it is the so-called power-power or side-by-side and it is based on the study of the power difference between the target (upgraded) wind turbine and a reference wind turbine, before and after the upgrade of the wind turbine of interest. In [25], three test cases of wind turbine power curve upgrades are considered: pitch angle optimization near the cut-in, vortex generators and passive flow control devices installation, cut-out management optimization. The first two test cases are studied by modeling the pre-upgrade power of the wind turbines of interest using an Artificial Neural Network (ANN) model having as input some operation variables of the nearby wind turbines. A control upgrade, dealing with the rotor revolutions per minute optimization in order to reach the most appropriate induction level, is studied in [26]: in that work, the power-power method is generalized by modeling the power of the upgraded wind turbine through a multivariate linear, employing as input variables some operation parameters of the nearby wind turbines. For other issues regarding this topic, see also [31][32][33].
On these grounds, the objective of the present work was furnishing further contributions to the topic of wind turbine power curve upgrades assessment. For doing this, two test cases were considered:

•
The first test case deals with the yaw management optimization on a 2 MW wind turbine. There is a considerable literature about the potentiality of wind turbine efficiency improvement through the advances in yaw management (see, for example, [34]), but, at this stage, the available studies mainly deal with simulation estimates (for example, recently, in [35], a yaw control strategy based on reinforced learning is designed). To the best of the authors knowledge, the study in this work is the first in the literature that is based on wind turbines in operation.

•
The second test case deals with a control upgrade on a 850 kW wind turbine. Since the technology of this kind of device is gradually becoming obsolete, the re-powering on this wind turbine has been more impacting and has dealt with pitch, yaw, and cut-out management optimization.
An interesting point about this upgrade is that the measuring chain was improved, through the installation of a sonic anemometer. Furthermore, the wind farm manufacturer arranged a testing period of the upgrade for some months, by alternating half-hour intervals characterized by the operation with the pre-and post-upgrade control logic. Therefore, it was possible to compare the two power curves quite reliably, because the wind speed data have good quality and because the data were collected in the same period and seasonal biases were therefore avoidable. This gives the possibility of verifying the model-based estimate of the production improvement through another, independent, approach.
The two above test cases were studied with particular attention to the methodology. The selected model for the power of the wind turbines of interest was a multivariate linear and it was decided that several operation parameters of the nearby wind turbines could in principle be input variables for the model. This can be considered a generalization of the concept of rotor-equivalent wind speed [36]: the conditions on site can be described, for example, through the blade pitches, the rotor revolutions per minute, and the power output of the wind turbines constituting the wind farm. As discussed in detail throughout the manuscript, remarkable collinearity between the possible covariates of the models was observed and for this reason a principal component regression [37] was employed, differently with respect, for example, to [33], where a stepwise regression algorithm [38] was used for input variables selection for an ordinary least squares regression. This approach is general and does not depend on the test case: therefore, it can be considered a contribution to the methodologies for wind turbine performance control and monitoring.
As regards the selected test cases, the results of this work are that the yaw control optimization on the 2 MW wind turbine provided a production improvement of 1.3% of the AEP; and the 850 kW wind turbine re-powering provided an improvement of 2.5% of the AEP.
The structure of the manuscript is as follows. In Section 2, the test cases and the datasets are described. Section 3 is devoted to the methods: the employed model is discussed in general and implemented in particular for the two selected test cases. In Section 4, the results for the production improvement are collected and discussed. Section 5 is devoted to the conclusions and to some further directions of the present work.

The Test Cases and the Datasets
One wind turbine for each test case wind farm underwent the corresponding upgrade (WTG02 in Wind Farm 1 and WTG022 in Wind Farm 2). Actually, the wind farm owner has been adopting the following approach as regards power upgrades: selecting some test wind turbines and, after some months of operation, assessing the impact of the upgrade on the grounds of studies such as the present one. Subsequently, the wind farm owner decides if it is worth extending the upgrade to the other wind turbines in the wind farm.
The employed datasets were obtained from the SCADA collected databases of the wind turbines. Their quality was checked as follows: • Data were filtered on the request that all wind turbines in the wind farm were productive. This was done using the appropriate operation time counter available in the dataset.

•
The quality of the anemometer data was crosschecked overall for each wind turbine through the analysis of the average power curve against the theoretical one and no relevant anomalies were detected.

•
The quality of the data for each time step for each wind turbine was crosschecked by comparing the actual power production for the measured nacelle wind speed against the theoretical power curve. If a deviation larger than 30% was detected, the measurement was rejected.

Test Case 1: Yaw Control Optimization, 2 MW Wind Turbine
The wind farm of interest is composed of six horizontal-axis three-bladed wind turbines having 2 MW of rated power each and the rotor diameter is 92.5 m. The cut-in is 3.5 m/s and the cut-out is 25 m/s. The nominal wind speed is 14.5 m/s. The layout of the wind farm is reported in Figure 1 and the wind turbine of interest (WTG02) is indicated in red. The wind farm is sited onshore in a gentle terrain in southern Italy. The inter-turbine distances go from the order of 7 rotor diameters (between nearest neighbors) up to the order of 19 rotor diameters. The data available were organized in two datasets as follows: • The first dataset is denoted as D bef and contains the data collected from 1 January 2017 to 20 August 2018. It is a period prior to the yaw control upgrade on turbine WTG02. It is composed of 35,971 data.

•
The second dataset is denoted as D aft and contains the data collected from 1 September 2018 to 1 January 2019. It is a period after the control optimization on turbine WTG02. It is composed of 9288 data.
In Figure 2, the normalized autocovariance of the power output of WTG02 is reported as a function of the lag (up to 20) for the D bef dataset. This was done to crosscheck the assumption that each measurement can be considered independent with respect to the others.
The wind direction roses, measured at WTG02, during D bef and D aft are reported in Figure 3 and it arises that the distributions are very similar before and after the upgrade of the WTG02. Therefore, it can be argued that, as far as can be analyzed from the data available, the model formulation and use are not biased by differences in climatology. This is supported also by the fact that the ratio between the average nacelle wind speeds at WTG02 during D bef and D aft is 1.04.  The SCADA collected data have ten minutes of sampling time. The available validated measurements are: The effect of the control upgrade was an improvement of the yaw management, especially for low and moderate wind intensities. This resulted in a decrease of the occurrence of high yaw misalignment. Qualitatively, this was assessed as follows: the yaw misalignment of WTG02 was computed as the difference between the wind direction measured at the nacelle and the position of the nacelle itself. As a pre-requisite, data were filtered on the request that WTG02 was in production. The plot of the yaw misalignment (in degrees) against the power is reported in Figure 4 for the datasets D bef and D aft .

Test Case 2: Control Re-Powering, 850 kW Wind Turbine
The wind farm of interest is composed of twenty-three horizontal-axis three-bladed wind turbines having 850 kW of rated power each and the rotor diameter is 58 m. The cut-in is 3 m/s and the cut-out is 20 m/s. The nominal wind speed is 12.5 m/s. The layout of the wind farm is reported in Figure 5 and the wind turbine of interest (WTG022) is indicated in red. The wind farm is sited onshore in a gentle terrain in northern France. The inter-turbine distances go from the order of four rotor diameters (between nearest neighbors) to the order of 100 rotor diameters. The wind direction rose on-site is quite uniform. The data available were organized into two datasets as follows: • The first dataset is denoted as D bef and contains the data collected from 1 February 2018 to 20 August 2018. It is a period prior to the control upgrade on turbine WTG022. It is composed of 15,353 data.

•
The second dataset is denoted as D aft and contains the data collected from 24 August 2018 to 1 April 2019. It is a period after the control optimization on turbine WTG022. In this period, half-hour intervals of operation according to the pre-and post-upgrade logic were alternated. The former subset is indicated as D non−up aft and is composed of 4245 data. The latter subset is indicated as D up aft and is composed of 4265 data. Only D up aft is employed for the model-based estimate of the upgrade, while both subsets are employed for the power curve study in Section 4.2.
In Figure 6, the normalized autocovariance of the power output of WTG022 is reported as a function of a lag up to 20, for the D bef dataset. This was done to crosscheck the assumption that each measurement can be considered independent with respect to the others.  The wind direction roses, measured at WTG022, during D bef and D aft are reported in Figure 7 and it arises that the distributions are similar before and after the upgrade of the WTG022. Therefore, it can be argued that, as far as can be analyzed from the data available, the model formulation and use are not remarkably biased by climatology effects. This is supported also by the fact that the ratio between the average nacelle wind speeds at WTG02 during D bef and D aft is 0.96.
The SCADA collected data have ten minutes of sampling time. The available validated measurements are: • nacelle wind speed; • temperature outside the nacelle; and • active power.
It should be noticed that, after the upgrade intervention, WTG022 has a sonic anemometer at the nacelle. As discussed in detail in Section 4.2, the operation of WTG022 during D aft was as follows: half-hour intervals of operation according to the pre-and post-upgrade control logic were alternated. Therefore, to assess the upgrade using the techniques proposed in Section 3, only the data in D aft characterized by operation according to the upgraded control were selected. Instead, for the power curve study of Section 4.2, all data in D aft were used, after dividing them according to the pre or post upgrade behavior.

The Methods
This section presents the formulating of a reliable model for the pre-upgrade power of the wind turbines of interest (WTG02 in Wind Farm 1 and WTG022 in Wind Farm 2). Section 4 is devoted in detail to the use of these models for the performance improvement estimate. For the moment, it is important to recall that a good model for the power of the wind turbines was needed because it was trained with pre-upgrade data and validated against a pre-upgrade dataset; the upgrade was quantified by simulating through the adopted model how the post-upgrade power would have been if the upgrade did not take place. In other words, the performance improvement was elaborated from how the residuals between measurements and model estimates changed after the upgrade with respect to before.
As anticipated in Section 1, the critical point was selecting the model type and the input variables. The discussion in [33], in relation to the work in [25], indicates that a linear model can be adequate for this objective. In other words, the general sense is that it is possible to approximate reliably the power of a wind turbine as a linear function of operation variables measured at the nearby wind turbines in the farm. This makes sense, not only by a statistical point of view, but also by the point of view of wind energy practice: actually, since a wind turbine acts as a filter to the wind fluctuations, the blade pitch, the rotor revolutions per minute and the active power of a wind turbine can likely be used for accounting for the on-site wind conditions [36].
The possible variables fed as input to the model are those indicated in Section 2.1 for Test Case 1 and those indicated in Section 2.2 for Test Case 2, for all the wind turbines in the wind farms except the upgraded ones. The decision of excluding the variables of the upgraded wind turbines as input variables to the model was motivated by the fact that the wind sensors might change after the upgrade (as in Test Case 2, see Section 2.2), or the upgrade might affect the measuring chain of the wind conditions (as discussed in [25,33]), or in general the relation between the power and the control (pitch, rotor revolutions per minute, etc) might change as a consequence of the upgrade. Therefore, since for the employed method one must assume that the input variables to the model are "probes" of the external conditions whose behavior does not change after the upgrade of the wind turbine of interest, it is straightforward that the variables of the upgraded wind turbine can only be the target (i.e., the output) of the model. Similar to Astolfi et al. [33], a linear model was considered adequate for the objectives of this work. The critical point is the input variables selection: Tables 1 and 2 indicate that the possible covariates of the model can be highly correlated and this would lead to a non-optimal standard linear regression. On these grounds, Principal Component Regression (PCR) [39] was selected for this study. The use of this method for control and monitoring purposes in wind energy has been growing [40]. The procedure is as follows. Let Y n,1 = (y i , . . . , y n ) T be the vector of measured output and X n,p = (x i , . . . , x n ) T be the matrix of covariates. n is the number of observations and p is the number of covariates. In the following, it is assumed that X is normalized such that each covariate has zero mean.
The standard least squares regression poses that where β is the vector of regression coefficients that must be estimated from the data and is a vector of random errors. The ordinary least squares estimate of β is given by The principal component estimate of β is obtained as follows. The principal component transformation of the covariates matrix can easily be expressed in terms of the singular value matrix factorization. Therefore, let be the singular value decomposition of X. This means that the columns of U and V are orthonormal sets of vectors denoting the left and right singular vectors of X and ∆ is a diagonal matrix, whose elements are the singular values of X. This allows decomposing X X T as: where Λ = diag λ 1 , . . . , λ p and λ 1 ≥ · · · ≥ λ p ≥ 0. XV i is the ith principal component and V i is the ith loading corresponding to the ith principal value λ i . Therefore, W = XV can be viewed as a new covariates data matrix and the principal component regression basically is an ordinary least squares regression between Y and W. A powerful aspect of the principal component regression is that the decomposition in Equation (4) indicates a sort of regularization scheme: namely, the matrix W can be truncated including a desired number of principal components. This regularization addresses the problem of multicollinearity of covariates, because, when two or more covariates are highly correlated, X tends to lose its full rank and this implies that X X T has some eigenvalues tending to 0: excluding the principal components associated to the smallest eigenvalues λ i means regularizing the covariates matrix in order that it has full rank.
Finally, the PCR estimate of β is given as where it is assumed that the matrices can be truncated to a desired number of columns, i.e., principal components. The selection of an adequate number of principal components for the regression is performed through K-fold cross-validation [41]. D bef is divided randomly in two fractions: (K − 1)/K of the data are used for training and the remaining 1/K are used for validation. K was selected to be 10 for this study. The training data are employed for estimating β through principal component regression (Equation (5)) and the model estimate of the validation data is given bŷ For each fold selection, the root mean square error is used as a metric for the goodness of the regression: it is given in general by where n valid is the number of rows of X valid . The RMSE values are subsequently averaged on the folds selection and, therefore, for a given number j of principal components included in the regression, one can obtain a unique metric RMSE j for estimating the quality of the regression. The final selection of the number of principal components to be kept is performed as follows: given RMSE j , the error estimate for the model with j principal components, if RMSE j − RMSE j+1 < 10%, k is selected. It should be noticed that, as discussed in Section 4, the results for both test cases do not depend sensibly on this choice, as long as a certain minimum number of principal components are included in the model. A test can be formulated for inquiring the statistical significance of the fact the performance of the wind turbine of interest has changed after the upgrade. One can pose that the output can be modeled through a linear model before and after the upgrade and inquire to whether there has been a structural change, i.e. if the linear models before and after the upgrade are different. Suppose therefore that where X is the matrix of explanatory variables, β is the vector of regression coefficients and are vectors of random errors.
The hypothesis test about the structural change of the regression regards the null hypothesis: This is known as the Chow test and is based on the fact that, indicated with RSS the residuals sum of squares between measurements and model estimates, the quantity is distributed as F(K, N − 2K − 2), where K is the number of covariates and N is the number of data points. Practically, the Chow test is performed as follows. The covariates matrices and the output vectors before and after the upgrade are vertically juxtaposed to form a total covariates matrix X TOT and a total output vector Y TOT . The test is performed with the assumption that the break point where the structural change can happen occurs when the data before the upgrade end and the data after the upgrade start.

Test Case 1
In Table 2, some sample results are reported for supporting the selection of the principal component regression: the correlation coefficients between the rotor rotational speeds of WTG01 and WTG03-WTG06 are reported. These covariates were selected because the rotor basically acts as a filter, smoothing the fluctuations caused by the turbulence, and it is therefore likely that in a wind farm the rotor speeds of nearby wind turbines are highly collinear. The structure of the model for the test case of interest was selected as follows. The output Y is the power of WTG02; the covariates matrix X was selected to be composed of power, rotor rotational speed, generator rotational speed, blade pitches, nacelle position and ambient temperature at each wind turbine of the wind farm, except WTG02. Therefore, if one considers the filtered D bef dataset, Y is a vector of 25,044 data and X is a matrix having 24,055 rows and 30 columns (six variables for five wind turbines).
The results for the model K-fold cross-validation are reported in Figure 8 and, with the criterion exposed in Section 3, five principal components were selected. It should be noticed that a sensitivity analysis was carried and it was observed that the results do not change substantially by including more than five principal components. As regards the Chow Test, the matrix X TOT is composed of 31,392 rows (25,044 before upgrade and 6348 after upgrade) and 30 columns, and the vector Y TOT is composed of 31392 elements. The break point position for the Chow test is 25,044 and the computed p-value is lower than 10 −32 . This clearly indicates that the linear relation between covariates matrix and the target has a structural change after the upgrade of WTG02.

Test Case 2
In Table 2, the correlation coefficients between some sample covariates are reported. The powers of WTG018-WTG021 and WTG023 was selected for reporting in Table 2: these covariates was selected for readability of the table and mostly because those wind turbines are the nearest to the target WTG022 and therefore those covariates are likely to be selected for a standard least squares regression. The remarkably high values reported in Table 2 support the selection of the principal component regression as model type. The structure of the model for the test case of interest was selected as follows. The output Y is the power of WTG022. The covariates matrix X is composed of the available validated measurements: nacelle wind speed, power and ambient temperature at each wind turbine of the wind farm, except WTG022. Therefore, if one considers the filtered D bef dataset, Y is a vector of 15,353 data and X is a matrix having 15,353 rows and 66 columns (three variables for 22 wind turbines).
The results for the model K-fold cross-validation are reported in Figure 9 and, through the criterion exposed in Section 3, the number of selected principal components results to be six. It should be noticed that the results do not change significantly by including more than six principal components in the model. As regards the Chow Test, the matrix X TOT is composed of 19,618 rows (15,353 before upgrade and 4265 after upgrade) and 66 columns, the vector Y TOT is composed of 19,618 elements. The break point position for the Chow test is 15,353 and the computed p-value is lower than 10 −32 . This clearly indicates that the linear relation between covariates matrix and the target has a structural change after the upgrade of WTG022.

The Results
The procedure for assessing the upgrade Was based on the following idea. After an upgrade installation, through the SCADA collected data, it is possible to know the power production of the upgraded wind turbine. To estimate the impact of the upgrade, one should know how much the wind turbine would have produced if the upgrade did not take place. The most reliable and practical way to obtain this kind of estimate is through a data-driven model, based on the pre-upgrade datasets. A reliable model was achieved (Section 3) and it was used for the upgrade assessment presented below.
The procedure is as follows. The datasets available were organized in this way: • D bef was randomly divided in two subsets: D0 ( 2 3 of the data) and D1 ( 1 3 of the data). D0 was used for training the model and constructing the weight matrix W and D1, the pre-upgrade dataset, was employed for validating the model. • D aft , the post-upgrade dataset, was employed for estimating the power upgrade. For notation consistency, it is also referred to equivalently as D2.
Notice that, for Test Case 2, the dataset D up aft was employed as D2. The residuals between the measurement y and the simulationŷ, for the datasets D1 and D2, were studied. The focus was in how the residuals varied after the upgrade. Therefore, consider Equation (11) A Student's t-test was performed to inquire if there was any statistically significant change in the turbine output after the upgrade. The t statistic is computed as In Equation (12), N 1 and N 2 are the numbers of data in, respectively, D1 and D2;R 1 andR 2 are the average residuals in datasets D1 and D2 respectively; and σ R is given in Equation (13): where S 1 and S 2 are the standard deviations of the residuals in datasets D1 and D2, respectively. As regards the upgrade estimate, for i = 1, 2, one computes and where N i is the number of data points in the datasets D1 and D2, respectively. Notice that, if the model is reliable, one should have that δ 1 0 and ∆ 1 0, differently with what should happen as regards δ 2 and ∆ 2 if the upgrade is really effective. Finally, the quantity can be taken a percentage estimate of the production improvement. In the case the datasets D1 and D2 are characterized by considerably different y distributions, it might be appropriate to take this into account by renormalizing Equation (16): a reasonable correction factor can be the ratio between the y averages in datasets D2 and D1. The above procedure can be repeated several times to synthesize experiment repetition: at each run of the model, a different D0 (training set) and therefore D1 (pre-upgrade validation dataset) can be selected. Notice that this basically corresponds to repeating the K-fold cross-validation. The difference with respect to the procedure described in Section 3 is that in this case the model structure was always the same and it was exactly the one selected on the grounds of the discussion in Section 3. The way the pre-upgrade data were divided was also changed with respect to Section 3. The selection of D0 and D1 actually corresponded to K = 33.3. This was done because it agrees with most of the rule of thumbs for data partition for this kind of tasks and because, with this selection, the dimensions of D1 and D2 (the post-upgrade dataset) have the same order of magnitude. Therefore, the ∆ estimate varied at each run of the model, because the training data changed and, therefore ∆ 1 and ∆ 2 changed. In principle, it could be possible to select randomly a subset of D2 as post-upgrade simulation dataset, but in this work this choice was avoided. The reason was that typically D2 is shorter than D0 and D1, because for practical reasons an upgrade is assessed as soon as possible with good reliability. The above bootstrap technique therefore allowed having several estimates of ∆ with the same data: the final estimate was always the average and it is presented below with its standard deviation. This corresponds to the procedure of Section 3 with J repetitions: for this part of the study, J was selected based on when the ∆ average and standard deviation became fairly stable. It was observed that J = 30 is sufficient for this task.

Test Case 1
Since the effect of the upgrade regards especially the low-moderate wind intensity, data were filtered on the request that the power of WTG02 is less than 1 MW. After this further filter, the number of data was 25,044 for D bef and 6348 for D aft .
The t-statistic (Equation (12)) was computed to be of the order of 10 −10 and this indicates that the probability that the upgrade was ineffective was correspondingly low. Table 3 reports the results for the average (over the J model runs) ∆ i and δ i with i = 1, 2 (Equations (14) and (15)). From these results, it arises that the upgrade could be detected as an average absolute increase of 13.5 kW in the difference between WTG02 power measurements and model estimates. Notice that the average value of the residuals for datasets D1 was extremely low (0.1 kW) and, correspondingly, the average estimate of ∆ 1 (Equation (16)), i.e. the percentage error on the cumulative production, was extremely low as well. This indicates that the model was particularly reliable as regards the simulation of the pre-upgrade behavior of the WTG02.
In Figure 10, the plot of R(x 1 ) and R(x 2 ) on a sample model run is reported. The data were averaged in power production intervals, whose amplitude was 5% of the rated. From this plot, the effect of the upgrade can be read as an increase of the difference between the WTG02 power measurements and the WTG02 power model estimates. Table 3. Average absolute and percentage residuals between measurement and model estimation.   Table 3 and Equation (14), the average production improvement was estimated as ∆ = 4.3%, with a standard deviation of 0.4%: in other words, with the proposed method it was computed that WTG02, during the dataset D2, produced below 1 MW, the 4.1% more than it would have done if the upgrade had not been adopted. A reference long-term power or wind speed distribution can be employed to estimate how much this corresponds in terms of annual energy production and the average result is ∆ AEP = 1.3 ± 0.1%. This result is consistent with the test case studies in [25]: the order of magnitude of the impact of multi-megawatt wind turbine control optimization can typically be estimated as 1% of the AEP. It is interesting notice that, to the best of the authors knowledge, this is the first estimate in the literature based on operation data of the impact of yaw control optimization.

Test Case 2
The t-statistic (Equation (12)) was computed to be of the order of 10 −15 and this indicates that the probability that the upgrade was ineffective was correspondingly low. Table 4 reports the results for the average (over the J model runs) ∆ i and δ i with i = 1, 2 (Equations (14) and (15)). It arises that the upgrade could be detected as an average absolute increase of 3.9 kW in the difference between WTG022 power measurements and model estimates. The average value of the residuals for datasets D1 was very low (0.2 kW) and correspondingly, the average estimate of ∆ 1 (Equation (16)), i.e. the percentage error on the cumulative production, was extremely low as well. This indicates that the model was reliable as regards the simulation of the pre-upgrade behavior of the WTG022. Table 4. Average absolute and percentage residuals between measurement and model estimation.

Residual δ (kW) ∆ (%)
3.9 2.5 In Figure 11, the plot of R(x 1 ) and R(x 2 ) on a sample model run is reported. The data were averaged in power production intervals, whose amplitude was 10% of the rated. From this plot, the effect of the upgrade can be read as an increase of the difference between the WTG022 power measurements and the WTG022 power model estimates, especially for moderately low wind intensities and approaching rated power.   Table 3 and Equation (14), the average production improvement was estimated as ∆ = 2.5%, with a standard deviation of 0.2%: in other words, with the proposed method it was computed that WTG022, during the dataset D2, produced 2.5% more than it would have done if the upgrade had not been adopted.

Power Curve Analysis
As anticipated in Section 2, the post-upgrade operation during dataset D aft was as follows: half-hour intervals were alternated, during which WTG022 was operating, respectively, according to the pre-and post-upgrade control logic. This was done to assess practically in real time the effect of the upgrade. With these data available and taking into account that, during D aft , a sonic anemometer was collecting data at WTG022 nacelle, it was reasonable to study the power curve.
In Figure 12, the two power curves measured during D aft are reported. Data were averaged in wind speed intervals having 0.5 m/s of amplitude. In Figure 13, the difference between these two curves is plotted. In Figure 13, it can interestingly be observed that the upgraded operation mode indeed lost performance around 10 m/s: the same situation was observed from the residuals presented in Figure 11. Since this study was performed with only few months of data in D aft , it is plausible to expect that this situation was adjusted in the following, to obtain a performance improvement along the whole power curve.  The production improvement was estimated as follows: the power curve, according to the pre-upgrade logic in Figure 12, and the power difference in Figure 13, were weighted against the wind distribution during the whole D aft dataset. The ratio between these two quantities provided an estimate of how much the production would have improved during D aft if the power curve was always the improved one, with respect to the production that would have been obtained if the power curve was always the non-improved one. The improvement computed in this way amounted to 2.3% of the production. Even though it was computed with a different approach, it is interesting to notice that it agreed fairly well with the estimate reported in Section 4.2.

Conclusions
In this study, two test cases of wind turbine power curve upgrades were analyzed: the common ground between them is that the upgrade regards the control of the wind turbines. The difference between the two test cases is that one wind turbine (Test Case 1) has a quite recent technology (it is a 2 MW wind turbine) and the control upgrade deals only with one aspect (the management of the yaw); the other wind turbine under investigation (Test Case 2) belongs instead to a less recent technology (the rated power is 850 kW) and the upgrade has consequently involved several aspects of the control (yaw, pitch, and cut-out) and included the update of the anemometer sensors at the nacelle.
Despite being organized as a test case discussion, this study was strongly characterized by a methodological approach. Actually, the point with the study of wind turbine power curve upgrades is that it is difficult to assess them reliably using operation data analysis techniques such as the power curve, because of the multivariate dependence of the power of a wind turbine on climate conditions and working parameters. The problem of wind turbine power curve upgrades study therefore translates into the following question: how can the power of a wind turbine be modeled reliably? It is evident that the answer to this question can be exploited for several problems regarding the control and monitoring of wind turbines and, in general, of complex systems. As regards wind turbines, for example, similar approaches are employed in [42] for the study of how much the pitch misalignment impacts on the performance.
The turning point for the present study was practically adopting the other wind turbines in the wind farm as probes of on-site conditions. This somehow generalized the concept of rotor-equivalent wind speed, discussed, for example, in [36]: since the wind turbine acts as a filter, some working parameters such as active power, blade pitches, rotor or generator revolutions per minute can robustly describe the wind farm at the micro-scale level. Therefore, the idea of this study was modeling the power of the wind turbines of interest, according to their pre-upgrade behavior, as a linear function of the wind and operation conditions measured at the nearby wind turbines: this can basically be considered a generalization of the so-called power-power method, adopted, for example, in [30]. Since for the test cases considered in this work the possible covariates for a linear model displayed a remarkable collinearity, a principal component regression was adopted.
Using this modeling technique, the impact of the upgrades could be elaborated from how the residuals between power measurements and power model estimates vary after the upgrade with respect to before. The results for the selected test cases are the following: the yaw control optimization on the 2 MW wind turbine was estimated as 1.3% of the AEP; and the re-powering on the 850 kW wind turbine was estimated as 2.5% of the AEP.
There are at least two other remarkable aspects as regards the selected test cases. To the best of the authors knowledge, Test Case 1 is the first assessment in the literature of yaw control optimization using operation data and the obtained results indicate that the yaw management optimization is a promising direction for improving the power production of wind turbines. It is therefore valuable to push forward this line of research, as recently done, for example, in [35]. As regards Test Case 2, it was possible to obtain another estimate of the impact of the upgrade using the power curve study. Actually, with the re-powering, the anemometer sensors were updated and a sonic anemometer was installed. Furthermore, in the post upgrade period examined for this study, the operation of the wind turbine was alternated: half an hour according to the pre-upgrade logic and half an hour according to the post-upgrade logic, and so on. The quality of the data and the fact that they were collected in the same period (avoiding seasonal biases) allowed studying the power curve reliably and the improvement estimate was shown to be in good agreement with the computation from the multivariate model.
There are several further directions of the present work. Currently, some test cases are at study for which a linear model is not adequate, probably because of complex climatology conditions on site. Therefore, it is planned to investigate nonlinear approaches for this kind of studies and to inquire what site characteristics call for nonlinearity. An interesting development is the use of the methods of this work for other control and monitoring issues related to wind turbine operation: for example, monitoring the effect of blade pitches re-alignment according to the technique proposed in [43], or monitoring the operation of the wind turbines [40]. Furthermore, a very promising direction of the studies about wind turbine power curve upgrades is the use of time-resolved data, having sampling time of the order of second: this kind of data have considerable potentiality for performance control and monitoring [44], but their time scale calls for more advanced time-series analysis [45].