Wind Turbine Systematic Yaw Error: Operation Data Analysis Techniques for Detecting It and Assessing Its Performance Impact

The widespread availability of wind turbine operation data has considerably boosted the research and the applications for wind turbine monitoring. It is well established that a systematic misalignment of the wind turbine nacelle with respect to the wind direction has a remarkable impact in terms of down-performance, because the extracted power is in first approximation proportional to the cosine cube of the yaw angle. Nevertheless, due to the fact that in the wind farm practice the wind field facing the rotor is estimated through anemometers placed behind the rotor, it is challenging to robustly detect systematic yaw errors without the use of additional upwind sensory systems. Nevertheless, this objective is valuable because it involves the use of data that are available to wind farm practitioners at zero cost. On these grounds, the present work is a two-steps test case discussion. At first, a new method for systematic yaw error detection through operation data analysis is presented and is applied for individuating a misaligned multi-MW wind turbine. After the yaw error correction on the test case wind turbine, operation data of the whole wind farm are employed for an innovative assessment method of the performance improvement at the target wind turbine. The other wind turbines in the farm are employed as references and their operation data are used as input for a multivariate Kernel regression whose target is the power of the wind turbine of interest. Training the model with pre-correction data and validating on post-correction data, it is estimated that a systematic yaw error of 4 ∘ affects the performance up to the order of the 1.5% of the Annual Energy Production.


Introduction
Horizontal-axis wind turbines (HAWTs) represent one of the most promising renewable energy technologies, especially as the energy density increases with the recent developments in rotor size [1].
It is fundamental for improving the energy yield up to some percent of the annual energy production (AEP) that the wind turbine control is robust and that the blade aerodynamics are appropriately optimized [2]. A robust wind turbine control means also that the wind turbine should operate as much time as possible with the rotor plane facing perpendicularly the wind flow: the yaw angle has therefore a theoretical target of 0 • . In practice, this is not possible because of the mismatch between the time scale of the meandering wind direction and the large inertia of the nacelle, to be counteracted through the yaw motors.
Furthermore, the impact of yaw angle on the energy yield is remarkable: a theoretical estimation [3] is that the extracted power is proportional to the cosine cube of the yaw angle. In [4], the effect of yaw error depending on the operation regime has been investigated through a wind turbine model, yaw error model and the equivalent wind speed model that includes the wind shear and tower shadow effects: the conclusion is that the cosine cube law is simplistic and yaw error can remarkably affect the power output; for example, an average yaw error of 10 • can cause a power loss up to 10%. The cosine cube law is discussed also in [5], where it is argued that the law is rather cos X with X depending on the operation regime and ranging from 1.88 to 5.14. In [6], the aeroelastic damping under yawed conditions has been analyzed and it has been observed that the aeroelastic stability of the horizontal axis wind turbine blade is adversely affected by the effects of the yaw errors: therefore, it can be stated that the yaw error not only impacts on the energy yield, but also potentially on the wind turbine lifetime. Wind turbines in operation are analyzed in [7] and the point of view is that the energy capture characteristics must be investigated under the joint effect of non-vanishing yaw angle and internal wind turbine control: a yaw index in relation to the power curve is formulated and tested on the operation data of a 2 MW wind turbine and the main result is that the yaw effect has an increasing trend up to 8 m/s and has a decreasing trend thereon because, approaching rated power, the pitch control has a more important effect with respect to yaw control.
The results in [7] provide a clear explanation of the fact that wind turbine yaw control optimization has remarkable potentiality for wind energy technology development. It should be noticed that this issue is at least twofold: on one hand, the point can be improving the control algorithm; on the other hand, the point can be improving the measurement chain basing on which the control algorithm operates. Actually, the use of cup anemometers placed behind the rotor limits the quality of wind direction measurements and in [8] it is estimated that this can negatively affect the power production up to 5%; the necessity of upwind (instead of downwind) wind flow measurements for improving yaw alignment is discussed also in [9]. The design of optimized yaw controls has been addressed in several studies, like [10][11][12][13][14][15]. An interesting aspect is that, in recent years, it has become a common practice to update the technology of operating wind turbines and this also involves the yaw control: the study in [16] is devoted to the estimate of energy yield improvement in an industrial wind farm, where a pilot wind turbine has installed a yaw control optimization. The control upgrade considered in [16] acts by diminishing the operation time with high yaw misalignment and the estimate provided in that study is that the AEP improves up to the order of 1%. For completeness, it should be mentioned that yaw control is considered the keystone for wind farm control and wake redirection [17][18][19][20].
The above cited studies deal with the dynamic analysis of the yaw error, but in wind turbines practice it can happen that, for reason of installation or operation in a complex environment, there is a systematic zero-point shift in the yaw angle measurements. If such a systematic error occurs, the control system will regulate the wind turbine to achieve a set point that is believed to be 0 • , but it is not. Therefore, if the zero-point shift of the yaw angle reaches the order of several degrees, the energy yield of the wind turbine will be systematically affected up to the order of some percent of the AEP.
The systematic yaw error can most reliably be diagnosed if additional sensory systems for measuring upwind wind flow are employed: for example, the studies in [21][22][23][24] deal with this objective through the use of Lidar anemometers. The point with this approach is its application in wind energy practice: employing a Lidar is costly and the outcome about the presence of a relevant systematic yaw error is uncertain. For this reason, a recommended practice is employing in the most intelligent way the data that are available at zero cost, i.e., the Supervisory Control And Data Acquisition (SCADA) data [25,26]. There are some studies dealing with the use of SCADA data for systematic yaw error diagnosis: in [27], the approach is analyzing the power output of the target wind turbine as a function of the yaw error; in [28], the power factor C p (rather than the power) is selected as indicator. In [29], a wind farm approach has been developed: it is based on comparing the distribution of yaw errors for each wind turbine against the average of the wind farm.
On these grounds, the objectives of the present study are mainly two: • contributing to the SCADA-based methodologies for diagnosing systematic yaw error; • assessing the impact of systematic yaw error through innovative techniques based on SCADA data analysis.
The above objectives can be achieved in the present study because a very appropriate test case study has been selected. This work has been organized as a cooperation between the University of Perugia and the Renvico company. A multi-MW wind turbine sited in southern Italy, owned by the Renvico company, has been diagnosed of systematic yaw error through the SCADA-based method proposed in this work.
The innovation proposed in this work, as regards the diagnosis of systematic yaw error through operation data, is that the rotor speed has been selected as a target because it does not depend on the nacelle anemometer measurements, it only depends on the torque exerted on the rotor and it is the fundamental quantity at the basis of wind turbine control [30]. The rationale for this choice is therefore that torque is lost if the yaw angle is systematically non-zero when it should be zero and, by analyzing the symmetry of the yaw error-rotor speed curve at the target wind turbine and by comparing the curve for the target and for reference wind turbines, it is expected that the yaw error diagnosis should be effective and clear.
Subsequently, for corroborating the SCADA-based analysis, upwind wind flow measurements have been conducted using spinner anemometer [31] and a 4 • systematic yaw error has been corrected by the wind turbine manufacturer. Therefore, for the present study, operation data before and after the correction of 4 • yaw error have been at disposal and this has allowed studying how the performance of the target wind turbine has changed.
A further innovation of this study deals with the estimation of energy yield improvement: it has been pursued through the approach employed, for example, in [32][33][34]. The rationale is that in wind farm operation there are situations when the performance of a target wind turbine is expected to change, while there is no reason why the performance of the remainder wind turbines should change: this happens for example when a technology upgrade is installed on a pilot wind turbine [32][33][34], or when a systematic error at a wind turbine is corrected [35]. Therefore, the change in the relative performance between the target wind turbine and the reference wind turbines can be codified by employing the operation variables of the reference wind turbines as inputs for a data-driven model whose output is the power of the target wind turbine. The correction of the yaw error at one wind turbine is expected to produce a performance improvement and therefore, if the model is trained with pre-correction data, validating the model on pre and post correction data sets, different behaviors in the residuals between measurements and model estimates should be observed.
Summarizing, therefore, with this study it is supported that the use of operation data for diagnosing systematic wind turbine yaw errors is fruitful and that this effort is strongly encouraged in the wind energy practice because the correction of 4 • yaw error provided an energy yield improvement of the order of 1.5% of the AEP on a multi-MW wind turbine.
The structure of the manuscript is the following: in Section 2, the test case and the data sets at disposal are described; Section 3 is devoted to the methods for yaw error detection and for performance assessment; the results are collected and discussed in Section 4; conclusions and further directions are indicated in Section 5.

The Test Case and the Data Set
The wind farm features six multi-MW wind turbines, it is sited in southern Italy and the owner company is Renvico. The lowest inter-turbine distance is of the order of more than six rotor diameters.
The target wind turbine for the diagnosis of systematic yaw error is WTG05 and it is indicated in red in the layout reported in Figure 1. A possible systematic yaw error on WTG05 has been at first diagnosed using the methods proposed in this study; subsequently, the results based on operation data analysis have been corroborated by upwind wind flow measurements from a spinner anemometer [31], and in the last days of 2019 the wind farm manufacturer has corrected a 4 • systematic yaw error.
The data sets at disposal are the following: • D bef goes from 1st January 2017 to 1st January 2018: it is a data set prior to the yaw error correction on WTG05; • D aft goes from 1st January 2020 to 1st April 2020: it is a data set posterior to the yaw error correction on WTG05.
The validated data at disposal, for each wind turbine in the wind farm, are: The data have 10 min of sampling time and the wind turbine grid operation time counters were available too and have been used to filter data on the request that the wind turbines of interest were operating properly and producing output. Upon data filtering, D bef results consisted of 13,569 records and D aft of 4915 records.

Yaw Error Detection
The general idea proposed in this work for the diagnosis of systematic yaw error is analyzing the yaw error-rotor speed curve. As hinted in Section 1, the rationale for the selection of this target is that the rotor speed of a wind turbine does not depend on the wind flow measurements from the nacelle anemometer placed behind the rotor span. The wind turbine regulates the rotor speed basing on the torque exerted on the rotor and the wind turbine control follows consequently. Therefore, if a wind turbine is losing torque because there is a systematic yaw error, it is losing rotor speed.
The yaw error is defined in Equation (1) (1) and represented in Figure 2. The proposed procedure is the following: • filter the data on wind turbine operation time, using the appropriate time counter; • filter the data on a narrow wind speed interval in Region II. • compute the average yaw error-rotor speed curve. Data can be averaged on yaw error intervals of, for example, 1 • or 2 • .
According to the reasoning in [7], it is appropriate to filter data on a wind speed interval that is not too close to rated power because, approaching rated, the pitch control is more important than the yaw control and because the rotor speed saturates. For a multi-MW wind turbine, an appropriate choice can be for example [7,8] m/s. Two aspects of the yaw error-rotor curve provide meaningful indications about the presence of a systematic yaw error and these, summarizing, are the symmetry of the curve and the comparison between the curves of the wind turbines in the farm: • the curve should have its maximum for γ 0 and diminish for increasing absolute value of γ.
On the other way round, if there is a systematic yaw error, the curve is clearly asymmetric and has its maximum at the value of γ corresponding to the systematic error. • if all the wind turbines in the farm are well aligned, the observed average ω r should have the same order of magnitude for all the wind turbines. If a wind turbine is losing rotor speed, it is under-performing.
The above two aspects can be summarized with two indexes: an asymmetry index η asym and a discrepancy index η disc . The former can be defined, given the yaw error-rotor rpm curve for the target wind turbine, as the difference between the average rotor rpm for γ < 0 (ω − ) and for γ > 0 (ω + ) as in Equation (2): The discrepancy index η disc is defined in Equation (3): whereω re f is the average rotor speed of one (or more) reference wind turbine andω tar is the average rotor speed of the target wind turbine. Notice that the above indexes can be computed on the data sets D bef and D aft : if the line of reasoning is correct, after the yaw error correction (therefore in D aft ), the indexes should highlight a remarkable difference.

Support Vector Regression and Data Set Analysis
The objective of this part of the work is formulating a reliable method for quantifying the performance improvement of wind turbine WTG05 after the 4 • yaw error correction. This translates in the necessity of precision wind turbine performance monitoring through operation data analysis: this is in general a complex task because the power of a wind turbine has a multivariate dependence on climate conditions and working parameters.
The power curve in its simplest and widely accepted conception [36] can often provide meaningful indications about wind turbine performance, but in general it can be vexed by season bias [37], wind profile [38] and by data quality issues, due to the fact that the wind speed is measured through cup anemometers placed behind the rotors.
For this reason, there is a vast amount of literature about the modelling of wind turbine power: in [39,40], the drawbacks of the standard binned power curve modeling method [36] are discussed and it is argued that more input variables (as, for example, rotor speed) and a more sophisticated modelling procedure (as, for example, Gaussian process) are needed for precision modelling of wind turbine power.
The point behind the intuition in [32][33][34] is that in wind energy practice there are situations where it is known that one (or more) wind turbines are expected to change their performance, while the rest of the wind farm is not. An example is when a technology upgrade is installed on a pilot wind turbine from a wind farm and only at a later stage, upon assessment of the energy yield improvement, the decision about extending the upgrade to all the wind farm is taken. Another similar situation is when a systematic error [35] (as in the present test case) is corrected on a wind turbine: after the yaw error correction of 4 • , WTG05 is expected to improve its performance, while the other wind turbines are not.
The idea is therefore that in this case it is possible not only to improve the model structure (with respect to binned power curves), but also to enrich the model inputs on a wind farm level because some reference wind turbines can be selected, whose operation parameters can be input variables for a regression for the power of the target wind turbine of interest (WTG05, in this case). This can be conceived as a wind farm generalization of the concept of rotor equivalent wind speed [41,42].
For the present test case, WTG02, WTG04 and WTG06 have been selected as reference wind turbines because, on the grounds of the knowledge of the history of the interventions on the wind farm, it has been evaluated that they are the most appropriate.
The input variables have been selected basing on their Pearson correlation coefficient with the target Y (power of WTG05). Given in general a pair of variables x and y, the Pearson correlation coefficient is defined in Equation (4): where N is the number of x and y observations. Basing on this criterion, the following input variables have been selected for the model whose target Y is the power of WTG05: These variables constitute the covariates matrix X.
In Table 1, the statistical properties of the covariates matrix and the Pearson coefficient with the output for the data set D bef are reported. The data set D bef is further represented in Figure 3, where the time series of the covariates and of the output are reported: these are grouped in subfigures, each representing homogeneous quantities (i.e., powers, rotor speeds, generator speeds of the reference wind turbines; power of the target wind turbine). Furthermore, in Figure 4, two scatter plots are reported for the data set D bef : the power of WTG05 is scattered against the powers and the rotor speeds respectively of the reference wind turbines.   In order to appreciate qualitatively the performance improvement of WTG05 after yaw error correction, in Figure 5, the power of WTG05 has been represented as a function of the power of WTG02 for the data sets D bef and D aft : on the left, the scatter plot is represented and on the right the average curve, with data averaged in interval of the 10% of the rated power, is reported. From the plot on the right in Figure 5, it qualitatively arises that the performance of WTG05 with respect to WTG02 slightly increases during D aft , in particular in Region II as expected. Furthermore, it arises that the relation between the power of WTG05 and the power of WTG02 is in general non-linear.
A coherent procedure for estimating the performance improvement is needed, taking into account the multivariate non-linear dependence of the power of WTG05 on the model inputs: for this reason, the selected model structure is Support Vector Regression with Gaussian Kernel.
In order to understand the principles of Support Vector Regression [43], consider at first a linear model (Equation (5)): The objective is finding f (X) with the minimum norm value β β subject to the residuals being lower than a threshold for each observation (Equation (15)): The optimization of the regression is a compromise between the flatness of f (x) and tolerance on the occurrence of residuals higher than . This problem can be rephrased through the Lagrange dual formulation: the function to be minimized is L (α) (Equation (7)): with the constraints (Equation (8)) where C is the box constraint. The β parameters are given in Equation (9): If α n or α * n are non-vanishing, the corresponding observation is called a support vector. Once the model has been trained with a data set, it can be used for predicting new values, given the inputs, through the function (Equation (10)): Equation (10)  The production improvement can be elaborated by studying how the discrepancy between predicted new values and observed new values changes after the yaw error correction.
For this work a non-linear regression has been selected: in general, it is obtained by replacing in the above formulas the dot products between observations matrix with a nonlinear Kernel function (Equation (11)): where ϕ is a transformation mapping the X observations into a feature space.
In this work, a Gaussian Kernel (Equation (12)) has been selected: Then, for the nonlinear case, Equation (7) rewrites as in Equation (13): and Equation (10) for predicting rewrites as in Equation (14):

Performance Analysis
The data sets at disposal are employed as follows for estimating the energy yield improvement subsequent to the yaw error correction: • D bef is randomly divided into two subsets: D0 (a random selection of 2 3 of the data set) and D1 (the remainder 1 3 of the data set). D0 is used for training the regression, D1 is used for testing the regression. The convergence of model training is verified through the MATLAB R fitrsvm routine.
• D aft (also named D2 for notation consistency) is used to quantify the performance deviation with respect to D1 (and therefore D bef ).
Once the Support Vector Machine model has been trained with the D0 data set, the output is simulated using Equation (14) (basing on the input variables observations) for the data sets D1 and D2. If the performance of the wind turbine has improved in D2 with respect to D1, the behavior of the difference between measured and simulated target should have changed Therefore, consider Equation (15) with i = 1, 2.

R(X
For i = 1, 2, one computes (Equation (16)) (16) and the quantity ∆ = ∆ 2 − ∆ 1 provides an estimate of the performance deviation from data set D1 to D2 [44,45]. In other words, if during D1 and D2 the wind turbine performance does not change significantly, the residuals (Equation (15)) have similar statistical properties: therefore, f (X) in average will deviate from Y(X) in a similar way in D1 and D2 and therefore ∆ 2 and ∆ 1 will be similar and their difference will approximately be 0. If, instead the performance improves during D2, the residuals (Equation (15)) are on average higher than in D1, because f (X) is estimated through a model trained on a data set characterized by a lower performance. Therefore, one would reasonably obtain ∆ non-negligibly higher than 0.

Yaw Error Diagnosis
The method described in Section 3.1 has been applied to the data sets D bef and D aft . Results are reported for the [7,8] m/s wind speed interval selection, but it has been crosschecked that the method does not depend on this peculiar selection, as long as rated power (and therefore rotor speed saturation) is not reached.
In Figures 6 and 7, WTG05 has been compared to a sample reference wind turbine (WTG06) in respectively the data set D bef and D aft .  Comparing Figure 6 and Figure 7, it is possible to observe that Figure 6 describes the wind turbine WTG05 affected by a systematic yaw error. This can be argued because in Figure 6, the highest rotor speed is achieved when the yaw error is of the order of 5 • and the curve is clearly asymmetric. Furthermore, the highest torque should be exerted on the wind turbine when the yaw error is around 0 • and it is therefore expected that the curves of WTG05 and WTG06 should be comparable, especially at 0 • yaw error when there is the highest data population: instead, it happens that around 0 • , the rotor speed deficit for WTG05 with respect to WTG06 is of the order of the 10% of the rotor speed working range for that model of wind turbine.
This kind of analysis has inspired further investigation of the present test case and a spinner anemometer has been installed on site for upwind wind flow measurements. The spinner anemometer data analysis has corroborated the present conclusions and a systematic yaw error of 4 • has been corrected by the wind turbine manufacturer. D aft therefore describes WTG05 after the yaw error correction and the expectation is that the yaw error-rotor speed curves for WTG05 and the reference wind turbine (WTG06) should be comparable. This is indeed the case, as can be seen in Figure 7: the first indication is that the working rotor speed ranges of WTG05 and WTG06 are very similar and the the WTG05 curve is not severely asymmetric as it was in Figure 6. In other words, from Figure 7 the observation is that WTG05 is not severely losing rotor speed (and therefore torque) as before the yaw error correction.
The above considerations can be summarized in Table 2, displaying how the asymmetry index η asym (Equation (2)) and the discrepancy index η disc (Equation (3)) change from D bef to D aft . It results that both indexes are of the order of 10 times higher in D bef with respect to D aft . Table 2. Asymmetry index η asym (Equation (2)) and Discrepancy index η disc (Equation (3)) before and after yaw error correction.

Data Set
As far as can be argued through operation data analysis, therefore, it can be stated that WTG05 has recovered the normal operation after the yaw error correction.

Performance Assessment
As a preliminary analysis, the SVM regression has been subjected to K-fold cross-validation [46]. (K − 1)/K of the data are used for training and the remaining 1/K are used for validation. K has been selected to be 10, as typical in this kind of applications. The models store the results of training on nine folds of observations out of 10, and leave one fold of observations out for validation. The generalization error can be estimated as the out-of-sample root mean-squared error and the result is 62 kW. Subsequently, all the training data set D0 has been passed for the training and the convergence of the model has been achieved after 5761 iterations. The resubstitution (in-sample) root mean-squared error is 53 kW.
Following Equation (15), for i = 1, 2, one can write the average residual between measurements and model estimates asR where N i is the sample size in D1 and D2. The WTG05 performance improvement after the yaw error correction should resemble in different statistical properties of the residuals R(X 2 ) with respect to R(X 1 ). If the model f (X) is trained with pre-correction data, when the output is simulated on the post-correction data set D2, the model estimate f (X) should be systematically lower than the measurements in a manner which should be distinguishable with respect to D1. This is indeed the case, as can be seen in Table 3: the measured WTG05 power is averagely 13.4 kW higher than the model estimate after the yaw correction with respect than before the correction. It should be noticed that the selected data-driven approach allows synthesising the experiments replication by running the model with different random choices of D0 (and consequently of D1). The results in Table 3 and all the following results should therefore be intended as the average results of 1000 iterations, but it should be noticed that few iterations are sufficient for obtaining a substantially stable results.
From this analysis, the obtained result is ∆ = 1.9%. Recalling that data have been filtered below rated power, because no performance improvement can be seen when the power is already rated, the result for ∆ means that on average WTG05 has improved its performance of the 1.9% below rated power after the correction of the yaw error. This result should converted in an estimate for the AEP improvement; for a detailed analysis about this aspect in relation to the Weibull wind intensity distribution, refer to [2]. For the purposes of the present work, a simple method can be employed: it is based on the analysis of the wind intensity statistics on site, because the resulting improvement in the AEP depends on how much time the wind turbine operates at rated power in that particular site. Therefore, Equation (18) provides a simple yaw to estimate ∆ AEP : where E sub−rated is the yearly sub-rated energy yield and E rated is the yearly rated energy yield. Equation (18) has been employed on several yearly test data sets and it results that the order of magnitude for ∆ AEP is 1.5%. Table 3. Statistical behavior of the residuals between measurement and estimation, for the D0 and D1 data sets.

Data SetR (kW)
Finally, it is interesting to analyze the dependence of the energy yield improvement for the different operation regions of the wind turbines. This can be appreciated in Figure 8: R(X 1 ) and R(X 2 ), computed on a sample model run, are displayed. The data are averaged in power production intervals, whose amplitude is 10% of the rated. From Figure 8 it arises that the correction of the yaw error improves the wind turbine performance especially between 35% and 75% of the rated power, with an increasing effect with increasing power. This corroborates the observations in [7] that the yaw behavior of a wind turbine must be analyzed taking into account the control and the operation regime. When rated power is approached, the pitch behavior becomes important and the yaw error becomes less relevant. Basing on the results in Figure 8, it is also supported that the yaw error is less relevant near the cut-in, because in this regime the control also involves the pitch. The yaw error is therefore particularly relevant in Region II, when the rotor speed increases and the blade pitch is practically constant. From the estimate about ∆ AEP and the result in Figure 8, it is argued that, despite the effects of systematic yaw error are concentrated in Region II, they are definitely non-negligible with respect to the energy yield of the wind turbine. Furthermore, it is a useful crosscheck to observe that the quantitative result of Figure 8 is coherent with the qualitative analysis of Figure 5.

Conclusions
The present study has been devoted to innovative operation data analysis methodologies for the diagnosis of wind turbine systematic yaw error and for the assessment of the impact of this kind of error on wind turbine energy yield. It has been organized as a collaboration between academia (University of Perugia) and industry (Renvico company) and it has the structure of a test case discussion.
The selected test case is a multi-MW wind turbine sited in Italy that has been diagnosed of a remarkable systematic yaw error through the operation data analysis of the present work. These results have been corroborated by upwind flow measurements through spinner anemometer and a systematic yaw error of 4 • has been corrected. Therefore, operation data describing the wind turbine in operation before and after the correction of this yaw error have been at disposal for this study and this has allowed estimating how much the wind turbine performance has improved after the yaw error correction or, on the other hand, how much energy yield is lost when there is a yaw error.
The two main results of this study are the following: • Despite the fact that SCADA data do not provide upwind flow measurements, it is nevertheless possible (and recommended) to employ them for reliably diagnosing systematic yaw errors.
The main innovation of this study has been targeting the rotor speed, because in Region II it directly depends on the torque exerted on the rotor and torque is lost if the yaw angle is systematically non-zero when it should be zero. Analyzing the properties of the yaw error-rotor speed curve, for each single wind turbine and comparing wind turbines in a wind farm, it is possible to obtain meaningful indications about the presence of systematic yaw errors.

•
As discussed in [7], the effect of yaw error should be read in light of the wind turbine control. The energy yield is affected by the systematic yaw error in particular in Region II, when the torque and the rotor speed increase with the wind and the blade pitch angle is practically set to 0. Near the cut-in and near rated power, the importance of pitch control increases and the yaw error is less important. The net effect of this on wind turbine operation can be estimated from the test case considered in this work: a systematic yaw error of 4 • can affect the AEP of the wind turbine for the order of 1.5%.
Furthermore, an interesting aspect of the present work is that it contributes in general to the problem of wind turbine performance monitoring through operation data analysis. It is well known in the literature that the power curve binning method (as recommended by the IEC [36]) is simplistic for precision monitoring and that reliable data-driven models for the power of a wind turbine should have a more complicated structure (as, for example, PCR, GP, LASSO regression and so on) and most of all should be multivariate, in order to account for the complex dependence of wind turbine power on climate conditions and working parameters.
The present study has contributed to the above issue with the innovative intuition that, in the wind farm practice, there are situations in which it is known that the performance of some wind turbines could (or should) have changed and that there is no reason why the performance of other wind turbines should have changed. The idea of this study is that this information should be exploited for the data-driven modelling of the target wind turbine of interest and the input variables for the model can include the operation variables of the reference wind turbines.
An interesting further direction of the present work is applying the present methodologies on a vast number of wind turbines, possibly operating in field for a considerable number of years; actually, this kind of study could contribute in shedding light on the occurrence of systematic yaw errors in a wind turbine lifetime, possibly depending on installation, human management interventions, wind turbine technology, and operation in a harsh environment. writing-review and editing, F.C., A.L. and L.T. All authors have read and agreed to the published version of the manuscript.