Research on Anomaly Detection of Wind Farm SCADA Wind Speed Data

: Supervisory control and data acquisition (SCADA) systems are critical for wind power grid integration and wind farm operation and maintenance. However, wind turbines are affected by regulation, severe weather factors, and mechanical failures, resulting in abnormal SCADA data that seriously affect the usage of SCADA systems. Thus, strict and effective data quality control of the SCADA data are crucial. The traditional anomaly detection methods based on either “power curve” or statistical evaluation cannot comprehensively detect abnormal data. In this study, a multi-approach based abnormal data detection method for SCADA wind speed data quality control is developed. It is mainly composed of the EEMD (Ensemble Empirical Mode Decomposition)-BiLSTM network model, wind speed correlation between adjacent wind turbines, and the deviation detection model based on dynamic power curve ﬁtting. The proposed abnormal data detection method is tested on SCADA data from a real wind farm, and statistical analysis of the results veriﬁes that this method can effectively detect abnormal SCADA wind data. The proposed method can be readily applied for real-time operation to support an effective use of SCADA data for wind turbine control and wind power prediction.


Introduction
Wind energy has become one of the fastest-growing energy sources. According to the estimate by the World Wind Energy Association, by 2020, approximately 12% of the world's electricity will be generated by wind power (GLOBAL WIND REPORT 2019). Supervisory control and data acquisition (SCADA) systems, as comprehensive monitoring systems that remotely connect each wind turbine with the main control room, have been widely used in wind power grid connection, power prediction, and wind farm operation [1][2][3] and maintenance [4]. During the operation of a wind turbine, a SCADA system typically samples wind turbine data at a high frequency (e.g., every second). Due to the high sampling frequency, SCADA data are not fully understood or utilized [5][6][7][8].
SCADA systems record numerous types of operating data, including historical operating status, and some data can be converted into characteristic curves reflecting the performance of the wind turbine, which has great utilization value [9]. High-quality SCADA data are the basis of data assimilation and post-processing of model forecasts for error correction. However, there are often abnormal data in SCADA data, including abnormal wind turbine status information, abnormal data collection, human intervention, and abnormal weather conditions. These anomalies sometimes destroy the data trends in the normal state of the wind turbine and complicate the use of data, especially for wind power prediction. Therefore, it is very important to detect and analyze anomalies in SCADA data.
At present, the prevailing methods for anomaly data detection include statistical correlation, distance relationship, deviation from physical relationship, and deviation from prediction. Anomaly detection based on statistical relationship is mainly used to test the inconsistency of each point in the sample set [10,11], finding an abnormal behavioral relationship between an individual sample and the dataset. The distance relationship method detects anomalies through the distance between a single data sample and the center of the dataset [12]. These two methods are not effective for some other complex abnormal conditions. Anomaly detection based on deviation relation is used to establish a group of data subsets of a dataset, and by calculating the dissimilarity between subsets, one can determine the outliers. In the actual data processing process, this is complex and difficult to deploy [13,14]. The method based on prediction is used to learn from a large amount of historical data, put the data into the prediction model, and compare the test data with the prediction data to confirm their abnormal characteristics [15,16]. This method sometimes assigns some normal mutation data as abnormal. For example, there are both abnormal information caused by the change in the turbine blade performance and sudden changes caused by natural severe weather. It is difficult to comprehensively detect anomalies by relying on one of the above methods alone. Therefore, this study refines and integrates three anomaly detection methods into a comprehensive detection method for filtering the abnormal SCADA data.
The rest of this article is organized as follows. In Section 2, three detection methods are introduced: the EEMD-BiLSTM network, wind speed correlation detection between adjacent wind turbines, and dynamic power curve fitting deviation detection. In Section 3, the novel design of a comprehensive detection method utilizing the three detection methods is presented, and the feasibility of the method is verified based on historical data of several wind turbines. The results of the real-time operation of the wind speed abnormality detection in the SCADA data method for a medium-sized wind farm are analyzed to determine the effectiveness of the proposed method for real-time wind speed abnormality detection in SCADA data. Finally, in Section 4, the experimental results are summarized to obtain conclusions. The major finding of this study is that the proposed detection method is capable of effectively filtering abnormal SCADA data. This method can be used for cleaning historical data records and also for real-time SCADA data quality control, effectively ensuring suitable use of SCADA data.

EEMD-BiLSTM Network
In recent years, long short-term memory (LSTM), a deep learning technology, has been widely used in wind power prediction in the field of wind energy [17][18][19]. LSTM is a time cyclic neural network, which is specially designed to solve the long-term dependence problem existing in general RNN (Recursive Neural Network) and CNN (Convolution Neural Network).
However, using the time series model method alone, the detailed information in the data cannot be effectively displayed, so the set empirical mode decomposition method is adopted. When splitting the original signal, the decomposed components can automatically match their own scale [20]. If the decomposed components can still be split, they continue to decompose until they are not decomposed. At this time, all components of the original signal decomposed by the EEMD method have been obtained [21]. This decomposition method can mine more detailed information from inside the signal and is very suitable for dealing with unstable data. EEMD has two steps ( Figure 1 Step 1: Add normal-distributed noise to the wind speed time series to enhance E performance (Huang et al. [21]).
Step 2: Apply the EMD (Empirical Mode Decomposition) method to obtain N (intrinsic mode components) components and a residual. This process includes obtain the local extremum of the data, finding the upper and lower envelope of the wavefo and obtaining IMF (intrinsic mode components) by deducting the average value of upper and lower envelope from the original time series.
The decomposition steps diagram of EEMD is shown in Figure 1.
The prediction process is shown in Figure 2 below. Firstly, the ensemble empir mode decomposition method is used to split it into signals of different scales, so a greatly reduce the vibration and motility of the wind speed signal. The decomposed c ponent signals are used to optimize the parameter batch_size and the number of neur of the Bi-LSTM model. After finding the optimal parameters, the Bi-LSTM model is tialized, and each component signal is sent to the Bi-LSTM model for training to ob their own prediction results. The predicted wind speed is finally accumulated by the sults of all components. Step 1: Add normal-distributed noise to the wind speed time series to enhance EMD performance (Huang et al. [21]).
Step 2: Apply the EMD (Empirical Mode Decomposition) method to obtain N IMF (intrinsic mode components) components and a residual. This process includes obtaining the local extremum of the data, finding the upper and lower envelope of the waveform, and obtaining IMF (intrinsic mode components) by deducting the average value of the upper and lower envelope from the original time series.
The decomposition steps diagram of EEMD is shown in Figure 1.
The prediction process is shown in Figure 2 below. Firstly, the ensemble empirical mode decomposition method is used to split it into signals of different scales, so as to greatly reduce the vibration and motility of the wind speed signal. The decomposed component signals are used to optimize the parameter batch_size and the number of neurons of the Bi-LSTM model. After finding the optimal parameters, the Bi-LSTM model is initialized, and each component signal is sent to the Bi-LSTM model for training to obtain their own prediction results. The predicted wind speed is finally accumulated by the results of all components.
The EEMD-BiLSTM model is trained with SCADA data for each wind turbine independently. Figure 3 shows test results for a sample wind turbine from a mid-size wind farm located in northern Inner Mongolia. The model was trained with 10 months of data. With this short period of training data, the EEMD-BiLSTM network model can infer the true value of the wind speed with good accuracy. It can be expected that the model could be further improved with longer accumulation of data samples.  The EEMD-BiLSTM model is trained with SCADA data for each wind turbine independently. Figure 3 shows test results for a sample wind turbine from a mid-size wind farm located in northern Inner Mongolia. The model was trained with 10 months of data. With this short period of training data, the EEMD-BiLSTM network model can infer the true value of the wind speed with good accuracy. It can be expected that the model could be further improved with longer accumulation of data samples. The time series anomaly detection model assumes that the observation to be detected is a missing value and uses EEMD-BiLSTM deep learning technology to estimate/project the wind speed at that time point based on the other part of the time series data, and then it is used as a reference to judge the reliability of the wind speed observed by the SCADA system. To evaluate this method, we must first evaluate the accuracy of the deep learning scheme to estimate wind speed. We used the historical observation data collected from 1 January to 31 October 2020 of a wind turbine in northern Inner Mongolia to fill in the value of the "extract and leave the missing" value of the test sample, and then compare the filled  The EEMD-BiLSTM model is trained with SCADA data for each wind turbine independently. Figure 3 shows test results for a sample wind turbine from a mid-size wind farm located in northern Inner Mongolia. The model was trained with 10 months of data. With this short period of training data, the EEMD-BiLSTM network model can infer the true value of the wind speed with good accuracy. It can be expected that the model could be further improved with longer accumulation of data samples. The time series anomaly detection model assumes that the observation to be detected is a missing value and uses EEMD-BiLSTM deep learning technology to estimate/project the wind speed at that time point based on the other part of the time series data, and then it is used as a reference to judge the reliability of the wind speed observed by the SCADA system. To evaluate this method, we must first evaluate the accuracy of the deep learning scheme to estimate wind speed. We used the historical observation data collected from 1 January to 31 October 2020 of a wind turbine in northern Inner Mongolia to fill in the value of the "extract and leave the missing" value of the test sample, and then compare the filled The time series anomaly detection model assumes that the observation to be detected is a missing value and uses EEMD-BiLSTM deep learning technology to estimate/project the wind speed at that time point based on the other part of the time series data, and then it is used as a reference to judge the reliability of the wind speed observed by the SCADA system. To evaluate this method, we must first evaluate the accuracy of the deep learning scheme to estimate wind speed. We used the historical observation data collected from 1 January to 31 October 2020 of a wind turbine in northern Inner Mongolia to fill in the value of the "extract and leave the missing" value of the test sample, and then compare the filled estimated wind speed with the observation to calculate the estimation accuracy. Let us use RMSE to evaluate the effect.
The RMSE is the square root of the sum of squares of the deviation between the observed and true values and the reciprocal of the number of observations (m). This parameter can be used to measure the deviation between the observed and true values. If y (test) represents the predicted value of the model in the testing set, then the RMSE can be expressed as: Energies 2022, 15, 5869 5 of 18 Table 1 shows the statistical results for the cases of different missing rates of the data. It shows that when the missing data are within 10%, the root mean square error between the filled data and the original data is less than 0.3 m/s, indicating high accuracy. Since there are often continuous data abnormalities in the production and operation of wind turbines, we conducted deep learning time series data estimation tests with EEMD-BiLSTM for different consecutive abnormal points. Table 2 shows the experimental comparison results. It can be seen from Table 2 that when one to three consecutive points are missing, the time series model filling data can accurately simulate the original observation data (root mean square error < 0.4 m/s), so it can be used as a reference to identify and monitor abnormal measurement data. Figure 4 shows how a time series model can be used to fill in data to monitor abnormal data points. It can be seen from the figure that this method can effectively monitor abnormal timing data.

Detection of Wind Speed Correlation between Adjacent Wind Turbines
Under the conditions of global atmospheric circulation and weather system circulation, the near-surface airflow of wind farms is determined by the local topography and other underlying surfaces, and it has a high degree of correlation over several hundred meters to several kilometers. Thus, the correlation of wind speed between two adjacent wind turbines contains crucial information on the anomalies in one or both wind turbines.

Detection of Wind Speed Correlation between Adjacent Wind Turbines
Under the conditions of global atmospheric circulation and weather system circulation, the near-surface airflow of wind farms is determined by the local topography and other underlying surfaces, and it has a high degree of correlation over several hundred meters to several kilometers. Thus, the correlation of wind speed between two adjacent wind turbines contains crucial information on the anomalies in one or both wind turbines. The correlation analysis of wind speed between adjacent wind turbines refers to the analysis of two or more correlated variable elements to measure the closeness of the correlation between the two variable factors. The correlation coefficient reflects the direction and degree of the change trend between two variables. Its value ranges from −1 to +1, where 0 means that the two variables are not correlated. A positive value means a positive correlation, and a negative value means a negative correlation. The larger the value, the stronger the correlation.
The correlation coefficient, one of the first statistical indicators designed by statistician Carl Pearson, is a quantity measuring the degree of linear correlation between variables, usually expressed in the letter r. Due to the different study subjects, the correlation coefficient can be defined in several ways. Among them, the Pearson correlation coefficient is more commonly used. where The Pearson correlation analysis is widely used in the field of wind power. Selwyn [22] applied the correlation method to analyze the reliability of wind turbine components. Mostafa [23] applied wind load correlation analysis for wind farm reliability assessment. Shin [24] applied the structural correlation evaluation method of wind farms to evenly estimate the reliability of wind farms.
The wind speed correlation of two wind turbines is mainly affected by the distance between the wind turbines and the micro-scale topography of the area. First, we used the wind speed data of the wind turbines in 2019.1-2019.12 to verify the correlation. Here, we selected a medium-sized wind farm in central Mongolia, China and identified their wind speed correlation. The verification results are shown in Table 3. Table 3. Relationship between wind turbine correlation coefficients and turbine (short distance). A1-A10 are the IDs of the turbine sample.   Tables 3 and 4 show that the wind turbines situated close to each other are highly correlated, whereas those situated farther apart are weakly correlated. Therefore, correlation of close-by wind turbines, namely, with a distance less than 3-5 km, can be used to infer the anomalies of the data of the two turbines. For a turbine of concern, by applying the algorithm to two to three close-by wind turbines, one can generally determine if the target turbine is an anomaly. In other words, if the correlation between a wind turbine and several surrounding wind turbines is very poor, this indicates that the wind turbine may be abnormal.  Tables 3 and 4 show that the wind turbines situated close to each other are highly correlated, whereas those situated farther apart are weakly correlated. Therefore, correlation of close-by wind turbines, namely, with a distance less than 3-5 km, can be used to infer the anomalies of the data of the two turbines. For a turbine of concern, by applying the algorithm to two to three close-by wind turbines, one can generally determine if the target turbine is an anomaly. In other words, if the correlation between a wind turbine and several surrounding wind turbines is very poor, this indicates that the wind turbine may be abnormal. Table 4. Relationship between wind turbine correlation coefficients and turbine (short distance). B1-B10 are the IDs of the turbine sample.
Note: The lower left part of the table is the distance relationship between the wind turbines, expressed in kilometers, and the upper right part of the table is the corresponding correlation relationship. Tables 3 and 4 show that the wind turbines situated close to each other are highly correlated, whereas those situated farther apart are weakly correlated. Therefore, correlation of close-by wind turbines, namely, with a distance less than 3-5 km, can be used to infer the anomalies of the data of the two turbines. For a turbine of concern, by applying the algorithm to two to three close-by wind turbines, one can generally determine if the target turbine is an anomaly. In other words, if the correlation between a wind turbine and several surrounding wind turbines is very poor, this indicates that the wind turbine may be abnormal. To demonstrate the correlation-based anomaly detection method, we selected four wind turbines (turbine A and its three adjacent turbines A1, A2, and A3) from a wind farm in northern Inner Mongolia to test the ability of our detection method to detect abnormal data. The data period used was from 1 November to 31 December 2020, and the data are the 15-min average wind speeds of the SCADA wind speed data of the four wind turbines. The calculated correlation coefficients of the four turbines are shown in Table 5.
Note: The lower left part of the table is the distance relationship between the wind turbines, expressed in kilometers, and the upper right part of the table is the corresponding correlation relationship.
To demonstrate the correlation-based anomaly detection method, we selected four wind turbines (turbine A and its three adjacent turbines A1, A2, and A3) from a wind farm in northern Inner Mongolia to test the ability of our detection method to detect abnormal data. The data period used was from 1 November to 31 December 2020, and the data are the 15-min average wind speeds of the SCADA wind speed data of the four wind turbines. The calculated correlation coefficients of the four turbines are shown in Table 5. Table 5. SCADA wind speed correlation coefficients of four wind turbines.

ID Correlation Coefficient
The statistical results based on long-term (two months) samples, presented in Table 5, show that if the wind turbine is normal most of the time, the correlation between adjacent wind turbines is very good. Although there are certain differences in the correlation between different wind turbines-for example, the correlation coefficient between the A1 and A2 wind turbines and the correlation coefficient between the A1 and A3 wind turbines are relatively large-the overall result can be used by focusing on a short-term correlation, such as over the course of one to three days, to detect degrading data quality. Short-term correlation detection provides support for the quality of wind speed observation. Table 6 shows the correlation calculation results for a three-day period.

ID Correlation Coefficient
The statistical results based on long-term (two months) samples, presented in Table  5, show that if the wind turbine is normal most of the time, the correlation between adjacent wind turbines is very good. Although there are certain differences in the correlation between different wind turbines-for example, the correlation coefficient between the A1 and A2 wind turbines and the correlation coefficient between the A1 and A3 wind turbines are relatively large-the overall result can be used by focusing on a short-term correlation, such as over the course of one to three days, to detect degrading data quality. Short-term correlation detection provides support for the quality of wind speed observation. Table 6 shows the correlation calculation results for a three-day period. It can be seen from Table 6 that turbine A has a poor correlation with its adjacent wind turbines A1, A2, and A3 during this period. The wind speed timing diagram of these four wind turbines during this period ( Figure 5) shows that the wind speed of wind turbine A began to be abnormal at 3600 min after the start of the detection period, which is consistent with the correlation analysis result. In real-time operational applications, roll- It can be seen from Table 6 that turbine A has a poor correlation with its adjacent wind turbines A1, A2, and A3 during this period. The wind speed timing diagram of these four wind turbines during this period ( Figure 5) shows that the wind speed of wind turbine A began to be abnormal at 3600 min after the start of the detection period, which is consistent with the correlation analysis result. In real-time operational applications, rolling correlation evaluation and testing of wind turbine data were performed over the past three days. It can be seen from Table 6 that turbine A has a poor correlation with its adjacent wind turbines A1, A2, and A3 during this period. The wind speed timing diagram of these four wind turbines during this period ( Figure 5) shows that the wind speed of wind turbine A began to be abnormal at 3600 min after the start of the detection period, which is consistent with the correlation analysis result. In real-time operational applications, rolling correlation evaluation and testing of wind turbine data were performed over the past three days.  Figure 11).  Figure 11).

Dynamic Power Curve Fitting
The power generated by a wind turbine is proportional to the third power of the wind speed, which satisfies a certain functional relationship, that is, a power curve. A wind turbine power curve will be provided by the wind turbine manufacturer, but due to many external factors, the actual power curve of the wind turbine installed in the wind farm will be different from the manufacturer's calibrated power curve, and the actual power curves of identical wind turbines will not be completely consistent at different sites. SCADA wind speed and wind power monitoring data can be used to establish a dynamic power curve, which can then be used to detect abnormal wind speeds of wind turbines.
The actual power curve established based on the measured data of SCADA can be completed by the statistical fitting method. The actual power curve and the wind speedpower dispersion graph are general measures of wind turbine performance and contain important information about the overall health of the wind turbine. Many failures and performance degradation processes will be manifested in the measured power curve. The power curve generated from SCADA data can be used to detect wind turbine failures or give early indications of severe performance degradation. It also manifests abnormal SCADA data due to human interference, operation of the wind turbine, and other factors, such as icing, strong turbulence, etc.
The wind turbine power curve modeling methods of wind farms are divided into three categories, namely discrete methods, parametric methods, and non-parametric methods. Discrete methods mainly adopt a standardization algorithm based on Taylor series expansion and turbulence intensity [25]. Parametric methods mainly include the piecewise average method (IEC) [26], the piecewise linear model method [27], polynomial fitting [28], exponential fitting [29], and four-parameter logistic function [30]; non-parametric methods mainly include support vector basis, k-nearest neighbors, the decision tree, and the extreme random tree [31][32][33]. The accuracy of the parametric methods is generally worse than that of the non-parametric methods, but the parametric methods are easier to deploy. Therefore, Energies 2022, 15, 5869 9 of 18 parametric methods are often used to model the wind speed-power characteristic curve in practical applications. The three methods used in this study are described below.
(1) Polynomial fitting method Polynomial fitting uses polynomial expansion to fit all the observation points in the analysis area to obtain the objective analysis field of the observation data. The expansion coefficient (a) is determined by least squares fitting. For the wind speed and power data points (x i , y i ), 1 ≤ i ≤ N of a given wind turbine, the following n-order polynomials can be used to fit: The polynomial fitting method is simple to deploy, and the power curve modeling of wind turbines currently uses polynomial fitting modeling. However, the regional polynomial fitting of this method is not stable, and missing data will cause severe distortion of the fitting curves.
(2) Exponential fitting The wind power curve is the most intuitive expression of the generating capacity of the unit. The power curve used for wind turbine performance analysis and evaluation is calculated from the measured wind power according to a certain algorithm.
where p(v) is the power value, p max is the rated maximum power value of the wind turbine, v is the wind speed value, and α, β, and k are the fitting curve parameters.
(3) Four-parameter logic function The shape of the curve is determined by the vector parameter v = (h,m,q,τ) of the logic function. The parameters of the logistic function can be estimated by the least square method, the maximum likelihood method, and the evolutionary programming method. Parameters can be obtained using the genetic algorithm, particle swarm optimization algorithm, and difference algorithm. The accuracy of the power curve model based on these methods is much higher than that obtained by the non-parametric method.

(4) Comparison of different power-curve fitting methods
The polynomial fitting curve lacks inertia and is easily affected by abnormal data. When the wind speed reaches the maximum value and the wind speed is very small, oscillation is likely to occur. Feasible solutions require artificial reconstruction of data, which is more complicated. The four-parameter logic function is relatively more stable, but the maximum value part is prone to deviation. Non-parametric functions can better fit the wind power curve, but the relevant parameters of the fitted curve cannot be directly obtained, which is inconvenient for deployment in actual projects.
The maximum value of the curve of the exponential function method is closer to the actual curve, and there is a maximum coefficient, which can be easily determined according to the nominal power of the wind turbine or the maximum power value of the business operation. However, the maximum value of actual power may be abnormal data, so the maximum value is optimized.
in which p max−1% represents the power data that satisfy the wind speed-the power condition data value is in the top 1%, and p median represents the median of these data. The exponential fitting formula is improved as follows: in which p(v) is the power value, v is the wind speed value, and α, β, and k are the fitting parameters. By fitting Equation (7) with the wind farm data using the Python curve_fit function, α, β, and k providing the objective function of Equation (7) and inputting historical data, the best fitting parameters are searched. The fitting curve parameters obtained by inputting one year of historical data can well reflect the curve trend. When performing power curve fitting detection, it is necessary to verify the validity of wind speed. Therefore, the power-wind speed relationship can be written as Equation (7).
For the purpose of comparison, the above methods are employed to fit the same data of a wind turbine with a rated power of 1500 kW in the selected wind farm in northern Inner Mongolia, and the fitting results are shown in Figure 6. Table 7 gives the trained parameters of the four curve-fitting methods and the corresponding standard deviation. Although the optimized exponential fitting method does not significantly improve the STD compared with the polynomial fitting method, it has fewer control parameters and is convenient for practical deployment.
92.80 Figure 6 shows that in a well-performed wind turbine, all methods achieve reasonably good fitting curves, although there are evident differences in the wind ranges below the turbine cut-in speed and above label power capacity.

Method Name Estimated Power Curve Model STD
Polynomial fitting curve 100.71 Four parameter logic function fitting curve 92.80 Figure 6 shows that in a well-performed wind turbine, all methods achieve reasonably good fitting curves, although there are evident differences in the wind ranges below the turbine cut-in speed and above label power capacity.

Mahalanobis Distance
Since the unit scale of wind speed and power is inconsistent, a Mahalanobis distance pair is used to describe the similarity relationship between points. It is an effective method to calculate the similarity of two unknown sample sets. Unlike Euclidean distance, it considers the relationship between various characteristics.
If two vectors, x 1 , x 2 ∈ R, are two groups of samples of the dataset, U is the mean of vector x 1 , V is the mean of vector x 1 , and Σ is the covariance of x 1 and x 2 , the Mahalanobis distance between x 1 and x 2 is: Table 8 shows that if the Mahalanobis distance is 1.5, a better retention point can be obtained. When the distance is greater than 2, the number of the retention points changes little.

Power-Curve Deviation Abnormal Data Detection
For data quality control, it is important to note that one should not use all data of a wind turbine for fitting its power curve. If one does, and if there is a significant amount of abnormal data, the generated fitting curve (e.g., the red curve shown in Figure 7 for a selected turbine) will not contain useful information for the deviation detection requirements.

Power-Curve Deviation Abnormal Data Detection
For data quality control, it is important to note that one should not use all data of a wind turbine for fitting its power curve. If one does, and if there is a significant amount of abnormal data, the generated fitting curve (e.g., the red curve shown in Figure 7 for a selected turbine) will not contain useful information for the deviation detection requirements. To solve the problem, a multi-step dynamic power fitting method was implemented. Firstly, based on the wind turbine operating mechanism and operating strategy, a coarse- To solve the problem, a multi-step dynamic power fitting method was implemented. Firstly, based on the wind turbine operating mechanism and operating strategy, a coarsegrained confidence equivalent wind speed boundary model was established to identify and eliminate obvious abnormalities, that is, "power curve impossible" data. The cleaning process is shown in Figure 8. grained confidence equivalent wind speed boundary model was established to identify and eliminate obvious abnormalities, that is, "power curve impossible" data. The cleaning process is shown in Figure 8. The diagram on the left in Figure 8 illustrates the first round of cleaning data. The blue box indicates the "impossible power curve area", which is directly eliminated. The diagram on the right in Figure 8 shows the data fitting after the first round of cleaning, where the red line is the power curve generated using the cleaned data. The power curve generated by the data after the first round of cleaning better reflects the power curve. Based on the first-round fitted wind power curve, we performed the second step of cleaning to improve the accuracy of the power curve. The setting method is to substitute the SCADA data power value into the first fitting curve to calculate the wind speed. The wind speed value in the SCADA data is within the range of ±2 when the wind speed value in the SCADA data is in the range of ±2. A wind farm verifies that the restriction conditions used in the second round of cleaning are reasonable. The fitting curve after cleaning is shown in Figure 9.  Figure 7, but for the first round of data cleaning and power curve fitting. labels (a): Areas (blue frames) of "power curve impossible" data (black dots) are marked and removed; labels (b) power curve fitting (red curve) after removing the bad data. (a) The first round of data cleaning diagram; (b) fitting curve diagram of the first round of cleaning. The blue lines enclose the areas of "impossible power curve area" determined empirically.
The diagram on the left in Figure 8 illustrates the first round of cleaning data. The blue box indicates the "impossible power curve area", which is directly eliminated. The diagram on the right in Figure 8 shows the data fitting after the first round of cleaning, where the red line is the power curve generated using the cleaned data. The power curve generated by the data after the first round of cleaning better reflects the power curve. Based on the first-round fitted wind power curve, we performed the second step of cleaning to improve the accuracy of the power curve. The setting method is to substitute the SCADA data power value into the first fitting curve to calculate the wind speed. The wind speed value in the SCADA data is within the range of ±2 when the wind speed value in the Energies 2022, 15, 5869 13 of 18 SCADA data is in the range of ±2. A wind farm verifies that the restriction conditions used in the second round of cleaning are reasonable. The fitting curve after cleaning is shown in Figure 9.
where the red line is the power curve generated using the cleaned data. The power curve generated by the data after the first round of cleaning better reflects the power curve. Based on the first-round fitted wind power curve, we performed the second step of cleaning to improve the accuracy of the power curve. The setting method is to substitute the SCADA data power value into the first fitting curve to calculate the wind speed. The wind speed value in the SCADA data is within the range of ±2 when the wind speed value in the SCADA data is in the range of ±2. A wind farm verifies that the restriction conditions used in the second round of cleaning are reasonable. The fitting curve after cleaning is shown in Figure 9.  The diagram on the left side of Figure 9 illustrates the second-round data cleaning algorithm. The black points are the abnormal data points, which are directly eliminated. The diagram on the right side of Figure 9 shows the data fitting after the second round of cleaning, where the red line is the power curve generated using the cleaned data. After the second step of data cleaning, the fitted power curve is more accurate. The third step is to repeat the cleaning process of the second step based on the wind power curve fitted in the second step. The setting method is to substitute the SCADA data power value into the second fitting curve to inversely calculate the wind speed. If the wind speed value in the SCADA data is within the range of ±1.5 for the inverse wind speed value, the data are considered normal. The process is shown in Figure 10. The diagram on the left side of Figure 9 illustrates the second-round data cleaning algorithm. The black points are the abnormal data points, which are directly eliminated. The diagram on the right side of Figure 9 shows the data fitting after the second round of cleaning, where the red line is the power curve generated using the cleaned data. After the second step of data cleaning, the fitted power curve is more accurate. The third step is to repeat the cleaning process of the second step based on the wind power curve fitted in the second step. The setting method is to substitute the SCADA data power value into the second fitting curve to inversely calculate the wind speed. If the wind speed value in the SCADA data is within the range of ±1.5 for the inverse wind speed value, the data are considered normal. The process is shown in Figure 10. The diagram on the left side of Figure 10 shows the third round of data cleaning. The black points are designated as the abnormal data points. The diagram on the right side of Figure 10 shows the data fitting after the third round of cleaning, where the red line is the power curve generated using the cleaned data. After the third step of data cleaning, the fitted power curve much accurately expresses the wind speed and power relationship of the wind turbine.
The advantage of dynamic power curve fitting is that it can accurately fit the wind speed-power function relationship of wind turbines in the normal operation state of wind farms and readily support monitoring of wind turbine performance and aid in quality control of wind turbine SCADA wind speed data. In practical applications, it is necessary The diagram on the left side of Figure 10 shows the third round of data cleaning. The black points are designated as the abnormal data points. The diagram on the right side of Figure 10 shows the data fitting after the third round of cleaning, where the red line is the power curve generated using the cleaned data. After the third step of data cleaning, the fitted power curve much accurately expresses the wind speed and power relationship of the wind turbine. The advantage of dynamic power curve fitting is that it can accurately fit the wind speed-power function relationship of wind turbines in the normal operation state of wind farms and readily support monitoring of wind turbine performance and aid in quality control of wind turbine SCADA wind speed data. In practical applications, it is necessary to experimentally determine the data length (time period) and update frequency for power curve fitting. The length of the time period must meet the representative requirement of number of data samples for the power curve fitting, and the update frequency is mainly affected by seasonal changes in the performance of the wind turbine mechanical equipment and meteorological conditions. Based on the wind farm selected for this study, the SCADA data of the past two months can meet the requirements, and the frequency of updating the fitting can be up to a month.

Summary of Data
In this study, SCADA data obtained from a real wind farm in northern Inner Mongolia were used to evaluate the proposed detection method. The wind farm is located in mountainous with steep terrain, and there are 42 wind turbines distributed along the ridges of the terrain at heights of around 1770-1960 m. The distribution of wind turbines is shown in Figure 11. We note that due to proprietary requirements, the geophysical information (e.g., latitude and longitude) of the wind turbines and the domain are not shown. The data include 15-min average wind speed and power output from 1 January to 31 October 2020 for all 42 wind turbines.

Evaluation Method
A two-class confusion matrix (Table 9) is used to quantitatively assess the a detection state.  Table 9, the following three statistical scores can be calc Figure 11. Distribution of wind turbines used for real-time study. The color shades show the terrain height (meters above sea level).

Evaluation Method
A two-class confusion matrix (Table 9) is used to quantitatively assess the anomaly detection state. Based on the matrix in Table 9, the following three statistical scores can be calculated: (1) Precision Precision is defined as follows: Precision shows a ratio of correctly detected anomaly samples over the total detected anomaly samples.

(2) Recall
Recall is defined as follows: Recall is the proportion of the number of anomalies correctly detected to the actual number of anomalies in the testing dataset.
(3) F1-Score These quantities are also related to the F1-Score, which is defined as the harmonic mean of precision and recall: The system can obtain a high precision score while a number of anomalies are being missed. Similarly, the system can obtain a high recall score, while the false positive rate is higher. Therefore, the F1-score provides more general information on the accuracy of the proposed approach because it is the weighted average of the precision and recall rates.

Evaluation Results
In this section, we present the results of the three anomaly detection schemes and the combined approach. The data period is from 1 November 2020 to 31 December 2020, with a total of 245,952 15-min mean value samples.

Efficiency of the Three Detection Methods
EEMD-BiLSTM Correlation and Power Curve based abnormal data detection methods attempt to identify anomaly data from very different perspectives of the SCADA data. The EEMD-BiLSTM model can detect sudden abnormal information of the wind turbine in the time series, but it cannot effectively detect abnormal changes in wind speed trends or abnormal data that do not meet the normal state of the wind turbine. It can be seen from the statistics in Table 10 that correlation detection can find a larger proportion of the abnormal data, but the other two methods cannot effectively find abnormal data. Therefore, it is difficult to comprehensively monitor all abnormalities using the three methods individually. A combined application of the EEMD-BiLSTM model, the wind speed correlation detection approach, and dynamic power curve fitting and deviation detection algorithm would enable an effective integrated anomaly detection model. Figure 12 lays out the dataflow of the combination scheme that allows the three methods to collaboratively discover abnormal data. A combined application of the EEMD-BiLSTM model, the wind speed correlation detection approach, and dynamic power curve fitting and deviation detection algorithm would enable an effective integrated anomaly detection model. Figure 12 lays out the dataflow of the combination scheme that allows the three methods to collaboratively discover abnormal data.

Overall Evaluation
For an overall evaluation, all 42 stably running wind turbines of a medium-sized wind farm were selected for evaluating the abnormal wind speed detection methods. The data from 1 January 2019 to 31 October 2020 were selected for training the models, and the data from 1 November 2020 to 31 December 2020 were for anomaly detection test. We

Overall Evaluation
For an overall evaluation, all 42 stably running wind turbines of a medium-sized wind farm were selected for evaluating the abnormal wind speed detection methods. The data from 1 January 2019 to 31 October 2020 were selected for training the models, and the data from 1 November 2020 to 31 December 2020 were for anomaly detection test. We computed the Precision, Recall, and F1-Score of the individual analysis of the three anomaly detection methods developed in this study and their combined usage. The results are presented in Table 11. The table shows that the correlation detection method gains the best precision, while the F1-Scores of the three methods are close. However, the combined approach, with use of the three methods collaboratively, yields significantly superior performance.

Summary and Conclusions
There are many reasons for and different manifestations of SCADA data abnormalities, and a single detection method cannot effectively and comprehensively evaluate abnormal conditions. This study introduces a method for detecting the quality and anomalies of SCADA data with a combined approach based on EEMD-BiLSTM deep learning data fitting anomaly detection, correlation anomaly detection, and dynamical power-curve fitting anomaly detection. The characteristics and performance of the anomaly detection approaches are studied with wind farm SCADA data. The main conclusions of this study are as follows: (1) The wind speed correlation detection of nearby wind turbines can be used to effectively determine whether the wind speed data are abnormal for a certain period, but it cannot be used to determine individual data abnormalities. (2) The wind power curve is effective to determine whether the discrete wind speed point value is in line with the rated generation of the wind turbine, but it is less useful for weak winds (less than the cut-in wind speed) and strong winds (larger than the wind speed of the rated power). It also requires continuous monitoring of the overall health of wind turbines to make sure the automatic power-curve fitting is working properly. (3) The EEMD-BiLSTM model can be used to determine whether the wind speed value meets the time series law, but it is only effective for short-term abnormalities. (4) Through the coordinated combined monitoring of the three methods, abnormalities of SCADA data can be effectively detected.
The SCADA anomaly detection method developed in this study can be readily deployed for real time uses. It has been applied by the Inner Mongolia Electric Power Company, China, for real-time operation, and the system plays a critical role for supporting the data assimilation and model output bias correction of a numerical weather prediction model operated by the company for wind forecasting at over 100 wind farms.