Fault Diagnosis and Reconstruction of Wind Turbine Anemometer Based on RWSSA-AANN

: When the state of the wind turbine sensors, especially the anemometer, appears abnormal it will cause unnecessary wind loss and affect the correctness of other parameters of the whole system. It is very important to build a simple and accurate fault diagnosis model. In this paper, the model has been established based on the Random Walk Improved Sparrow Search Algorithm to optimize autoassociative neural network (RWSSA-AANN), and is used for fault diagnosis of wind turbine group anemometers. Using the cluster analysis, six wind turbines are determined to be used as a wind turbine group. The 20,000 sets of normal historical data have been used for training and simulating of the model, and the single and multiple fault states of the anemometer are simulated. Using this model to analyze the wind speed supervisory control and data acquisition system (SCADA) data of six wind turbines in a wind farm from 2013 to 2017, can effectively diagnose the fault state and reconstruct the fault data. A comparison of the results obtained using the model developed in this work has also been made with the corresponding results generated using AANN without optimization and AANN optimized by genetic algorithm. The comparison results indicate that the model has a higher accuracy and detection rate than AANN, genetic algorithm auto-associative neural network (GA-AANN), and principal component analysis (PCA). Author Contributions: Conceptualization, L.Z. and Q.Z.; methodology, L.Z. and Q.Z.; software, L.Z.; validation, L.Z. and Q.Z.; formal analysis, L.Z.; investigation, L.Z.; resources, L.Z.; data curation, L.Z.; writing—original draft preparation, L.Z.; writing—review and editing, Q.Z.; visualization, X.W.; supervision, X.W.; project administration, A.Z.;


Introduction
A wind turbine is a complex electromechanical equipment, usually located in remote areas with harsh environments. The nacelle is generally installed at a high altitude. With its high tendency in abnormalities and failures in its daily operation, it is difficult to detect, thus leading to high maintenance costs. With the continuous improvements of the wind turbine capacity, the frequency of component failures is also increasing, which hinders its stable operation and even leads to the stopping of the unit. The electrical system has the highest frequency of component failures among all components of the turbine, followed by sensors [1].
Large-scale wind turbines must install monitoring systems such as a stand-alone main control system, stand-alone condition monitoring system (CMS), and SCADA. Various sensor systems obtain the necessary real-time environmental parameters and status parameter data and complete tasks such as wind turbine start, pitch, yaw, monitoring the data over-limit alarm, and brake protection. Hence, the wind turbine monitoring system contains various sensors such as anemometers, wind vanes, speed decoders, position encoders, temperature sensors, vibration sensors, voltage, and current sensors, etc. Statistics show that more than 14% of wind turbine failures are caused by sensor failure, and more than 40% are caused by sensor-related system failures [2,3]. The failure rate of sensors is determined by the faulty components. The medium is relatively high and Energies 2021, 14, 6905 2 of 18 will directly cause the control system to receive error messages and issue wrong control instructions, thereby endangering the safety of equipment as well as personnel [4]. As the most important component of the entire SCADA system, the anemometer will affect the function and efficiency of the entire system when its status fails or performance declines. Therefore, it is very important to diagnose the status of the anemometer and deal with it accordingly. However, a wind farm is usually composed of dozens or even hundreds wind turbines, which are different in location, altitude, and environment, respectively, but there is a certain correlation among them. Based on this correlation, when analyzing the state of the anemometer, it is necessary to identify a wind turbine group with high similarity for research.
Many scholars have carried out extensive research on the fault diagnosis technology of wind turbine sensors. In reference [5], a model-based fault detection and isolation technology is proposed and applied to the fault detection of wind turbine benchmark model sensors. In reference [6], a fault diagnosis method of stator current sensor of doubly fed generator based on wavelet transform is proposed. The nonlinear model of doubly fed generator is transformed into equivalent Takagi Sugeno (T-S) model, and the residual vector generated by Luenberger observer is used to realize fault diagnosis of current sensor. In reference [7], using the dynamic model of wind power system, the H/h observer of wind turbine system is established in a limited frequency range, and the fault detection is realized by the residual error generated by the observer. However, due to the complexity of wind turbine system structure, it is impractical to use physical model in practical engineering research. At the same time, due to the improvement of SCADA system, fan fault detection can be realized by relying on abundant measurement data. There are many methods to realize fault diagnosis based on SCADA data. For example, in reference [8], for the gearbox fault detection of wind turbine, the gearbox detection model under normal state is established based on SCADA temperature data and heat transfer principle. The established model is used to analyze the fault fan data in one month, so as to realize the operation condition monitoring of gearbox. Corley et al. [9] for the gearbox fault detection of wind turbine, the gearbox detection model under normal state is established based on SCADA temperature data and heat transfer principle. The established model is used to analyze the fault fan data in one month to realize the gearbox fault detection. Pei et al. [10] proposed a zero-drift fault detection method based on SCADA data, detected the zero-drift fault by analyzing the power characteristics under different yaw angles, and verified the effectiveness of this method through multiple actual wind farms. In [11], a multi-fault detection and classification strategy based on SCADA data is proposed. The SCADA data is grouped, zoomed and feature transformed through multi-direction principal component analysis, and the support vector machine is used for fault classification. The results show that the accuracy of fault detection and classification is 98.2%. In reference [12], a fault detection and isolation method based on classifier fusion and data-driven fusion of multiple classifiers is proposed to extract features from the measured signals, which enriches the state information of wind turbines and improves the decision-making ability of fault detection and isolation schemes. However, these methods do not consider the correlation and nonlinear mapping between the same sensors of wind turbines in wind farms, so the detection performance is limited. Considering the increasing capacity of wind turbines and the continuous improvement of SCADA system, more attention should be paid to the relationship between similar sensors in wind turbine group to further explore the complex relationship between related sensors.
The application of AANN to fault diagnosis was first proposed by Kramer [13] and has been widely used in other fields ever since. Zhang et al. [14] proposed a bearing residual life prediction model based on associative neural network is proposed. The model randomly selects 4 bearing data from 17 bearing data sets as the verification set of the model, and the remaining 13 bearing data are used for the training set. A learning rate attenuation mechanism was used. The Actual results show that compared with LASSO, Random Forest Regression (RFR), Support Vector Regression (SVR), and Deep learning, the model has significant improvements in both RMSE and MAE. Huang et al. [15] proposed an AANN-based sensor fault diagnosis and reconstruction method in the engine control system, which could diagnose hard and soft sensor faults and reconstruct data. In the study by Vanini et al. [16], AANN was used for fault diagnosis in gas turbine sensors and components, which could also be used for verification and correction of sensor data. Hamidreza et al. [17] proposed a reconfigurable AANN sensor fault diagnosis method and isolated and reconstructed the fault sensors. Huang et al. [18] proposed an online fault detection method based on AANN in the production process, which can effectively extract the hidden information contained in multi-dimensional process variables and realize the fault detection method for the production of Virginiamycin M and S by Streptomyces virginica. LV et al. [19] aiming at the sensor fault diagnosis of dual channel aviation turbofan engine, a SDQ algorithm based on genetic algorithm to optimize AANN neural network is proposed to realize multi-sensor fault diagnosis and noise removal. Elnour et al. [20] proposed the sensor fault diagnosis of a multiarea HVAC system based on AANN and realized fault diagnosis of single and multiple sensors. At the same time, they also studied the improvements in the AANN structure. It can be seen that AANN model can be widely used in the fault diagnosis of various sensors and can achieve good results, but it is less used in the field of wind power. Therefore, using AANN to realize the sensor fault diagnosis of wind turbine has certain research value.
This study focuses mainly on the fault diagnosis of wind turbine anemometer based on the RWSSA-AANN model. The remainder of this paper is organized as follows: In Section 2, the principle of cluster analysis is elaborated, and the wind turbine group is determined. In Section 3, the structure of AANN is described, and the RWSSA-AANN model is established. In Section 4, the RWSSA-AANN model is used to diagnose several simulated faults of anemometer. Section 5 introduces the actual SCADA data detection. In Section 6, the comparison of the results of three models based on AANN, GA-AANN, and RWSSA-AANN in the fault diagnosis of anemometer is discussed. In Section 7, the conclusion is accounted.

Determining the Wind Turbine Groups of a Wind Farm
There are many wind turbines in a wind farm. Due to geographical location, altitude and other factors, not all wind turbines are in the same state. However, there is a correlation between wind speeds of wind turbines in the same wind farm. Therefore, it is necessary to find out the wind turbines with high correlation as wind turbine group for research. The Euclidean in cluster analysis is used to calculate the correlation between wind speeds in wind turbines, and the corresponding wind turbine group is obtained according to the distance.

Introduction to Cluster Analysis
Assuming there are n samples, starting with n classes, and calculating the minimum distance between each pair. The sample x i and x j are represented by i and j, respectively, and the distance between x i and x j is represented by d ij , G p , and G q represent two classes, each containing n p and n q samples.
Find the minimum distance from the off-diagonal line of D 0 , set the element to be d pq , then merge G p and G q into a new class G r = (G p , G q ), remove the two rows and two columns where G p and G q are located in D 0 , and add the distance between the new class and other classes to get the n-1 order matrix D 1 .

3.
Starting from D 1 , repeat step 2 to get D 2 , and then start from D 2 to repeat the above steps until all the samples are grouped into one category.

4.
In the process of merging, note down the number of the combined sample and the level of the two types of merging, and draw a cluster pedigree diagram.

Determination of the Wind Turbine Group
The data source used in this study is the SCADA data of 23 wind turbines provided by a wind power company. The SCADA data is consisted of the data recorded from a wind farm at 10 min intervals in 2017. A total of 10,000 sets of data were selected from 05:05 on April 1 to 10:35 on 9 June 2017. Calculate the average distance between the wind speeds of 23 wind turbines and classify them to obtain a clustering pedigree diagram, as shown in Figure 1. According to the average distance of 0.035 m/s as the dividing line, it can be divided into 7 categories, and 1, 2, 3, 4, 5, and 6 as a group for analysis.
2. Find the minimum distance from the off-diagonal line of D0, set the element to be dpq, then merge Gp and Gq into a new class Gr = (Gp, Gq), remove the two rows and two columns where Gp and Gq are located in D0, and add the distance between the new class and other classes to get the n-1 order matrix D1. 3. Starting from D1, repeat step 2 to get D2, and then start from D2 to repeat the above steps until all the samples are grouped into one category. 4. In the process of merging, note down the number of the combined sample and the level of the two types of merging, and draw a cluster pedigree diagram.

Determination of the Wind Turbine Group
The data source used in this study is the SCADA data of 23 wind turbines provided by a wind power company. The SCADA data is consisted of the data recorded from a wind farm at 10 min intervals in 2017. A total of 10,000 sets of data were selected from 05:05 on April 1 to 10:35 on 9 June 2017. Calculate the average distance between the wind speeds of 23 wind turbines and classify them to obtain a clustering pedigree diagram, as shown in Figure 1. According to the average distance of 0.035 m/s as the dividing line, it can be divided into 7 categories, and 1, 2, 3, 4, 5, and 6 as a group for analysis. Next, a verification of the correctness of the classification of the wind turbine group was carried out. For the wind turbines in the same wind turbines group, geographical location, altitude, and other environmental factors are similar, and the wind speed values are not much different. Three hidden layer back propagation (BP) simulation models were established with the wind speed of wind turbine Nos. 1-4 as the input and the wind speed of wind turbine NO. 5 as the output. The training data consisted of 2000 points. From the training prediction, the mean square error (MSE) between the predicted value without replacement and the actual wind speed value of wind turbine NO. 5 was obtained to be 0.3645. Following that, the wind turbine Nos. 6 and 10 replaced the wind turbine Nos. 1-4 as input, and simulation values were obtained under four different situations. The corresponding residuals were then compared with the actual values of wind turbine NO. 5 to estimate the MSE value under different situations. These results are given in Tables 1  and 2.  Next, a verification of the correctness of the classification of the wind turbine group was carried out. For the wind turbines in the same wind turbines group, geographical location, altitude, and other environmental factors are similar, and the wind speed values are not much different. Three hidden layer back propagation (BP) simulation models were established with the wind speed of wind turbine Nos. 1-4 as the input and the wind speed of wind turbine NO. 5 as the output. The training data consisted of 2000 points. From the training prediction, the mean square error (MSE) between the predicted value without replacement and the actual wind speed value of wind turbine NO. 5 was obtained to be 0.3645. Following that, the wind turbine Nos. 6 and 10 replaced the wind turbine Nos. 1-4 as input, and simulation values were obtained under four different situations. The corresponding residuals were then compared with the actual values of wind turbine NO. 5 to estimate the MSE value under different situations. These results are given in Tables 1 and 2. It can be seen from Table 1 that the MSE values obtained by replacing Nos. 1-4 wind turbines by NO. 6 wind turbine are close to the MSE values under normal conditions without replacement. Therefore, NO. 6 wind turbine and NO. 1-5 wind turbine can be considered as a wind turbine group. As can be seen from Table 2, the MSE values obtained by replacing Nos. 1-4 wind turbines with NO. 10 wind turbines are quite different from the normal MSE values without replacement. If NO. 10 wind turbines are also included in the wind turbine group of Nos. 1-5 wind turbines, there will be relatively large errors in the study of wind turbine group. Therefore, the wind turbine group classification is feasible.

Introduction to the RWSSA-AANN
The principle of AANN model and SSA algorithm is introduced. RWSSA algorithm is used to optimize the weights and offset parameters of AANN, and the prediction model RWSSA-AANN is established.

AANN
The AANN consists of five layers, namely, the input layer, the mapping layer, the bottleneck layer, the de-mapping layer, and the output layer [13]. A schematic of the model is shown in Figure 2. The characteristic of the AANN model is that the input and the output data are approximate values and their dimensions are the same. The number of nodes in the mapping layer, the de-mapping layer, and the bottleneck layer is determined iteratively. The evaluation index can be running time, error value, etc. The working process is described as follows: the input data is mapped to high-dimensional space through the mapping layer, and then compressed by the bottleneck layer. The number of nodes in the bottleneck layer is hoped to be as few as possible. The input data is de-mapped through the de-mapping layer, and the reconstructed data is output from the original spatial dimension. It can be seen from Table 1 that the MSE values obtained by replacing Nos. 1-4 wind turbines by NO. 6 wind turbine are close to the MSE values under normal conditions without replacement. Therefore, NO. 6 wind turbine and NO. 1-5 wind turbine can be considered as a wind turbine group. As can be seen from Table 2, the MSE values obtained by replacing Nos. 1-4 wind turbines with NO. 10 wind turbines are quite different from the normal MSE values without replacement. If NO. 10 wind turbines are also included in the wind turbine group of Nos. 1-5 wind turbines, there will be relatively large errors in the study of wind turbine group. Therefore, the wind turbine group classification is feasible.

Introduction to the RWSSA-AANN
The principle of AANN model and SSA algorithm is introduced. RWSSA algorithm is used to optimize the weights and offset parameters of AANN, and the prediction model RWSSA-AANN is established.

AANN
The AANN consists of five layers, namely, the input layer, the mapping layer, the bottleneck layer, the de-mapping layer, and the output layer [13]. A schematic of the model is shown in Figure 2. The characteristic of the AANN model is that the input and the output data are approximate values and their dimensions are the same. The number of nodes in the mapping layer, the de-mapping layer, and the bottleneck layer is determined iteratively. The evaluation index can be running time, error value, etc. The working process is described as follows: the input data is mapped to high-dimensional space through the mapping layer, and then compressed by the bottleneck layer. The number of nodes in the bottleneck layer is hoped to be as few as possible. The input data is demapped through the de-mapping layer, and the reconstructed data is output from the original spatial dimension. In Figure2, ω1 represents the network weight iw(1,1) between the input layer and the mapping layer, ω2 is the weight lw(2,1) between the mapping layer and the bottleneck layer, ω3 is the weight lw (3,2) between the bottleneck layer and the de-mapping layers, and ω4 is the weight lw (4,3) between the de-mapping layer and the output layer. b1 is the offset parameter b(1) of the mapping layer, b2 is the offset parameter b(2) of the bottleneck layer, b3 is the offset parameter b(3) of the de-mapping layer, and b4 is the offset parameter In Figure 2, ω 1 represents the network weight i w (1,1) between the input layer and the mapping layer, ω 2 is the weight lw(2,1) between the mapping layer and the bottleneck layer, ω 3 is the weight lw(3,2) between the bottleneck layer and the de-mapping layers, and ω 4 is the weight lw(4,3) between the de-mapping layer and the output layer. b 1 is the offset parameter b(1) of the mapping layer, b 2 is the offset parameter b(2) of the bottleneck layer, b 3 is the offset parameter b(3) of the de-mapping layer, and b 4 is the offset parameter b(4) of the output layer. According to Figure 2, the output vector expression of each layer can be obtained as follows: According to Equations (2) to (5), the output and input expressions of the AANN network can be obtained as.
In Equation (6), f 1 and f 3 are the transfer functions of the mapping layer and the demapping layer, respectively, and the logsig function is used [21]. The expression is as follows: f 2 is the transfer function of the bottleneck layer, using Tansig function. The expression is as follows: f 4 is the transfer function of the output layer, using purelin function. The expression is as follows:

RWSSA-AANN Model Establishment
AANN is a feed forward neural network, using a large amount of training data. During the training process, the weights and offset parameters of the transfer functions of each node in the hidden layer are automatically adjusted by using an error function. The values of the weights and offset parameters are related to the performance of the whole network, and are randomly generated with different results each time. Using Random Walk Improved Sparrow Search Algorithm (RWSSA) to optimize AANN weights and offset parameters, and then training the network, will greatly improve the performance of the neural network and avoid local optimal problems. RWSSA is superior to Genetic Algorithm (GA), Particle Swarm Optimization (PSO), and other algorithms in terms of accuracy, convergence speed, stability, and robustness. It can provide high accuracy and reduce errors in high-dimensional operations. In order to improve the searching ability, the random walk is used to disturb the optimal sparrow. At the beginning of the iteration, the random walk boundary is large, which is helpful to improve the global searching ability. After several iterations, the random walk boundary is narrowing, which improves the local search of the optimal position of the Algorithm.

Mathematical Model of SSA
The sparrows are usually gregarious birds. Its ability is mainly to prey and breed the next generation. These two behaviors are likened to predators and breeders. The rules for establishing the model can be found in the literature [22].
The position matrix of sparrow is expressed as: where i is the number of sparrows and j is the dimension of variables to be optimized: The value of each line in Equation (11) represents the fitness value of the individual. P nmun is the number of predators, its position is constantly updated. During each iteration, the update expression of predator's position is as follows: where t is the current iteration position, b = 1,2, . . . , j. M t a,b represents the value of the b-th dimension of the a-th sparrow at iteration t. ρ is a random number, ρ ∈ (0, 1), P 2 is the alarm value, P 2 ∈ (0, 1), t is the security threshold, G is the maximum number of iterations, n is the current number of iterations, n ∈ (1, G). Q is a random number and L represents a 1 × j matrix.
If P 2 < T, it means that there are no threats around the sparrow flock, and the predator can enter a large-scale search mode. When P 2 ≥ T, it means that there are threats nearby, and all sparrows need to fly to other safe areas quickly. The expression for the position update of the joiner is as follows: where M t w is the worst position in the current population, M t q is the best position occupied by the predator, and A is a 1 × j matrix, internal elements randomly assigned to −1 or 1.
When a > i/2, the a-th participant with poor adaptability was most likely to starve to death. S nmun is the number of sparrows that are aware of danger, and its value is 10% to 20% of the total number i. Their initial positions are generated randomly: where M t best is the current global best position, and is the complete position of the center of the group β. It is the normal distribution coefficient of random number with mean value 0 and variance 1. k is the control coefficient, indicating the direction of sparrow movement, k ∈ [−1, 1]. f a is the current fitness value of sparrow. f g and f w are the global best and worst fitness values of sparrows. ε is a minimum constant, mainly to avoid zero denominator. f a > f g indicates that the current sparrow position is outside the edge of the group. f a = f g indicates that the sparrow in the center of the location is aware of the danger and wants to move closer to other places.

Random Walk Strategy
The mathematical expression of the random walk process is: In Equation (15), M(t) is the set of random walk steps; cussum is the calculated cumulative sum; t is the number of random walk steps; r(t) is a random function, defined as: In Equation (16), rand ∈ [0, 1]. Since there is a boundary in the feasible region, Equation (15) cannot be used directly to update the position of the sparrow. In order to ensure a random walk within the feasible region, it needs to be normalized according to Equation (17).
In Equation (17): E i is the minimum value of the i-th dimension variable random walk; H i is the maximum value of the i-th dimension variable random walk; J t i is the minimum value of the i-th dimension variable in the t-th iteration; N t i is the maximum value of the i-dimensional variable in the t-th iteration.

RWSSA-AANN
The process of RWSSA-AANN model establishment and fault diagnosis are given as the following steps: Step 1. Data processing: On the basis of the selected historical measurement data, standardization was performed to eliminate the influence of the absolute value difference between each data in the model training process.
Step 2. Setting the number hidden layer nodes of AANN and determining the network structure: according to the previous number of wind turbine groups, the number of nodes in the input and output layer was six. The training data of six wind turbines are obtained, using three evaluation indicators, the first is the MSE value, the second is the running time, and the third is the Reduce noise level (RNL).
where: σ 2 O is the average noise variance of the output value and σ 2 I is the average noise variance of the input value.
The Leven berg-Marquardt algorithm was used as the training method by setting the goal to 0.00001, epochs to 100, and the learning rate to 0.01. The training results of the different network structures are shown in Table 3, and the available model is 6-23-5-23-6. Step 3. Defining the dimension k of the RWSSA algorithm: R was set as the number of input layer nodes, S 1 as the number of mapping layer nodes, S 2 as the number of bottleneck layer nodes, S 3 as the number of demapping layer nodes, and T as the number of output layer nodes. Thus, the search space dimension could be written as k = R × S 1 + S 1 × S 2 + S 2 × S 3 + T × S 3 + S 1 + S 2 + S 3 + T and the search space dimension was calculated to be 563.
Step 4. Setting the parameters of RWSSA: On the basis of multiple experiments, the number of iterations G was 30, the population size I was 50, the proportion of P nmun to i Energies 2021, 14, 6905 9 of 18 was 70%, and the rest were joiners, The safety threshold T 1 was 0.8, and the number of guard sparrows D was 10-20% of the population i, taken as 10 in this study. P 2 is a random value of rand (1).
Step 5. Determining the RWSSA fitness value: Using the set parameters, the RWSSA algorithm was iteratively calculated, and the best fitness value was updated at the end of each iteration and sort.
Step 6. Location update: Update the position of the predator sparrow, update the position of the joiner sparrow, update the position of the guard sparrow, calculate the fitness value and use random walk to update the sparrow position.
Step 7. Optimal solution generation: Whether the stopping condition is satisfied, the optimal fitness value, f g , the global optimal position set M best are obtained when the conditions are satisfied, the optimal weight and offset parameter were obtained. If the conditions are not met, repeat step 6.
Step 8. Optimize the AANN network: The obtained optimal weights and offset parameter are assigned to ω and b in Equation (6), improving the prediction accuracy of AANN model.

Evaluation Criteria
The correctness and effectiveness of the constructed model were evaluated using two evaluation criteria, namely, MSE, and fault detection rate (FDR). The MSE was calculated using the following expression: where m is the number of training (or testing) samples, n is the input dimension of training (or testing), o j i represents the i-th network target input value of the j-th training/testing sample, and p j i is the j-th the i-th network predicted output value of training/test samples. The smaller the value of the two parameters, the higher is the accuracy of the model.
The FDR can be expressed as: where G is the number of correct fault predictions, and N is the total number of faults.

Realization of Fault Diagnosis of Anemometer
According to the characteristics of AANN model, several sensor faults are simulated and diagnosed. The detection results show that the model can detect sensor faults accurately and quickly

Fault Diagnosis Instructions
The training data is 20,000 sets of data selected from 0:05 on 1 January 2017 to 21:15 on 19 May 2017. Test data is 2000 data from 13:25 on 26 May to 10:35 on 9 June suppose that O 1 to O 6 represent the actual data of the six anemometers, whereas P 1 to P 6 represent the six predicted outputs of the model. e i is the difference between input O i and output P i , where i = 1, . . . , 6. The relationship between e i and the threshold ε is a judgment of a fault occurs.
The threshold could be selected depending on the testing time, reliability, sensitivity, etc., The residual threshold corresponding to the 95% confidence limit of each variable in the difference matrix is obtained as the fault discrimination threshold, and the threshold of 6 wind turbine is shown in Table 4. When the residual value exceeds the threshold, it is judged that there is failure occurs. The AANN model can capture the correlation between the input parameters, so it can use the values of other sensors without fault to reconstruct the values of the fault sensors.

Analog Fault Diagnosis
The sensor failure sinclude single fixed deviation fault, single failure fault, single drift fault, multiple deviations, and failed faults. The fault state of the simulated sensor was realized by injecting the fault data into the normal 2000 sets of test data. Among the various fault criteria, the fixed deviation fault was taken as the difference between the test value and the real value, which is an approximate fixed number. Make the NO. 2 wind turbine add a deviation of 2 m/s during the period of 1500 to 1750. Failure fault implies that the test value drops to a certain value and remains unchanged, and the wind speed is slowly reduced to 3m/s during the period from 500 to 750 of the NO. 3 wind turbine. Whereasa drift fault implies that the difference between the test value and the true value increases with time. It is assumed that the offset of the gain of 0.002 is injected in the time period from 1000 to 1280 of the NO. 5 wind turbine. The multiple fault settings are to reduce the wind speed of the NO. 1 wind turbine unit 851 to 1200 to 1.8 m/s, and the NO. 6 wind turbine unit 750-1019 adds a fixed deviation with an amplitude of 2 m/s. From the residual curve shown in Figure 3b, the wind speed anemometer of the NO. 2 wind turbine shows a deviation fault. At the 1500th data point, i.e., at 23:15 on 5 June 2017, a single deviation fault occurred. The deviation fault could be detected quickly. It is observed that the residual error of the faultless wind turbine anemometer is lower than the detection threshold. Figure 4 shows the faulty wind speed and the reconstructed wind speed. The reconstruction error is 0.1878. the detection threshold. Figure 4 shows the faulty wind speed and the reconstructed wind speed. The reconstruction error is 0.1878.     Figure 5 shows the single invalidation fault. Figure 5c shows the invalidation fault of the NO. 3 wind turbine anemometer. The fault was detected at the 500th data which is 0:35 on 30 May 2017.It can be seen that the fault is an invalidation fault. Whereas the residual errors of other no-faulty sensors were found to be lower than the detection threshold. There are a few points that are false positives, and they can be ignored because the number is very small. Figure 6 shows the faulty wind speed and reconstruction wind speed of the NO. 3 wind turbine. The reconstruction error is 0.2929.   Figure 5c shows the invalidation fault of the NO. 3 wind turbine anemometer. The fault was detected at the 500th data which is 0:35 on 30 May 2017. It can be seen that the fault is an invalidation fault. Whereas the residual errors of other no-faulty sensors were found to be lower than the detection threshold. There are a few points that are false positives, and they can be ignored because the number is very small. Figure 6 shows the faulty wind speed and reconstruction wind speed of the NO. 3 wind turbine. The reconstruction error is 0.2929.   Figure 7 shows the failure of multiple anemometers. In Figure 7a, the NO. 1 wind turbine anemometer was detected to fail at 851 points, that is, at 11:05 on 1 June. The anemometer of NO. 6 wind turbine in Figure 7f quickly detected the fixed deviation fault at the 750th data point, i.e., at 18:15 on 31 May 2017. However, the residual errors of the other wind turbine anemometer that do no-faulty are still lower than the threshold, and a very small number of errors have a short failure time which can be ignored. Figure 8 shows the faulty wind speed and reconstruction wind speed of the NO. 1 wind turbine and NO.6 wind turbine. The reconstruction error of NO. 1 is 0.3772 and NO.6 is 0.1812.

Actual Data Verification
In order to verify the feasibility and effectiveness of the model, the wind turbines Nos. 1-6 from the SCADA data of a wind turbine group provided by a company were used for the testing.

Actual Data Verification
In order to verify the feasibility and effectiveness of the model, the wind turbines Nos. 1-6 from the SCADA data of a wind turbine group provided by a company were used for the testing.

Actual Data Processing
The Actual data included five years of data from 0:05 on 1 January 2013 to 23:55 on 31 December 2017, taken at 10 min intervals. Thus, the data for each wind turbine is consisted of 52,560 groups per year. Due to some reasons, not every year's data had 52,560 groups. The data at some intermediate time intervals are missing. In order to ensure the reliability of the test, Lagrangian interpolation was used for supplementing the missing data based on the data before and after the missing value.

Fault Diagnosis of the Actual Data
According to the five-year data obtained in the foregoing, the fault diagnosis is carried out through the established model. Figure 9 shows the invalidation fault of the wind turbine anemometer of Nos. 1-6 in the 2014 data. It can be seen from the curve of the residual shown in Figure 9d, the invalidation fault of the anemometer of the NO.4 wind turbine is at the 27,618th data point, that is, the corresponding data at 18:55 on 11 July 2014. It can be obviously seen from the corresponding wind speed diagram, the anemometer invalidation failed during the above time period. Due to the influence of various factors on the sensor during its operation, a few points of the other no-fault sensors exceed the threshold, and thus no processing is required in such cases. Figure 10 shows the fault wind speed and reconstruction wind speed of the NO. 4 wind turbine in 2014.
turbine is at the 27,618th data point, that is, the corresponding data at 18:55 on July11, 2014. It can be obviously seen from the corresponding wind speed diagram, the anemometer invalidation failed during the above time period. Due to the influence of various factors on the sensor during its operation, a few points of the other no-fault sensors exceed the threshold, and thus no processing is required in such cases. Figure 10 shows the fault wind speed and reconstruction wind speed of the NO. 4 wind turbine in 2014.  residual shown in Figure 9d, the invalidation fault of the anemometer of the NO.4 wind turbine is at the 27,618th data point, that is, the corresponding data at 18:55 on July11, 2014. It can be obviously seen from the corresponding wind speed diagram, the anemometer invalidation failed during the above time period. Due to the influence of various factors on the sensor during its operation, a few points of the other no-fault sensors exceed the threshold, and thus no processing is required in such cases. Figure 10 shows the fault wind speed and reconstruction wind speed of the NO. 4 wind turbine in 2014.   Figure 11 shows the failure situation of the fixed deviation of the wind turbine anemometer sensors Nos. 1-6 in the 2016 data. It can be seen from Figure 11c that the anemometer of the NO. 3 wind turbine identified the fault at the 18,132th data point, i.e., at 21:55 on 5 May 2016, which can be approximated as a fixed deviation fault through the curve. Figure 12c shows the power curve of NO. 3 wind turbine. The theoretical power is the minimum power fitted by the cubic spline interpolation method according to the measured wind speed. Figure 12c shows that the theoretical power is greater than the actual power when the power curve is at the 18,132th point. From 18,132th point to 18,269th point, the theoretical power is greater than the actual power. As can be seen from the plot of the power curves of the other five wind turbines that did not fail, in the same period, the actual power of these five wind turbines is greater than the theoretical power. Figure 13 shows the fault wind speed and reconstruction wind speed of the NO. 3 wind turbine in 2016. Figure 11 shows the failure situation of the fixed deviation of the wind turbine anemometer sensors Nos. 1-6 in the 2016 data. It can be seen from Figure 11c that the anemometer of the NO. 3 wind turbine identified the fault at the 18,132nd data point, i.e., at 21:55 on 5 May 2016, which can be approximated as a fixed deviation fault through the curve.  Figure 12c shows the power curve of NO. 3 wind turbine. The theoretical power is the minimum power fitted by the cubic spline interpolation method according to the measured wind speed. Figure 12c shows that the theoretical power is greater than the actual power when the power curve is at the 18,132nd point. From 18,132nd point to 18,269th point, the theoretical power is greater than the actual power. As can be seen from the plot of the power curves of the other five wind turbines that did not fail, in the same period, the actual power of these five wind turbines is greater than the theoretical power. Figure 13 shows the fault wind speed and reconstruction wind speed of the NO. 3 wind turbine in 2016.

Comparison of the Three Methods
In order to test the accuracy of the RWSSA-AANN model, this model, and three other methods, namely, AANN, GA-AANN, and PCA, were applied to the actual data for the purpose of comparison. The MSE and FDR values that obtained by applying each model were compared, and were shown in Table 5. The results show that the MSE value of the RWSSA-AANN model is lower than that of AANN, GA-AANN, and PCA models, whereas the FDR value of the RWSSA-AANN model is higher than that of the AANN, GA-AANN and PCA models. It can be seen that the RWSSA-AANN model exhibits high accuracy and small error, which is beneficial for improving the accuracy of fault diagnosis of wind turbine anemometer.

Comparison of the Three Methods
In order to test the accuracy of the RWSSA-AANN model, this model, and three other methods, namely, AANN, GA-AANN, and PCA, were applied to the actual data for the purpose of comparison. The MSE and FDR values that obtained by applying each model were compared, and were shown in Table 5. The results show that the MSE value of the RWSSA-AANN model is lower than that of AANN, GA-AANN, and PCA models, whereas the FDR value of the RWSSA-AANN model is higher than that of the AANN, GA-AANN and PCA models. It can be seen that the RWSSA-AANN model exhibits high accuracy and small error, which is beneficial for improving the accuracy of fault diagnosis of wind turbine anemometer. According to the complex characteristics of wind speed, such as a long time delay and time-varying, it is feasible to determine the wind turbines with high similarity as a group through cluster analysis. 3.
AANN model is used to establish the wind speed fault diagnosis model of wind turbine, and RWSSA is used to optimize the weights and offset parameters of the model. The simulation results show that the model can be applied to wind turbine sensor fault diagnosis. 4.
The actual SCADA data are detected, and the correctness of the diagnosis results is verified by the correlation between wind speed and power of wind turbine.

5.
By comparing with the actual data and analyzing AANN, GA-AANN, and RWSSA-AANN models, the results show that RWSSA-AANN model has smaller error, higher detection accuracy, and better stability. 6.
The focus of later research is to optimize the model, reduce the running time and improve the accuracy. The invalidation fault detection rate of the model needs to be further improved.