Fault Diagnosis Method for Wind Turbine Gearboxes Based on IWOA-RF

A fault diagnosis method for wind turbine gearboxes based on undersampling, XGBoost feature selection, and improved whale optimization-random forest (IWOA-RF) was proposed for the problem of high false negative and false positive rates in wind turbine gearboxes. Normal samples of raw data were subjected to undersampling first, and various features and data labels in the raw data were provided with importance analysis by XGBoost feature selection to select features with higher label correlation. Two parameters of random forest algorithm were optimized via the whale optimization algorithm to create a fitness function with the false negative rate (FNR) and false positive rate (FPR) as evaluation indexes. Then, the minimum fitness function value within the given scope of parameters was found. The WOA was controlled by the hyper-parameter α to optimize the step size. This article uses the variant form of the sigmoid function to alter the change trend of the WOA hyper-parameter α from a linear decline to a rapid decline first and then a slow decline to allow the WOA to be optimized. In the initial stage, a larger step size and step size change rate can make the model progress to the optimization target faster, while in the later stage of optimization, a smaller step size and step size change rate allows the model to more accurately find the minimum value of the fitness function. Finally, two hyper-parameters, corresponding to the minimum fitness function value, were substituted into a random forest algorithm for model training. The results showed that the method proposed in this paper can significantly reduce the false negative and false positive rates compared with other optimization classification methods.


Introduction
Wind power generation [1] is of great importance in the clean energy power generation sector. In recent years, driven by policies of carbon neutrality and carbon emission peaks [2], China has focused on the development of clean energy, in which wind power is one of the most mature technologies and is expanding in scale [3]. However, while the use of wind turbines is expanding in scale, the number of wind turbine accidents is also increasing [4]. Wind turbine faults emerge endlessly due to changeful working conditions and exposure to the sun, rain, sandstorms, and other severe weather factors throughout the year [5]. As a result, faults such as turbine gearbox faults [6], main bearing faults [7], and generator faults [8] lead to wind turbine maintenance downtime. Wind turbine maintenance is difficult and expensive due to high-altitude maintenance operations, which results in the need for considerable manpower and material resources, and can incur huge economic losses. For this reason, monitoring and maintenance of wind turbines is of the utmost importance to avoid secondary damage and reduce the maintenance cost and difficulties [9].
The wind turbine gearbox [10] is an important transmission part, but its fault frequency is at the top of the list. Data [11] shows that gearbox faults lead to the longest maintenance downtime. Common wind turbine gearbox faults occur in gears and bearings [12]. A changeable running environment and complex operating conditions [13] lead to the necessary long-term adjustment of the load operation, thus causing excessive mechanical equipment wear, including wear and pitting corrosion of the gearbox gears, tooth surface deformation and tooth cracks, and the enhanced possibility of a crack on the inner ring, outer ring, and cage of the bearing [14].
Machine learning algorithms are widely used in wind turbine fault diagnosis. Xiang [15] used the method of convolutional neural network cascading to LSTM (long short-term memory) network to warn in the event of an abnormal state in wind turbines. Kordestani [16] combined the dynamic principal component analysis (DPCA) with the support vector machine (SVM) to identify dynamic fault states. Trizoglou [17] proposed an extreme gradient-reinforced model for fault diagnosis of wind turbine gearboxes with better prediction accuracy than LSTM and a lower calculation cost. Liu [18] proposed a hybrid model to predict the oil temperature of wind turbine gearboxes. Pan [19] utilized the deep belief network (DBN) and self-organizing map (SOM) to reduce the noise of the sensor's characteristic signal and then optimized and improved the particle filter (PF) algorithm via the fruit fly optimization algorithm (FOA) to predict the service life of the gearbox. The above methods improved the accuracy of the model, while simultaneously making the model more complex, increasing training time, and worsening real-time performance, thus making the practical application of some of these projects unfeasible.
The random forest (RF) [20] algorithm is an extended variant of bagging, and, belonging to the integration algorithm, it is simple, easy to implement, requires a lower computational expense, and has strong classification performance. The random forest algorithm is widely used in machine learning for fault diagnosis. In the field of medicine, Asadi [21] used the improved RF to diagnose patients with heart diseases and P. K. P [22] used the RF, subject to bayesian optimization, to realize an accurate classification of breast cancer. In traffic, a mixed method, combining random forest regression and maximum information coefficient, was used to predict whether a flight would be delayed [23]. In agriculture, Makungwe [24] combined the linear mixture model with RF to predict soil pH range. The RF algorithm, with strong computing power, was used in this paper for the fault diagnosis of wind turbine gearboxes.
Feature selection [25] is necessary to find the optimal feature subset, as redundant features will increase model training time and irrelevant features will reduce the accuracy of the model in the dataset. Common feature selection methods include Pearson's correlation coefficient, chi-square test, XGBoost [26], variance selection, mutual information and maximum information coefficient, etc. In this study, XGBoost feature selection was used to process the data and identify the correlation between each feature and gearbox fault more conveniently and intuitively.
An appropriate classification algorithm selected for fault diagnosis may not achieve the expected results. For this reason, the hyper-parameters in the classification algorithm should be adjusted, namely, subject to optimization. Long [27] adopted the grey wolf optimization algorithm and proposed an exploration-enhanced grey wolf optimization algorithm (EEGWO) [28]. Long [29] also proposed an enhanced adoptive butterfly optimization algorithm for PV model parameter identification. Tang [30] proposed a cost-sensitive large margin distributor (CLDM) for fault diagnosis of wind turbine generators to reduce the influence of data category imbalance, and the cost-sensitive extreme random forest algorithm (CS-ERF) was proposed [31] to further study the fault of wind turbine generators.
In recent years, a large number of researchers have begun to use machine learning methods to realize the intelligent diagnosis of fan gearbox faults. McKinnon [32], using SCADA data, compared three algorithms of one-class support vector machine (OCSVM), isolation forest (IF), and elliptical envelope (EE) for the same fault. The average accuracy of OCSVM was 82%, which was better than IF and EE, but the accuracy of fault classification Energies 2021, 14, 6283 3 of 13 required improvement. Yang [33] proposed a method based on deep joint variational automatic encoder to detect wind turbine gearbox faults. This method reduced the FNR, but the optimization of the FPR was not obvious. Corley [34] combined thermal modeling and machine learning methods to detect gearbox faults. The method had less conclusive results, which may have affected the accuracy of fault detection. Tang [35] improved the lightGBM method, which solved the problem of low computational efficiency and poor real-time performance of the traditional GBDT algorithm, but there was also a risk that it would fall into local optimization.
However, there are still some problems to be solved in practical engineering applications. Overfitting and local optimum can easily occur due to the strong capability of the random forest algorithm. A hyper-parameter fault diagnosis model, based on a random forest algorithm optimized by WOA, was proposed to solve the problem of finding the global optimal parameters of the fault diagnosis model of wind turbine gearboxes, to allow for improved diagnosis performance of the fault diagnosis model.
The whale optimization algorithm (WOA) [36] is a bionic algorithm imitating whale hunting, which mainly includes three search mechanisms: (1) realizing the local search of the algorithm via shrinkage encirclement mechanism and (2) spiral mechanism, and (3) the global search of the algorithm via random learning strategy. With advantages of a simple process, a fast rate of convergence, and excellent performance in solving optimization problems, the WOA is widely used in various applications.
This article uses the advantages of WOA and flexibly controls the optimization speed by changing the hyper-parameters of WOA, and change the location update strategy to make it more random and reduce the risk of WOA falling into a local optimum. The hyper-parameters to be optimized in random forest algorithm were substituted into the improved whale optimization algorithm (IWOA). The optimal value was quickly found through three search methods, and then the optimal parameter value was returned to RF for training. Finally, the fault diagnosis was carried out by the trained model.

Random Forest Algorithm
After the bagging integration was created by RF with a decision tree as the base learner, random feature selection was further introduced into the RF on the basis of the training degree of the decision-making tree. For attribute selection and partition, the traditional decision tree selected an optimal attribute among the attributes of the current node, while the RF randomly selected a subset containing k attributes from the attribute set of the node on the base decision tree. It then selected an optimal attribute from the subset for the partition. The substitute of all training subsets was put into different base learners for training. Each learner's classification results were voted for in a comprehensive way, and the result with the largest number of votes was the final prediction of the RF. Figure 1 showed the flow chart of the random forest algorithm.
The diversity of base learner in the random forest was achieved by sample disturbance and attribute disturbance, leading to further improvement of the finally-integrated generalization performance due to the increase in the difference degree among individual learners.
The hyper-parameters of the random forest included the number of the decision in the forest, the depth of the decision tree, the number of the optimal splitting point features, the criterion to measure the splitting, and the minimum number of samples on the leaf node, which directly affect the accuracy of the random forest classification and the required time [37]. The number of decision trees and the minimum number of samples on the leaf node were selected, in this study, via the empirical law for optimization. The diversity of base learner in the random forest was achieved by sample disturbance and attribute disturbance, leading to further improvement of the finallyintegrated generalization performance due to the increase in the difference degree among individual learners.
The hyper-parameters of the random forest included the number of the decision in the forest, the depth of the decision tree, the number of the optimal splitting point features, the criterion to measure the splitting, and the minimum number of samples on the leaf node, which directly affect the accuracy of the random forest classification and the required time [37]. The number of decision trees and the minimum number of samples on the leaf node were selected, in this study, via the empirical law for optimization.

Whale Optimization Algorithm (WOA)
Derived through the imitation of whale hunting, WOA [38] is a meta-heuristic algorithm based on swarm intelligence, which mainly includes three search mechanisms: realizing the local search of the algorithm via shrinkage encirclement mechanism and spiral mechanism, and the global search of the algorithm via random learning strategy. It has the advantage of being a simple process with a fast rate of convergence.

Search Prey Model
The mathematical model when the whale searches for prey is as follows: t is the current number of training, t ≥ 1; and is the randomly selected whale position vector. When ≥ 1 in the algorithm, a search location was randomly selected to update the position of the other whales to force the whale to deviate from the prey, thereby finding a more appropriate prey. In this case, the exploration ability of the algorithm was strengthened so that the WOA could perform a global search.

Mathematical Model
WOA's mathematic model to encircle preys is as follows:

Whale Optimization Algorithm (WOA)
Derived through the imitation of whale hunting, WOA [38] is a meta-heuristic algorithm based on swarm intelligence, which mainly includes three search mechanisms: realizing the local search of the algorithm via shrinkage encirclement mechanism and spiral mechanism, and the global search of the algorithm via random learning strategy. It has the advantage of being a simple process with a fast rate of convergence.

Search Prey Model
The mathematical model when the whale searches for prey is as follows: t is the current number of training, t ≥ 1; and X rand is the randomly selected whale position vector. When A ≥ 1 in the algorithm, a search location was randomly selected to update the position of the other whales to force the whale to deviate from the prey, thereby finding a more appropriate prey. In this case, the exploration ability of the algorithm was strengthened so that the WOA could perform a global search.

Mathematical Model
WOA's mathematic model to encircle preys is as follows: In the above equations, t is the number of current iterations; A and C are coefficient vectors, X * (t) represents the best position for whale hunting, namely the current optimal solution, X(t) represents the current position vector of the whale, and D represents the distance between the front and rear positions of the whale. Coefficient vectors A and C can be represented in the following form: Energies 2021, 14, 6283 5 of 13 where r 1 and r 2 are two random numbers in (0, 1) and α is the hyper-parameter that controls the whale's moving step length. The value of α decreases curvilinearly from 2 to 0.

Hunting Mathematical Model
In order to reduce the possibility of the model falling into the local optimum, this article will use the randomness of the parameters in the hunting model to obtain three random moving positions of the whale, and use the average of these three moving positions as the next moving position. According to the whale's spiral movement for hunting, the mathematic model of the whale's movement location is as follows: where D p = |X * (t) − X(t)| represents the distance between the whale and the prey; b is a constant to define the shape of the helix; and l is a random number in (−1, 1). The whale swims to the prey in a helix with increased curvature. Suppose that it is of probability σ to select the shrinkage encirclement mechanism and 1 − σ to select the spiral mechanism for updating the whale's location. The mathematical model is as follows: When the whale was going to attack the prey, α was set to decrease when the whale got closer to the prey in the mathematical model. The fluctuation range also decreased over α's decrease. When α decreased from 2 to 0 during iteration, A was a random value within [−α, α]. When the value of A was with [−1, 1], the whale may appear at any position between its current position and the position of the prey. When A < 1 was set in the algorithm, the whale attacks the prey. Table 1 presents the functions and value ranges of the number τ of decision trees in the IWOA-RF, the main optimization model used in this experiment, and the minimum number δ of samples on the leaf node. Algorithm 1 is the original algorithm of WOA. The paper combined improved WOA with RF to form IWOA-RF. The specific details of IWOA-RF improvement can be reflected in its pseudo code.

Optimization Algorithm Flow
The optimization process of WOA can be seen as follows: with the smallest fitness value as the optimal position; 5: Update the location of the next generation; 6: While t = max_iter, output the optimal individual, namely the optimal solution found by the algorithm. Otherwise t < max_iter,t = t + 1, return to step (4); Output: Get the best parameter vector X* as the optimal position of dimension dim.

IWOA-RF Pseudo-Code
Implementation of the proposed RF hyper-parameters optimization can be detailed as follows (Algorithm 2): Algorithm 2: Implementation of IWOA-RF fault detection method.

Results
The wind turbine gearbox was taken as the experimental object. Due to the large number of sensors in the turbine, which receive data, the complex relationship among data features, and the different size of the relationship between features and faults, the

Data Description
The operation data of a 1.5 MW wind turbine in a wind farm in Inner Mongolia was used in this experiment. This wind farm contains 33 wind turbines. The data of a wind turbine with a gearbox fault, chosen through random selection, was used. The data interval was 1 min. For the rigor of the experiment, this experiment selected the data 5 h before the failure, the data when the failure occurs, and the data 5 h after the failure. The gearbox is an important variable-speed transmission in wind turbines, mainly composed of gears, bearings, and transmission shafts. Most faults occur on the gearbox's gears and bearings. The gearbox and other components were monitored by measuring the bearing temperature, gear speed, generated power, and other parameters of the turbine by sensors. Figure 3 presents the overall system diagram of the wind turbine.   Figure 2 is the overall flow chart of IWOA-RF. The original data was cleaned and the data was under-sampled to obtain a standard data set. The standard data set was divided into training set and test set with a split function, and the training set and test set were substituted into IWOA The optimal parameter vector of the current RF was obtained, and the optimal parameter vector is obtained after maximum number of cycles. Finally, the test set is substituted into the RF to obtain the FNR and FPR.

Data Description
The operation data of a 1.5 MW wind turbine in a wind farm in Inner Mongolia was used in this experiment. This wind farm contains 33 wind turbines. The data of a wind turbine with a gearbox fault, chosen through random selection, was used. The data in-terval was 1 min. For the rigor of the experiment, this experiment selected the data 5 h before the failure, the data when the failure occurs, and the data 5 h after the failure. The gearbox is an important variable-speed transmission in wind turbines, mainly composed of gears, bearings, and transmission shafts. Most faults occur on the gearbox's gears and bearings. The gearbox and other components were monitored by measuring the bearing temperature, gear speed, generated power, and other parameters of the turbine by sensors. Figure 3 presents the overall system diagram of the wind turbine.
The original data of the turbine shown in Table 2 includes a portion of the data of the No. 8 wind turbine collected in January. terval was 1 min. For the rigor of the experiment, this experiment selected the data 5 h before the failure, the data when the failure occurs, and the data 5 h after the failure. The gearbox is an important variable-speed transmission in wind turbines, mainly composed of gears, bearings, and transmission shafts. Most faults occur on the gearbox's gears and bearings. The gearbox and other components were monitored by measuring the bearing temperature, gear speed, generated power, and other parameters of the turbine by sensors. Figure 3 presents the overall system diagram of the wind turbine.

Data Cleaning and Preprocessing
With a large amount of data, incomplete, null, and other values in the original data, due to sensors and for other reasons, may be directly discarded to obtain a non-null dataset. The wind turbine runs normally the majority of the time, leading to an excess of normal samples and unbalanced categories. In this paper, the number of normal samples in the original data was reduced and the number of normal samples and fault samples in the data was balanced by undersampling. The dataset also contained pitch fault, yaw fault, and other faults at a certain point in time. This experiment aimed to study the gearbox fault, so other disturbing fault data was filtered out to obtain a standard single fault and normal datasets. The data, after the cleaning, was preprocessed for standardization.
Each feature has different sensitivity to different faults of wind turbines. XGBoost was used to screen all features once in this experiment to select the more important features more accurately and reduce the complexity of the model.

Feature Selection
More features does not lead to a higher accuracy of the training in machine learning [39]. On the contrary, a large number of redundant features in the data will not only affect the classification speed of the model, but also affect the classification accuracy of the model and increase the risk of missing and false alarms. XGBoost was used for feature selection and calculating the features to be split through parallel features. Multiple threads were used in an attempt to take each feature as the feature to be split and find the optimal splitting point of each feature. The feature with the largest gain was selected as the feature to be split, after calculation of the gain was generated after splitting. Table 3 shows the feature importance of each feature obtained after XGBoost feature selection.  Figure 4 presents the partial feature importance after visualization to more clearly distinguish the importance of each feature to the tag.
Energies 2021, 14, x FOR PEER REVIEW Figure 4 presents the partial feature importance after visualization to more distinguish the importance of each feature to the tag. From Figure 4, it is evident that the generator power, generator shaft torqu generator inlet oil temperature are the main factors affecting gearbox faults. F screening was performed by calculating the importance of each feature. Features wi importance affected the accuracy of the model, but also consumed a considerable a of the computing power, causing poor real-time performance of the model. Fo reason, the features that fell below the mean value of feature importance were disc and those above the mean value were retained. The mean feature importanc 0.026316, and, thus, 12 feature data were retained.

Performance Evaluation Index of Fault Diagnosis
The unimproved whale optimization algorithm (WOA), particle swarm optimi (PSO) [40], and salp swarm algorithm (SSA) were compared with the IWOA-RF to the effectiveness and superiority of the IWOA-RF in the diagnosis of wind turbine g faults. Three random forest models, after optimization and improved whale optimi algorithm (IWOA), were substituted into three new data to test the reliability of the m The false negative rate (FNR) and false positive rate (FPR) were taken as perfor evaluation indexes in this paper. FNR and FPR can be represented in the following where TP, TN, FP, and FN were confusion matrix as shown in Table 4.  From Figure 4, it is evident that the generator power, generator shaft torque, and generator inlet oil temperature are the main factors affecting gearbox faults. Feature screening was performed by calculating the importance of each feature. Features with low importance affected the accuracy of the model, but also consumed a considerable amount of the computing power, causing poor real-time performance of the model. For this reason, the features that fell below the mean value of feature importance were discarded and those above the mean value were retained. The mean feature importance was 0.026316, and, thus, 12 feature data were retained.

Performance Evaluation Index of Fault Diagnosis
The unimproved whale optimization algorithm (WOA), particle swarm optimization (PSO) [40], and salp swarm algorithm (SSA) were compared with the IWOA-RF to verify the effectiveness and superiority of the IWOA-RF in the diagnosis of wind turbine gearbox faults. Three random forest models, after optimization and improved whale optimization algorithm (IWOA), were substituted into three new data to test the reliability of the model.
The false negative rate (FNR) and false positive rate (FPR) were taken as performance evaluation indexes in this paper. FNR and FPR can be represented in the following form: where TP, TN, FP, and FN were confusion matrix as shown in Table 4. Table 4. Confusion matrix of classification results.

The Actual Situation
Forecast Classification

Predicted Failure Predicted Normal
The actual fault TP FN The actual normal FP TN

Experimental Results
The FNR and FPR of dataset 1 were obtained after multiple iterations of the model, as shown in Figure 5.

Experimental Results
The FNR and FPR of dataset 1 were obtained after multiple iterations of the model, as shown in Figure 5. Under the same amount of training, it can be seen from Figure 5a that the average FNR of the unimproved random forest algorithm model exceeds 8%, while the average FNR of PSO-RF, SSA-RF and WOA-RF is about 4%, and the optimization effect is obvious; but The FNR of IWOA-RF is controlled below 2%, which is better than the other three optimization methods. It can be seen from Figure 5b that the average FPR of the five models are less than 1%, and the average FPR of PSO-RF, SSA-RF and WOA-RF are all higher than 0.5%, and the classification results fluctuate greatly. The performance of the optimization of the FPR is unstable, which may be that the algorithm has fallen into a local optimum; while the FPR of IWOA-RF is controlled between 0.1% and 0.3%, the FPR is low, the fluctuation is small, and the classification effect is better than others Four algorithms.
The FNR and FPR of dataset 2 are shown in Figure 6. Under the same amount of training, it can be seen from Figure 5a that the average FNR of the unimproved random forest algorithm model exceeds 8%, while the average FNR of PSO-RF, SSA-RF and WOA-RF is about 4%, and the optimization effect is obvious; but The FNR of IWOA-RF is controlled below 2%, which is better than the other three optimization methods. It can be seen from Figure 5b that the average FPR of the five models are less than 1%, and the average FPR of PSO-RF, SSA-RF and WOA-RF are all higher than 0.5%, and the classification results fluctuate greatly. The performance of the optimization of the FPR is unstable, which may be that the algorithm has fallen into a local optimum; while the FPR of IWOA-RF is controlled between 0.1% and 0.3%, the FPR is low, the fluctuation is small, and the classification effect is better than others Four algorithms.
The FNR and FPR of dataset 2 are shown in Figure 6. Comparing Figures 5 and 6, it is clear the general tendency of FNR and FPR of dataset 2 was similar to that of dataset 1. The FNR of IWOA-RF is lower than that of the other four algorithms. The FNR is controlled below 4%. In Figure 6b, although the average FPR of SSA-RF is slightly higher than that of IWOA-RF, the fluctuation of FPR of SSA-RF range is too large. Overall, the classification performance of IWOA-RF is better than the other three optimizations.
The FNR and FPR of dataset 3 are shown in Figure 7.
The FPR of all five models was almost zero, which may result from the fact that the data quality of dataset 3 was relatively better, with less fault data, leading to a greater impact on the model. It can be seen from Figure 7a,b that the FPR and FNR of IWOA-RF are compared with the other three types. There is little difference between the FNR and the FPR of the optimized model, especially the FPR. The FPR of the five models all appear zero, and the overall average FPR is close to zero; The average FNR of the four optimization algorithms is less than 1%, and the FNR of IWOA-RF is even closer to zero. It can be proved that the diagnostic effect of IWOA-RF is better than the other three optimization algorithms.
FNR of the unimproved random forest algorithm model exceeds 8%, while the average FNR of PSO-RF, SSA-RF and WOA-RF is about 4%, and the optimization effect is obvious; but The FNR of IWOA-RF is controlled below 2%, which is better than the other three optimization methods. It can be seen from Figure 5b that the average FPR of the five models are less than 1%, and the average FPR of PSO-RF, SSA-RF and WOA-RF are all higher than 0.5%, and the classification results fluctuate greatly. The performance of the optimization of the FPR is unstable, which may be that the algorithm has fallen into a local optimum; while the FPR of IWOA-RF is controlled between 0.1% and 0.3%, the FPR is low, the fluctuation is small, and the classification effect is better than others Four algorithms.
The FNR and FPR of dataset 2 are shown in Figure 6.
(a) Boxplot of FNR for dataset 2 (b) Boxplot of FPR for dataset 2  Comparing Figures 5 and 6, it is clear the general tendency of FNR and FPR of dataset 2 was similar to that of dataset 1. The FNR of IWOA-RF is lower than that of the other four algorithms. The FNR is controlled below 4%. In Figure 6b, although the average FPR of SSA-RF is slightly higher than that of IWOA-RF, the fluctuation of FPR of SSA-RF range is too large. Overall, the classification performance of IWOA-RF is better than the other three optimizations.
The FNR and FPR of dataset 3 are shown in Figure 7. The FPR of all five models was almost zero, which may result from the fact that the data quality of dataset 3 was relatively better, with less fault data, leading to a greater impact on the model. It can be seen from Figure 7a,b that the FPR and FNR of IWOA-RF are compared with the other three types. There is little difference between the FNR and the FPR of the optimized model, especially the FPR. The FPR of the five models all appear zero, and the overall average FPR is close to zero; The average FNR of the four optimization algorithms is less than 1%, and the FNR of IWOA-RF is even closer to zero. It can be proved that the diagnostic effect of IWOA-RF is better than the other three optimization algorithms.

Conclusions
A fault diagnosis method for wind turbine gearboxes based on undersampling, XGBoost feature selection, and IWOA-RF was proposed in this paper to solve the problem of high FNR and FPR of wind turbine gearboxes. The main contributions of this paper are as follows: (1) undersampling and XGBoost feature selection were adopted to reduce the dimension of the original data and eliminate the negative impact of data category imbalance and redundant features on the model; and (2) an optimized Whale optimization

Conclusions
A fault diagnosis method for wind turbine gearboxes based on undersampling, XG-Boost feature selection, and IWOA-RF was proposed in this paper to solve the problem of high FNR and FPR of wind turbine gearboxes. The main contributions of this paper are as follows: (1) undersampling and XGBoost feature selection were adopted to reduce the dimension of the original data and eliminate the negative impact of data category imbalance and redundant features on the model; and (2) an optimized Whale optimization algorithm was used to optimize the number of classifiers and the minimum number of samples on leaf nodes in the random forest. The IWOA-RF was compared with the WOA_RF, PSO-RF, and SSA-RF using three datasets of FNR and FPR. The results showed that the proposed method can effectively reduce the FNR and FPR during fault diagnosis under