Research on Misalignment Fault Isolation of Wind Turbines Based on the Mixed-Domain Features

: The misalignment of the drive system of the DFIG (Doubly Fed Induction Generator) wind turbine is one of the important factors that cause damage to the gears, bearings of the high-speed gearbox and the generator bearings. How to use the limited information to accurately determine the type of failure has become a difﬁcult study for the scholars. In this paper, the time-domain indexes and frequency-domain indexes are extracted by using the vibration signals of various misaligned simulation conditions of the wind turbine drive system, and the time-frequency domain features—energy entropy are also extracted by the IEMD (Improved Empirical Mode Decomposition). A mixed-domain feature set is constructed by them. Then, SVM (Support Vector Machine) is used as the classiﬁer, the mixed-domain features are used as the inputs of SVM, and PSO (Particle Swarm Optimization) is used to optimize the parameters of SVM. The fault types of misalignment are classiﬁed successfully. Compared with other methods, the accuracy of the given fault isolation model is improved.


Introduction
With the rapid development of wind power, a lot of wind turbines have faults in the operation because most of them are installed in remote areas, and their loads are unstable.The operation and maintenance cost of up to 25-30% seriously restrict the development of wind power industry [1,2].Therefore, how to reduce the risk of wind turbine operation and decrease the cost of wind power generation has become a research topic for many scholars.The fault isolation of wind turbine is the bottleneck to solve the above problems; undoubtedly, it is the key to ensuring long-term stable operation of the wind turbine, and it is the necessary means to realize the condition-based maintenance of the wind turbines [3].
Variable speed constant frequency doubly fed wind turbine is the main type in wind farms [4], and its simplified model is shown in Figure 1.The fault rate of gearbox and generator is relatively high [5].There are many reasons for the faults of gearbox and generator, and the misalignment between them is one of the most important [6].This is mostly because: 1.
The precise alignment in the wind turbine is very difficult; 2.
The wind speed fluctuation can cause the start-stop of wind turbine frequently, and, as time goes on, it will cause the shift of generator or the deformation of some parts, resulting in the misalignment of generator and gearbox.
When the misalignment occurs, it is easy to cause damage to gearbox gears, bearings and generator bearings and so on.Therefore, the study and fault isolation of the transmission system in the wind turbine is the key to ensuring its long stable operation.When the misalignment occurs, it is easy to cause damage to gearbox gears, bearings and generator bearings and so on.Therefore, the study and fault isolation of the transmission system in the wind turbine is the key to ensuring its long stable operation.The misalignment belongs to a latent fault.When the fault is accumulated to a certain extent, the equipment is damaged seriously, leading to the outage of the wind turbine, affecting the power generation seriously, resulting in huge economic losses.For many years, how to isolate the latent fault accurately has been a difficult problem.This is mainly due to the fault feature of latent faults not being obvious, and researchers lack the awareness of the developing rules for latent fault.In order to identify and isolate the existence of the misalignment fault accurately and to overcome the difficulty of obtaining a large number of misalignment fault samples in reality, in this paper, Solidworks and Adams models are established according to the 1.5 MW wind turbine.The simulation of the transmission system is carried out in Adams under the normal operation, the parallel misalignment, the angle misalignment and the comprehensive misalignment.The vibration signal (angular acceleration) of a high speed shaft is extracted as the information (the established model and its effectiveness are shown in [7]).
The methods of extracting fault features from vibration signals are mainly based on time-domain analysis, frequency-domain analysis and time-frequency domain analysis: 1.The time-domain analysis is one of the most simple and direct analysis methods, and it is effective in isolating the fault in a certain extent.Except for the poor anti-interference of serious faults, the numerical values in time-domain are close to that in normal state, so it is easy to make misjudgments only using time-domain analysis.2. The frequency-domain analysis of signal is the most commonly used method of mechanical equipment fault analysis.It can be used to get more intuitive fault information than by the time-domain analysis.However, the theoretical basis of frequency-domain analysis is the method of Fourier analysis, so it has sidedness, and cannot extract the features of vibration signal comprehensively.3. The vibration signal is expressed in the time-domain and frequency-domain at the same time for the time-frequency domain analysis, and it has very prominent advantages.The main methods of feature extraction based on time-frequency analysis are as follows: short time Fourier Transform, Wigner-Ville Distribution, Wavelet Transform, Blind Source Separation, Empirical Mode Decomposition (EMD) and so on.
Due to the complexity of the fault mechanism of the wind turbine transmission system, the state information is different with different feature indexes, the sensitivity and regularity of the running state, as well as the clustering and separability in the model space not being the same.Single features or single domain features are difficult to fully reflect the different conditions of wind turbines with different degrees and types of faults.Therefore, it is necessary to construct a mixed feature library with time-domain, frequency-domain and time-frequency domain, which becomes one of the development trends of fault feature extraction.
Many scholars have applied one or two of the above domains to extract the fault isolation feature based on vibration signal.For example, Ref. [8] extracted root mean square (RMS), peak The misalignment belongs to a latent fault.When the fault is accumulated to a certain extent, the equipment is damaged seriously, leading to the outage of the wind turbine, affecting the power generation seriously, resulting in huge economic losses.For many years, how to isolate the latent fault accurately has been a difficult problem.This is mainly due to the fault feature of latent faults not being obvious, and researchers lack the awareness of the developing rules for latent fault.In order to identify and isolate the existence of the misalignment fault accurately and to overcome the difficulty of obtaining a large number of misalignment fault samples in reality, in this paper, Solidworks and Adams models are established according to the 1.5 MW wind turbine.The simulation of the transmission system is carried out in Adams under the normal operation, the parallel misalignment, the angle misalignment and the comprehensive misalignment.The vibration signal (angular acceleration) of a high speed shaft is extracted as the information (the established model and its effectiveness are shown in [7]).
The methods of extracting fault features from vibration signals are mainly based on time-domain analysis, frequency-domain analysis and time-frequency domain analysis: 1.
The time-domain analysis is one of the most simple and direct analysis methods, and it is effective in isolating the fault in a certain extent.Except for the poor anti-interference of serious faults, the numerical values in time-domain are close to that in normal state, so it is easy to make misjudgments only using time-domain analysis.

2.
The frequency-domain analysis of signal is the most commonly used method of mechanical equipment fault analysis.It can be used to get more intuitive fault information than by the time-domain analysis.However, the theoretical basis of frequency-domain analysis is the method of Fourier analysis, so it has sidedness, and cannot extract the features of vibration signal comprehensively.

3.
The vibration signal is expressed in the time-domain and frequency-domain at the same time for the time-frequency domain analysis, and it has very prominent advantages.The main methods of feature extraction based on time-frequency analysis are as follows: short time Fourier Transform, Wigner-Ville Distribution, Wavelet Transform, Blind Source Separation, Empirical Mode Decomposition (EMD) and so on.
Due to the complexity of the fault mechanism of the wind turbine transmission system, the state information is different with different feature indexes, the sensitivity and regularity of the running state, as well as the clustering and separability in the model space not being the same.Single features or single domain features are difficult to fully reflect the different conditions of wind turbines with different degrees and types of faults.Therefore, it is necessary to construct a mixed feature library with time-domain, frequency-domain and time-frequency domain, which becomes one of the development trends of fault feature extraction.
Many scholars have applied one or two of the above domains to extract the fault isolation feature based on vibration signal.For example, Ref. [8] extracted root mean square (RMS), peak value and kurtosis coefficient of the vibration signal, and combined with the time-domain waveform to judge whether there was fault in equipment.In Ref. [9], the kurtosis and peak value of the vibration signal were selected as the time-domain indexes, and then the wavelet packet algorithm was used to extract the frequency band energy and the 2-norm as the time-frequency domain indexes, the PCA (Principal Component Analysis) was used to confirm the principal component, and the crack fault of the wind turbine gearbox was diagnosed effectively.The EMD method was used to judge the working states and fault types of the gear by calculating EMD energy entropy of different vibration signals as the input features of Support Vector Machine in [10], but only the time-frequency domain feature was considered.Ref. [11] used a testing system for a wind turbine vibration test to collect vibration signal, and the time-domain and frequency-domain analysis were carried out to get the fault features of the rotor misalignment through the analysis of wind turbine generator, but the time-frequency analysis was not considered.In Ref. [12], the wavelet analysis was used to denoise the vibration signals of the wind turbine gearbox under normal, wear fault and broken tooth fault.Five feature parameters were extracted into the LVQ (Learning Vector Quantification) neural network, and the results showed that the method can identify the fault quickly and accurately.In Ref. [13], the time-domain of the peak, RMS and kurtosis of the vibration signal were extracted as the time-domain features of the fault gearbox.Combining observation and analysis of the normal signal and fault signal frequency-domain feature, the possible location of the failure was determined.In Ref. [14], the vibration signals of three typical states of normal conditions, tooth wear and tooth breakage of the gearbox in the wind turbine are analyzed, and the margin index, kurtosis index, peak index, pulse index, power spectrum entropy in the frequency-domain were extracted, the time-domain and frequency-domain features were the inputs in the fault isolation.In Refs.[15] and [16], the mixed-domain feature set is constructed to completely characterize the property of each fault by combining Empirical Mode Decomposition (EMD) with the Autoregression (AR) model coefficients.
A large amount of literature shows that there are few reports on the method of combining the time-domain, frequency-domain and time-frequency domain features together to construct a mixed-domain feature library when researching the fault of the wind turbine transmission system.In order to extract the feature parameters which reflect the vibration signal as much as possible, and to make the fault isolation more reliable and accurate, in this paper, firstly, the time-domain, frequency-domain and time-frequency domain of the wind turbine vibration signal are combined to construct the mixed-domain feature library to obtain more comprehensive and accurate fault isolation information.The information entropy of the intrinsic mode function (IMF) component decomposed by EMD is used as the time-frequency feature.Then, the support vector machine that is suitable for small samples is used as the isolation tool, and the PSO algorithm is used to optimize the relevant parameters of the support vector machine to obtain better classification performance with high diagnostic accuracy.The results show that the proposed method can identify the types of misalignment effectively compared with other methods.

Feature Extraction of Vibration Signal in the Time-Domain
The vibration signal of the transmission system (assuming that the signal is a discrete time series x 0 , x 1 , x 2 , ... , x N−1 with finite length) contains the running status information, so a representative time-domain index can be selected as the fault feature parameter to help determine whether the drive system has a fault and what type of misalignment has occurred.The time-domain features of the signal include the dimensional and the dimensionless time-domain index [17].

Dimensional Index
In this paper, the dimensional time-domain indexes selected include the variance, the square root amplitude, the RMS, the standard deviation (SD) and the kurtosis.Their formulas are as follows: Variance: where x is the mean of the signal.Square root amplitude: RMS: SD: Kurtosis:

Dimensionless Index
The dimensionless index is not so sensitive to the change of amplitude and frequency of the signal, that is, it is not directly related to the working parameters of the wind turbine.The kurtosis index, the waveform index, the peak index, the pulse index and the margin index are commonly used as dimensionless indexes [18].
The impact component will be produced when there is fault in the transmission system of wind turbine, and the kurtosis index can reflect the degree of steepness, and the expression is: The waveform index reflects the shape of the time-domain waveform, which is sensitive to the waveform change caused by the fault, and the expression is: The peak index reflects the distribution of large data components in the vibration signal, which is the deviation degree of the vibration amplitude distribution relative to the normal Gaussian distribution, and the expression is: The pulse index represents the magnitude of the impact component in the signal, which is sensitive at the beginning of the fault, and the expression is: The margin index is sensitive to the impact signal, and the expression is:

Feature Extraction of Vibration Signal in the Frequency-Domain
For a complex vibration signal, the information extracted in the time-domain is limited.Therefore, the time-domain signal is often transformed into a frequency-domain signal by a mathematical method to reveal the frequency component of the signal, so as to extract more information from the signal.The mathematical basis for the transition from time-domain to frequency-domain is Fourier Transform.The process of Fourier Transform is called spectrum analysis.The commonly used spectral analysis includes amplitude spectrum analysis, phase spectrum analysis, logarithmic spectrum analysis and power spectrum analysis.Power spectrum is a commonly used method for frequency-domain feature extraction.It can be used to describe the change of signal frequency-domain features by describing the change of power centroid and the dispersion degree of power distribution [19].
For a discrete time series (x 0 , x 1 , x 2 , ... , x N−1 ) with finite length, assuming the sampling frequency is f s , three commonly used frequency-domain parameters are as follows [20]: Centroid frequency (FC): Mean square Frequency (MSF): Variance frequency (VF): where S(ω) is power spectrum of the discrete signal, S(ω x(i)e −jπω ; and ω is the angular frequency.

Feature Extraction of Vibration Signal in the Time-Frequency Domain
In this paper, EMD is improved by the method of mirror extension, and the energy entropy of the vibration signal is extracted by the improved EMD (IEMD).
Assuming [21]: where i = 1, 2, 3, ..., n; x ik is the amplitude of each discrete point.The expression of IEMD energy entropy is According to Ref. [7], the energy entropy of the first eight IMF components of IEMD decomposition is the time-frequency domain indexes.
Collecting the vibration signal of the high-speed shaft at normal and three misalignment conditions with the speeds of 81.3 • /s, 104.4 • /s, 110.7 • /s and 117.0 • /s, respectively, one gets 114 samples at each speed with each case, so the total number of samples is 4 × 4 × 114.The features of time-domain, frequency-domain and time-frequency domain of the samples are extracted to construct the mixed-domain fault feature library, which are shown in Table 1.

Fault Isolation of Transmission System Based on PSO-SVM
After establishing the mixed-domain fault feature library of the vibration signal, the samples can be divided into two groups, one is the training set and the other is the testing set.In this paper, the training samples are 4 × 304, and the testing samples are 4 × 152.SVM is selected to be the classifier that will be trained by the training samples.

The Principles of SVM
SVM was proposed by Vapnik in 1995.The principle of SVM can be briefly described as follows [7].The giving training samples (x 1 , y 1 ), (x 2 , y 2 ), ..., (x l , y l ) (where x i ∈ R n , y i ∈ {1, −1}, and l is the number of training samples) can be separated by a hyper plane, and the hyper plane can be expressed as: where ω and b, respectively, represent the normal vector and the constant term.When ω and b are the best, it means the optimal hyper plane has been found, which makes the distance between the positive and negative samples the largest.
For linearly inseparable problems, the samples can be mapped to the hyper plane by the kernel function K(x i , y j ) to realize linear separability.Its objective function is as follows: a i a j y i y j K(x i , x j ), where a i is the Lagrange operator; C is the punishment factor, and its basic function is to control the penalty of the wrong samples; K(x i , x j ) is the kernel function, and its basic function is to transform the vectors of low-dimensional to the inner product in the high-dimensional.
The kernel functions used in SVM are mainly linear kernel function, polynomial kernel function, radial basis kernel function and sigmoid kernel function.Many studies and experiments show that the radial basis function (RBF) is a better choice.Therefore, in this paper, the RBF is used as the kernel function.The function is as follows: where g is the kernel parameter.
In SVM, parameter C determines the training error and the generalization ability of the classifier; and parameter g affects the distribution form of the samples in the feature space.The selection of them is very important for the performance of the classifier.In this paper, PSO is used to select the parameters of C and g to improve the classification performance.

Particle Swarm Optimization
PSO is a kind of swarm intelligence optimization algorithm in addition to the ant colony algorithm and the fish swarm algorithm, and it was first proposed by Kennedy and Eberhart in 1995 [7].The principle can be briefly described as follows.
Supposing in a D-dimensional search space, there is a population X = (X 1 , X 2 , ..., X n ), consisting of n particles.The i-th particle is represented as a D-dimensional vector of X i = [x i1 , x i2 , ..., x iD ], which represents the position of the i-th particle in the D-dimensional search space, and it also represents a potential solution to the problem.According to the objective function, the fitness value of each particle's position X i can be calculated.If the velocity of the i-th particle is the individual extremum is P i = [P i1 , P i2 , ..., P iD ], the population extremum is P g = [P g1 , P g2 , ..., P gD ], and then the velocity and position of the particle are updated according to the following equations: where d = 1, 2, ..., D; i = 1, 2, ..., n; k is the number of iterations; and c 2 are learning factors, which are non negative constants, making the particles learn from their own or other better particles to achieve the purpose of being close to the better position of itself or the whole group; r 1 and r 2 are random numbers distributed in [0, 1] to maintain the diversity of group.In order to prevent the blind search of particles, it is generally recommended to limit the position and velocity to a certain range of [−X max , X max ] and [−V max , V max ].
The realization of the fault isolation based on PSO-SVM ( in the model, PSO is used to select the parameters of SVM classifier) is shown in Figure 2.
The results of parameters optimizing of SVM using PSO and the mixed-domain features of the wind turbine are shown in Figure 3, that is, the optimal fitness of PSO algorithm is 83.3882%, and the obtained optimal parameters are . The classification results of the testing set using the optimized SVM classifier are shown in Figure 4, where the category label "0" indicates the normal condition; "1" indicates the parallel misalignment; "2" indicates the angle misalignment; and "3" indicates the comprehensive misalignment.The classification accuracy rate of training set obtained by using an optimized SVM classifier achieves 97.9441%, and the classification accuracy rate of testing set is 92.1053%, so the fault classification accuracy is very high.In order to better illustrate the superiority of the PSO-SVM algorithm proposed in this paper, the same fault features are adopted by SVM, GridSearch-SVM (the parameters of SVM are optimized by GridSearch) and GA-SVM (the parameters of SVM are optimized by Genetic Algorithm), and the testing results are shown in Figure 5a-c and Table 5.
From the simulation results, it can be seen that the fault isolation by SVM only has a poor recognition effect.When using GridSearch-SVM, the training accuracy and testing accuracy are better, but not higher than PSO-SVM.The training accuracy of GA-SVM is very high, but the promotion ability is less than PSO-SVM.Therefore, it is better to use the PSO-SVM algorithm to isolate the fault.The results of parameters optimizing of SVM using PSO and the mixed-domain features of the wind turbine are shown in Figure 3, that is, the optimal fitness of PSO algorithm is 83.3882%, and the obtained optimal parameters are . The classification results of the testing set using the optimized SVM classifier are shown in Figure 4, where the category label "0" indicates the normal condition; "1" indicates the parallel misalignment; "2" indicates the angle misalignment; and "3" indicates the comprehensive misalignment.The classification accuracy rate of training set obtained by using an optimized SVM classifier achieves 97.9441%, and the classification accuracy rate of testing set is 92.1053%, so the fault classification accuracy is very high.In order to better illustrate the superiority of the PSO-SVM algorithm proposed in this paper, the same fault features are adopted by SVM, GridSearch-SVM (the parameters of SVM are optimized by GridSearch) and GA-SVM (the parameters of SVM are optimized by Genetic Algorithm), and the testing results are shown in Figure 5a-c and Table 5.
From the simulation results, it can be seen that the fault isolation by SVM only has a poor recognition effect.When using GridSearch-SVM, the training accuracy and testing accuracy are better, but not higher than PSO-SVM.The training accuracy of GA-SVM is very high, but the promotion ability is less than PSO-SVM.Therefore, it is better to use the PSO-SVM algorithm to isolate the fault.From the simulation results, it can be seen that the fault isolation by SVM only has a poor recognition effect.When using GridSearch-SVM, the training accuracy and testing accuracy are better, but not higher than PSO-SVM.The training accuracy of GA-SVM is very high, but the promotion ability is less than PSO-SVM.Therefore, it is better to use the PSO-SVM algorithm to isolate the fault.

Comparison of Fault Isolation Results with Different Fault Features
In this paper, the time-domain, frequency-domain and time-frequency domain features are used to isolation the misalignment fault.In order to compare the superiority of the extracted fault features, the paper also adopts time-domain indexes, frequency-domain indexes, IEMD energy entropy, and their combinations of any two kinds as the inputs of SVM, and the recognition results are shown in Figure 6 and Table 6.

Comparison of Fault Isolation Results with Different Fault Features
In this paper, the time-domain, frequency-domain and time-frequency domain features are used to isolation the misalignment fault.In order to compare the superiority of the extracted fault features, the paper also adopts time-domain indexes, frequency-domain indexes, IEMD energy entropy, and their combinations of any two kinds as the inputs of SVM, and the recognition results are shown in Figure 6 and Table 6.From Figure 6 and Table 6, it can be seen that when the time-domain, frequency-domain and time-frequency domain features are used separately as inputs of the PSO-SVM isolation model, the accuracy of the training set and the testing set are low, and the recognition results are not good.When two kinds of features are combined as the inputs, the training set accuracy is above 91%, the testing set accuracy is more than 84%, and the recognition rate is increased.When the three mixed-domain features are combined, the correct rate of the training set reaches 97.9441%, and the accuracy of the testing set reaches 92.1053%.Therefore, it can be concluded that the mixed-domain Algorithms 2017, 10, 67 13 of 14 features combining three kinds of domain extractions proposed in this paper can effectively reflect the characteristics of the misalignment fault, and can be used to classify the fault types.It is also interesting to note that, whether used individually or combined with energy entropy, the time-domain features seem to provide better identification results than the frequency-domain data.This is because, when the wind turbine transmission system fails, it will produce impact components, and time-domain features can well reflect them, so better results can be obtained when using time-domain features.

Conclusions
This paper is based on the time-domain, frequency-domain and time-frequency domain analysis of vibration signals of the wind turbine in four different operating conditions with four kinds of rotating speeds.The mixed-domain features are extracted in the paper, which are the inputs of SVM classifiers.The parameters of SVM are optimized using the PSO algorithm.The testing results show that the proposed model for misalignment fault isolation is effective compared with other commonly used algorithms.
When the wind turbine operates in the condition of misalignment, there will be some changes in the temperature and electrical signals in addition to the reflection of the mechanical aspects, so ANSYS, MATLAB, and other software should be included in the simulation further to get temperature and electrical parameters, so as to study the fault characteristics more comprehensively.
At the same time, it is necessary to point out that the method presented in this paper is a general method, and it can be used elsewhere.

Figure 1 .
Figure 1.Simplified model of a doubly fed wind turbine.

Figure 1 .
Figure 1.Simplified model of a doubly fed wind turbine.

Figure 6 .
Figure 6.Testing results of PSO-SVM using different input features.(a) using time-domain features only; (b) using frequency-domain features only; (c) using IEMD energy entropy features only; (d) using time-domain + frequency-domain features; (e) using frequency-domain + IEMD energy entropy features; and (f) using time-domain + IEMD energy entropy features.

Figure 6 .
Figure 6.Testing results of PSO-SVM using different input features.(a) using time-domain features only; (b) using frequency-domain features only; (c) using IEMD energy entropy features only; (d) using time-domain + frequency-domain features; (e) using frequency-domain + IEMD energy entropy features; and (f) using time-domain + IEMD energy entropy features.

Table 1 .
Mixed-domain feature library.are partial data of the extracted time-domain, frequency-domain and time-frequency domain indexes.Type 0, 1, 2, 3 represent normal state, parallel misalignment, angle misalignment and comprehensive misalignment, respectively.

Table 2 .
The partial data of the time-domain indexes.

Table 3 .
The partial data of the frequency-domain indexes.

Table 4 .
The partial data of the energy entropy of the first eight IMF components of IEMD.

Table 5 .
Comparison of PSO-SVM with other commonly used classifiers.

Table 5 .
Comparison of PSO-SVM with other commonly used classifiers.

Table 6 .
PSO-SVM classification results of different fault features.

Table 6 .
PSO-SVM classification results of different fault features.