Diagnosis of Broken Bars in Wind Turbine Squirrel Cage Induction Generator: Approach Based on Current Signal and Generative Adversarial Networks

To ensure the profitability of the wind industry, one of the most important objectives is to minimize maintenance costs. For this reason, the components of wind turbines are continuously monitored to detect any type of failure by analyzing the signals measured by the sensors included in the condition monitoring system. Most of the proposals for the detection and diagnosis of faults based on signal processing and artificial intelligence models use a fault-free signal and a signal acquired on a system in which a fault has been provoked; however, when the failures are incipient, the frequency components associated with the failures are very close to the fundamental component and there are incomplete data, the detection and diagnosis of failures is difficult. Therefore, the purpose of this research is to detect and diagnose failures of the electric generator of wind turbines in operation, using the current signal and applying generative adversarial networks to obtain synthetic data that allow for counteracting the problem of an unbalanced dataset. The proposal is useful for the detection of broken bars in squirrel cage induction generators, which, according to the control system, were in a healthy state.


Introduction. Fault Diagnosis in Wind Turbines by Means of the Current Signal
Due to the remote locations where wind farms are installed and the considerable height of the wind turbines (WTs), condition-based maintenance predominates in the wind industry [1][2][3]. The detection and diagnosis of failures of WT components is usually performed using signal processing techniques applied to the vibration signal [4]. However, according to [5], the vibration signal has some disadvantages that can be overcome using the current signal, since with this signal it is possible to detect not only electrical faults, but also mechanical faults.
When a current flows through the stator of the induction machine, a flux is created in the air gap that depends on the design parameters of the motor. This flux induces currents in the rotor bars, which will create their own field. According to [6], when the rotor is in good condition, there is only one field that rotates in the same direction as the rotor, at the slip frequency. However, when there is an asymmetry in the rotor, a current and a field appear in the opposite direction of the normal current. The electromotive force due to the fault current induces a current in the stator, whose frequency is given by f bb = (1 − 2ks) f Hz [7].
The broken rotor bar causes a torque pulsation at twice the slip frequency (2sf ), in addition to a speed oscillation that is also a function of the inertia of the rotor. The speed oscillation can reduce the magnitude of the sideband (1 − 2s)f, but instead an upper sideband current component is induced at (1 + 2s)f in the stator winding, which is reinforced by the magnetic core nonlinearity, that is, as a summary it can be said that the failure due to broken bars or other oscillations induces additional frequencies in the stator current given by Equation (1).
Equation (1) is what is known as a double slip frequency sideband due to broken rotor bars, which modulates the amplitude and phase of the line current. According to [8], as the oscillations of the load torque are also manifested in the spectrum around the fundamental frequency, the components of Equation (1) would not be useful to diagnose broken bars. For this, it is necessary to use the higher order components given by Equation (2).
Having analyzed the way in which the components associated with failures are produced, such as broken bars, now the problem is to find the appropriate method for their detection and diagnosis. For the detection and diagnosis of faults in rotating electrical machines, the prevalent approach is based on working in the frequency domain, identifying the frequency components of failure in the vibration spectrum. Although many proposals in this regard can be found in the specialized literature, there are still challenges to overcome, especially in the detection of incipient faults and in operation at low load. Besides, if the slip varies due to change in speed, load, or grid instability, there is no constant spectrum, and signal processing techniques must be applied, such as: time-frequency transforms, filtering techniques to eliminate the components that are not of interest, and algorithms to obtain the spectrum corresponding to a specific speed [9,10].
Thus, in relation specifically to the use of the current signal for the fault detection of the electric generator of WTs and prime mover coupled to its shaft, some examples can be mentioned. In [11], the faulty gears of the gearbox are detected using power spectrum density (PSD), [12] proposes a model that allows for detecting the broken teeth of gears applying fast Fourier transform (FFT), [13] also detects the broken teeth of gears, using FFT prior to the application of SVM, [14] uses PSD to extract the characteristics of the signal that, after passing through a model called particle filtering, feeds an adaptive neuro Fuzzy inference System (ANFIS) capable of detecting broken teeth in gears, [15] proposes the use of a multiphase imbalance separation technique (EMIST) together with FFT to detect faults in the inner and outer race bearings of a gearbox, [16] proposes the detection of roughness in bearings, decomposing the current signal by means of wavelets. In [17], faults of the stator and rotor windings are detected by analyzing the spectrum obtained with the FFT. Through a comparative study, [18] determined that, to detect bearing failures, broken bars, and eccentricity, the wavelet-based methodology is superior to the Welch periodagram, PSD and Short Time Fourier Transform (STFT), while according to [19], Hilbert transform (HT) is superior to Park transform and Teager energy operator when it comes to detecting generator bearing failures. As a summary, once the signal processing techniques have been applied, the identification of the components associated with the failures is done according to the equations, such as those included in Table 1. Table 1. Frequency components associated with faults [5].

Fault Mathematical Model
Inter-turns short circuit From a historical point of view, methodologies based on signal processing techniques were the first to be used regarding the detection and diagnosis of faults in rotating electrical machines. After the appearance of the AI models, it would not take long before they were applied to the diagnosis and detection of faults in rotating induction machines. At present, signal processing techniques constitute a previous step for the application of AI models.
AI has been the subject of a huge number of publications and research papers, which have allowed for the construction of sophisticated equipment to control, monitor and detect the presence of faults, especially using ANN, Fuzzy logic and Neuro-Fuzzy systems [20] and [21]. As an example of the basic methodology used, we can mention the publication made by [22], approximately twenty-two years ago. According to this proposal, the frequency components that are relevant for the diagnosis depend on the slip or speed, so it is necessary to make two measurements, a sampling at high frequencies to determine the slip by means of the harmonic of the first slot and another sampling of greater duration at low frequencies to find the components associated with the faults. The inference machine of an expert system compares the spectrum obtained with a database containing the components associated with different types of faults and filters out the part of the spectrum that is not of interest. A simple way to carry out the diagnosis is by comparing the values obtained against a previously established threshold. The other alternative is through ANFIS with a Sugeno-type first-order inference system. The set of faults obtained are the input of the membership functions that form the adaptive nodes of the first layer of a multilayer neural network fed forward. The first layer is associated with the membership functions of linguistic variables (small, medium, and large), while the last layer provides the diagnosis expressed as: no failure, incipient failure, one broken bar, and two broken bars. To diagnose broken bars, the input variables are the components associated with these faults, whilst the negative and positive sequence components are used to detect short circuits between turns.
According to [21], as the ANFIS model only has one output, it can detect only one fault. For this reason, [21] proposes the co-active ANFIS (CANFIS) model capable of detecting several faults at the same time. The previously filtered and demodulated signal is decomposed by wavelet transform to obtain the coefficients that are affected by the faults. These coefficients are the input of the CANFIS model, which by having multiple outputs, can detect several faults at the same time. For [23,24], when fault diagnosis is done applying wavelet transform, it is not necessary to use slip. However, according to [25], as wavelets have the drawback of energy loss and edge distortion, it is preferable to use empirical mode decomposition (EMD) to obtain intrinsic mode functions (IMF). Each IMF represents a frequency band in which a type of failure can be found using Support Vector Machines (SVM), but after the optimization of the parameters by means of a genetic immune algorithm.
Although most of the proposals are based on models that have been previously trained with data containing the faults to be detected, according to the proposal of [26], this requirement would not be necessary. According to [27,28], the diagnosis can be made based only on data from systems such as Supervisory Control and Data Acquisition (SCADA). In [29], the signal is filtered to obtain the root mean square (RMS), kurtosis, skewness, standard deviation and crest factor. These parameters are the input data of a neural network trained with the Bayesian Regularization algorithm; the ANN has a hidden layer with 46 neurons and for the output layer it uses the tan-sigmoid transfer function.
According to [30], once the rotor current signal of a doubly fed induction generator (DFIG) has been sampled, the instantaneous rotation frequency of the shaft is obtained and the signal is demodulated using HT to obtain its envelope, but as the speed and shaft rotation frequency are not constant, an angular resampling algorithm is applied to obtain a constant envelope. Once this constant envelope is obtained, its PSD can be obtained to detect faults. In each phase, the amplitudes of the rotation frequencies corresponding to the input shaft, pinion and output shaft are obtained, plus RMS, kurtosis, peak value, and signal-to-noise ratio (SNR). These variables become the input of a deep learning model called a stacked autoencoder. Finally, the characteristics obtained from the learning of the ANN feed an SVM algorithm that classifies the failures. In [13], the frequency components that correspond to the faults are derived as a function of the interaction between the rotor and stator currents. When the gearbox suffers a failure, the vibration produces a torque on the shaft, Equation (3), altering its speed and frequency, which will be reflected in the current spectrum. Thus, when the FFT is applied to rotor and stator signals, characteristic frequencies of gearbox failures can be detected. In total, 16 variables of the rotor and 9 of the stator are extracted, of which 19 are in the frequency domain and 6 are in the time domain (RMS, kurtosis, and peak value of both signals). These variables feed an SVM whose output offers a diagnosis of the failures expressed in probability form, unlike the traditional SVM model that only classifies the failures.
Another way to visualize the state of a component is by means of its remaining useful life (RUL), as applied in [14] to analyze the state of the bearings of a gearbox. According to this study, whilst gears cause torsional vibrations, bearings do so radially. For this reason, in a healthy state the phase current will contain the fundamental component f and sidebands fi caused by the normal vibrations of the gearbox. However, as the faults will be reflected in the amplitude of the frequency components, if there are bearings with localized faults, new components will appear in the current signal, according to Equation (4). Once the signal has been sampled and before obtaining the PSD, the high frequency noise is eliminated by means of a forward-backward filter. The variable used to predict the state of the multiplier and its RUL is the SNR calculated according to Equation (5). The SNR values are the input of a five-layer ANFIS model, with an inference system formed by a set of 16 Fuzzy rules of the IF-THEM form based on a first order Sugeno model and optimizing the parameters using an ANN.
According to [31], although the rotor imbalance is manifested in the amplitude of the lateral harmonics of the fundamental frequency, the FFT and STFT fail to detect faults when the SNR is low, and the speed is not constant. As a single ANN would not be sufficient to cover the entire range of the electric generator speed, the authors propose dividing the interval in which the speed varies by several ranges and using one ANN for each range. The variables used correspond to wind speed, wind direction, pitch angle, turning speed and power output for one year, all of them obtained from the SCADA. To test the proposal, [31] simulates in Simulink a WT whose current signal is sampled at 5 kHz for 300 s. Applying FFT to 2 s signal segments, the spectrum formed by 250 components that become the input of the ANN is obtained. Having trained the model first with the signal in a healthy state and then with the signal containing the fault, the author states that it is possible to detect the frequency components associated with rotor eccentricity, which according to classical spectral analysis are given by Equation (6).
In [32], through the equations that relate voltage, current, flux, mutual inductance and torque between stator and rotor, a WT with DFIG is simulated in Matlab, but unlike [31], the faults studied are the one-phase fault and the inter-turns short circuit of the stator, while the model used is Fuzzy logic. The current signals from the three phases of the stator feed the Fuzzy system, which interprets them as linguistic variables (zero, small, medium, Appl. Sci. 2021, 11, 6942 5 of 18 and large). From the database obtained with the measurements, the membership functions and 14 Fuzzy rules are built. In another simulation proposed in [33], the current signal from a DFIG and ANN are used to monitor islanding events of WTs that are part of a wind farm connected to the grid. Other models used include the detection of broken bars and inter-turns short circuits, using equations from Table 1 [34].
In relation to the publications on the monitoring, detection, and diagnosis of failures in induction motors, using current signal and AI models, there is less research that deals with WTs. Some proposals are shown in Table 2. The components that have received the most attention are: bearings, gearbox, blades and electric generator. It can also be observed that among the models used, the following predominate: SVM, ANN, fuzzy logic and ANFIS. However, at the time of writing this research, only the works of [15,[35][36][37] deal with the use of the current signal based on data from WTs in operation, although none of them apply AI models. By way of summary, it can also be stated that: • Generally, signal processing and analysis techniques are combined with AI; • Not just one AI model is used, but a combination of them; • The component of the WTs that has received the most attention is the gearbox; • The types of failures most analyzed are broken gears and broken teeth; • In most cases, the current signal is from a DFIG; and • Stator, rotor or both currents signal can be used.
In most of the proposals, the AI model used is trained with the signal from the faultfree machine, to later use a sample of the signal that contains some type of failure caused on purpose (so the type of failure is known in advance). In these conditions, diagnosis is relatively easy, but due to the related costs, in the wind industry it would be very difficult to proceed in this way, so only the signal during the WT operation is available. However, if we only have the signal from the electric generator of TWs during its operation, the diagnosis process is complicated, since, in this case, and assuming that there are few faults or the faults are in an incipient state, the data are unbalanced. Since the efficiency of generative adversarial networks (GANs) is so high that it is difficult to distinguish between real and synthetic data, samples obtained by GANs could be used to compensate for unbalanced datasets.
Considering what has been stated previously, the main objectives of this work are to investigate the detection and diagnosis of faults of the SCIG installed in an operating WT (on which there are very few field studies), using electrical current signal and GANs, which is a methodology little explored thus far. The rest of this investigation is organized as follows: in Section Two, a brief conceptual analysis is made on the use of the current signal and GANs for fault detection and diagnosis in WTs. Section Three includes the methods and materials. In Section Four, the results obtained when applying the proposed methodology are shown and discussed. Conclusions and recommendations are included in Section Five.

Fault Diagnosis by Means of GANs
From a computational point of view, there are several methods to improve the efficiency of AI models trained with an unbalanced dataset, [39]. However, according to [40], statistical, regression, clustering and reconstruction models are not efficient when it comes to unbalanced datasets with very few outliers. Non-parametric models require large amounts of data and computational resources, while proximity-based models are affected by the volume and dimensionality of the data. Therefore, to overcome the drawbacks of unbalanced datasets and the lack of information caused by the course of dimensionality, [40] proposes Artificially Generating Potential Outliers (AGPO), whose main idea is to apply generative adversarial networks (GANs) to detect outliers of unbalanced datasets.
Among the most widely used AI models are those called generative modeling, whose main objective is to learn the exact distribution of the data with which they are trained, so that new data similar to the original can be generated, simulated or predicted. Although the main application of these models has been the treatment of images, they have also been used in the field of video games, cinema, graphic design, audio analysis and body language. However, the main difficulty has been finding the function that allows modeling the input data, and for this purpose, these models use the Monte Carlo method based on Markov chains, which is computationally very expensive. To overcome this drawback, several proposals, such as the Variational Autoencoders (VAEs) model have incorporated the use of ANNs capable of obtaining a powerful approximation function, through backpropagation. VAEs obtain the probability function using Bayesian statistics and two generative networks. The first ANN generates a probability function and random values on the studied phenomenon. The second ANN performs the discriminator function and provides a model only for the variables labeled conditional on the observed variables [41].
Until a few years ago, VAEs were among the most powerful and popular models used for the unsupervised autonomous learning of complicated distributions; however, they have the drawback that to determine probability distributions, they use Bayesian networks and Markov chains. However, in 2014, Ian Goodfellow [42] proposed the GANs, which is a model composed of two multilayer perceptron (MLP) ANNs. The first ANN plays the role of generator (NNG), since, after obtaining the distribution function of the dataset, it is capable of generating synthetic data very similar to the originals. The second ANN works as a discriminator (NND), because it determines whether a sample is real or is generated by the NNG. The two ANNs compete with each other until they find the Nash equilibrium for the non-cooperative game between two players trying to minimize their cost function. When the optimal point has been reached, the synthetic samples are so similar to the real ones that the NNG is not able to determine if a sample is real or fake. All this is achieved without using Markov chains and inference systems, and both ANNs are trained simultaneously with the same dataset, unlike models such as VAE, in which the NNG and NND are trained separately, [42].
According to [43], the minimization technique based on the gradient to lower the cost of each player simultaneously, which is used in [42], fails in convergence, so it is preferable to train the generator to match the value that it should have for an intermediate layer of the discriminator. This strategy, called GAN coincident characteristic, is excellent as a semi-supervised learning classifier, since, according to [40], its strength is based on the fact that: "Instead of directly maximizing the output of the discriminator, the new objective requires the generator to generate data that matches the statistics of the real data, where we use the discriminator only to specify the statistics that we think are worth matching." Currently, one of the most recent alternatives to solve the problem of classifying outliers from an unbalanced dataset is to generate synthetic data using semi-supervised and unsupervised models based on GANs [44]. The applications of GANs are no longer limited to image processing, but they are also applied to tabular data. In addition, to improve efficiency, proposals can be found that combine GANs with other models [45], while other investigations even modify the structure of the original GANs proposal. In [40], the Single-Objective Generative Adversarial Active Learning (SO-GAAL) algorithm is proposed to detect outliers from an unbalanced tabular dataset. The SO-GAAL model, which is based on the Generative Adversarial Active Learning (GAAL) proposed by [46], is basically a GAN that performs the classification when the NND separates the real data from those potential synthetic outliers generated by NNG. However, as the training progresses and the min-max game reaches Nash equilibrium, the information about the potential outliers is too close to the real data and the NND fails to distinguish between real data and outliers, causing the accuracy of SO -GAAL to drop dramatically.
The SO-GAAL model is unable to obtain a distribution function that represents the whole dataset and fails to detect the outliers because it does not stop the training when the outliers provide the necessary information, and this is due to fact that the GAN proposed by [42] has no prior information. To correct this problem, [40] proposes to modify the structure of the GANs, adding multiple NNGs with different objectives (MO-GAAL). The real data are divided into subsets of affine samples and each subset will feed an NNG, which will learn to generate outliers similar to the real data. In this way, a set of distributions is obtained that represents the whole dataset, which allows the NND to classify the outliers.
Regarding the detection and diagnosis of faults in rotating electrical machines, based on unbalanced datasets, several approaches are available, such as: [47], which proposes the Adaptive Boosting (AdaBoost) method to detect broken bars, and [48], which proposes a multiclass support vector machine (SVM) based on the one-vs-one strategy to detect broken bars in the case of speeds close to synchronism. In general, it can be said that, when it comes to fault detection and diagnosis, proposals based on an unbalanced dataset combine various statistical and AI models, but there is no standardized methodology. The application of GANs is still limited and there are not many references: [49] uses the current signal and synthetic data generated by GANs to correct the overtraining of a deep neural network used to detect the faults of an induction motor, while, to detect incipient faults in a gearbox, [50] obtains a synthetic dataset through GANs that are added to the original dataset to properly train a Stacked Denoising Autoencoder (SDAE) using the vibration signal. In general, it can be said that no references have been found on the use of the current signal together with GANs for fault detection in WTs, and especially in squirrel cage induction generators (SCIG) of WTs. Furthermore, the mentioned proposals for induction motors have been demonstrated on test benches and using small power electric motors. The GANs could also be applied to other types of signals, in such a way that the results can be compared with other proposals, such as [51].

Materials and Methods
For this study, measurements were made in a wind farm located in the region of Castilla y León, Spain. The wind farm has 33 WTs (NEG Micon brand) that use two winding SCIG, one of 750 kW that operates when the wind speed exceeds 6.5 m/s and another 200 kW winding for wind speeds between 3 and 6.5 m/s (see Table 3). As it is necessary to turn off the WTs to install the measurement equipment, permission was only obtained to perform the measurements at four WTs (WT-3, WT-4, WT-16 and WT-25). It must also be noted that the tests were done on the highest power winding, since the wind speed was relatively high at the time the signal was sampled. For measurements, three Fluke i3000s FLEX-36 current clamps (one for each phase) were connected to the main panel of the WT. The other end of the current clamp was connected to a PicoScope ® 4424, which must necessarily be connected to a computer where the software has previously been installed to configure the acquisition of the signal (see Figure 1). Although generally to obtain a good resolution in the frequency domain the sampling rate used fluctuates between 2 and 5 kHz, in this work a 10 kHz sampling rate was applied, since the frequencies that could be found were unknown. The total measurement time in each WT was approximately 8.5 min and to reduce the effects of spectrum variation, the sampled signal is recorded every two seconds. The software to sample the signal simultaneously records a file in mat format for each phase. Under these conditions and considering the four WTs, a total of 1006 signal files are obtained for each phase and 3018 files if the three phases are considered.
The signal records are processed in Matlab to obtain the power spectral density. The difference in magnitude between the fundamental frequency and its sidebands is then calculated, and depending on the magnitude of this difference, the data are labeled as 0 (healthy) or 1 (broken bars) [52]. Through Equation (2), the lateral components of the fifth, seventh, eleventh and thirteenth harmonics are also obtained. The signal records are processed in Matlab to obtain the power spectral density. The difference in magnitude between the fundamental frequency and its sidebands is then calculated, and depending on the magnitude of this difference, the data are labeled as 0 (healthy) or 1 (broken bars) [52]. Through Equation (2), the lateral components of the fifth, seventh, eleventh and thirteenth harmonics are also obtained.
According to what was seen in Section Two, the identification of the frequency components associated with the faults must be done at a fixed speed and slip. However, as the wind speed has a stochastic behavior, the spectrum of the generator will vary, and it is necessary to apply a method that allows the analysis in the frequency range at which the faults occur [53]. To solve the inconvenience described, one of the most accepted alternatives is the wavelet transform, since it allows for analyzing the signal in both the time and frequency domain [23]. Using the wavelet transform, a signal can be represented as a sum of small waves or wavelets throughout the time domain, which is known as a continuous wavelet transform (CWT). Each wavelet is a wavelet function that represents the original signal but scaled and shifted. However, as CWT involves too many calculations, another alternative is to apply the discrete wavelet transform (DWT), which can be seen as a downsampling process to decompose a signal into two sequences called cA1 (approximation coefficients) and cD1 (detail coefficients). cA1 corresponds to the lower frequency range, while cD1 consists of the high-frequency noise of the original signal. If we decompose cA1, a second level of decomposition formed by cA2 and cD2 will be obtained. Decomposing cA2, we will obtain cA3 with cD3, and so on until the frequency level we are trying to analyze is reached [54].
Applying Equation (1) for broken bars, a frequency component of 49.33 Hz is obtained. Then, following the methodology proposed in [55], the signal is decomposed using According to what was seen in Section Two, the identification of the frequency components associated with the faults must be done at a fixed speed and slip. However, as the wind speed has a stochastic behavior, the spectrum of the generator will vary, and it is necessary to apply a method that allows the analysis in the frequency range at which the faults occur [53]. To solve the inconvenience described, one of the most accepted alternatives is the wavelet transform, since it allows for analyzing the signal in both the time and frequency domain [23]. Using the wavelet transform, a signal can be represented as a sum of small waves or wavelets throughout the time domain, which is known as a continuous wavelet transform (CWT). Each wavelet is a wavelet function that represents the original signal but scaled and shifted. However, as CWT involves too many calculations, another alternative is to apply the discrete wavelet transform (DWT), which can be seen as a downsampling process to decompose a signal into two sequences called cA 1 (approximation coefficients) and cD 1 (detail coefficients). cA 1 corresponds to the lower frequency range, while cD 1 consists of the high-frequency noise of the original signal. If we decompose cA 1 , a second level of decomposition formed by cA 2 and cD 2 will be obtained. Decomposing cA 2 , we will obtain cA 3 with cD 3 , and so on until the frequency level we are trying to analyze is reached [54].
Applying Equation (1) for broken bars, a frequency component of 49.33 Hz is obtained. Then, following the methodology proposed in [55], the signal is decomposed using discrete wavelet transform (DWT) in 8 levels (see Figure 2). Since the 49.33 Hz frequency is contained within the frequency range of level d8 (38-78.16 Hz), signal power is obtained from level d8. In addition, the maximum and minimum values of the signal power are obtained, as well as the median, mean, mode, standard deviation, and variance. With all these data obtained in Matlab, a file type csv is created, which becomes the dataframe to work in Python and tensorflow. The data are scaled to the range 0 to 1 for the best behavior of the neural networks. discrete wavelet transform (DWT) in 8 levels (see Figure 2). Since the 49.33 Hz frequency is contained within the frequency range of level d8 (38-78.16 Hz), signal power is obtained from level d8. In addition, the maximum and minimum values of the signal power are obtained, as well as the median, mean, mode, standard deviation, and variance. With all these data obtained in Matlab, a file type csv is created, which becomes the dataframe to work in Python and tensorflow. The data are scaled to the range 0 to 1 for the best behavior of the neural networks. Figure 2. Signal decomposition using DWT, [55,56].
In Python, Kmeans is first applied to obtain three clusters, in such a way that the dataset is separated into healthy and faulty samples. Once the failure samples have been identified, and because they are relatively few, then the GANs are used to generate synthetic samples with faults which can compensate the unbalanced dataset. The synthetic data obtained in this way are put together with the original samples to retrain an ANN (see Figure 3).  In Python, Kmeans is first applied to obtain three clusters, in such a way that the dataset is separated into healthy and faulty samples. Once the failure samples have been identified, and because they are relatively few, then the GANs are used to generate synthetic samples with faults which can compensate the unbalanced dataset. The synthetic data obtained in this way are put together with the original samples to retrain an ANN (see Figure 3).
Appl. Sci. 2021, 11, x FOR PEER REVIEW 10 of 18 discrete wavelet transform (DWT) in 8 levels (see Figure 2). Since the 49.33 Hz frequency is contained within the frequency range of level d8 (38-78.16 Hz), signal power is obtained from level d8. In addition, the maximum and minimum values of the signal power are obtained, as well as the median, mean, mode, standard deviation, and variance. With all these data obtained in Matlab, a file type csv is created, which becomes the dataframe to work in Python and tensorflow. The data are scaled to the range 0 to 1 for the best behavior of the neural networks. In Python, Kmeans is first applied to obtain three clusters, in such a way that the dataset is separated into healthy and faulty samples. Once the failure samples have been identified, and because they are relatively few, then the GANs are used to generate synthetic samples with faults which can compensate the unbalanced dataset. The synthetic data obtained in this way are put together with the original samples to retrain an ANN (see Figure 3).  The hyper-parameters of the ANNs are: 100 training cycles (epochs) and the samples presented to the network each time are 10 (batch size). Accuracy is defined as the initial results metric. To guarantee the independence of the training and test data, ANNs are trained using 5-fold CrossValidation. The implementation of the ANN models was done with tensorflow and Keras. The computer used was the same one that was used to sample the signal, that is, a Toshiba laptop with an Intel Core i3-3120M processor, 2.50 GHz, 8 GB of RAM and an Intel HD Graphics 4000 graphics card.
To compare the results obtained with the proposed procedure, the proposal of [40] is applied, in which, in addition to generating synthetic data using GANs to compensate unbalanced tabular data, it is also proposed to use several generating neural networks (GNN), instead of a single GNN that is used in the original proposal of GANs. The research code from [40] is available in the GitHub repository, but since it is written in previous versions of Python and Tensorflow, it is necessary to create another virtual environment in Anaconda Powershell.

Results and Discussion
Applying some basic AI models, it can be observed that for all the models, the convergence is excellent, the RMSE is very small, and the accuracy value is very high (see Table 4 and Figure 4). The same does not happen with the Presiccion, Recall and F1 metrics, since the uniformity of the data and the existence of very few outliers make it difficult to identify the outliers. Applying kmeans and segmenting the dataset into three clusters (see Figure 5), the uniformity of the data can be appreciated, but also in one of the clusters the dispersion of the outliers can be observed. Outliers reduce the effectiveness of kmeans, as can be seen in the confusion matrix in Figure 6.  Table 5, where 78% of the failures indicated correspond to WT-3 and WT-4. The small number of samples which have failures causes the set of data available to train the AI model to be unbalanced, which would also likely cause the metrics of the first Kmeans model to be improved.
The low number of outliers (incomplete dataset) reduces the efficiency of the AI models. So, to improve the quality of the prediction, we apply the strategy of generating synthetic data using GANs. First, we tested with the methodology proposed by [40]. When the SCIG signal is processed using the algorithm proposed to generate synthetic samples using GANs with only one GNN, the precision in the detection of outliers is shown in Figure 7a and the area under the curve (AUC) is 0.52. When the GANs model with multiple GNNs is used, which according to [40] is superior to models such as: kNN, FastABOD, Parzen and k-means, the precision in the detection of outliers improves considerably (see Figure 7b) and the value AUC increases to 0.84. Appl. Sci. 2021, 11, x FOR PEER REVIEW 12 of 18             thetic data using GANs. First, we tested with the methodology proposed by [40]. When the SCIG signal is processed using the algorithm proposed to generate synthetic samples using GANs with only one GNN, the precision in the detection of outliers is shown in Figure 7a and the area under the curve (AUC) is 0.52. When the GANs model with multiple GNNs is used, which according to [40] is superior to models such as: kNN, FastABOD, Parzen and k-means, the precision in the detection of outliers improves considerably (see Figure 7b) and the value AUC increases to 0.84. As can be seen in Figure 7, the ROC curves are very unstable and the predictions fall below the non-discrimination line, which means that the model has difficulties converg and is ineffective at predicting failures. When only one GNN is used, the instability of the model is maintained despite reaching 17,000 iterations (see Figure 7a), while, using several GNNs, the model stabilizes after 7000 iterations (see Figure 7b), and the sensitivity also improves markedly. Applying the methodology proposed in this research, as described in the fourth section, with the samples showing signs of failure (see Table 5), we train the GANs to generate 100 synthetic data, which is added to the original data, to retrain a neural network. Proceeding in this way, the Receiver Operating Characteristics (ROC) curve (see Figure 8) and the value of 0.95 for the AUC are obtained. Compared with the ROC curve obtained by applying the proposal of [40] (see Figure 7), the ROC curve in Figure 8 is much As can be seen in Figure 7, the ROC curves are very unstable and the predictions fall below the non-discrimination line, which means that the model has difficulties converg and is ineffective at predicting failures. When only one GNN is used, the instability of the model is maintained despite reaching 17,000 iterations (see Figure 7a), while, using several GNNs, the model stabilizes after 7000 iterations (see Figure 7b), and the sensitivity also improves markedly.
Applying the methodology proposed in this research, as described in the fourth section, with the samples showing signs of failure (see Table 5), we train the GANs to generate 100 synthetic data, which is added to the original data, to retrain a neural network. Proceeding in this way, the Receiver Operating Characteristics (ROC) curve (see Figure 8) and the value of 0.95 for the AUC are obtained. Compared with the ROC curve obtained by applying the proposal of [40] (see Figure 7), the ROC curve in Figure 8 is much smoother and indicates a better convergence of the models used in this research. In fact, the AUC is also higher.
Another way to visualize the efficiency of the proposed model is through the confusion matrix (see Figure 9). From t 9 it can be seen that the proposed model is capable of appropriately classifying the synthetic data and in this way improves the accuracy of the prediction. smoother and indicates a better convergence of the models used in this research. In fact, the AUC is also higher. Another way to visualize the efficiency of the proposed model is through the confusion matrix (see Figure 9). From t 9 it can be seen that the proposed model is capable of appropriately classifying the synthetic data and in this way improves the accuracy of the prediction.

Conclusions
The signal processing techniques have represented an important advance regarding the detection and diagnosis of faults; however, in many cases they are not enough and must be combined with other mathematical models that generally assume some idealizations. Another alternative is artificial intelligence (AI) models, which from a conceptual point of view are characterized by their ability to adapt to uncertainty and to work with incomplete data. These AI models are used individually or together with signal processing techniques; however, when only the current signal of a WT in operation is available, but the signal has not been previously sampled in a healthy state or with some type smoother and indicates a better convergence of the models used in this research. In fact, the AUC is also higher. Another way to visualize the efficiency of the proposed model is through the confusion matrix (see Figure 9). From t 9 it can be seen that the proposed model is capable of appropriately classifying the synthetic data and in this way improves the accuracy of the prediction.

Conclusions
The signal processing techniques have represented an important advance regarding the detection and diagnosis of faults; however, in many cases they are not enough and must be combined with other mathematical models that generally assume some idealizations. Another alternative is artificial intelligence (AI) models, which from a conceptual point of view are characterized by their ability to adapt to uncertainty and to work with incomplete data. These AI models are used individually or together with signal processing techniques; however, when only the current signal of a WT in operation is available, but the signal has not been previously sampled in a healthy state or with some type

Conclusions
The signal processing techniques have represented an important advance regarding the detection and diagnosis of faults; however, in many cases they are not enough and must be combined with other mathematical models that generally assume some idealizations. Another alternative is artificial intelligence (AI) models, which from a conceptual point of view are characterized by their ability to adapt to uncertainty and to work with incomplete data. These AI models are used individually or together with signal processing techniques; however, when only the current signal of a WT in operation is available, but the signal has not been previously sampled in a healthy state or with some type of fault, the detection and diagnosis is complicated. It must also be noted that when failures are incipient and there are few failure records, the efficiency of AI models will be reduced.
In the usual procedures, the diagnosis algorithms are trained using data from tests with healthy equipment and with equipment in which failures have been caused. However, these situations may not be extrapolated to real situations, in addition to decreasing the performance of the classification algorithms when working with unbalanced datasets.
To improve the efficiency of AI models and wind turbine fault diagnosis procedures in cases of unbalanced data, this research proposes generating synthetic data using GANs. The methodology has shown its effectiveness for the early detection of failures due to broken bars in SCIG, in addition to allowing for improving the metrics of the AI model used.
Although this study has focused on broken bars, the proposed model could be applied to detect other faults. At the output of the proposed model, another model based on ANN or Fuzzy logic could be added to obtain a more precise diagnosis of the failure studied (half section broken bar, one broken bar, two broken bars, many broken bars). It would also be advisable to continue with the research trying to build GANs, not only with MLP, but also with other AI models, and to use GANs not to generate a synthetic dataset, but to carry out the diagnosis exclusively through GANs. In this study, tabular data have been used; however, the proposed methodology could be tested with images. In addition, other types of signals could also be used, such as: vibration, acoustic and thermal.  Acknowledgments: The authors would like to thank the University of Valladolid and University of Guayaquil for the assistance in the preparation of this research. We would also like to thank the company CETASA for allowing the acquisition of the signals and providing the necessary equipment. Thanks also to the anonymous reviewer for their assistance in the improvement of the study.