Next Article in Journal
Curved Surface Minijet Impingement Phenomena Analysed with ζ-f Turbulence Model
Previous Article in Journal
Participatory Impetus for and Forms of Citizens’ Co-Owned Power Plants: Cases from Higashi-Ohmi, Japan
Previous Article in Special Issue
An Unsupervised Learning Approach to Condition Assessment on a Wound-Rotor Induction Generator
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Data-Mining Approach for Wind Turbine Fault Detection Based on SCADA Data Analysis Using Artificial Neural Networks

1
DEIM School of Engineering, University of Tuscia, 01100 Viterbo, Italy
2
Department of Enterprise Engineering, University of Rome Tor Vergata, 00133 Rome, Italy
*
Author to whom correspondence should be addressed.
Energies 2021, 14(7), 1845; https://doi.org/10.3390/en14071845
Submission received: 1 February 2021 / Revised: 13 February 2021 / Accepted: 20 February 2021 / Published: 26 March 2021
(This article belongs to the Special Issue Future Maintenance Management in Renewable Energies)

Abstract

:
Wind energy has shown significant growth in terms of installed power in the last decade. However, one of the most critical problems for a wind farm is represented by Operation and Maintenance (O&M) costs, which can represent 20–30% of the total costs related to power generation. Various monitoring methodologies targeted to the identification of faults, such as vibration analysis or analysis of oils, are often used. However, they have the main disadvantage of involving additional costs as they usually entail the installation of other sensors to provide real-time control of the system. In this paper, we propose a methodology based on machine learning techniques using data from SCADA systems (Supervisory Control and Data Acquisition). Since these systems are generally already implemented on most wind turbines, they provide a large amount of data without requiring extra sensors. In particular, we developed models using Artificial Neural Networks (ANN) to characterize the behavior of some of the main components of the wind turbine, such as gearbox and generator, and predict operating anomalies. The proposed method is tested on real wind turbines in Italy to verify its effectiveness and applicability, and it was demonstrated to be able to provide significant help for the maintenance of a wind farm.

1. Introduction

The increasingly evident climate change and the need to increase the amount of energy produced from renewable sources, dictated by national and international strategic objectives [1], have led to growing interest in the development of technology that allows the utilization of wind as an energy resource. For instance, in the last decade in Europe, wind has been the source characterized by the most significant growth in terms of installed power [2] and, although this number is already very high today, it still seems destined to increase. At the same time, the development of wind technology has been characterized by a continuous growth in the size of turbines, up to over 10 MW. Such large investments require ever-higher levels of reliability and availability.
Furthermore, the search for profitable wind conditions leads to new investments in remote locations that are usually difficult to reach, such as offshore and high altitudes sites. In these conditions, intervention times are long thus the occurrence of unexpected critical failures can generate, in addition to the standard cost of intervention, very high costs related to non-availability. In fact, one of the biggest problems for a wind farm is represented by O&M (Operation and Maintenance) costs that can reach 20–30% of the total costs related to power generation [3].
The wind turbine is a complex system constituted of numerous components and subcomponents, each characterized by the possibility of incurring in different failure types, often difficult to locate and that may impact on other components’ health.
In recent years, the scientific research paid great attention to the study of specific components such as the gearbox and generator, characterized by high replacement and repair costs and more extended downtime in case of failure [4].
Long downtime for the wind turbine in a scenario where the request for availability is increasingly high can lead to production losses not acceptable. To prevent such situations is important to research methodologies that can identify malfunctions and failures in their initial state, so that the impact of the loss of productivity can be minimized.
This paper aims to develop a monitoring system based on the use of data from SCADA systems (“Supervisory Control and Data Acquisition”). In condition monitoring and fault detection for wind turbines, the use and analysis of SCADA data have recently been one of the most investigated methods. These systems are nowadays implemented on all wind turbines and make available a large amount of data, sampled with a relatively high frequency (typically 1 Hz) and recorded with their average values every 10 min [5].
We proceed by identifying, among these data, extended periods where the turbine has been free from failures to develop a model that is representative of its “healthy state”. This model will be taken as a reference during normal operating conditions and used to highlight the presence of abnormal behavior.
In this work, we address all the phases leading to the development of a methodology aimed at identifying failures: data pre-processing, model development and data post-processing. Several levels of wind turbine models are proposed: the turbine as a whole, and components such as the generator and the gearbox, which are among the most critical.
The main tools used are ANN (Artificial Neural Networks) to develop models to describe the natural behavior of the system and its components and control charts to support the prompt identification of malfunctions.
The developed methodology is tested on a real case study represented by a wind farm currently operating in an Italian location.
To conclude the introduction of the paper, the diagram in Figure 1 describes the methodological steps followed in the proposed research.

2. Background

At present, the maintenance policy typically adopted in wind farms is preventive, either scheduled or condition-based according to the development of simple static threshold alerts, with the aim of identifying possible faults in a timely manner to avoid further problems.
Interventions of typical scheduled maintenance are, for example, the purging of the generator bearings, the slip ring cleaning, the cleaning of the lubrication system filters, inspections of the gearbox with video endoscope. This type of approach does not, however, assure the avoidance of critical failures. Besides, a failure of a component, albeit not yet critical, could cause the turbine to work in non-optimal conditions, causing significant efficiency losses that could not be detected by the normal performance monitoring systems.
In recent years, various established monitoring methodologies targeted to the identification of faults have been transported to the wind turbine sector: vibration analysis [6,7,8,9,10,11,12,13], the analysis of acoustic emissions [14,15], MCSA (Machine Current Signature Analysis) [16,17,18,19,20], the analysis of oils [21,22] have been applied with interesting results. These techniques have the main disadvantage of requiring additional costs as they entail the installation of additional sensors if real-time control of the system is required [23,24]. Moreover, these methods have effectively been proven on high-speed rotation machines, but their sensitivity, validity and feasibility still need to be further verified on wind turbines in which some components are characterized by slow variation speeds and large dynamic loadings [25].
Also, techniques based on image detection, such as thermal image analysis [26], or microscope analysis [27] can find application in wind turbine fault detection [25]. Despite their effectiveness, the images of the failure modes need to be captured, stored and analyzed, and this requires an extra set up as well as advanced data analysis techniques [28].
Another type of data-driven based approach, on which the methodology proposed in this article is based, utilize the data from the SCADA systems. Most MW-scale wind turbines are already equipped with SCADA systems; therefore, one of the main advantages of these methods compared to those previously mentioned is that they do not require extra sensors, showing significant cost-effectiveness and are considered to be one of the most efficient solutions for wind turbine condition monitoring [29].
Typical condition-based maintenance through SCADA control systems, which are normally used as a support by the maintenance function, being able of generating alert signals (such as for the exceeding of static threshold values of the monitored parameters), is not always effective because it often does not allow intervention times sufficient to prevent critical scenarios.
To overcome these limitations is, therefore, necessary to move towards condition-based maintenance developed through more complex models, able to assess the interaction between the operating variables and the boundary conditions and identify in a predictive manner anomalous operating conditions before significant performance losses are generated.
Physical models, regression-based model, artificial neural networks or even machine learning techniques are widely used [30].
Instead of physical models, which use physical and thermodynamic relations to derive exactly determined output variables, data-driven models use historical data to identify the relationships between the input and output variables defined. From this point of view, therefore, the approaches based on physical models require a thorough knowledge of the specific structure of the system and its behavior in different operating conditions, often obtainable with great difficulty [31,32], therefore not easily feasible.
In contrast, data-driven models have the advantage of getting good accuracy in modeling without the need for large interaction with the end-user of the instrument [31].
The success of such an approach aimed at identifying failures is determined by the accuracy of the model developed. Several tools, such as K-nearest Neighbors [29,33], clustering algorithms [34], Support Vector Machines [35,36,37,38], both static and dynamic neural networks [39,40,41,42,43,44,45,46,47,48,49], and even deep learning approaches [50,51], have been proven very effective in modeling the relations between the parameters of a wind turbine.
In the methodology proposed in this paper, the principal tools used to model the turbine behavior are ANNs, as these have been shown to be very promising in numerous applications, especially in those where different methods are compared [52,53].
To obtain more information about the different applications and techniques used in condition-monitoring and fault detection for wind turbines refer to Appendix A, where a more in-depth analysis of the publications investigated is reported.
Additionally, one of the objectives of this work is also to contribute to the scientific literature by addressing the current absence of methodological support that explains in detail the configuration of the tools and models to be developed as these phases significantly determine the system’s ability to detect operating anomalies. These phases are described in the next section.

3. Methodology

This paper aims to propose a comprehensive methodology to design and apply a clear and effective approach based on the use of ANN and SPC (Statistical Process Control) for the fault detection of wind turbines.
The methodology defines all the steps to follow to create and deploy a fault detection control system, integrating tools from different fields (i.e., supervised and unsupervised machine learning techniques to develop data-driven models and techniques and multiple control charts from statistical process control), and which can be reliably applied to different scenarios.
While only a few of the scientific contributions highlighted in Appendix A present a general approach and only a few researchers have tried to improve their analysis with the support of statistical control charts, as has been done in this work, none of them have integrated all these steps and tested the final applicability of the resulting methodology on a real case study application.
The main steps of the approach here presented are the following:
  • Data acquisition and data pre-processing: the data are acquired, cleaned and prepared to be suitable for subsequent processing;
  • Model processing: the different models of the turbine and its components are developed and configured;
  • Post-processing: the deviations are evaluated using the control chart.
Large databases, generally regarding several years of operation of the wind turbine, and information about the maintenance interventions carried out are required. We then proceed by identifying among these data extended periods where the turbine has been free from failures to develop a model that is representative of its state of health (model training phase). These models will be the reference in the testing phase, where, with a new dataset, this time representative of general operating conditions of the turbine (therefore not excluding any possible failure), the system’s ability to identify the presence of anomalies will be validated. Only after this last phase, the model is deemed ready to be used on real-time data. Figure 2 represent respectively, the process to elaborate the model and the application of the model in the control phase.

3.1. Data Acquisition and Data Pre-Processing

In order to build representative models for the system, it is necessary to have the historical monitoring data of an adequate timeframe for the customization of the models. The width of the time interval will be influenced by the frequency of available data and the typical behavior of the system (i.e., the data used should be representative of all possible conditions of the system examined).
It is also important to have accurate and detailed information on maintenance interventions (both preventive and corrective) performed on the system for the same period.
Finally, any alarm signals generated automatically by the measurement system is considered useful support to model processing.
During this activity, an assessment of the quality and quantity of information available is also made in order to highlight any need for additional information before the start of the model processing phase.

3.1.1. Data Cleaning

During the training of the models, when the relationship that binds inputs and outputs is identified, the presence of outliers is a condition particularly dangerous. Indeed, in the definition of the “healthy behavior” of the system examined, the presence of an anomaly in the data can strongly affect the accuracy of the resulting models, and therefore operations aimed at cleaning the dataset are necessary.
The data cleaning phase can consist of several steps.
Once identified, potentially relevant variables for the model processing phase, the following steps should be performed:
  • Removal of samples in which at least one input or output signal is missing;
  • Removal of samples in which the wind turbine output power is zero;
  • Removal of samples where one or more variables are out of the range of normal variation (is also essential to identify the cause of such an occurrence).
In addition to sensor errors, a good part of the anomalous behaviors can be due to the artificial power reductions to which the wind farms can be subjected. The power limitations may be due to maintenance requirements, but mostly they are due to constraints imposed by the national power grid to overcome dispatching problems. These behaviors are not considered normal, and samples affected by these restrictions have to be removed (information about the power limitation is generally present in SCADA data).
Although a simple preliminary filter is often sufficient to remove most of the outliers, it is suggested to consider using more specific techniques in case the preliminary cleaning phase fails to exclude the presence of all outliers. Thus, the data set for training can be further cleaned up to ensure better accuracy of the model.

3.1.2. Clustering and Mahalanobis Distance

For this purpose, a clustering method that, albeit with some diversification, has been applied, with excellent results, to similar problems, is proposed [34,45]. The method is based on the removal of outliers using the evaluation of the Mahalanobis distance. The Mahalanobis distance is defined as the distance, measured in terms of standard deviation from the average, of a point from a distribution, and it takes into account the correlation in the data since it is calculated using the inverse of the variance-covariance matrix of the data set of interest [54].
To improve the identification of outliers is useful to divide the dataset into smaller groups [34]. Looking at a characteristic curve of a turbine reported in Figure 3, we realize how it behaves differently in the different areas of the power curve. In the criterion proposed by [43,45], it is recommended to divide the parameters considered in clustering into intervals where the turbine behavior changes. A simple method of dividing observations into a given number of groups is to use K-means clustering.
After subdividing the samples, to determine the outliers, we proceed to calculate the Mahalanobis distance for each observation following the subsequent formula [54]:
M D i = ( x i C i n ) c o v ( x ) 1 ( x i C i n ) T   ,
where M D i represents the Mahalanobis distance of the i-th sample x i , C i n represents the coordinates of the center of the n-th cluster (with 𝑛 = 1 … 𝑘) corresponding to the sample   x i , while c o v ( x ) is the covariance matrix by x .
To determine the outliers, we can use a simple method proposed by [34]. This method consists of setting a threshold value δ of the distance M D to consider about 10–15% of the points as anomalous.

3.2. Model Processing

In the model processing phase, an Artificial Neural Network is defined.
Neural networks are a powerful modeling tool. The choice of their use is dictated by the fact that they have been proven to be very capable of modeling the complex non-linear relationships between the characteristic parameters of a wind turbine.
Regardless of the type of architecture chosen, the configuration of models based on neural networks have essential common points: the choice of the number of layers, the number of neurons in each layer and the selection of the algorithm to be used in the training phase.
In general, increasing the number of layers and using a more significant number of neurons within them increases the capacity of the neural network; however, it requires greater computational complexity and increases the probability of overfitting [55].
Overfitting or overtraining is one of the most typical problems encountered in the creation of models based on neural networks: the network performs well during the training phase but fails to replicate as good results when it is working with new data.
There are no rules that allow choosing which architecture is the best, nor the number of layers, nor the number of neurons that compose them. The best method, albeit expensive in computational terms, is to rely on a trial and error procedure: different configurations are tested, and the one that gives the best results is selected [55].
Since this is a problem with different orders of infinity, it will be impossible to test all possible configurations to find the best solution: it will therefore be of fundamental importance to have an idea of which configurations are most used for the application considered. In this phase, the support of the bibliographic research carried out will therefore be fundamental.
For wind turbine applications, feed-forward neural networks (FFNN) are by far the most used. There are also numerous applications of recursive networks, which in several cases have shown better performance. As regards the configurations of the structure, the most adopted provide a single hidden layer with a number of neurons varying between 10 and 30, which, despite being a very simple configuration, has shown satisfactory results. In all the applications viewed, the Marquardt-Levenberg algorithm is used to train the network. It is an evolution of the backpropagation algorithm that has been proven to be faster and more efficient than other standard algorithms for neural networks composed of a few hundred neurons [56].
The idea of our work was to start the experimentation of these tools starting from feed-forward architectures, both static and dynamic, chosen for their simplicity, and the non-linear auto-regressive networks with exogenous inputs (NARX) as representative of recursive networks, selected for their positive applications [43,45].
In order to perform the training of the model, the available dataset will be separated into three parts:
  • Training set (used to effectively train the model, defining the hyperparameters of the ANN);
  • Validation set (necessary to overcome the overfitting problem);
  • Test set (final set, never seen by the trained model, used instead to assess its real performance).
Typical percentages of division for the dataset are 70-15-15.
The model thus created will then be used in the monitoring and control phase: receiving as inputs the measurements of some variables of the system (both operational and environmental; their choice is to be determined through the study of technical and scientific literature), the model generates the value of the output variable, characterizing the system in its healthy condition.
The output variable generated by the model is then compared with the actual value measured by the measurement system at the same time. The deviation between the two values (actual and estimated by the model) is evaluated statistically using control charts to identify anomalies in place in the system. The preliminary analysis of control charts is necessary to have a first evaluation of the performance of the models in highlighting anomalous behaviors in the past.

Feature Selection

The choice of variables should be made in order to be able to characterize the “healthy” behavior of the system completely. Inputs and outputs will significantly characterize the system’s ability to monitor components and identify faults. The inputs and outputs that can allow adequate visibility of abnormal behaviors are not known a priori and are not easy to determine. A careful analysis of the system is needed, and important skills are required to estimate the mutual influence to which the parameters of a wind turbine are subjected.
For this phase, the support of bibliographic research is essential to understand which of the variables made available by SCADA systems are the best for implementing a turbine monitoring system through its components. The determination of the link between these quantities is generally based on the mixed-use of data reduction techniques (e.g., Principal Component Analysis) and engineering knowledge.
In Table 1, the different inputs and outputs for the proposed models are presented. Their choice has been the results of research in the scientific literature in order to identify the possible variables to use.

3.3. Post-Processing

After having processed the model and having evaluated its accuracy, it is necessary to analyze the behavior of the resulting deviations.
The deviations are calculated as:
Δ   = a c t u a l   v a l u e e s t i m a t e d   v a l u e .
The deviation between each pair of values (actual value and value estimated by the model) is evaluated statistically through the use of a control chart to identify anomalies in the system. In particular, in this approach, the Shewhart control chart is used.
The deviations are plotted on the chart showing their evolution over time. Two control limits are added to simplify the evaluation of anomalous behaviors [58]:
  • The Upper Control Limit (UCL);
  • The Lower Control Limit (LCL).
These values define the sensitivity of the control chart and are often set as multiples of the standard deviation of the deviations’ distribution, σ. The standard deviation is calculated from the moving range MR, as the difference between the i-th deviation and the previous one:
M R = | Δ i Δ i 1 | ,
σ = M R ¯ 1.128   ,
U C L = + 3 σ ,  
L C L = 3 σ ,
when the system examined shows a healthy behavior (compliant with the model) the deviations on the chart will show a normal statistical distribution with a mean equal to zero, on the contrary, the presence of non-random patterns (e.g., points outside the control limits, mixtures or shifts of the average) are signals of non-conformity with the model and therefore of anomalous behavior.
Once the model has been validated, by ascertaining that the anomalies detected are real faults, the model can be used to enact real-time fault detection.
Figure 4 summarizes the proposed method to perform fault detection in wind turbines.

4. Case Study Application

The application of the proposed methodology is carried out on a selection of wind turbines from a wind farm in southern Italy. The turbines’ model is Vestas V90 2MW.
The data available are in different formats:
  • SCADA data recorded every 10 min, from 1 January 2015 to 9 January 2018, for a total of 192 sampled variables;
  • Service report, in which for each month from January 2015 to October 2016, the records of the maintenance interventions carried out are collected.
The models were created with historical data and were then applied to the following period to assess the tool. In order to do so, therefore, maintenance information has been essential.
In the following paragraphs all the considerations that emerged in the different phases are reported—the issues of data pre-processing, the choice of ideal configurations and the selection of the most suitable models. Finally, the capacity of the developed system has been tested for the identification of faults that, in the period investigated, were found in the turbines of the wind farm.
As previously specified, we followed two different approaches—the first, at a higher level, based on the turbine monitoring as a whole entity via the power output, and then the development of more specific models for the major components, in particular for the generator and the gearbox. The input and output variables of the developed models, determined thanks to the literature review, are shown in Table 1.
In Table 2, it is possible to observe a list with the most critical faults concerning the components on which we have concentrated. Below, for each component, we will see how the system reacts.

4.1. Data Pre-Processing

The first general filtering operation is aimed at avoiding the presence of anomalous points that can have a negative impact on the accuracy of the model. The following categories of data have been excluded:
  • Output power is zero;
  • Instances in which at least one of the measures of the relevant variables is missing;
  • Instances in which the turbine is working under a regime of limited power.
In Figure 5a it is shown the initial state of the data for one of the turbines, where the presence of numerous outliers is obvious. Already in this first phase, one of the characteristics that will have a great impact on our system emerges: about 50% of the data are affected by a limited power regime.
The power limitations are mainly due to constraints imposed by the national power grid to overcome dispatching problems and, since these behaviors are not considered normal, the samples affected by these restrictions have to be removed.
An example of applying the first general filter is shown in Figure 5b.
For the selection of an appropriate set of data to be used for the training of the different models, the maintenance records were analyzed. Through the service reports the history of the individual turbines in the wind farm can be reconstructed in terms of the failures that occurred. For the different models, a part of the dataset is manually selected where the turbine has been free from faults that may have had repercussions on the monitored variables. There are no general rules for establishing the ideal size of the dataset to be used in training, but it should contain all the natural variability of the quantities used as inputs and outputs. In this regard, where possible, an annual interval of operation for the wind turbine is used.
With the aim of increasing the performance of the models, the training set is subjected to a second filtering operation: data outside of the normal operating range of the turbine has been removed through the use of a clustering method, based on the K-means algorithm to divide the data into subgroups and afterward use the Mahalanobis distance to detect abnormal points and delete them.
In Figure 5c, there is an example of the outliers’ removal using this method. For the number of clusters, a value has been chosen equal to 12, while the threshold distance was defined as such a value that the 5% of the points are considered anomalies.
Finally, Figure 5d shows the training dataset clean and ready to be used in training the model.
Figure 6 shows a graphic representation of the output variables chosen to realize the four models developed by one of the turbines analyzed during the training phase.

4.2. Model Processing

To develop the models, both static and dynamic FFNN and recursive networks (NARX) have been considered.
As there are no general rules for the optimal configuration to be used, the number of layers and the number of neurons inside them are determined following an experimental campaign, widely varying these two characteristics for the different types of networks tested. Although increasing the size of the network generally results in better performance, the results obtained show how using neural networks with more than two hidden layers and with a number of neurons greater than 30 does not lead to a substantial improvement of the models.
To facilitate easy training and avoid the phenomenon of overfitting, to which large neural networks are particularly prone, in this application it is preferred the use of neural networks with a maximum of one hidden layer and with a number of neurons not exceeding 30, characteristics that will be established from time to time through an iterative procedure. Furthermore, to improve the network’s ability to generalize, in addition to the standard early stopping methodology, with a division of the train, validation and test sets of 70%, 15% and 15% respectively, the network has been tested on an additional independent test set equal to 20% of the set used in the entire training. The network characterized by the best performance has been selected.
Although the literature shows dynamic feed-forward networks or recursive networks such as NARX are particularly suitable for this problem [43,45], for our particular application, the use of the suggested tools did not produce the expected results: the best performances of the models have in fact been obtained with the use of the simplest neural networks, the static feed-forward. One reason why these regressive approaches have not been proven suitable is certainly their sensitivity to “missing” data, in this case caused by the copious, but inevitable, removal of the numerous outliers.
All neural network models were developed using MATLAB.

4.3. Wind Turbine Model

The model used for monitoring the output power is made with FFNN, using as input, in addition to wind speed and ambient temperature, wind direction and standard deviation of wind speed, which, albeit to a lesser extent, offer a contribution to the performance of the model. Figure 7 presents a graphical representation of the FFNN model elaborated.
An example of the use of this model is reported in Figure 8, where a control chart of Power Output deviations is shown.
The proposed model has been tested on several turbines and compared with another type of model common in literature, based on a non-linear regression (see the formula below from [59]):
P ( v ) = P m a x ( 1 + ( b v ) a ) c           a , b , c > 0 ,  
where v represents the wind speed and P m a x the nominal wind turbine power.
Figure 9 presents a control chart of Power Output deviations using this other reference model.
In Table 3, a comparison between the two models is reported.
By the comparison between the two models, it is clear that the first one, the ANN model, is able to overcome the issue of seasonality, assuring a better representation of the behavior of the wind turbine.
Although the output power model provides performances, calculated in terms of Root-Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) perfectly in line with results from the literature, it was not able to identify the occurrence of faults in specific components. Therefore, specific models were created for the critical components: gearbox and generator.

4.4. Gearbox Model

The ANN model is an FFNN with one hidden layer and 27 neurons (with RMSE = 0.68 °C, MAE = 0.48 °C, MAPE = 0.93% calculated during the training phase). The control chart of the gearbox bearing temperature model is reported in Figure 10.
To assess its application, we should refer to the maintenance history of the wind turbine:
  • Repair gearbox from 26 April 2016 to 30 April 2016;
  • IMS bearings Replacing from 27 September 2016 to 28 September 2016;
  • Repair gearbox from 8 February 2017 to 11 February 2017.
Observing the control chart, in addition to two points outside the control limits, respectively on 28 February 2016 and on 16 March 2016, from 19 May 2016 the deviations are out of control for an extended period, presenting a positive shift of the average, before the replacement of the Intermediate Shaft (IMS) bearings of 27 September 2016. Therefore, it seems that maintenance intervention might have been predicted.
Although the value of the deviations decreases, there are still points outside the control limits. However, there are still not enough elements to be able to predict the last maintenance action on the gearbox in February 2017.

4.5. Generator Model

4.5.1. Wind Turbine WT01

The ANN model for the generator bearing temperature is an FFNN with one hidden layer and 28 neurons (with RMSE = 3.80 °C, MAE = 2.11 °C, MAPE = 4.54% calculated during the training phase). The control chart for this application is reported in Figure 11.
To assess its application, we should refer to the maintenance history of the wind turbine:
  • Non-Drive End (NDE) and Drive End (DE) bearings replacement from 16 May 2016 to 19 May 2016;
  • Replacement of the generator from 16 August 2016 to 28 August 2016.
The turbine is subjected to a replacement of the bearings and a replacement of the generator. The latter generated one of the most critical scenarios for wind farm maintenance, keeping the turbine stationary for fourteen days. There is no information of alarms or interventions except for the request to replace the bearings which started on 12 May 2016 and was carried out from 16 May 2016 to 19 May 2016.
No alarm is detected in the following months, but the turbine is in a stopped state from 16 August 2016 to 28 August 2016, a period in which, following a request submitted on the 22 August, the replacement of the generator is conducted, ending on 28 August 2016.
The control chart of the deviations of the generator bearing temperature model shows an evident change in variability with numerous points out of control from 5 November 2015, to the replacement of the bearings of 16 May 2016. In particular, the last point out of control dates back to 12 May 2016, when the replacement was ordered.
The proposed monitoring system, therefore, seems to predict the anomaly well in advance. Following the replacement of the bearings, the deviations change significantly, presenting a shift of the average, while the variability seems to have returned to normal. Although the shift of the average can be justified by the use of a different type of bearing characterized by different specifications, the growing trend that stops only after replacing the generator is still anomalous.
To assess how early the system could have predicted the anomaly, considering the first three weeks from the replacement of the bearings to be “normal” and calculating the new average of the deviations, 100% of the following points would be above the average just mentioned. Referring to the rules created by Western Electric, which consider eight points on the same side of the control chart to be anomalous, the proposed system would have identified the anomaly on 25 June 2016, approximately two months before the generator’s replacement.

4.5.2. Wind Turbine WT02

Now, it is possible to observe another application on a different wind turbine.
The ANN models for the generator bearing temperature and the generator slip ring temperature are FFNN with one hidden layer and respectively 24 neurons (with RMSE = 2.46 °C, MAE = 1.28 °C, MAPE = 3.00% calculated during the training phase) and 30 neurons with RMSE = 0.88 °C, MAE = 0.70 °C, MAPE = 3.14% calculated during the training phase). The control chart for these applications are reported in Figure 12 (generator bearing temperature model) and in Figure 13 (generator slip ring temperature model).
To assess its application, we should refer to the maintenance history of the wind turbine:
  • Purging of exhausted grease channel of the generator bearings 1 August 2016;
  • NDE and DE bearings replacement from 11 May 2017 to 12 May 2017.
From the analysis of the control charts of the generator bearing temperature deviations (Figure 12), in the first period, several points beyond the upper limit are noted.
The first alarm dates back to 8 June 2016 and has been repeated fourteen more times until 1 August 2016, the date on which the clogged grease drain channel was cleaned. In this case, the system has shown good forecasting capacity, also providing prediction in reference to the second alarm of 5 September 2016, about the high temperature of the bearings.
Since October 2016, there are repeated points beyond the upper limit with very high deviation values. The anomalous behavior appeared to end on 18 February 2017, and then the deviations go back out of control several times until the bearings were replaced on 11 May 2017.
Unfortunately, as of October 2016, maintenance reports were no longer available, preventing more specific considerations regarding the actual detection of anomalies.
However, it is possible to assume that there have been interventions in the turbine, probably when the deviations drop since the generator temperatures have exceeded 100 °C, certainly generating alarms from the SCADA system. The numerous points out of control in the months preceding the replacement of the bearings seem to be non-random and potentially signals able to predict the need for intervention.
Although to a much lesser extent, the temperature of the slip ring shows several points out of control between October 2016 and February 2017, as reported in Figure 13. Besides, from 12 March 2017, the model is stably beyond the control limits. Deviations that fall within limits correspond to the exact time of replacement of the bearings.

5. Discussion

In this experimental application, all the steps of the proposed fault detection methodology have been tested.
In the first steps, the data acquisition and pre-processing, some difficulties were encountered caused by the numerous outliers. This situation is typical in the operational context of the investigated application. However, despite a large number of outliers, the application of the proposed clustering method that combines the K-Means algorithm with the use of the Mahalanobis distance was quite efficient. Indeed, it allowed to obtain an adequate dataset for the subsequent phases and the procedure has the additional advantage of being easily automatable to support large-scale applications on wind farms.
Two different monitoring approaches have been undertaken. The first, at a higher level, based on the turbine monitoring via the power output, showed excellent results from the point of view of the model performance but has not been proven capable of signaling the presence of anomalies in the turbine, thus fostering the development of more specific models for the major components, in particular for the generator and the gearbox.
The best results for the detection of faults and operating anomalies were obtained for the generator, where the proposed approach showed evidence of the applicability in the prediction of the occurrence of critical failures.
In addition, the system was able to predict minor interventions carried out as purging and cleaning of the bearings and failures in the ventilation system.
The proposed approach has the notable advantage of being tailored to only use SCADA data that are generally present and are already transmitted in real-time, whereas the other relevant predictive techniques cited before require additional measurement systems that cannot be continuously performed.
Besides, the use of data-driven models, in opposition to physical models, allows for the possibility of getting good accuracy in modeling without the need for an extensive knowledge of the specific structure of the system and its behavior in every operating conditions.
The experimental application has been successfully carried out in the case study presented, but it should be highlighted the fundamental importance that the data acquisition and data collection phases have in this approach. Indeed, historical data that are not enough extensive to be representative of the wind turbine’s healthy state would prove detrimental to the application’s success.

6. Conclusions

In this paper, all the phases that lead to the realization of a system aimed at the monitoring and the identification of anomalies of a wind turbine and its main components, such as the generator and the gearbox, have been described. The proposed approach is based on the use of data collected by SCADA acquisition systems. The main tools used for the development of the fault detection methodology are ANN for the development of the models and SPC for the identification and analysis of operating anomalies.
The proposed methodology has the objective to implement a fault detection system for wind turbines on several levels: monitoring the performance of the turbine as a whole, while also monitoring two critical components such as the generator and the gearbox.
The methodological approach has been applied to a real case study regarding two wind turbines to test its effectiveness and it was successful in identifying abnormal behaviors before the insurgence of faults.
Thus, the system developed has been proven to be a valuable support to the maintenance of a wind farm, providing additional information to evolve the current maintenance policy based on time-based scheduled inspections and alarms related to the exceedance of static threshold values, which often do not allow sufficient time to prevent critical scenarios.
The proposed method has the possibility of being extended to other major components.
A future development of this approach is the evolution of said methodology towards fault diagnosis. In addition to identifying the abnormal behavior, in order to define with precision the cause, faults that occurred should be associated with specific patterns on the control charts, thus laying the foundations for the subsequent automation of the fault diagnosis. To do so, however, deep cooperation with industrial actors is necessary since not only a long-term trial is mandatory but also the support of experts and maintenance personnel is deemed critical to successfully tailor this further step. This aspect would have interesting consequences, even regarding the scheduling optimization of maintenance jobs.
In more general terms, then, the increased awareness of the real health of the system generated by the implementation of such tools would have advantages also concerning the very current theme of power forecasting for renewable energy sources [60].

Author Contributions

All authors contributed equally to the idea and the design of the methodology proposed; A.S. and D.D. were responsible for the case study application; A.S. and D.D. prepared the original draft; A.S. and V.I. contributed to the review and editing and V.I. was responsible for the project supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

This appendix reports the most relevant contributions examined in the scientific literature review performed in this research. It aims to summarize the different applications and techniques used in condition-monitoring and fault detection for wind turbines. The literature review process was performed by using keywords such as “wind turbines”, “condition monitoring”, “fault detection”, “neural networks”.
Table A1 reports a list of the relevant scientific publications analyzed. For each application is reported: investigated components, the main tools and methods used, the presence of a real case study in which the proposed tools and techniques are validated, and the type of approach followed.
In particular, for the column “Real Case Study” we considered real case studies only those that involved wind turbines operating in real conditions (no prototypes, single components or simulation approaches).
Table A1. Summary of relevant applications about fault detection and condition monitoring in the wind turbine.
Table A1. Summary of relevant applications about fault detection and condition monitoring in the wind turbine.
Ref.ComponentsTools/MethodsReal Case StudyApproach Type
[33]Kusiak et al., 2009Wind turbinek-NN 1YesSCADA data
[39]Zaher et al., 2009Gearbox, generatorMultilayer auto regressive FFNNYesSCADA data
[40]Schlechtingen and Santos, 2011Gearbox, generatorAuto-regressive FFNN,
Linear regression
YesSCADA data
[61]Simani et al., 2011Sensors/ActuatorsFuzzy LogicNoNot specified
[41]Kusiak and Verma, 2012Gearbox, generatorMultilayer FFNNYesSCADA data
[7]Zhang et al., 2012GearboxFast Fourier TransformationNoVibration Analysis
[52]Schlechtingen et al., 2013Wind turbineANFIS 2, K-NN 1, FFNN, CCFL 3YesSCADA data
[34]Kusiak and Verma, 2013Wind turbineK-means clusteringYesSCADA data
[22]Feng et al., 2013GearboxMathematical
model
YesSCADA data
[22]Feng et al., 2013GearboxSpectral AnalysisYesVibration signal and oil debris count
[8]Liu, 2013TowerMathematical
model
NoVibration Analysis
[17]Ling and Cai, 2013GeneratorMathematical
model
NoMCSA
[57]Schlechtingen and Santos, 2014Gearbox, generatorANFIS 2YesSCADA data
[42]Zhang and Wang, 2014Main bearingMultilayer FFNNYesSCADA data
[43]Karlsson, 2015Wind turbineNARXYesSCADA data
[18]Merabet et al., 2015GeneratorFuzzy LogicNoMCSA
[19]El Bouchikhi et al., 2015GeneratorMaximum likelihood estimatorNoMCSA
[35]Leahy et al., 2016Wind turbineSVM 4YesSCADA data
[62]Zhang and Ma, 2016Wind turbineParallel factor analysis, K-meansYesSCADA data
[14]Gómez Muñoz and García Márquez, 2016BladesGraphical
method
NoAcoustic Emission Analysis
[44]Sun et al., 2016Gearbox, generatorFFNN 2, Fuzzy synthetic evaluationYesSCADA data
[63]Pozo and Vidal, 2016Sensors/ActuatorsPCA 5, Statistical hypothesis testingNoSCADA data
[64]Bi et al., 2017Pitch systemMathematical
model
YesSCADA data
[36]Ouyang et al., 2017Wind turbineSVM 4YesSCADA data
[65]Wang et al., 2017GearboxDeep Neural NetworkYesSCADA data
[66]Nazir et al., 2017Sensors/ActuatorsMathematical modelNoNot specified
[45]Bangalore et al., 2017GearboxNARXYesSCADA data
[59]Marčiukaitis et al., 2017Wind turbineNon-linear regressionYesSCADA data
[46]Nithya et al., 2017RotorFFNN 2NoSCADA data
[53]Zhao et al., 2017GeneratorSVM 4, ANN, K-NN 1, Naive BayesianYesSCADA data
[67]Yu et al., 2018Sensors/ActuatorsDeep Belief NetworkNoNot specified
[68]Alvarez and Ribaric, 2018GearboxMathematical
model
YesSCADA data
[69]González-González et al., 2018Pitch systemMathematical
model
YesNot specified
[20]Artigao et al., 2018GeneratorFast Fourier TransformationYesMCSA
[70]Dao et al., 2018Wind turbineCointegration analysisYesSCADA data
[47]Manobel et al., 2018Wind turbineGaussian Processes, ANNYesSCADA data
[48]Bangalore and Patriksson, 2018GearboxANNYesSCADA data
[37]Vidal et al., 2018Sensors/ActuatorsSVM 4NoSCADA data
[71]Zhao, 2018Gearbox, generatorDeep auto-encoder networkYesSCADA data
[72]Yang et al., 2018Wind turbineMultivariate EWMA 6YesSCADA data
[73]Wen et al., 2018Various componentsCNN 7NoSCADA data
[49]Wang et al., 2018Gearbox, generatorPCA 5, ANNYesSCADA data
[74]Zhang e Lang, 2018BearingsWavelet energy transmissibility functionsYesVibration analysis
[11]Li et al., 2019Gearbox bearingStochastic resonanceYesVibration analysis
[12]Gu e Chen, 2019Gearbox bearingStochastic resonanceYesVibration analysis
[13]Li et al., 2019Gearbox bearingHidden-Markov modelYesVibration analysis
[75]Qian et al., 2019GearboxHELM 8 algorithm, cloud computingYesSCADA data
[50]Fu et al., 2019GearboxCNN 7, LSTM 9 networksYesSCADA data
[76]Lei et al., 2019Various componentsLSTM 9 networksNoNot specified
[77]Saari et al., 2019BearingsSVM 4YesVibration analysis
[9]Jiang et al., 2019GearboxMultiscale CNN 7NoVibration analysis
[78]Bakdi et al., 2019Wind turbinePCA 5, EWMA 6NoNot specified
[79]Rizk et al., 2020BladesHyperspectral imaging techniqueNoImage Analysis
[80]Dong et al., 2020Wind turbineMathematical modelYesSCADA data
[38]Liu et al., 2020Generator, converter, pitch systemConvolutional Neural Network, SVM 4YesSCADA data
[81]Chang et al., 2020GearboxConcurrent CNN 7NoVibration analysis
[82]Pujol-Vazquez et al., 2020Pitch actuatorMathematical modelNoNot specified
[83]Stetco et al., 2020GeneratorCNN 7NoData-driven using current signals
[84]Zhang and Lang, 2020Wind turbine,
generator
Dynamic model sensorYesSCADA data
[85]Chen et al., 2020Generator Modulation
signal bispectrum
YesCurrent signals analysis
[86]Yang et al., 2021BladesDeep learning modelYesImage Analysis
[10]Chen et al., 2021Generator bearingsDCGAN 10YesData-driven using vibration data
[29]Wang and Liu, 2021Gearbox, GeneratorCMI 11, K-NN 1YesSCADA data
1 k-Nearest Neighbors (k-NN), 2 Adaptive Neuro-Fuzzy Interference System (ANFIS), 3 Cluster Center Fuzzy Logic (CCFL), 4 Support Vector Machine (SVM), 5 Principal Component Analysis (PCA), 6 Exponentially Weighted Moving Average (EWMA), 7 Convolutional Neural Networks (CNN), 8 Hierarchical Extreme Learning Machine (HELM), 9 Long Short Term Memory (LSTM), 10 Deep Convolutional Generative Adversarial Networks (DCGAN), 11 Conditional Mutual Information (CMI).

References

  1. Digital Science & Research Solutions, Inc. Renewable Energy Sources and Climate Change Mitigation: Special Report of the Intergovernmental Panel on Climate Change. Choice Rev. Online 2012, 49, 49-6309. [Google Scholar] [CrossRef]
  2. Wind Europe: Wind Energy in Europe in 2019—Trends and Statistics. Available online: https://windeurope.org/data-and-analysis/product/wind-energy-in-europe-in-2019-trends-and-statistics/ (accessed on 15 October 2020).
  3. Blanco, M.I. The Economics of Wind Energy. Renew. Sustain. Energy Rev. 2009, 13, 1372–1382. [Google Scholar] [CrossRef]
  4. Hahn, B.; Durstewitz, M.; Rohrig, K. Reliability of Wind Turbines. In Wind Energy; Springer: Berlin/Heidelberg, Germany, 2007; pp. 329–333. [Google Scholar]
  5. Tautz-Weinert, J.; Watson, S.J. Using SCADA Data for Wind Turbine Condition Monitoring—A Review. IET Renew. Power Gener. 2017, 11, 382–394. [Google Scholar] [CrossRef] [Green Version]
  6. Liu, Z.; Zhang, L.; Carrasco, J. Vibration Analysis for Large-Scale Wind Turbine Blade Bearing Fault Detection with an Empirical Wavelet Thresholding Method. Renew. Energy 2020, 146, 99–110. [Google Scholar] [CrossRef]
  7. Zhang, Z.; Verma, A.; Kusiak, A. Fault Analysis and Condition Monitoring of the Wind Turbine Gearbox. IEEE Trans. Energy Convers. 2012, 27, 526–535. [Google Scholar] [CrossRef]
  8. Liu, W.Y. The Vibration Analysis of Wind Turbine Blade–Cabin–Tower Coupling System. Eng. Struct. 2013, 56, 954–957. [Google Scholar] [CrossRef]
  9. Jiang, G.; He, H.; Yan, J.; Xie, P. Multiscale Convolutional Neural Networks for Fault Diagnosis of Wind Turbine Gearbox. IEEE Trans. Ind. Electron. 2019, 66, 3196–3207. [Google Scholar] [CrossRef]
  10. Chen, P.; Li, Y.; Wang, K.; Zuo, M.J.; Heyns, P.S.; Baggeröhr, S. A Threshold Self-Setting Condition Monitoring Scheme for Wind Turbine Generator Bearings Based on Deep Convolutional Generative Adversarial Networks. Measurement 2021, 167, 108234. [Google Scholar] [CrossRef]
  11. Li, J.; Li, M.; Zhang, J.; Jiang, G. Frequency-Shift Multiscale Noise Tuning Stochastic Resonance Method for Fault Diagnosis of Generator Bearing in Wind Turbine. Measurement 2019, 133, 421–432. [Google Scholar] [CrossRef]
  12. Gu, X.; Chen, C. Adaptive Parameter-Matching Method of SR Algorithm for Fault Diagnosis of Wind Turbine Bearing. J. Mech. Sci. Technol. 2019, 33, 1007–1018. [Google Scholar] [CrossRef]
  13. Li, J.; Zhang, X.; Zhou, X.; Lu, L. Reliability Assessment of Wind Turbine Bearing Based on the Degradation-Hidden-Markov Model. Renew. Energy 2019, 132, 1076–1087. [Google Scholar] [CrossRef]
  14. Gómez Muñoz, C.; García Márquez, F. A New Fault Location Approach for Acoustic Emission Techniques in Wind Turbines. Energies 2016, 9, 40. [Google Scholar] [CrossRef] [Green Version]
  15. Caso, E.; Fernandez-del-Rincon, A.; Garcia, P.; Iglesias, M.; Viadero, F. Monitoring of Misalignment in Low Speed Geared Shafts with Acoustic Emission Sensors. Appl. Acoust. 2020, 159, 107092. [Google Scholar] [CrossRef]
  16. Faiz, J.; Moosavi, S.M.M. Eccentricity Fault Detection—From Induction Machines to DFIG—A Review. Renew. Sustain. Energy Rev. 2016, 55, 169–179. [Google Scholar] [CrossRef]
  17. Ling, Y.; Cai, X. Rotor Current Dynamics of Doubly Fed Induction Generators during Grid Voltage Dip and Rise. Int. J. Electr. Power Energy Syst. 2013, 44, 17–24. [Google Scholar] [CrossRef]
  18. Merabet, H.; Bahi, T.; Halem, N. Condition Monitoring and Fault Detection in Wind Turbine Based on DFIG by the Fuzzy Logic. Energy Procedia 2015, 74, 518–528. [Google Scholar] [CrossRef] [Green Version]
  19. El Bouchikhi, E.H.; Choqueuse, V.; Benbouzid, M. Induction Machine Faults Detection Using Stator Current Parametric Spectral Estimation. Mech. Syst. Signal. Process. 2015, 52–53, 447–464. [Google Scholar] [CrossRef] [Green Version]
  20. Artigao, E.; Honrubia-Escribano, A.; Gomez-Lazaro, E. Current Signature Analysis to Monitor DFIG Wind Turbine Generators: A Case Study. Renew. Energy 2018, 116, 5–14. [Google Scholar] [CrossRef] [Green Version]
  21. Hamilton, A.; Quail, F. Detailed State of the Art Review for the Different Online/Inline Oil Analysis Techniques in Context of Wind Turbine Gearboxes. J. Tribol. 2011, 133, 044001. [Google Scholar] [CrossRef] [Green Version]
  22. Feng, Y.; Qiu, Y.; Crabtree, C.J.; Long, H.; Tavner, P.J. Monitoring Wind Turbine Gearboxes: Monitoring Wind Turbine Gearboxes. Wind Energ. 2013, 16, 728–740. [Google Scholar] [CrossRef]
  23. Lu, B.; Li, Y.; Wu, X.; Yang, Z. A Review of Recent Advances in Wind Turbine Condition Monitoring and Fault Diagnosis. In Proceedings of the 2009 IEEE Power Electronics and Machines in Wind Applications, Lincoln, NE, USA, 24–26 June 2009; pp. 1–7. [Google Scholar]
  24. Salameh, J.P.; Cauet, S.; Etien, E.; Sakout, A.; Rambault, L. Gearbox Condition Monitoring in Wind Turbines: A Review. Mech. Syst. Signal. Process. 2018, 111, 251–264. [Google Scholar] [CrossRef]
  25. Liu, Z.; Zhang, L. A Review of Failure Modes, Condition Monitoring and Fault Diagnosis Methods for Large-Scale Wind Turbine Bearings. Measurement 2020, 149, 107002. [Google Scholar] [CrossRef]
  26. Glowacz, A. Fault Diagnosis of Electric Impact Drills Using Thermal Imaging. Measurement 2021, 171, 108815. [Google Scholar] [CrossRef]
  27. Gong, Y.; Fei, J.-L.; Tang, J.; Yang, Z.-G.; Han, Y.-M.; Li, X. Failure Analysis on Abnormal Wear of Roller Bearings in Gearbox for Wind Turbine. Eng. Fail. Anal. 2017, 82, 26–38. [Google Scholar] [CrossRef]
  28. AlShorman, O.; Masadeh, M.; Alkahtani, F.; AlShorman, A. A Review of Condition Monitoring and Fault Diagnosis and Detection of Rotating Machinery Based on Image Aspects. In Proceedings of the 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI), Sakheer, Bahrain, 26–27 October 2020; pp. 1–5. [Google Scholar]
  29. Wang, Z.; Liu, C. Wind Turbine Condition Monitoring Based on a Novel Multivariate State Estimation Technique. Measurement 2021, 168, 108388. [Google Scholar] [CrossRef]
  30. Stetco, A.; Dinmohammadi, F.; Zhao, X.; Robu, V.; Flynn, D.; Barnes, M.; Keane, J.; Nenadic, G. Machine Learning Methods for Wind Turbine Condition Monitoring: A Review. Renew. Energy 2019, 133, 620–635. [Google Scholar] [CrossRef]
  31. Benedetti, M.; Cesarotti, V.; Introna, V.; Serranti, J. Energy Consumption Control Automation Using Artificial Neural Networks and Adaptive Algorithms: Proposal of a New Methodology and Case Study. Appl. Energy 2016, 165, 60–71. [Google Scholar] [CrossRef]
  32. Carvalho, T.P.; Soares, F.A.A.M.N.; Vita, R.; Francisco, R.D.P.; Basto, J.P.; Alcalá, S.G.S. A Systematic Literature Review of Machine Learning Methods Applied to Predictive Maintenance. Comput. Ind. Eng. 2019, 137, 106024. [Google Scholar] [CrossRef]
  33. Kusiak, A.; Zheng, H.; Song, Z. Models for Monitoring Wind Farm Power. Renew. Energy 2009, 34, 583–590. [Google Scholar] [CrossRef]
  34. Kusiak, A.; Verma, A. Monitoring Wind Farms With Performance Curves. IEEE Trans. Sustain. Energy 2013, 4, 192–199. [Google Scholar] [CrossRef]
  35. Leahy, K.; Hu, R.L.; Konstantakopoulos, I.C.; Spanos, C.J.; Agogino, A.M. Diagnosing Wind Turbine Faults Using Machine Learning Techniques Applied to Operational Data. In Proceedings of the 2016 IEEE International Conference on Prognostics and Health Management (ICPHM), Ottawa, ON, Canada, 20–22 June 2016; pp. 1–8. [Google Scholar]
  36. Ouyang, T.; Kusiak, A.; He, Y. Modeling Wind-Turbine Power Curve: A Data Partitioning and Mining Approach. Renew. Energy 2017, 102, 1–8. [Google Scholar] [CrossRef]
  37. Vidal, Y.; Pozo, F.; Tutivén, C. Wind Turbine Multi-Fault Detection and Classification Based on SCADA Data. Energies 2018, 11, 3018. [Google Scholar] [CrossRef] [Green Version]
  38. Liu, Z.; Xiao, C.; Zhang, T.; Zhang, X. Research on Fault Detection for Three Types of Wind Turbine Subsystems Using Machine Learning. Energies 2020, 13, 460. [Google Scholar] [CrossRef] [Green Version]
  39. Zaher, A.; McArthur, S.D.J.; Infield, D.G.; Patel, Y. Online Wind Turbine Fault Detection through Automated SCADA Data Analysis. Wind Energ. 2009, 12, 574–593. [Google Scholar] [CrossRef]
  40. Schlechtingen, M.; Ferreira Santos, I. Comparative Analysis of Neural Network and Regression Based Condition Monitoring Approaches for Wind Turbine Fault Detection. Mech. Syst. Signal. Process. 2011, 25, 1849–1875. [Google Scholar] [CrossRef] [Green Version]
  41. Kusiak, A.; Verma, A. Analyzing Bearing Faults in Wind Turbines: A Data-Mining Approach. Renew. Energy 2012, 48, 110–116. [Google Scholar] [CrossRef]
  42. Zhang, Z.-Y.; Wang, K.-S. Wind Turbine Fault Detection Based on SCADA Data Analysis Using ANN. Adv. Manuf. 2014, 2, 70–78. [Google Scholar] [CrossRef] [Green Version]
  43. Karlsson, D. Wind Turbine Performance Monitoring Using Artificial Neural Networks. Master’s Thesis, Chalmers University of Technology, Göteborg, Sweden, 2015. [Google Scholar]
  44. Sun, P.; Li, J.; Wang, C.; Lei, X. A Generalized Model for Wind Turbine Anomaly Identification Based on SCADA Data. Appl. Energy 2016, 168, 550–567. [Google Scholar] [CrossRef] [Green Version]
  45. Bangalore, P.; Letzgus, S.; Karlsson, D.; Patriksson, M. An Artificial Neural Network-Based Condition Monitoring Method for Wind Turbines, with Application to the Monitoring of the Gearbox: ANN-Based CMS for Wind Turbine Gearbox Monitoring. Wind Energ. 2017, 20, 1421–1438. [Google Scholar] [CrossRef]
  46. Nithya, M.; Nagarajan, S.; Navaseelan, P. Fault Detection of Wind Turbine System Using Neural Networks. In Proceedings of the 2017 IEEE Technological Innovations in ICT for Agriculture and Rural Development (TIAR), Chennai, India, 7–8 April 2017; pp. 103–108. [Google Scholar]
  47. Manobel, B.; Sehnke, F.; Lazzús, J.A.; Salfate, I.; Felder, M.; Montecinos, S. Wind Turbine Power Curve Modeling Based on Gaussian Processes and Artificial Neural Networks. Renew. Energy 2018, 125, 1015–1020. [Google Scholar] [CrossRef]
  48. Bangalore, P.; Patriksson, M. Analysis of SCADA Data for Early Fault Detection, with Application to the Maintenance Management of Wind Turbines. Renew. Energy 2018, 115, 521–532. [Google Scholar] [CrossRef]
  49. Wang, Y.; Ma, X.; Qian, P. Wind Turbine Fault Detection and Identification Through PCA-Based Optimal Variable Selection. IEEE Trans. Sustain. Energy 2018, 9, 1627–1635. [Google Scholar] [CrossRef] [Green Version]
  50. Fu, J.; Chu, J.; Guo, P.; Chen, Z. Condition Monitoring of Wind Turbine Gearbox Bearing Based on Deep Learning Model. IEEE Access 2019, 7, 57078–57087. [Google Scholar] [CrossRef]
  51. Helbing, G. Deep Learning for Fault Detection in Wind Turbines. Renew. Sustain. Energy Rev. 2018, 98, 189–198. [Google Scholar] [CrossRef]
  52. Schlechtingen, M.; Santos, I.F.; Achiche, S. Using Data-Mining Approaches for Wind Turbine Power Curve Monitoring: A Comparative Study. IEEE Trans. Sustain. Energy 2013, 4, 671–679. [Google Scholar] [CrossRef]
  53. Zhao, Y.; Li, D.; Dong, A.; Kang, D.; Lv, Q.; Shang, L. Fault Prediction and Diagnosis of Wind Turbine Generators Using SCADA Data. Energies 2017, 10, 1210. [Google Scholar] [CrossRef] [Green Version]
  54. De Maesschalck, R.; Jouan-Rimbaud, D.; Massart, D.L. The Mahalanobis Distance. Chemom. Intell. Lab. Syst. 2000, 50, 1–18. [Google Scholar] [CrossRef]
  55. Haykin, S.S.; Haykin, S.S. Neural Networks and Learning Machines, 3rd ed.; Prentice Hall: New York, NY, USA, 2009; ISBN 978-0-13-147139-9. [Google Scholar]
  56. Hagan, M.T.; Menhaj, M.B. Training Feedforward Networks with the Marquardt Algorithm. IEEE Trans. Neural Netw. 1994, 5, 989–993. [Google Scholar] [CrossRef] [PubMed]
  57. Schlechtingen, M.; Santos, I.F. Wind Turbine Condition Monitoring Based on SCADA Data Using Normal Behavior Models. Part 2: Application Examples. Appl. Soft Comput. 2014, 14, 447–460. [Google Scholar] [CrossRef]
  58. Benedetti, M.; Bonfà, F.; Introna, V.; Santolamazza, A.; Ubertini, S. Real Time Energy Performance Control for Industrial Compressed Air Systems: Methodology and Applications. Energies 2019, 12, 3935. [Google Scholar] [CrossRef] [Green Version]
  59. Marčiukaitis, M.; Žutautaitė, I.; Martišauskas, L.; Jokšas, B.; Gecevičius, G.; Sfetsos, A. Non-Linear Regression Model for Wind Turbine Power Curve. Renew. Energy 2017, 113, 732–741. [Google Scholar] [CrossRef]
  60. Wang, J.; Song, Y.; Liu, F.; Hou, R. Analysis and Application of Forecasting Models in Wind Power Integration: A Review of Multi-Step-Ahead Wind Speed Forecasting Models. Renew. Sustain. Energy Rev. 2016, 60, 960–981. [Google Scholar] [CrossRef]
  61. Simani, S.; Castaldi, P.; Tilli, A. Data—Driven Approach for Wind Turbine Actuator and Sensor Fault Detection and Isolation. IFAC Proc. Vol. 2011, 44, 8301–8306. [Google Scholar] [CrossRef] [Green Version]
  62. Zhang, W.; Ma, X. Simultaneous Fault Detection and Sensor Selection for Condition Monitoring of Wind Turbines. Energies 2016, 9, 280. [Google Scholar] [CrossRef] [Green Version]
  63. Pozo, F.; Vidal, Y. Wind Turbine Fault Detection through Principal Component Analysis and Statistical Hypothesis Testing. AST 2016, 101, 45–54. [Google Scholar] [CrossRef] [Green Version]
  64. Bi, R.; Zhou, C.; Hepburn, D.M. Detection and Classification of Faults in Pitch-Regulated Wind Turbine Generators Using Normal Behaviour Models Based on Performance Curves. Renew. Energy 2017, 105, 674–688. [Google Scholar] [CrossRef] [Green Version]
  65. Wang, L.; Zhang, Z.; Long, H.; Xu, J.; Liu, R. Wind Turbine Gearbox Failure Identification With Deep Neural Networks. IEEE Trans. Ind. Inf. 2017, 13, 1360–1368. [Google Scholar] [CrossRef]
  66. Nazir, M.; Khan, A.Q.; Mustafa, G.; Abid, M. Robust Fault Detection for Wind Turbines Using Reference Model-Based Approach. J. King Saud Univ.Eng. Sci. 2017, 29, 244–252. [Google Scholar] [CrossRef] [Green Version]
  67. Yu, D.; Chen, Z.M.; Xiahou, K.S.; Li, M.S.; Ji, T.Y.; Wu, Q.H. A Radically Data-Driven Method for Fault Detection and Diagnosis in Wind Turbines. Int. J. Electr. Power Energy Syst. 2018, 99, 577–584. [Google Scholar] [CrossRef]
  68. Alvarez, E.J.; Ribaric, A.P. An Improved-Accuracy Method for Fatigue Load Analysis of Wind Turbine Gearbox Based on SCADA. Renew. Energy 2018, 115, 391–399. [Google Scholar] [CrossRef]
  69. González-González, A.; Cortadi, A.J.; Galar, D.; Ciani, L. Condition Monitoring of Wind Turbine Pitch Controller: A Maintenance Approach. Measurement 2018, 123, 80–93. [Google Scholar] [CrossRef]
  70. Dao, P.B.; Staszewski, W.J.; Barszcz, T.; Uhl, T. Condition Monitoring and Fault Detection in Wind Turbines Based on Cointegration Analysis of SCADA Data. Renew. Energy 2018, 116, 107–122. [Google Scholar] [CrossRef]
  71. Zhao, H. Anomaly Detection and Fault Analysis of Wind Turbine Components Based on Deep Learning Network. Renew. Energy 2018, 127, 825–834. [Google Scholar] [CrossRef]
  72. Yang, H.-H.; Huang, M.-L.; Lai, C.-M.; Jin, J.-R. An Approach Combining Data Mining and Control Charts-Based Model for Fault Detection in Wind Turbines. Renew. Energy 2018, 115, 808–816. [Google Scholar] [CrossRef]
  73. Wen, L.; Li, X.; Gao, L.; Zhang, Y. A New Convolutional Neural Network-Based Data-Driven Fault Diagnosis Method. IEEE Trans. Ind. Electron. 2018, 65, 5990–5998. [Google Scholar] [CrossRef]
  74. Zhang, L.; Lang, Z.-Q. Wavelet Energy Transmissibility Function and Its Application to Wind Turbine Bearing Condition Monitoring. IEEE Trans. Sustain. Energy 2018, 9, 1833–1843. [Google Scholar] [CrossRef] [Green Version]
  75. Qian, P.; Zhang, D.; Tian, X.; Si, Y.; Li, L. A Novel Wind Turbine Condition Monitoring Method Based on Cloud Computing. Renew. Energy 2019, 135, 390–398. [Google Scholar] [CrossRef]
  76. Lei, J.; Liu, C.; Jiang, D. Fault Diagnosis of Wind Turbine Based on Long Short-Term Memory Networks. Renew. Energy 2019, 133, 422–432. [Google Scholar] [CrossRef]
  77. Saari, J.; Strömbergsson, D.; Lundberg, J.; Thomson, A. Detection and Identification of Windmill Bearing Faults Using a One-Class Support Vector Machine (SVM). Measurement 2019, 137, 287–301. [Google Scholar] [CrossRef]
  78. Bakdi, A.; Kouadri, A.; Mekhilef, S. A Data-Driven Algorithm for Online Detection of Component and System Faults in Modern Wind Turbines at Different Operating Zones. Renew. Sustain. Energy Rev. 2019, 103, 546–555. [Google Scholar] [CrossRef]
  79. Rizk, P.; Al Saleh, N.; Younes, R.; Ilinca, A.; Khoder, J. Hyperspectral Imaging Applied for the Detection of Wind Turbine Blade Damage and Icing. Remote Sens. Appl. Soc. Environ. 2020, 18, 100291. [Google Scholar] [CrossRef]
  80. Dong, X.; Gao, D.; Li, J.; Jincao, Z.; Zheng, K. Blades Icing Identification Model of Wind Turbines Based on SCADA Data. Renew. Energy 2020, 162, 575–586. [Google Scholar] [CrossRef]
  81. Chang, Y.; Chen, J.; Qu, C.; Pan, T. Intelligent Fault Diagnosis of Wind Turbines via a Deep Learning Network Using Parallel Convolution Layers with Multi-Scale Kernels. Renew. Energy 2020, 153, 205–213. [Google Scholar] [CrossRef]
  82. Pujol-Vazquez, G.; Acho, L.; Gibergans-Báguena, J. Fault Detection Algorithm for Wind Turbines’ Pitch Actuator Systems. Energies 2020, 13, 2861. [Google Scholar] [CrossRef]
  83. Stetco, A.; Ramirez, J.M.; Mohammed, A.; Djurović, S.; Nenadic, G.; Keane, J. An End-to-End, Real-Time Solution for Condition Monitoring of Wind Turbine Generators. Energies 2020, 13, 4817. [Google Scholar] [CrossRef]
  84. Zhang, S.; Lang, Z.-Q. SCADA-Data-Based Wind Turbine Fault Detection: A Dynamic Model Sensor Method. Control. Eng. Pract. 2020, 102, 104546. [Google Scholar] [CrossRef]
  85. Chen, X.; Xu, W.; Liu, Y.; Islam, M.R. Bearing Corrosion Failure Diagnosis of Doubly Fed Induction Generator in Wind Turbines Based on Stator Current Analysis. IEEE Trans. Ind. Electron. 2020, 67, 3419–3430. [Google Scholar] [CrossRef] [Green Version]
  86. Yang, X.; Zhang, Y.; Lv, W.; Wang, D. Image Recognition of Wind Turbine Blade Damage Based on a Deep Learning Model with Transfer Learning and an Ensemble Learning Classifier. Renew. Energy 2021, 163, 386–397. [Google Scholar] [CrossRef]
Figure 1. The figure describes the research proposed in this paper step by step.
Figure 1. The figure describes the research proposed in this paper step by step.
Energies 14 01845 g001
Figure 2. The figure describes the methodology proposed in the paper. (a) The elaboration of the model starts from the acquisition of the historical Supervisory Control and Data Acquisition (SCADA) data from the wind turbines with the aim to obtain a trained Artificial Neural Network (ANN) model able to represent the turbine in its “healthy state”. (b) Use of the trained model in the control phase: receiving as inputs the new measurements, the model elaborates the predicted values of the output variable that is then compared with the actual values measured by the measurement system at the same time. The deviation between the two values (actual and estimated by the model) is evaluated statistically to assess the health state of the turbine.
Figure 2. The figure describes the methodology proposed in the paper. (a) The elaboration of the model starts from the acquisition of the historical Supervisory Control and Data Acquisition (SCADA) data from the wind turbines with the aim to obtain a trained Artificial Neural Network (ANN) model able to represent the turbine in its “healthy state”. (b) Use of the trained model in the control phase: receiving as inputs the new measurements, the model elaborates the predicted values of the output variable that is then compared with the actual values measured by the measurement system at the same time. The deviation between the two values (actual and estimated by the model) is evaluated statistically to assess the health state of the turbine.
Energies 14 01845 g002
Figure 3. The relation between the output power of a wind turbine and wind speed.
Figure 3. The relation between the output power of a wind turbine and wind speed.
Energies 14 01845 g003
Figure 4. Proposed fault detection methodology.
Figure 4. Proposed fault detection methodology.
Energies 14 01845 g004
Figure 5. (a) SCADA data before filtering operations; (b) Data after removing points subjected to power limitation; (c) Application of clustering on the train set; (d) Data cleaned and ready for the training stage (after the elimination of abnormal points identified through the evaluation of the Mahalanobis distance).
Figure 5. (a) SCADA data before filtering operations; (b) Data after removing points subjected to power limitation; (c) Application of clustering on the train set; (d) Data cleaned and ready for the training stage (after the elimination of abnormal points identified through the evaluation of the Mahalanobis distance).
Energies 14 01845 g005
Figure 6. Graphical representation of the chosen output variables for the four models elaborated from one of the turbines analyzed in the time interval used for the training phase. (a) Power Output (kW); (b) Gearbox Bearing Temperature; (c) Generator DE Bearing Temperature; (d) Generator Slip Ring Temperature.
Figure 6. Graphical representation of the chosen output variables for the four models elaborated from one of the turbines analyzed in the time interval used for the training phase. (a) Power Output (kW); (b) Gearbox Bearing Temperature; (c) Generator DE Bearing Temperature; (d) Generator Slip Ring Temperature.
Energies 14 01845 g006aEnergies 14 01845 g006b
Figure 7. Representation of the feed-forward neural networks (FFNN) model elaborated: the model presents 4 neurons in the input layer, 28 neurons in the only hidden layer and one neuron in the output layer.
Figure 7. Representation of the feed-forward neural networks (FFNN) model elaborated: the model presents 4 neurons in the input layer, 28 neurons in the only hidden layer and one neuron in the output layer.
Energies 14 01845 g007
Figure 8. Control chart of Power Output deviations using the ANN model.
Figure 8. Control chart of Power Output deviations using the ANN model.
Energies 14 01845 g008
Figure 9. Control chart of Power Output deviations using the non-linear regression model.
Figure 9. Control chart of Power Output deviations using the non-linear regression model.
Energies 14 01845 g009
Figure 10. The control chart of Gearbox Bearing Temperature deviations (WT01).
Figure 10. The control chart of Gearbox Bearing Temperature deviations (WT01).
Energies 14 01845 g010
Figure 11. The control chart of Generator Bearing Temperature deviations (WT01).
Figure 11. The control chart of Generator Bearing Temperature deviations (WT01).
Energies 14 01845 g011
Figure 12. The control chart of Generator Bearing Temperature deviations (WT02).
Figure 12. The control chart of Generator Bearing Temperature deviations (WT02).
Energies 14 01845 g012
Figure 13. The control chart of Generator slip ring Temperature deviations (WT02).
Figure 13. The control chart of Generator slip ring Temperature deviations (WT02).
Energies 14 01845 g013
Table 1. In the table inputs and outputs used for the different models are identified.
Table 1. In the table inputs and outputs used for the different models are identified.
ComponentInputsOutputRef.
Wind turbineWind Speed
Ambient Temperature
Wind Direction
Wind Speed Standard Deviation
Power Output[43]
GearboxNacelle Temperature
Rotor Speed
Power Output
Ambient Temperature
Gearbox Oil Temperature
Gearbox Bearing Temperature[40,45]
GeneratorNacelle Temperature
Power Output
Generator Speed
Generator Stator Temperature
Generator DE 1 Bearing Temperature[40]
Generator Speed
Power Output
Rotor Grid Inverter Temp. (Ph.1)
Nacelle Temperature
Generator Slip Ring Temperature[57]
1 Drive End (DE).
Table 2. Main maintenance interventions on gearbox and generator in the investigated period.
Table 2. Main maintenance interventions on gearbox and generator in the investigated period.
ComponentTurbineMaintenance WorkStartEnd
GearboxWT01Gearbox repair26 April 201630 April 2016
IMS 1 bearings Replacing27 September 201628 September 2016
Gearbox repair08 February 201711 February 2017
GeneratorWT01NDE 2 and DE 3 bearings replacement16 May 201619 May 2016
Generator replacement16 August 201626 August 2016
WT02NDE 2 and DE 3 bearings replacement11 May 201712 May 2017
1 Intermediate Shaft (IMS); 2 Non-Drive End (NDE); 3 Drive End (DE).
Table 3. Performance parameters comparison for the two models analyzed.
Table 3. Performance parameters comparison for the two models analyzed.
ModelRMSE 1MAE 2MAPE 3Control Limits
FFNN36.08 kW26.72 kW2.78% ± 46.54 kW
Non-linear Regression65.54 kW48.24 kW4.67% ± 79.74 kW
1 Root-Mean Squared Error (RMSE), 2 Mean Absolute Error (MAE), 3 Mean Absolute Percentage Error (MAPE).
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Santolamazza, A.; Dadi, D.; Introna, V. A Data-Mining Approach for Wind Turbine Fault Detection Based on SCADA Data Analysis Using Artificial Neural Networks. Energies 2021, 14, 1845. https://doi.org/10.3390/en14071845

AMA Style

Santolamazza A, Dadi D, Introna V. A Data-Mining Approach for Wind Turbine Fault Detection Based on SCADA Data Analysis Using Artificial Neural Networks. Energies. 2021; 14(7):1845. https://doi.org/10.3390/en14071845

Chicago/Turabian Style

Santolamazza, Annalisa, Daniele Dadi, and Vito Introna. 2021. "A Data-Mining Approach for Wind Turbine Fault Detection Based on SCADA Data Analysis Using Artificial Neural Networks" Energies 14, no. 7: 1845. https://doi.org/10.3390/en14071845

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop