A Data-Mining Approach for Wind Turbine Fault Detection Based on SCADA Data Analysis Using Artificial Neural Networks

Annalisa Santolamazza; Daniele Dadi; Vito Introna

doi:10.3390/en14071845

,

and

¹

DEIM School of Engineering, University of Tuscia, 01100 Viterbo, Italy

²

Department of Enterprise Engineering, University of Rome Tor Vergata, 00133 Rome, Italy

^*

Author to whom correspondence should be addressed.

Energies2021, 14(7), 1845;https://doi.org/10.3390/en14071845

This article belongs to the Special Issue Future Maintenance Management in Renewable Energies

Version Notes

Order Reprints

Abstract

Wind energy has shown significant growth in terms of installed power in the last decade. However, one of the most critical problems for a wind farm is represented by Operation and Maintenance (O&M) costs, which can represent 20–30% of the total costs related to power generation. Various monitoring methodologies targeted to the identification of faults, such as vibration analysis or analysis of oils, are often used. However, they have the main disadvantage of involving additional costs as they usually entail the installation of other sensors to provide real-time control of the system. In this paper, we propose a methodology based on machine learning techniques using data from SCADA systems (Supervisory Control and Data Acquisition). Since these systems are generally already implemented on most wind turbines, they provide a large amount of data without requiring extra sensors. In particular, we developed models using Artificial Neural Networks (ANN) to characterize the behavior of some of the main components of the wind turbine, such as gearbox and generator, and predict operating anomalies. The proposed method is tested on real wind turbines in Italy to verify its effectiveness and applicability, and it was demonstrated to be able to provide significant help for the maintenance of a wind farm.

Keywords:

condition monitoring; fault detection; wind turbine; artificial neural networks; predictive maintenance; gearbox; generator

1. Introduction

The increasingly evident climate change and the need to increase the amount of energy produced from renewable sources, dictated by national and international strategic objectives [1], have led to growing interest in the development of technology that allows the utilization of wind as an energy resource. For instance, in the last decade in Europe, wind has been the source characterized by the most significant growth in terms of installed power [2] and, although this number is already very high today, it still seems destined to increase. At the same time, the development of wind technology has been characterized by a continuous growth in the size of turbines, up to over 10 MW. Such large investments require ever-higher levels of reliability and availability.

Furthermore, the search for profitable wind conditions leads to new investments in remote locations that are usually difficult to reach, such as offshore and high altitudes sites. In these conditions, intervention times are long thus the occurrence of unexpected critical failures can generate, in addition to the standard cost of intervention, very high costs related to non-availability. In fact, one of the biggest problems for a wind farm is represented by O&M (Operation and Maintenance) costs that can reach 20–30% of the total costs related to power generation [3].

The wind turbine is a complex system constituted of numerous components and subcomponents, each characterized by the possibility of incurring in different failure types, often difficult to locate and that may impact on other components’ health.

In recent years, the scientific research paid great attention to the study of specific components such as the gearbox and generator, characterized by high replacement and repair costs and more extended downtime in case of failure [4].

Long downtime for the wind turbine in a scenario where the request for availability is increasingly high can lead to production losses not acceptable. To prevent such situations is important to research methodologies that can identify malfunctions and failures in their initial state, so that the impact of the loss of productivity can be minimized.

This paper aims to develop a monitoring system based on the use of data from SCADA systems (“Supervisory Control and Data Acquisition”). In condition monitoring and fault detection for wind turbines, the use and analysis of SCADA data have recently been one of the most investigated methods. These systems are nowadays implemented on all wind turbines and make available a large amount of data, sampled with a relatively high frequency (typically 1 Hz) and recorded with their average values every 10 min [5].

We proceed by identifying, among these data, extended periods where the turbine has been free from failures to develop a model that is representative of its “healthy state”. This model will be taken as a reference during normal operating conditions and used to highlight the presence of abnormal behavior.

In this work, we address all the phases leading to the development of a methodology aimed at identifying failures: data pre-processing, model development and data post-processing. Several levels of wind turbine models are proposed: the turbine as a whole, and components such as the generator and the gearbox, which are among the most critical.

The main tools used are ANN (Artificial Neural Networks) to develop models to describe the natural behavior of the system and its components and control charts to support the prompt identification of malfunctions.

The developed methodology is tested on a real case study represented by a wind farm currently operating in an Italian location.

To conclude the introduction of the paper, the diagram in Figure 1 describes the methodological steps followed in the proposed research.

Figure 1. The figure describes the research proposed in this paper step by step.

2. Background

At present, the maintenance policy typically adopted in wind farms is preventive, either scheduled or condition-based according to the development of simple static threshold alerts, with the aim of identifying possible faults in a timely manner to avoid further problems.

Interventions of typical scheduled maintenance are, for example, the purging of the generator bearings, the slip ring cleaning, the cleaning of the lubrication system filters, inspections of the gearbox with video endoscope. This type of approach does not, however, assure the avoidance of critical failures. Besides, a failure of a component, albeit not yet critical, could cause the turbine to work in non-optimal conditions, causing significant efficiency losses that could not be detected by the normal performance monitoring systems.

In recent years, various established monitoring methodologies targeted to the identification of faults have been transported to the wind turbine sector: vibration analysis [6,7,8,9,10,11,12,13], the analysis of acoustic emissions [14,15], MCSA (Machine Current Signature Analysis) [16,17,18,19,20], the analysis of oils [21,22] have been applied with interesting results. These techniques have the main disadvantage of requiring additional costs as they entail the installation of additional sensors if real-time control of the system is required [23,24]. Moreover, these methods have effectively been proven on high-speed rotation machines, but their sensitivity, validity and feasibility still need to be further verified on wind turbines in which some components are characterized by slow variation speeds and large dynamic loadings [25].

Also, techniques based on image detection, such as thermal image analysis [26], or microscope analysis [27] can find application in wind turbine fault detection [25]. Despite their effectiveness, the images of the failure modes need to be captured, stored and analyzed, and this requires an extra set up as well as advanced data analysis techniques [28].

Another type of data-driven based approach, on which the methodology proposed in this article is based, utilize the data from the SCADA systems. Most MW-scale wind turbines are already equipped with SCADA systems; therefore, one of the main advantages of these methods compared to those previously mentioned is that they do not require extra sensors, showing significant cost-effectiveness and are considered to be one of the most efficient solutions for wind turbine condition monitoring [29].

Typical condition-based maintenance through SCADA control systems, which are normally used as a support by the maintenance function, being able of generating alert signals (such as for the exceeding of static threshold values of the monitored parameters), is not always effective because it often does not allow intervention times sufficient to prevent critical scenarios.

To overcome these limitations is, therefore, necessary to move towards condition-based maintenance developed through more complex models, able to assess the interaction between the operating variables and the boundary conditions and identify in a predictive manner anomalous operating conditions before significant performance losses are generated.

Physical models, regression-based model, artificial neural networks or even machine learning techniques are widely used [30].

Instead of physical models, which use physical and thermodynamic relations to derive exactly determined output variables, data-driven models use historical data to identify the relationships between the input and output variables defined. From this point of view, therefore, the approaches based on physical models require a thorough knowledge of the specific structure of the system and its behavior in different operating conditions, often obtainable with great difficulty [31,32], therefore not easily feasible.

In contrast, data-driven models have the advantage of getting good accuracy in modeling without the need for large interaction with the end-user of the instrument [31].

The success of such an approach aimed at identifying failures is determined by the accuracy of the model developed. Several tools, such as K-nearest Neighbors [29,33], clustering algorithms [34], Support Vector Machines [35,36,37,38], both static and dynamic neural networks [39,40,41,42,43,44,45,46,47,48,49], and even deep learning approaches [50,51], have been proven very effective in modeling the relations between the parameters of a wind turbine.

In the methodology proposed in this paper, the principal tools used to model the turbine behavior are ANNs, as these have been shown to be very promising in numerous applications, especially in those where different methods are compared [52,53].

To obtain more information about the different applications and techniques used in condition-monitoring and fault detection for wind turbines refer to Appendix A, where a more in-depth analysis of the publications investigated is reported.

Additionally, one of the objectives of this work is also to contribute to the scientific literature by addressing the current absence of methodological support that explains in detail the configuration of the tools and models to be developed as these phases significantly determine the system’s ability to detect operating anomalies. These phases are described in the next section.

3. Methodology

This paper aims to propose a comprehensive methodology to design and apply a clear and effective approach based on the use of ANN and SPC (Statistical Process Control) for the fault detection of wind turbines.

The methodology defines all the steps to follow to create and deploy a fault detection control system, integrating tools from different fields (i.e., supervised and unsupervised machine learning techniques to develop data-driven models and techniques and multiple control charts from statistical process control), and which can be reliably applied to different scenarios.

While only a few of the scientific contributions highlighted in Appendix A present a general approach and only a few researchers have tried to improve their analysis with the support of statistical control charts, as has been done in this work, none of them have integrated all these steps and tested the final applicability of the resulting methodology on a real case study application.

The main steps of the approach here presented are the following:

Data acquisition and data pre-processing: the data are acquired, cleaned and prepared to be suitable for subsequent processing;
Model processing: the different models of the turbine and its components are developed and configured;
Post-processing: the deviations are evaluated using the control chart.

Large databases, generally regarding several years of operation of the wind turbine, and information about the maintenance interventions carried out are required. We then proceed by identifying among these data extended periods where the turbine has been free from failures to develop a model that is representative of its state of health (model training phase). These models will be the reference in the testing phase, where, with a new dataset, this time representative of general operating conditions of the turbine (therefore not excluding any possible failure), the system’s ability to identify the presence of anomalies will be validated. Only after this last phase, the model is deemed ready to be used on real-time data. Figure 2 represent respectively, the process to elaborate the model and the application of the model in the control phase.

Figure 2. The figure describes the methodology proposed in the paper. (a) The elaboration of the model starts from the acquisition of the historical Supervisory Control and Data Acquisition (SCADA) data from the wind turbines with the aim to obtain a trained Artificial Neural Network (ANN) model able to represent the turbine in its “healthy state”. (b) Use of the trained model in the control phase: receiving as inputs the new measurements, the model elaborates the predicted values of the output variable that is then compared with the actual values measured by the measurement system at the same time. The deviation between the two values (actual and estimated by the model) is evaluated statistically to assess the health state of the turbine.

3.1. Data Acquisition and Data Pre-Processing

In order to build representative models for the system, it is necessary to have the historical monitoring data of an adequate timeframe for the customization of the models. The width of the time interval will be influenced by the frequency of available data and the typical behavior of the system (i.e., the data used should be representative of all possible conditions of the system examined).

It is also important to have accurate and detailed information on maintenance interventions (both preventive and corrective) performed on the system for the same period.

Finally, any alarm signals generated automatically by the measurement system is considered useful support to model processing.

During this activity, an assessment of the quality and quantity of information available is also made in order to highlight any need for additional information before the start of the model processing phase.

3.1.1. Data Cleaning

During the training of the models, when the relationship that binds inputs and outputs is identified, the presence of outliers is a condition particularly dangerous. Indeed, in the definition of the “healthy behavior” of the system examined, the presence of an anomaly in the data can strongly affect the accuracy of the resulting models, and therefore operations aimed at cleaning the dataset are necessary.

The data cleaning phase can consist of several steps.

Once identified, potentially relevant variables for the model processing phase, the following steps should be performed:

Removal of samples in which at least one input or output signal is missing;
Removal of samples in which the wind turbine output power is zero;
Removal of samples where one or more variables are out of the range of normal variation (is also essential to identify the cause of such an occurrence).

In addition to sensor errors, a good part of the anomalous behaviors can be due to the artificial power reductions to which the wind farms can be subjected. The power limitations may be due to maintenance requirements, but mostly they are due to constraints imposed by the national power grid to overcome dispatching problems. These behaviors are not considered normal, and samples affected by these restrictions have to be removed (information about the power limitation is generally present in SCADA data).

Although a simple preliminary filter is often sufficient to remove most of the outliers, it is suggested to consider using more specific techniques in case the preliminary cleaning phase fails to exclude the presence of all outliers. Thus, the data set for training can be further cleaned up to ensure better accuracy of the model.

3.1.2. Clustering and Mahalanobis Distance

For this purpose, a clustering method that, albeit with some diversification, has been applied, with excellent results, to similar problems, is proposed [34,45]. The method is based on the removal of outliers using the evaluation of the Mahalanobis distance. The Mahalanobis distance is defined as the distance, measured in terms of standard deviation from the average, of a point from a distribution, and it takes into account the correlation in the data since it is calculated using the inverse of the variance-covariance matrix of the data set of interest [54].

To improve the identification of outliers is useful to divide the dataset into smaller groups [34]. Looking at a characteristic curve of a turbine reported in Figure 3, we realize how it behaves differently in the different areas of the power curve. In the criterion proposed by [43,45], it is recommended to divide the parameters considered in clustering into intervals where the turbine behavior changes. A simple method of dividing observations into a given number of groups is to use K-means clustering.

Figure 3. The relation between the output power of a wind turbine and wind speed.

After subdividing the samples, to determine the outliers, we proceed to calculate the Mahalanobis distance for each observation following the subsequent formula [54]:

M D_{i} = \sqrt{(x_{i} - C_{i}^{n}) c o v {(x)}^{- 1} {(x_{i} - C_{i}^{n})}^{T}},

(1)

where

M D_{i}

represents the Mahalanobis distance of the i-th sample

x_{i}

,

C_{i}^{n}

represents the coordinates of the center of the n-th cluster (with 𝑛 = 1 … 𝑘) corresponding to the sample

x_{i}

, while

c o v (x)

is the covariance matrix by

x

.

To determine the outliers, we can use a simple method proposed by [34]. This method consists of setting a threshold value δ of the distance

M D

to consider about 10–15% of the points as anomalous.

3.2. Model Processing

In the model processing phase, an Artificial Neural Network is defined.

Neural networks are a powerful modeling tool. The choice of their use is dictated by the fact that they have been proven to be very capable of modeling the complex non-linear relationships between the characteristic parameters of a wind turbine.

Regardless of the type of architecture chosen, the configuration of models based on neural networks have essential common points: the choice of the number of layers, the number of neurons in each layer and the selection of the algorithm to be used in the training phase.

In general, increasing the number of layers and using a more significant number of neurons within them increases the capacity of the neural network; however, it requires greater computational complexity and increases the probability of overfitting [55].

Overfitting or overtraining is one of the most typical problems encountered in the creation of models based on neural networks: the network performs well during the training phase but fails to replicate as good results when it is working with new data.

There are no rules that allow choosing which architecture is the best, nor the number of layers, nor the number of neurons that compose them. The best method, albeit expensive in computational terms, is to rely on a trial and error procedure: different configurations are tested, and the one that gives the best results is selected [55].

Since this is a problem with different orders of infinity, it will be impossible to test all possible configurations to find the best solution: it will therefore be of fundamental importance to have an idea of which configurations are most used for the application considered. In this phase, the support of the bibliographic research carried out will therefore be fundamental.

For wind turbine applications, feed-forward neural networks (FFNN) are by far the most used. There are also numerous applications of recursive networks, which in several cases have shown better performance. As regards the configurations of the structure, the most adopted provide a single hidden layer with a number of neurons varying between 10 and 30, which, despite being a very simple configuration, has shown satisfactory results. In all the applications viewed, the Marquardt-Levenberg algorithm is used to train the network. It is an evolution of the backpropagation algorithm that has been proven to be faster and more efficient than other standard algorithms for neural networks composed of a few hundred neurons [56].

The idea of our work was to start the experimentation of these tools starting from feed-forward architectures, both static and dynamic, chosen for their simplicity, and the non-linear auto-regressive networks with exogenous inputs (NARX) as representative of recursive networks, selected for their positive applications [43,45].

In order to perform the training of the model, the available dataset will be separated into three parts:

Training set (used to effectively train the model, defining the hyperparameters of the ANN);
Validation set (necessary to overcome the overfitting problem);
Test set (final set, never seen by the trained model, used instead to assess its real performance).

Typical percentages of division for the dataset are 70-15-15.

The model thus created will then be used in the monitoring and control phase: receiving as inputs the measurements of some variables of the system (both operational and environmental; their choice is to be determined through the study of technical and scientific literature), the model generates the value of the output variable, characterizing the system in its healthy condition.

The output variable generated by the model is then compared with the actual value measured by the measurement system at the same time. The deviation between the two values (actual and estimated by the model) is evaluated statistically using control charts to identify anomalies in place in the system. The preliminary analysis of control charts is necessary to have a first evaluation of the performance of the models in highlighting anomalous behaviors in the past.

Feature Selection

The choice of variables should be made in order to be able to characterize the “healthy” behavior of the system completely. Inputs and outputs will significantly characterize the system’s ability to monitor components and identify faults. The inputs and outputs that can allow adequate visibility of abnormal behaviors are not known a priori and are not easy to determine. A careful analysis of the system is needed, and important skills are required to estimate the mutual influence to which the parameters of a wind turbine are subjected.

For this phase, the support of bibliographic research is essential to understand which of the variables made available by SCADA systems are the best for implementing a turbine monitoring system through its components. The determination of the link between these quantities is generally based on the mixed-use of data reduction techniques (e.g., Principal Component Analysis) and engineering knowledge.

In Table 1, the different inputs and outputs for the proposed models are presented. Their choice has been the results of research in the scientific literature in order to identify the possible variables to use.

Table 1. In the table inputs and outputs used for the different models are identified.

3.3. Post-Processing

After having processed the model and having evaluated its accuracy, it is necessary to analyze the behavior of the resulting deviations.

The deviations are calculated as:

Δ = a c t u a l v a l u e - e s t i m a t e d v a l u e .

(2)

The deviation between each pair of values (actual value and value estimated by the model) is evaluated statistically through the use of a control chart to identify anomalies in the system. In particular, in this approach, the Shewhart control chart is used.

The deviations are plotted on the chart showing their evolution over time. Two control limits are added to simplify the evaluation of anomalous behaviors [58]:

The Upper Control Limit (UCL);
The Lower Control Limit (LCL).

These values define the sensitivity of the control chart and are often set as multiples of the standard deviation of the deviations’ distribution, σ. The standard deviation is calculated from the moving range MR, as the difference between the i-th deviation and the previous one:

M R = | Δ_{i} - Δ_{i - 1} |,

(3)

σ = \frac{\bar{M R}}{1.128},

(4)

U C L = + 3 σ,

(5)

L C L = - 3 σ,

(6)

when the system examined shows a healthy behavior (compliant with the model) the deviations on the chart will show a normal statistical distribution with a mean equal to zero, on the contrary, the presence of non-random patterns (e.g., points outside the control limits, mixtures or shifts of the average) are signals of non-conformity with the model and therefore of anomalous behavior.

Once the model has been validated, by ascertaining that the anomalies detected are real faults, the model can be used to enact real-time fault detection.

Figure 4 summarizes the proposed method to perform fault detection in wind turbines.

Figure 4. Proposed fault detection methodology.

4. Case Study Application

The application of the proposed methodology is carried out on a selection of wind turbines from a wind farm in southern Italy. The turbines’ model is Vestas V90 2MW.

The data available are in different formats:

SCADA data recorded every 10 min, from 1 January 2015 to 9 January 2018, for a total of 192 sampled variables;
Service report, in which for each month from January 2015 to October 2016, the records of the maintenance interventions carried out are collected.

The models were created with historical data and were then applied to the following period to assess the tool. In order to do so, therefore, maintenance information has been essential.

In the following paragraphs all the considerations that emerged in the different phases are reported—the issues of data pre-processing, the choice of ideal configurations and the selection of the most suitable models. Finally, the capacity of the developed system has been tested for the identification of faults that, in the period investigated, were found in the turbines of the wind farm.

As previously specified, we followed two different approaches—the first, at a higher level, based on the turbine monitoring as a whole entity via the power output, and then the development of more specific models for the major components, in particular for the generator and the gearbox. The input and output variables of the developed models, determined thanks to the literature review, are shown in Table 1.

In Table 2, it is possible to observe a list with the most critical faults concerning the components on which we have concentrated. Below, for each component, we will see how the system reacts.

Table 2. Main maintenance interventions on gearbox and generator in the investigated period.

4.1. Data Pre-Processing

The first general filtering operation is aimed at avoiding the presence of anomalous points that can have a negative impact on the accuracy of the model. The following categories of data have been excluded:

Output power is zero;
Instances in which at least one of the measures of the relevant variables is missing;
Instances in which the turbine is working under a regime of limited power.

In Figure 5a it is shown the initial state of the data for one of the turbines, where the presence of numerous outliers is obvious. Already in this first phase, one of the characteristics that will have a great impact on our system emerges: about 50% of the data are affected by a limited power regime.

Figure 5. (a) SCADA data before filtering operations; (b) Data after removing points subjected to power limitation; (c) Application of clustering on the train set; (d) Data cleaned and ready for the training stage (after the elimination of abnormal points identified through the evaluation of the Mahalanobis distance).

The power limitations are mainly due to constraints imposed by the national power grid to overcome dispatching problems and, since these behaviors are not considered normal, the samples affected by these restrictions have to be removed.

An example of applying the first general filter is shown in Figure 5b.

For the selection of an appropriate set of data to be used for the training of the different models, the maintenance records were analyzed. Through the service reports the history of the individual turbines in the wind farm can be reconstructed in terms of the failures that occurred. For the different models, a part of the dataset is manually selected where the turbine has been free from faults that may have had repercussions on the monitored variables. There are no general rules for establishing the ideal size of the dataset to be used in training, but it should contain all the natural variability of the quantities used as inputs and outputs. In this regard, where possible, an annual interval of operation for the wind turbine is used.

With the aim of increasing the performance of the models, the training set is subjected to a second filtering operation: data outside of the normal operating range of the turbine has been removed through the use of a clustering method, based on the K-means algorithm to divide the data into subgroups and afterward use the Mahalanobis distance to detect abnormal points and delete them.

In Figure 5c, there is an example of the outliers’ removal using this method. For the number of clusters, a value has been chosen equal to 12, while the threshold distance was defined as such a value that the 5% of the points are considered anomalies.

Finally, Figure 5d shows the training dataset clean and ready to be used in training the model.

Figure 6 shows a graphic representation of the output variables chosen to realize the four models developed by one of the turbines analyzed during the training phase.

Figure 6. Graphical representation of the chosen output variables for the four models elaborated from one of the turbines analyzed in the time interval used for the training phase. (a) Power Output (kW); (b) Gearbox Bearing Temperature; (c) Generator DE Bearing Temperature; (d) Generator Slip Ring Temperature.

4.2. Model Processing

To develop the models, both static and dynamic FFNN and recursive networks (NARX) have been considered.

As there are no general rules for the optimal configuration to be used, the number of layers and the number of neurons inside them are determined following an experimental campaign, widely varying these two characteristics for the different types of networks tested. Although increasing the size of the network generally results in better performance, the results obtained show how using neural networks with more than two hidden layers and with a number of neurons greater than 30 does not lead to a substantial improvement of the models.

To facilitate easy training and avoid the phenomenon of overfitting, to which large neural networks are particularly prone, in this application it is preferred the use of neural networks with a maximum of one hidden layer and with a number of neurons not exceeding 30, characteristics that will be established from time to time through an iterative procedure. Furthermore, to improve the network’s ability to generalize, in addition to the standard early stopping methodology, with a division of the train, validation and test sets of 70%, 15% and 15% respectively, the network has been tested on an additional independent test set equal to 20% of the set used in the entire training. The network characterized by the best performance has been selected.

Although the literature shows dynamic feed-forward networks or recursive networks such as NARX are particularly suitable for this problem [43,45], for our particular application, the use of the suggested tools did not produce the expected results: the best performances of the models have in fact been obtained with the use of the simplest neural networks, the static feed-forward. One reason why these regressive approaches have not been proven suitable is certainly their sensitivity to “missing” data, in this case caused by the copious, but inevitable, removal of the numerous outliers.

All neural network models were developed using MATLAB.

4.3. Wind Turbine Model

The model used for monitoring the output power is made with FFNN, using as input, in addition to wind speed and ambient temperature, wind direction and standard deviation of wind speed, which, albeit to a lesser extent, offer a contribution to the performance of the model. Figure 7 presents a graphical representation of the FFNN model elaborated.

Figure 7. Representation of the feed-forward neural networks (FFNN) model elaborated: the model presents 4 neurons in the input layer, 28 neurons in the only hidden layer and one neuron in the output layer.

An example of the use of this model is reported in Figure 8, where a control chart of Power Output deviations is shown.

Figure 8. Control chart of Power Output deviations using the ANN model.

The proposed model has been tested on several turbines and compared with another type of model common in literature, based on a non-linear regression (see the formula below from [59]):

P (v) = P_{m a x} {(1 + {(\frac{b}{v})}^{a})}^{- c} a, b, c > 0,

(7)

where

v

represents the wind speed and

P_{m a x}

the nominal wind turbine power.

Figure 9 presents a control chart of Power Output deviations using this other reference model.

Figure 9. Control chart of Power Output deviations using the non-linear regression model.

In Table 3, a comparison between the two models is reported.

Table 3. Performance parameters comparison for the two models analyzed.

By the comparison between the two models, it is clear that the first one, the ANN model, is able to overcome the issue of seasonality, assuring a better representation of the behavior of the wind turbine.

Although the output power model provides performances, calculated in terms of Root-Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) perfectly in line with results from the literature, it was not able to identify the occurrence of faults in specific components. Therefore, specific models were created for the critical components: gearbox and generator.

4.4. Gearbox Model

The ANN model is an FFNN with one hidden layer and 27 neurons (with RMSE = 0.68 °C, MAE = 0.48 °C, MAPE = 0.93% calculated during the training phase). The control chart of the gearbox bearing temperature model is reported in Figure 10.

Figure 10. The control chart of Gearbox Bearing Temperature deviations (WT01).

To assess its application, we should refer to the maintenance history of the wind turbine:

Repair gearbox from 26 April 2016 to 30 April 2016;
IMS bearings Replacing from 27 September 2016 to 28 September 2016;
Repair gearbox from 8 February 2017 to 11 February 2017.

Observing the control chart, in addition to two points outside the control limits, respectively on 28 February 2016 and on 16 March 2016, from 19 May 2016 the deviations are out of control for an extended period, presenting a positive shift of the average, before the replacement of the Intermediate Shaft (IMS) bearings of 27 September 2016. Therefore, it seems that maintenance intervention might have been predicted.

Although the value of the deviations decreases, there are still points outside the control limits. However, there are still not enough elements to be able to predict the last maintenance action on the gearbox in February 2017.

4.5. Generator Model

4.5.1. Wind Turbine WT01

The ANN model for the generator bearing temperature is an FFNN with one hidden layer and 28 neurons (with RMSE = 3.80 °C, MAE = 2.11 °C, MAPE = 4.54% calculated during the training phase). The control chart for this application is reported in Figure 11.

Figure 11. The control chart of Generator Bearing Temperature deviations (WT01).

To assess its application, we should refer to the maintenance history of the wind turbine:

Non-Drive End (NDE) and Drive End (DE) bearings replacement from 16 May 2016 to 19 May 2016;
Replacement of the generator from 16 August 2016 to 28 August 2016.

The turbine is subjected to a replacement of the bearings and a replacement of the generator. The latter generated one of the most critical scenarios for wind farm maintenance, keeping the turbine stationary for fourteen days. There is no information of alarms or interventions except for the request to replace the bearings which started on 12 May 2016 and was carried out from 16 May 2016 to 19 May 2016.

No alarm is detected in the following months, but the turbine is in a stopped state from 16 August 2016 to 28 August 2016, a period in which, following a request submitted on the 22 August, the replacement of the generator is conducted, ending on 28 August 2016.

The control chart of the deviations of the generator bearing temperature model shows an evident change in variability with numerous points out of control from 5 November 2015, to the replacement of the bearings of 16 May 2016. In particular, the last point out of control dates back to 12 May 2016, when the replacement was ordered.

The proposed monitoring system, therefore, seems to predict the anomaly well in advance. Following the replacement of the bearings, the deviations change significantly, presenting a shift of the average, while the variability seems to have returned to normal. Although the shift of the average can be justified by the use of a different type of bearing characterized by different specifications, the growing trend that stops only after replacing the generator is still anomalous.

To assess how early the system could have predicted the anomaly, considering the first three weeks from the replacement of the bearings to be “normal” and calculating the new average of the deviations, 100% of the following points would be above the average just mentioned. Referring to the rules created by Western Electric, which consider eight points on the same side of the control chart to be anomalous, the proposed system would have identified the anomaly on 25 June 2016, approximately two months before the generator’s replacement.

4.5.2. Wind Turbine WT02

Now, it is possible to observe another application on a different wind turbine.

The ANN models for the generator bearing temperature and the generator slip ring temperature are FFNN with one hidden layer and respectively 24 neurons (with RMSE = 2.46 °C, MAE = 1.28 °C, MAPE = 3.00% calculated during the training phase) and 30 neurons with RMSE = 0.88 °C, MAE = 0.70 °C, MAPE = 3.14% calculated during the training phase). The control chart for these applications are reported in Figure 12 (generator bearing temperature model) and in Figure 13 (generator slip ring temperature model).

Figure 12. The control chart of Generator Bearing Temperature deviations (WT02).

Figure 13. The control chart of Generator slip ring Temperature deviations (WT02).

To assess its application, we should refer to the maintenance history of the wind turbine:

Purging of exhausted grease channel of the generator bearings 1 August 2016;
NDE and DE bearings replacement from 11 May 2017 to 12 May 2017.

From the analysis of the control charts of the generator bearing temperature deviations (Figure 12), in the first period, several points beyond the upper limit are noted.

The first alarm dates back to 8 June 2016 and has been repeated fourteen more times until 1 August 2016, the date on which the clogged grease drain channel was cleaned. In this case, the system has shown good forecasting capacity, also providing prediction in reference to the second alarm of 5 September 2016, about the high temperature of the bearings.

Since October 2016, there are repeated points beyond the upper limit with very high deviation values. The anomalous behavior appeared to end on 18 February 2017, and then the deviations go back out of control several times until the bearings were replaced on 11 May 2017.

Unfortunately, as of October 2016, maintenance reports were no longer available, preventing more specific considerations regarding the actual detection of anomalies.

However, it is possible to assume that there have been interventions in the turbine, probably when the deviations drop since the generator temperatures have exceeded 100 °C, certainly generating alarms from the SCADA system. The numerous points out of control in the months preceding the replacement of the bearings seem to be non-random and potentially signals able to predict the need for intervention.

Although to a much lesser extent, the temperature of the slip ring shows several points out of control between October 2016 and February 2017, as reported in Figure 13. Besides, from 12 March 2017, the model is stably beyond the control limits. Deviations that fall within limits correspond to the exact time of replacement of the bearings.

5. Discussion

In this experimental application, all the steps of the proposed fault detection methodology have been tested.

In the first steps, the data acquisition and pre-processing, some difficulties were encountered caused by the numerous outliers. This situation is typical in the operational context of the investigated application. However, despite a large number of outliers, the application of the proposed clustering method that combines the K-Means algorithm with the use of the Mahalanobis distance was quite efficient. Indeed, it allowed to obtain an adequate dataset for the subsequent phases and the procedure has the additional advantage of being easily automatable to support large-scale applications on wind farms.

Two different monitoring approaches have been undertaken. The first, at a higher level, based on the turbine monitoring via the power output, showed excellent results from the point of view of the model performance but has not been proven capable of signaling the presence of anomalies in the turbine, thus fostering the development of more specific models for the major components, in particular for the generator and the gearbox.

The best results for the detection of faults and operating anomalies were obtained for the generator, where the proposed approach showed evidence of the applicability in the prediction of the occurrence of critical failures.

In addition, the system was able to predict minor interventions carried out as purging and cleaning of the bearings and failures in the ventilation system.

The proposed approach has the notable advantage of being tailored to only use SCADA data that are generally present and are already transmitted in real-time, whereas the other relevant predictive techniques cited before require additional measurement systems that cannot be continuously performed.

Besides, the use of data-driven models, in opposition to physical models, allows for the possibility of getting good accuracy in modeling without the need for an extensive knowledge of the specific structure of the system and its behavior in every operating conditions.

The experimental application has been successfully carried out in the case study presented, but it should be highlighted the fundamental importance that the data acquisition and data collection phases have in this approach. Indeed, historical data that are not enough extensive to be representative of the wind turbine’s healthy state would prove detrimental to the application’s success.

6. Conclusions

In this paper, all the phases that lead to the realization of a system aimed at the monitoring and the identification of anomalies of a wind turbine and its main components, such as the generator and the gearbox, have been described. The proposed approach is based on the use of data collected by SCADA acquisition systems. The main tools used for the development of the fault detection methodology are ANN for the development of the models and SPC for the identification and analysis of operating anomalies.

The proposed methodology has the objective to implement a fault detection system for wind turbines on several levels: monitoring the performance of the turbine as a whole, while also monitoring two critical components such as the generator and the gearbox.

The methodological approach has been applied to a real case study regarding two wind turbines to test its effectiveness and it was successful in identifying abnormal behaviors before the insurgence of faults.

Thus, the system developed has been proven to be a valuable support to the maintenance of a wind farm, providing additional information to evolve the current maintenance policy based on time-based scheduled inspections and alarms related to the exceedance of static threshold values, which often do not allow sufficient time to prevent critical scenarios.

The proposed method has the possibility of being extended to other major components.

A future development of this approach is the evolution of said methodology towards fault diagnosis. In addition to identifying the abnormal behavior, in order to define with precision the cause, faults that occurred should be associated with specific patterns on the control charts, thus laying the foundations for the subsequent automation of the fault diagnosis. To do so, however, deep cooperation with industrial actors is necessary since not only a long-term trial is mandatory but also the support of experts and maintenance personnel is deemed critical to successfully tailor this further step. This aspect would have interesting consequences, even regarding the scheduling optimization of maintenance jobs.

In more general terms, then, the increased awareness of the real health of the system generated by the implementation of such tools would have advantages also concerning the very current theme of power forecasting for renewable energy sources [60].

Author Contributions

All authors contributed equally to the idea and the design of the methodology proposed; A.S. and D.D. were responsible for the case study application; A.S. and D.D. prepared the original draft; A.S. and V.I. contributed to the review and editing and V.I. was responsible for the project supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

This appendix reports the most relevant contributions examined in the scientific literature review performed in this research. It aims to summarize the different applications and techniques used in condition-monitoring and fault detection for wind turbines. The literature review process was performed by using keywords such as “wind turbines”, “condition monitoring”, “fault detection”, “neural networks”.

Table A1 reports a list of the relevant scientific publications analyzed. For each application is reported: investigated components, the main tools and methods used, the presence of a real case study in which the proposed tools and techniques are validated, and the type of approach followed.

In particular, for the column “Real Case Study” we considered real case studies only those that involved wind turbines operating in real conditions (no prototypes, single components or simulation approaches).

Table A1. Summary of relevant applications about fault detection and condition monitoring in the wind turbine.

	Ref.	Components	Tools/Methods	Real Case Study	Approach Type
[33]	Kusiak et al., 2009	Wind turbine	k-NN ¹	Yes	SCADA data
[39]	Zaher et al., 2009	Gearbox, generator	Multilayer auto regressive FFNN	Yes	SCADA data
[40]	Schlechtingen and Santos, 2011	Gearbox, generator	Auto-regressive FFNN, Linear regression	Yes	SCADA data
[61]	Simani et al., 2011	Sensors/Actuators	Fuzzy Logic	No	Not specified
[41]	Kusiak and Verma, 2012	Gearbox, generator	Multilayer FFNN	Yes	SCADA data
[7]	Zhang et al., 2012	Gearbox	Fast Fourier Transformation	No	Vibration Analysis
[52]	Schlechtingen et al., 2013	Wind turbine	ANFIS ², K-NN ¹, FFNN, CCFL ³	Yes	SCADA data
[34]	Kusiak and Verma, 2013	Wind turbine	K-means clustering	Yes	SCADA data
[22]	Feng et al., 2013	Gearbox	Mathematical model	Yes	SCADA data
[22]	Feng et al., 2013	Gearbox	Spectral Analysis	Yes	Vibration signal and oil debris count
[8]	Liu, 2013	Tower	Mathematical model	No	Vibration Analysis
[17]	Ling and Cai, 2013	Generator	Mathematical model	No	MCSA
[57]	Schlechtingen and Santos, 2014	Gearbox, generator	ANFIS ²	Yes	SCADA data
[42]	Zhang and Wang, 2014	Main bearing	Multilayer FFNN	Yes	SCADA data
[43]	Karlsson, 2015	Wind turbine	NARX	Yes	SCADA data
[18]	Merabet et al., 2015	Generator	Fuzzy Logic	No	MCSA
[19]	El Bouchikhi et al., 2015	Generator	Maximum likelihood estimator	No	MCSA
[35]	Leahy et al., 2016	Wind turbine	SVM ⁴	Yes	SCADA data
[62]	Zhang and Ma, 2016	Wind turbine	Parallel factor analysis, K-means	Yes	SCADA data
[14]	Gómez Muñoz and García Márquez, 2016	Blades	Graphical method	No	Acoustic Emission Analysis
[44]	Sun et al., 2016	Gearbox, generator	FFNN ², Fuzzy synthetic evaluation	Yes	SCADA data
[63]	Pozo and Vidal, 2016	Sensors/Actuators	PCA ⁵, Statistical hypothesis testing	No	SCADA data
[64]	Bi et al., 2017	Pitch system	Mathematical model	Yes	SCADA data
[36]	Ouyang et al., 2017	Wind turbine	SVM ⁴	Yes	SCADA data
[65]	Wang et al., 2017	Gearbox	Deep Neural Network	Yes	SCADA data
[66]	Nazir et al., 2017	Sensors/Actuators	Mathematical model	No	Not specified
[45]	Bangalore et al., 2017	Gearbox	NARX	Yes	SCADA data
[59]	Marčiukaitis et al., 2017	Wind turbine	Non-linear regression	Yes	SCADA data
[46]	Nithya et al., 2017	Rotor	FFNN ²	No	SCADA data
[53]	Zhao et al., 2017	Generator	SVM ⁴, ANN, K-NN ¹, Naive Bayesian	Yes	SCADA data
[67]	Yu et al., 2018	Sensors/Actuators	Deep Belief Network	No	Not specified
[68]	Alvarez and Ribaric, 2018	Gearbox	Mathematical model	Yes	SCADA data
[69]	González-González et al., 2018	Pitch system	Mathematical model	Yes	Not specified
[20]	Artigao et al., 2018	Generator	Fast Fourier Transformation	Yes	MCSA
[70]	Dao et al., 2018	Wind turbine	Cointegration analysis	Yes	SCADA data
[47]	Manobel et al., 2018	Wind turbine	Gaussian Processes, ANN	Yes	SCADA data
[48]	Bangalore and Patriksson, 2018	Gearbox	ANN	Yes	SCADA data
[37]	Vidal et al., 2018	Sensors/Actuators	SVM ⁴	No	SCADA data
[71]	Zhao, 2018	Gearbox, generator	Deep auto-encoder network	Yes	SCADA data
[72]	Yang et al., 2018	Wind turbine	Multivariate EWMA ⁶	Yes	SCADA data
[73]	Wen et al., 2018	Various components	CNN ⁷	No	SCADA data
[49]	Wang et al., 2018	Gearbox, generator	PCA ⁵, ANN	Yes	SCADA data
[74]	Zhang e Lang, 2018	Bearings	Wavelet energy transmissibility functions	Yes	Vibration analysis
[11]	Li et al., 2019	Gearbox bearing	Stochastic resonance	Yes	Vibration analysis
[12]	Gu e Chen, 2019	Gearbox bearing	Stochastic resonance	Yes	Vibration analysis
[13]	Li et al., 2019	Gearbox bearing	Hidden-Markov model	Yes	Vibration analysis
[75]	Qian et al., 2019	Gearbox	HELM ⁸ algorithm, cloud computing	Yes	SCADA data
[50]	Fu et al., 2019	Gearbox	CNN ⁷, LSTM ⁹ networks	Yes	SCADA data
[76]	Lei et al., 2019	Various components	LSTM ⁹ networks	No	Not specified
[77]	Saari et al., 2019	Bearings	SVM ⁴	Yes	Vibration analysis
[9]	Jiang et al., 2019	Gearbox	Multiscale CNN ⁷	No	Vibration analysis
[78]	Bakdi et al., 2019	Wind turbine	PCA ⁵, EWMA ⁶	No	Not specified
[79]	Rizk et al., 2020	Blades	Hyperspectral imaging technique	No	Image Analysis
[80]	Dong et al., 2020	Wind turbine	Mathematical model	Yes	SCADA data
[38]	Liu et al., 2020	Generator, converter, pitch system	Convolutional Neural Network, SVM ⁴	Yes	SCADA data
[81]	Chang et al., 2020	Gearbox	Concurrent CNN ⁷	No	Vibration analysis
[82]	Pujol-Vazquez et al., 2020	Pitch actuator	Mathematical model	No	Not specified
[83]	Stetco et al., 2020	Generator	CNN ⁷	No	Data-driven using current signals
[84]	Zhang and Lang, 2020	Wind turbine, generator	Dynamic model sensor	Yes	SCADA data
[85]	Chen et al., 2020	Generator	Modulation signal bispectrum	Yes	Current signals analysis
[86]	Yang et al., 2021	Blades	Deep learning model	Yes	Image Analysis
[10]	Chen et al., 2021	Generator bearings	DCGAN ¹⁰	Yes	Data-driven using vibration data
[29]	Wang and Liu, 2021	Gearbox, Generator	CMI ¹¹, K-NN ¹	Yes	SCADA data

¹ k-Nearest Neighbors (k-NN), ² Adaptive Neuro-Fuzzy Interference System (ANFIS), ³ Cluster Center Fuzzy Logic (CCFL), ⁴ Support Vector Machine (SVM), ⁵ Principal Component Analysis (PCA), ⁶ Exponentially Weighted Moving Average (EWMA), ⁷ Convolutional Neural Networks (CNN), ⁸ Hierarchical Extreme Learning Machine (HELM), ⁹ Long Short Term Memory (LSTM), ¹⁰ Deep Convolutional Generative Adversarial Networks (DCGAN), ¹¹ Conditional Mutual Information (CMI).

References

Digital Science & Research Solutions, Inc. Renewable Energy Sources and Climate Change Mitigation: Special Report of the Intergovernmental Panel on Climate Change. Choice Rev. Online 2012, 49, 49-6309. [Google Scholar] [CrossRef]
Wind Europe: Wind Energy in Europe in 2019—Trends and Statistics. Available online: https://windeurope.org/data-and-analysis/product/wind-energy-in-europe-in-2019-trends-and-statistics/ (accessed on 15 October 2020).
Blanco, M.I. The Economics of Wind Energy. Renew. Sustain. Energy Rev. 2009, 13, 1372–1382. [Google Scholar] [CrossRef]
Hahn, B.; Durstewitz, M.; Rohrig, K. Reliability of Wind Turbines. In Wind Energy; Springer: Berlin/Heidelberg, Germany, 2007; pp. 329–333. [Google Scholar]
Tautz-Weinert, J.; Watson, S.J. Using SCADA Data for Wind Turbine Condition Monitoring—A Review. IET Renew. Power Gener. 2017, 11, 382–394. [Google Scholar] [CrossRef]
Liu, Z.; Zhang, L.; Carrasco, J. Vibration Analysis for Large-Scale Wind Turbine Blade Bearing Fault Detection with an Empirical Wavelet Thresholding Method. Renew. Energy 2020, 146, 99–110. [Google Scholar] [CrossRef]
Zhang, Z.; Verma, A.; Kusiak, A. Fault Analysis and Condition Monitoring of the Wind Turbine Gearbox. IEEE Trans. Energy Convers. 2012, 27, 526–535. [Google Scholar] [CrossRef]
Liu, W.Y. The Vibration Analysis of Wind Turbine Blade–Cabin–Tower Coupling System. Eng. Struct. 2013, 56, 954–957. [Google Scholar] [CrossRef]
Jiang, G.; He, H.; Yan, J.; Xie, P. Multiscale Convolutional Neural Networks for Fault Diagnosis of Wind Turbine Gearbox. IEEE Trans. Ind. Electron. 2019, 66, 3196–3207. [Google Scholar] [CrossRef]
Chen, P.; Li, Y.; Wang, K.; Zuo, M.J.; Heyns, P.S.; Baggeröhr, S. A Threshold Self-Setting Condition Monitoring Scheme for Wind Turbine Generator Bearings Based on Deep Convolutional Generative Adversarial Networks. Measurement 2021, 167, 108234. [Google Scholar] [CrossRef]
Li, J.; Li, M.; Zhang, J.; Jiang, G. Frequency-Shift Multiscale Noise Tuning Stochastic Resonance Method for Fault Diagnosis of Generator Bearing in Wind Turbine. Measurement 2019, 133, 421–432. [Google Scholar] [CrossRef]
Gu, X.; Chen, C. Adaptive Parameter-Matching Method of SR Algorithm for Fault Diagnosis of Wind Turbine Bearing. J. Mech. Sci. Technol. 2019, 33, 1007–1018. [Google Scholar] [CrossRef]
Li, J.; Zhang, X.; Zhou, X.; Lu, L. Reliability Assessment of Wind Turbine Bearing Based on the Degradation-Hidden-Markov Model. Renew. Energy 2019, 132, 1076–1087. [Google Scholar] [CrossRef]
Gómez Muñoz, C.; García Márquez, F. A New Fault Location Approach for Acoustic Emission Techniques in Wind Turbines. Energies 2016, 9, 40. [Google Scholar] [CrossRef]
Caso, E.; Fernandez-del-Rincon, A.; Garcia, P.; Iglesias, M.; Viadero, F. Monitoring of Misalignment in Low Speed Geared Shafts with Acoustic Emission Sensors. Appl. Acoust. 2020, 159, 107092. [Google Scholar] [CrossRef]
Faiz, J.; Moosavi, S.M.M. Eccentricity Fault Detection—From Induction Machines to DFIG—A Review. Renew. Sustain. Energy Rev. 2016, 55, 169–179. [Google Scholar] [CrossRef]
Ling, Y.; Cai, X. Rotor Current Dynamics of Doubly Fed Induction Generators during Grid Voltage Dip and Rise. Int. J. Electr. Power Energy Syst. 2013, 44, 17–24. [Google Scholar] [CrossRef]
Merabet, H.; Bahi, T.; Halem, N. Condition Monitoring and Fault Detection in Wind Turbine Based on DFIG by the Fuzzy Logic. Energy Procedia 2015, 74, 518–528. [Google Scholar] [CrossRef]
El Bouchikhi, E.H.; Choqueuse, V.; Benbouzid, M. Induction Machine Faults Detection Using Stator Current Parametric Spectral Estimation. Mech. Syst. Signal. Process. 2015, 52–53, 447–464. [Google Scholar] [CrossRef]
Artigao, E.; Honrubia-Escribano, A.; Gomez-Lazaro, E. Current Signature Analysis to Monitor DFIG Wind Turbine Generators: A Case Study. Renew. Energy 2018, 116, 5–14. [Google Scholar] [CrossRef]
Hamilton, A.; Quail, F. Detailed State of the Art Review for the Different Online/Inline Oil Analysis Techniques in Context of Wind Turbine Gearboxes. J. Tribol. 2011, 133, 044001. [Google Scholar] [CrossRef]
Feng, Y.; Qiu, Y.; Crabtree, C.J.; Long, H.; Tavner, P.J. Monitoring Wind Turbine Gearboxes: Monitoring Wind Turbine Gearboxes. Wind Energ. 2013, 16, 728–740. [Google Scholar] [CrossRef]
Lu, B.; Li, Y.; Wu, X.; Yang, Z. A Review of Recent Advances in Wind Turbine Condition Monitoring and Fault Diagnosis. In Proceedings of the 2009 IEEE Power Electronics and Machines in Wind Applications, Lincoln, NE, USA, 24–26 June 2009; pp. 1–7. [Google Scholar]
Salameh, J.P.; Cauet, S.; Etien, E.; Sakout, A.; Rambault, L. Gearbox Condition Monitoring in Wind Turbines: A Review. Mech. Syst. Signal. Process. 2018, 111, 251–264. [Google Scholar] [CrossRef]
Liu, Z.; Zhang, L. A Review of Failure Modes, Condition Monitoring and Fault Diagnosis Methods for Large-Scale Wind Turbine Bearings. Measurement 2020, 149, 107002. [Google Scholar] [CrossRef]
Glowacz, A. Fault Diagnosis of Electric Impact Drills Using Thermal Imaging. Measurement 2021, 171, 108815. [Google Scholar] [CrossRef]
Gong, Y.; Fei, J.-L.; Tang, J.; Yang, Z.-G.; Han, Y.-M.; Li, X. Failure Analysis on Abnormal Wear of Roller Bearings in Gearbox for Wind Turbine. Eng. Fail. Anal. 2017, 82, 26–38. [Google Scholar] [CrossRef]
AlShorman, O.; Masadeh, M.; Alkahtani, F.; AlShorman, A. A Review of Condition Monitoring and Fault Diagnosis and Detection of Rotating Machinery Based on Image Aspects. In Proceedings of the 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI), Sakheer, Bahrain, 26–27 October 2020; pp. 1–5. [Google Scholar]
Wang, Z.; Liu, C. Wind Turbine Condition Monitoring Based on a Novel Multivariate State Estimation Technique. Measurement 2021, 168, 108388. [Google Scholar] [CrossRef]
Stetco, A.; Dinmohammadi, F.; Zhao, X.; Robu, V.; Flynn, D.; Barnes, M.; Keane, J.; Nenadic, G. Machine Learning Methods for Wind Turbine Condition Monitoring: A Review. Renew. Energy 2019, 133, 620–635. [Google Scholar] [CrossRef]
Benedetti, M.; Cesarotti, V.; Introna, V.; Serranti, J. Energy Consumption Control Automation Using Artificial Neural Networks and Adaptive Algorithms: Proposal of a New Methodology and Case Study. Appl. Energy 2016, 165, 60–71. [Google Scholar] [CrossRef]
Carvalho, T.P.; Soares, F.A.A.M.N.; Vita, R.; Francisco, R.D.P.; Basto, J.P.; Alcalá, S.G.S. A Systematic Literature Review of Machine Learning Methods Applied to Predictive Maintenance. Comput. Ind. Eng. 2019, 137, 106024. [Google Scholar] [CrossRef]
Kusiak, A.; Zheng, H.; Song, Z. Models for Monitoring Wind Farm Power. Renew. Energy 2009, 34, 583–590. [Google Scholar] [CrossRef]
Kusiak, A.; Verma, A. Monitoring Wind Farms With Performance Curves. IEEE Trans. Sustain. Energy 2013, 4, 192–199. [Google Scholar] [CrossRef]
Leahy, K.; Hu, R.L.; Konstantakopoulos, I.C.; Spanos, C.J.; Agogino, A.M. Diagnosing Wind Turbine Faults Using Machine Learning Techniques Applied to Operational Data. In Proceedings of the 2016 IEEE International Conference on Prognostics and Health Management (ICPHM), Ottawa, ON, Canada, 20–22 June 2016; pp. 1–8. [Google Scholar]
Ouyang, T.; Kusiak, A.; He, Y. Modeling Wind-Turbine Power Curve: A Data Partitioning and Mining Approach. Renew. Energy 2017, 102, 1–8. [Google Scholar] [CrossRef]
Vidal, Y.; Pozo, F.; Tutivén, C. Wind Turbine Multi-Fault Detection and Classification Based on SCADA Data. Energies 2018, 11, 3018. [Google Scholar] [CrossRef]
Liu, Z.; Xiao, C.; Zhang, T.; Zhang, X. Research on Fault Detection for Three Types of Wind Turbine Subsystems Using Machine Learning. Energies 2020, 13, 460. [Google Scholar] [CrossRef]
Zaher, A.; McArthur, S.D.J.; Infield, D.G.; Patel, Y. Online Wind Turbine Fault Detection through Automated SCADA Data Analysis. Wind Energ. 2009, 12, 574–593. [Google Scholar] [CrossRef]
Schlechtingen, M.; Ferreira Santos, I. Comparative Analysis of Neural Network and Regression Based Condition Monitoring Approaches for Wind Turbine Fault Detection. Mech. Syst. Signal. Process. 2011, 25, 1849–1875. [Google Scholar] [CrossRef]
Kusiak, A.; Verma, A. Analyzing Bearing Faults in Wind Turbines: A Data-Mining Approach. Renew. Energy 2012, 48, 110–116. [Google Scholar] [CrossRef]
Zhang, Z.-Y.; Wang, K.-S. Wind Turbine Fault Detection Based on SCADA Data Analysis Using ANN. Adv. Manuf. 2014, 2, 70–78. [Google Scholar] [CrossRef]
Karlsson, D. Wind Turbine Performance Monitoring Using Artificial Neural Networks. Master’s Thesis, Chalmers University of Technology, Göteborg, Sweden, 2015. [Google Scholar]
Sun, P.; Li, J.; Wang, C.; Lei, X. A Generalized Model for Wind Turbine Anomaly Identification Based on SCADA Data. Appl. Energy 2016, 168, 550–567. [Google Scholar] [CrossRef]
Bangalore, P.; Letzgus, S.; Karlsson, D.; Patriksson, M. An Artificial Neural Network-Based Condition Monitoring Method for Wind Turbines, with Application to the Monitoring of the Gearbox: ANN-Based CMS for Wind Turbine Gearbox Monitoring. Wind Energ. 2017, 20, 1421–1438. [Google Scholar] [CrossRef]
Nithya, M.; Nagarajan, S.; Navaseelan, P. Fault Detection of Wind Turbine System Using Neural Networks. In Proceedings of the 2017 IEEE Technological Innovations in ICT for Agriculture and Rural Development (TIAR), Chennai, India, 7–8 April 2017; pp. 103–108. [Google Scholar]
Manobel, B.; Sehnke, F.; Lazzús, J.A.; Salfate, I.; Felder, M.; Montecinos, S. Wind Turbine Power Curve Modeling Based on Gaussian Processes and Artificial Neural Networks. Renew. Energy 2018, 125, 1015–1020. [Google Scholar] [CrossRef]
Bangalore, P.; Patriksson, M. Analysis of SCADA Data for Early Fault Detection, with Application to the Maintenance Management of Wind Turbines. Renew. Energy 2018, 115, 521–532. [Google Scholar] [CrossRef]
Wang, Y.; Ma, X.; Qian, P. Wind Turbine Fault Detection and Identification Through PCA-Based Optimal Variable Selection. IEEE Trans. Sustain. Energy 2018, 9, 1627–1635. [Google Scholar] [CrossRef]
Fu, J.; Chu, J.; Guo, P.; Chen, Z. Condition Monitoring of Wind Turbine Gearbox Bearing Based on Deep Learning Model. IEEE Access 2019, 7, 57078–57087. [Google Scholar] [CrossRef]
Helbing, G. Deep Learning for Fault Detection in Wind Turbines. Renew. Sustain. Energy Rev. 2018, 98, 189–198. [Google Scholar] [CrossRef]
Schlechtingen, M.; Santos, I.F.; Achiche, S. Using Data-Mining Approaches for Wind Turbine Power Curve Monitoring: A Comparative Study. IEEE Trans. Sustain. Energy 2013, 4, 671–679. [Google Scholar] [CrossRef]
Zhao, Y.; Li, D.; Dong, A.; Kang, D.; Lv, Q.; Shang, L. Fault Prediction and Diagnosis of Wind Turbine Generators Using SCADA Data. Energies 2017, 10, 1210. [Google Scholar] [CrossRef]
De Maesschalck, R.; Jouan-Rimbaud, D.; Massart, D.L. The Mahalanobis Distance. Chemom. Intell. Lab. Syst. 2000, 50, 1–18. [Google Scholar] [CrossRef]
Haykin, S.S.; Haykin, S.S. Neural Networks and Learning Machines, 3rd ed.; Prentice Hall: New York, NY, USA, 2009; ISBN 978-0-13-147139-9. [Google Scholar]
Hagan, M.T.; Menhaj, M.B. Training Feedforward Networks with the Marquardt Algorithm. IEEE Trans. Neural Netw. 1994, 5, 989–993. [Google Scholar] [CrossRef] [PubMed]
Schlechtingen, M.; Santos, I.F. Wind Turbine Condition Monitoring Based on SCADA Data Using Normal Behavior Models. Part 2: Application Examples. Appl. Soft Comput. 2014, 14, 447–460. [Google Scholar] [CrossRef]
Benedetti, M.; Bonfà, F.; Introna, V.; Santolamazza, A.; Ubertini, S. Real Time Energy Performance Control for Industrial Compressed Air Systems: Methodology and Applications. Energies 2019, 12, 3935. [Google Scholar] [CrossRef]
Marčiukaitis, M.; Žutautaitė, I.; Martišauskas, L.; Jokšas, B.; Gecevičius, G.; Sfetsos, A. Non-Linear Regression Model for Wind Turbine Power Curve. Renew. Energy 2017, 113, 732–741. [Google Scholar] [CrossRef]
Wang, J.; Song, Y.; Liu, F.; Hou, R. Analysis and Application of Forecasting Models in Wind Power Integration: A Review of Multi-Step-Ahead Wind Speed Forecasting Models. Renew. Sustain. Energy Rev. 2016, 60, 960–981. [Google Scholar] [CrossRef]
Simani, S.; Castaldi, P.; Tilli, A. Data—Driven Approach for Wind Turbine Actuator and Sensor Fault Detection and Isolation. IFAC Proc. Vol. 2011, 44, 8301–8306. [Google Scholar] [CrossRef]
Zhang, W.; Ma, X. Simultaneous Fault Detection and Sensor Selection for Condition Monitoring of Wind Turbines. Energies 2016, 9, 280. [Google Scholar] [CrossRef]
Pozo, F.; Vidal, Y. Wind Turbine Fault Detection through Principal Component Analysis and Statistical Hypothesis Testing. AST 2016, 101, 45–54. [Google Scholar] [CrossRef]
Bi, R.; Zhou, C.; Hepburn, D.M. Detection and Classification of Faults in Pitch-Regulated Wind Turbine Generators Using Normal Behaviour Models Based on Performance Curves. Renew. Energy 2017, 105, 674–688. [Google Scholar] [CrossRef]
Wang, L.; Zhang, Z.; Long, H.; Xu, J.; Liu, R. Wind Turbine Gearbox Failure Identification With Deep Neural Networks. IEEE Trans. Ind. Inf. 2017, 13, 1360–1368. [Google Scholar] [CrossRef]
Nazir, M.; Khan, A.Q.; Mustafa, G.; Abid, M. Robust Fault Detection for Wind Turbines Using Reference Model-Based Approach. J. King Saud Univ.Eng. Sci. 2017, 29, 244–252. [Google Scholar] [CrossRef]
Yu, D.; Chen, Z.M.; Xiahou, K.S.; Li, M.S.; Ji, T.Y.; Wu, Q.H. A Radically Data-Driven Method for Fault Detection and Diagnosis in Wind Turbines. Int. J. Electr. Power Energy Syst. 2018, 99, 577–584. [Google Scholar] [CrossRef]
Alvarez, E.J.; Ribaric, A.P. An Improved-Accuracy Method for Fatigue Load Analysis of Wind Turbine Gearbox Based on SCADA. Renew. Energy 2018, 115, 391–399. [Google Scholar] [CrossRef]
González-González, A.; Cortadi, A.J.; Galar, D.; Ciani, L. Condition Monitoring of Wind Turbine Pitch Controller: A Maintenance Approach. Measurement 2018, 123, 80–93. [Google Scholar] [CrossRef]
Dao, P.B.; Staszewski, W.J.; Barszcz, T.; Uhl, T. Condition Monitoring and Fault Detection in Wind Turbines Based on Cointegration Analysis of SCADA Data. Renew. Energy 2018, 116, 107–122. [Google Scholar] [CrossRef]
Zhao, H. Anomaly Detection and Fault Analysis of Wind Turbine Components Based on Deep Learning Network. Renew. Energy 2018, 127, 825–834. [Google Scholar] [CrossRef]
Yang, H.-H.; Huang, M.-L.; Lai, C.-M.; Jin, J.-R. An Approach Combining Data Mining and Control Charts-Based Model for Fault Detection in Wind Turbines. Renew. Energy 2018, 115, 808–816. [Google Scholar] [CrossRef]
Wen, L.; Li, X.; Gao, L.; Zhang, Y. A New Convolutional Neural Network-Based Data-Driven Fault Diagnosis Method. IEEE Trans. Ind. Electron. 2018, 65, 5990–5998. [Google Scholar] [CrossRef]
Zhang, L.; Lang, Z.-Q. Wavelet Energy Transmissibility Function and Its Application to Wind Turbine Bearing Condition Monitoring. IEEE Trans. Sustain. Energy 2018, 9, 1833–1843. [Google Scholar] [CrossRef]
Qian, P.; Zhang, D.; Tian, X.; Si, Y.; Li, L. A Novel Wind Turbine Condition Monitoring Method Based on Cloud Computing. Renew. Energy 2019, 135, 390–398. [Google Scholar] [CrossRef]
Lei, J.; Liu, C.; Jiang, D. Fault Diagnosis of Wind Turbine Based on Long Short-Term Memory Networks. Renew. Energy 2019, 133, 422–432. [Google Scholar] [CrossRef]
Saari, J.; Strömbergsson, D.; Lundberg, J.; Thomson, A. Detection and Identification of Windmill Bearing Faults Using a One-Class Support Vector Machine (SVM). Measurement 2019, 137, 287–301. [Google Scholar] [CrossRef]
Bakdi, A.; Kouadri, A.; Mekhilef, S. A Data-Driven Algorithm for Online Detection of Component and System Faults in Modern Wind Turbines at Different Operating Zones. Renew. Sustain. Energy Rev. 2019, 103, 546–555. [Google Scholar] [CrossRef]
Rizk, P.; Al Saleh, N.; Younes, R.; Ilinca, A.; Khoder, J. Hyperspectral Imaging Applied for the Detection of Wind Turbine Blade Damage and Icing. Remote Sens. Appl. Soc. Environ. 2020, 18, 100291. [Google Scholar] [CrossRef]
Dong, X.; Gao, D.; Li, J.; Jincao, Z.; Zheng, K. Blades Icing Identification Model of Wind Turbines Based on SCADA Data. Renew. Energy 2020, 162, 575–586. [Google Scholar] [CrossRef]
Chang, Y.; Chen, J.; Qu, C.; Pan, T. Intelligent Fault Diagnosis of Wind Turbines via a Deep Learning Network Using Parallel Convolution Layers with Multi-Scale Kernels. Renew. Energy 2020, 153, 205–213. [Google Scholar] [CrossRef]
Pujol-Vazquez, G.; Acho, L.; Gibergans-Báguena, J. Fault Detection Algorithm for Wind Turbines’ Pitch Actuator Systems. Energies 2020, 13, 2861. [Google Scholar] [CrossRef]
Stetco, A.; Ramirez, J.M.; Mohammed, A.; Djurović, S.; Nenadic, G.; Keane, J. An End-to-End, Real-Time Solution for Condition Monitoring of Wind Turbine Generators. Energies 2020, 13, 4817. [Google Scholar] [CrossRef]
Zhang, S.; Lang, Z.-Q. SCADA-Data-Based Wind Turbine Fault Detection: A Dynamic Model Sensor Method. Control. Eng. Pract. 2020, 102, 104546. [Google Scholar] [CrossRef]
Chen, X.; Xu, W.; Liu, Y.; Islam, M.R. Bearing Corrosion Failure Diagnosis of Doubly Fed Induction Generator in Wind Turbines Based on Stator Current Analysis. IEEE Trans. Ind. Electron. 2020, 67, 3419–3430. [Google Scholar] [CrossRef]
Yang, X.; Zhang, Y.; Lv, W.; Wang, D. Image Recognition of Wind Turbine Blade Damage Based on a Deep Learning Model with Transfer Learning and an Ensemble Learning Classifier. Renew. Energy 2021, 163, 386–397. [Google Scholar] [CrossRef]

Figure 1. The figure describes the research proposed in this paper step by step.

Figure 2. The figure describes the methodology proposed in the paper. (a) The elaboration of the model starts from the acquisition of the historical Supervisory Control and Data Acquisition (SCADA) data from the wind turbines with the aim to obtain a trained Artificial Neural Network (ANN) model able to represent the turbine in its “healthy state”. (b) Use of the trained model in the control phase: receiving as inputs the new measurements, the model elaborates the predicted values of the output variable that is then compared with the actual values measured by the measurement system at the same time. The deviation between the two values (actual and estimated by the model) is evaluated statistically to assess the health state of the turbine.

Figure 3. The relation between the output power of a wind turbine and wind speed.

Figure 4. Proposed fault detection methodology.

Figure 5. (a) SCADA data before filtering operations; (b) Data after removing points subjected to power limitation; (c) Application of clustering on the train set; (d) Data cleaned and ready for the training stage (after the elimination of abnormal points identified through the evaluation of the Mahalanobis distance).

Figure 6. Graphical representation of the chosen output variables for the four models elaborated from one of the turbines analyzed in the time interval used for the training phase. (a) Power Output (kW); (b) Gearbox Bearing Temperature; (c) Generator DE Bearing Temperature; (d) Generator Slip Ring Temperature.

Figure 7. Representation of the feed-forward neural networks (FFNN) model elaborated: the model presents 4 neurons in the input layer, 28 neurons in the only hidden layer and one neuron in the output layer.

Figure 8. Control chart of Power Output deviations using the ANN model.

Figure 9. Control chart of Power Output deviations using the non-linear regression model.

Figure 10. The control chart of Gearbox Bearing Temperature deviations (WT01).

Figure 11. The control chart of Generator Bearing Temperature deviations (WT01).

Figure 12. The control chart of Generator Bearing Temperature deviations (WT02).

Figure 13. The control chart of Generator slip ring Temperature deviations (WT02).

Table 1. In the table inputs and outputs used for the different models are identified.

Component	Inputs	Output	Ref.
Wind turbine	Wind Speed Ambient Temperature Wind Direction Wind Speed Standard Deviation	Power Output	[43]
Gearbox	Nacelle Temperature Rotor Speed Power Output Ambient Temperature Gearbox Oil Temperature	Gearbox Bearing Temperature	[40,45]
Generator	Nacelle Temperature Power Output Generator Speed Generator Stator Temperature	Generator DE ¹ Bearing Temperature	[40]
Generator	Generator Speed Power Output Rotor Grid Inverter Temp. (Ph.1) Nacelle Temperature	Generator Slip Ring Temperature	[57]

¹ Drive End (DE).

Table 2. Main maintenance interventions on gearbox and generator in the investigated period.

Component	Turbine	Maintenance Work	Start	End
Gearbox	WT01	Gearbox repair	26 April 2016	30 April 2016
		IMS ¹ bearings Replacing	27 September 2016	28 September 2016
		Gearbox repair	08 February 2017	11 February 2017
Generator	WT01	NDE ² and DE ³ bearings replacement	16 May 2016	19 May 2016
	WT01	Generator replacement	16 August 2016	26 August 2016
	WT02	NDE ² and DE ³ bearings replacement	11 May 2017	12 May 2017

¹ Intermediate Shaft (IMS); ² Non-Drive End (NDE); ³ Drive End (DE).

Table 3. Performance parameters comparison for the two models analyzed.

Model	RMSE ¹	MAE ²	MAPE ³	Control Limits
FFNN	36.08 kW	26.72 kW	2.78%	$\pm$ 46.54 kW
Non-linear Regression	65.54 kW	48.24 kW	4.67%	$\pm$ 79.74 kW

¹ Root-Mean Squared Error (RMSE), ² Mean Absolute Error (MAE), ³ Mean Absolute Percentage Error (MAPE).

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

A Data-Mining Approach for Wind Turbine Fault Detection Based on SCADA Data Analysis Using Artificial Neural Networks

Abstract

1. Introduction

2. Background

3. Methodology

3.1. Data Acquisition and Data Pre-Processing

3.1.1. Data Cleaning

3.1.2. Clustering and Mahalanobis Distance

3.2. Model Processing

Feature Selection

3.3. Post-Processing

4. Case Study Application

4.1. Data Pre-Processing

4.2. Model Processing

4.3. Wind Turbine Model

4.4. Gearbox Model

4.5. Generator Model

4.5.1. Wind Turbine WT01

4.5.2. Wind Turbine WT02

5. Discussion

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

Appendix A

References

Article Metrics

Citations

Article Access Statistics