Use of Artificial Neural Networks and SCADA Data for Early Detection of Wind Turbine Gearbox Failures

Puruncajas, Bryan; Castellani, Francesco; Vidal, Yolanda; Tutivén, Christian

doi:10.3390/machines13080746

Open AccessArticle

Use of Artificial Neural Networks and SCADA Data for Early Detection of Wind Turbine Gearbox Failures^†

by

Bryan Puruncajas

^1,2

,

Francesco Castellani

^3,*

,

Yolanda Vidal

⁴

and

Christian Tutivén

^1,5,*

¹

Facultad de Ingeniería en Mecánica y Ciencias de la Producción, Escuela Superior Politécnica del Litoral, ESPOL, Campus Gustavo Galindo, Km. 30.5 Vía Perimetral, Guayaquil 090902, Ecuador

²

Faculty of Industrial Engineering, Universidad de Guayaquil, Av. Dr. Gómez Lince y Av. Juan Tanca Marengo, Guayaquil 090150, Ecuador

³

Department of Engineering, University of Perugia, Via G. Duranti 93, 06125 Perugia, Italy

⁴

Control, Data, and Artificial Intelligence, Department of Mathematics, Escola d’Enginyeria de Barcelona Est, Universitat Politècnica de Catalunya, Campus Diagonal-Besós (CDB), 08019 Barcelona, Spain

⁵

Centro de Energías Renovables y Alternativas, Escuela Superior Politécnica del Litoral, ESPOL, Campus Gustavo Galindo, Km. 30.5 Vía Perimetral, Guayaquil 090902, Ecuador

^*

Authors to whom correspondence should be addressed.

^†

This paper is an extended version of our paper published in 2024 at the 5th International Conference of IFToMM ITALY Turin, Italy, 11–13 September 2024.

Machines 2025, 13(8), 746; https://doi.org/10.3390/machines13080746

Submission received: 30 June 2025 / Revised: 16 August 2025 / Accepted: 18 August 2025 / Published: 20 August 2025

(This article belongs to the Special Issue Advances in Mechanism and Machine Science Within the IFIT 2024 Conference)

Download

Browse Figures

Versions Notes

Abstract

This paper investigates the utilization of artificial neural networks (ANNs) for the proactive identification of gearbox failures in wind turbines, boosting the use of operational SCADA data for predictive analysis. Avoiding gearbox failures, which can strongly impact the functioning of wind turbines, is crucial for ensuring high reliability and efficiency within wind farms. Early detection can be achieved though the development of a normal behavior model based on ANNs, which are trained with data from healthy conditions derived from selected SCADA variables that are closely associated with gearbox operations. The objective of this model is to forecast deviations in the gear bearing temperature, which serve as an early warning alert for potential failures. The research employs extensive SCADA data collected from January 2018 to February 2022 from a wind farm with multiple turbines. The study guarantees the robustness of the model through a thorough data cleaning process, normalization, and splitting into training, validation, and testing sets. The findings reveal that the model is able to effectively identify anomalies in gear bearing temperatures several months prior to failure, outperforming simple data processing methods, thereby offering a significant lead time for maintenance actions. This early detection capability is highlighted by a case study involving a gearbox failure in one of the turbines, where the proposed ANN model detected the issue months ahead of the actual failure. The present paper is an extended version of the work presented at the 5th International Conference of IFToMM ITALY 2024.

Keywords:

SCADA; gearbox failure; normal behavior model; artificial neural networks

1. Introduction

The reliability of wind farm operations has become a crucial aspect in boosting the energy transition. Consequently, there has been a growing interest from both academia and the industry in the complex task process of early fault detection in wind turbines (WTs). The capacity to identify damage in these non-stationary machines can significantly decrease downtime and optimize maintenance programs; these objectives are essential for ultimately reducing the Levelized Cost Of Energy (LCOE). Researchers are employing a range of methodologies and data sources to accomplish this aim, always considering that the time scale of the data may vary significantly based on the specific damaged component. Gearboxes, which transfer mechanical power from the rotor to the conversion generator, are among the components most prone to failure [1]. Effective condition monitoring (CM) is essential for the early detection of faults and optimal maintenance planning. This can be achieved using different experimental data and approaches.

The most commonly used methods are the following:

1.: Signal processing (or sensor-based) methods;
2.: Physical model approaches;
3.: SCADA data-driven techniques.

The first method is generally considered the most effective as it is based on specific measurements, such as vibrations [2], acoustic emissions [3], or electrical signals [4]. Unfortunately, such approaches are costly and difficult to operate, as a large number of sensors are needed, and the data acquisition can be challenging as different time scales must be applied to each component.

Physical models can also be very helpful [5] but their application is often hampered by the lack of specific technical data.

Ultimately, utilizing SCADA data-driven techniques is the most flexible approach, as they can be applied to a wide range of components. The main advantage of these methods is that they can be used by any wind turbine (WT) operator without the need to install new sensors, as they simply use the standard SCADA database.

Condition monitoring generally includes the prognosis and estimation of the degradation of key components [6]; such activities can be very challenging, especially when applied to real-world cases and can also call for additional measurements [7]. In practice, it can be very difficult to estimate the real "end of life"of components due to the non-stationarity of the machine operation [8]. When working with SCADA, there is also the problem of data unbalance [9], which can successfully be addressed using machine learning techniques [10].

Analyzing a group of WTs rather than a single unit can be very helpful, particularly for monitoring mid-to-long-term performance. This method can help detect systematic errors [11] as well mechanical faults [12]. Among all the SCADA data parameters, temperature signals are considered the most important for condition monitoring approaches [13]. The first symptom of an incipient fault is an increase in the temperature of the specific component. However, using temperature measurements from SCADA can also present challenges due to the following issues:

Temperature has a very slow dynamic, meaning faults can only be discovered by analyzing trends over the medium term;
Temperature drifts can be affected by other components of the machine and may be subject to unstable operation;
Seasonality should also be considered, as the temperature of different components may be affected by changes induced by climatology.

Notwithstanding these critical points, temperature analysis is a cost-effective and universally applicable solution for WT condition monitoring and early fault diagnosis [14].

For long-term monitoring, SCADA data generally offer an extensive continuous record (spanning years) of the operational parameters of the WT averaged over a 10 min interval. The literature provides a robust foundation for analyzing fault diagnosis using SCADA data, with a multitude of approaches documented. One of the most used approach is normal behavior modeling [15]. Engaging with SCADA data frequently entails managing high-dimensional data series [16]. Thus, it is common practice to implement dimension reduction (DR) or feature selection techniques, as demonstrated in [17], where diffusion map features based on geodesic distance are integrated with a DR methodology. Additionally, in [18], DR is effectively utilized to enhance a fault diagnosis method for WT generators based on support vector regression (SVR). It is important to highlight that SCADAs [19,20,21], due to their extensive history and low temporal resolution, can be highly beneficial for monitoring long-term variations. A pertinent example is the observation of the aging process of a wind farm detailed in [22], where data were gathered from a wind turbine following 13 years of operation and subsequently analyzed through a support vector regression method. The findings indicated that the machine had suffered a

5 %

reduction in production by that point.

A robust method for fault diagnosis utilizing SCADA data involves the application of artificial neural networks (ANNs), as presented in [23]. This methodology generally revolves around constructing a data-driven model that represents normal behavior [24], which serves as a benchmark for detecting anomalies and can be employed for any parameter linked to the compromised component. As mentioned previously, temperatures are often regarded as the most critical of the numerous parameters stored for fault diagnosis, since a damaged component typically begins to generate excess heat. Ref. [25] presents a significant study on WT fault diagnosis based on temperature data analysis. In this work, an ANN was trained to model the normal behavior of gearbox oil temperature, gearbox bearing temperature, and generator winding temperature. The input variables used for training include produced power, ambient temperature, and the target variable itself at one and two previous time steps.

An advanced methodology utilizing convolutional neural networks (CNN) is presented in [26], where a scalable CNN framework is used to analyze high-dimensional raw condition monitoring data for automatic identification of various electromechanical faults in WTs.

In [27], a hybrid approach combining a CNN and a long short-term memory (LSTM) network is utilized to develop a model representing the normal behavior of internal temperatures in WTs. The trend and extent of variation in the root mean square error (RMSE) are leveraged to establish the alarm threshold for fault detection.

Convolutional autoencoders have also proven effective for early fault detection; for instance, in [28], the prediction of a main bearing fault was made several months in advance through the application of a neural network designed with a convolutional autoencoder and SCADA data.

The implementation of normal behavior models in condition monitoring typically involves trend analysis of residuals; however, this task can be complex due to the non-stationary nature of machine operations. The issue is addressed in [29] through the use of change point detection for various SCADA parameters chosen for monitoring the primary components.

The research presented in [30] analyzes the selection of thresholds and training durations to model the normal behavior of temperatures in different subcomponents. Following feature selection, a regressive random forest model is constructed and the distribution of temperature error residuals is used to determine alarm thresholds.

Physical thermal modeling has also demonstrated its effectiveness in fault diagnosis, both independently and in combination with machine learning, as discussed in [5].

The integration of neural networks for condition monitoring can be effectively combined with other methodologies; for example, in [31], ANNs are employed as classifiers for the condition monitoring of mechanical systems by evaluating the outcomes of various signal processing techniques.

This study explores the application of ANNs for the early detection of faults utilizing actual operational SCADA data [32]. The effectiveness of the selected methodology was validated through data derived from a failure event involving a planetary bearing within the gearbox of a multi-megawatt wind turbine. The organization of the manuscript is as follows:

Section 2 outlines the test case being examined;
Section 3 provides a description of the approach, while Section 4 offers a brief introduction to the dataset utilized for the analysis;
Section 5 introduces some results from standard post-processing;
Section 6 elaborates on the ANN model, and Section 7 details the processing involved in error detection;
Finally, Section 8 discusses the results, and Section 9 presents the conclusions drawn from the study.

2. Test Case

The current study examines a gearbox failure in a large horizontal-axis multi-MW WT which is part of a wind farm consisting of nine turbines.

The characteristics of the turbine are detailed in Table 1.

All WTs are equipped with an SCADA system, while the vibration CMS (condition monitoring system) for gearbox monitoring is optional. The turbine under investigation in this work does not have a vibration condition monitoring system (CMS). The scheme of the gearbox is detailed in Figure 1. In the test case under investigation, the gearbox condition is typically assessed through an annual offline oil analysis.

In this test case, the oil analysis was only able to issue an alarm at a significantly delayed time. This was due to the inadequate time resolution (oil samples are collected once a year) and the delay between the sampling and the reporting of the analysis results. The fault in the planetary bearing was ultimately identified through an on-site invasive inspection. Following the conclusive diagnosis, the machine was taken offline for maintenance. It was estimated that the total energy loss during the downtime amounted to approximately 13% of the WT’s total annual output. The fault detection approach developed in this work is based solely on the use of SCADA data. This is a significant advantage due to the fact that SCADA data are standard in multi-MW WTs, meaning that the method can be widely applied. Nevertheless, the poor time scale of SCADA data represents a significant drawback, which may limit some specific applications. In the analysis, only eight of the nine operational WTs on the farm were considered, as one WT was affected by an electrical fault during the period under investigation. All the WTs had been working for six years before the gearbox fault and operate with the same settings and the same solution for connecting to the grid.

The SCADA system reliably collects statistics on a 10 min time scale (min, max, avg…), ensuring continuous monitoring of a large number of machine parameters, making it ideal for mid-to-long-term monitoring.

The main parameters that are included in the standard SCADA database are shown in Figure 2 and can be grouped into environmental, machine, and electrical parameters. Depending on the WT’s technology, other channels can be added to the database. However, the temperatures of the main mechanical and electrical components are always included and can be efficiently used for monitoring purposes, as in the present study.

3. Fault Detection Approach

The proposed fault detection approach is based on the construction of a normal behavior model using an ANN, with the objective of enabling the early identification of gearbox failures. The overall workflow is illustrated in Figure 3, which comprises three key stages: (1) preprocessing of SCADA data; (2) prediction of the gear bearing temperature using the ANN model; and (3) residual analysis and anomaly detection.

(1) Input preprocessing: The process begins with the selection of input variables from the SCADA dataset. The primary selection criteria are the physical proximity and diagnostic relevance of each sensor to the gearbox, the mechanical subsystem affected by the failure.

Once the set of candidate variables is identified based on domain knowledge, the Pearson correlation coefficient is used to quantify the synchronous linear association between each variable and the target signal (gear bearing temperature). This allows for the prioritization of variables that are not only physically relevant but also statistically aligned with the fault-sensitive signal under normal operating conditions.

After variable selection, the data are cleaned by identifying and removing outliers and imputing missing values to ensure data completeness. All input variables are normalized using Min–Max scaling to standardize input ranges, which facilitates stable and efficient convergence during ANN training. The resulting dataset is chronologically divided into three subsets: training, validation, and testing. Only healthy operational data are used for training and validation to ensure the model learns the baseline behavior of the system under fault-free conditions.

(2) Model prediction: The ANN is trained to forecast the gear bearing temperature based on historical input data. This training setup enables the model to learn temporal patterns and dynamic correlations inherent in the system’s normal operation. Once validated, the model can be deployed in real time to generate expected temperature values under healthy conditions, serving as a reference baseline for anomaly detection.

(3) Residual analysis and anomaly detection: Once the model has been trained and validated, it is deployed to predict the gear bearing temperature using real-time SCADA data, which may correspond to either healthy or degraded operational states. The deviation between the predicted and measured values—referred to as the residual—is computed as their absolute difference and serves as the basis for anomaly detection. To reduce the impact of transient fluctuations and enhance detection reliability, the residuals are smoothed using a simple moving average (SMA) over a one-week window. A statistical threshold is then defined based on the mean and standard deviation of the residuals observed during the training phase. When the smoothed residual exceeds this threshold, an alert is triggered, indicating a potential thermal anomaly that may be associated with an incipient gearbox fault. The exact formulation of the residual and threshold is provided in Section 7.

This study adopts a semi-supervised, regression-based strategy, which is particularly well suited for wind farm applications where labeled fault data are limited or unavailable. Since the SCADA database contains only healthy operational data, supervised classification approaches—which require labeled examples of both normal and faulty conditions—are not feasible. Instead, the proposed methodology constructs a model of normal operational behavior using regression techniques, allowing for the detection of anomalies based on deviations from expected values. This approach eliminates the need for manual data labeling, increases scalability, and facilitates the deployment of WT-specific models tailored to the unique dynamics of each unit.

It is important to note that, although classification algorithms such as support vector machines, random forests, or deep classifiers (e.g., CNNs or LSTM) have been explored in other contexts, their effectiveness strongly depends on the availability of fault-labeled data—something rarely available on operational wind farms. In this case study, no prior labeled failure events were available for most WTs. Consequently, a regression-based anomaly detection approach was the most viable solution.

Nonetheless, the integration of classification algorithms could be beneficial in hybrid frameworks where anomaly detection is followed by classification stages to identify the type or severity of faults. This avenue is considered promising and will be explored in future work to enhance model generalizability and diagnostic specificity. The method demonstrated its practical relevance and robustness by detecting an incipient gearbox failure nine months before the actual event.

4. Data Preprocessing

The SCADA data cover the period from 1 January 2018 to 28 February 2022, and offer extensive insights into the operation of the WTs, encompassing environmental, electrical, and control data, among various other parameters. These data are collected at a frequency of 1 Hz but are recorded in 10 min intervals which include the mean, maximum, minimum, and standard deviation for most variables derived from the SCADA data.

Moreover, maintenance and repair records are essential as they provide information regarding the nature of the failure, the start and end dates of work orders, the subsystems involved, and the actions undertaken. For example, it was noted through the work orders that WT7 encountered a gearbox failure on 23 February 2022.

4.1. Data Selection

The analysis focused on SCADA variables associated with measurements from the lubrication system, thermal monitoring, and structural elements located near the gearbox. These signals are widely recognized as effective indicators for early fault detection, particularly in mechanical systems where overheating or changes in fluid dynamics often precede structural failures [33]. Table 2 presents the variables that exhibited the highest Pearson’s correlation coefficients in relation to the target signal. It quantifies the association between the covariance of two variables and the product of their standard deviations; therefore, it serves as a normalized measure of covariance, resulting in a value that ranges from −1 to 1 [34], along with their corresponding operational ranges. This selection includes thermal variables, oil pressure differentials, and other operational parameters deemed diagnostically relevant.

The variable “Gear Bearing Temperature” was selected as the primary reference signal due to its physical proximity to the damaged component and its high sensitivity to thermal anomalies. The dataset also includes a second parameter, “Gear Bearing Temperature B,” which corresponds to the temperature reading at the bearing located on the opposite side of the gearbox. These two parameters provide measurements from bearings situated at either end of the gearbox. However, in this study, “Gear Bearing Temperature” was prioritized because it is located nearest to the area where the failure occurred, making it more representative of the local thermal behavior and thus more suitable for detecting incipient mechanical anomalies.

In addition to variables directly related to temperature and pressure, other operational parameters were incorporated due to their indirect influence on the thermal and hydraulic behavior of the system. These included rotor speed (Rotor RPM), ambient wind speed, grid production power, and blade pitch angle. Such parameters reflect the operational load and dynamic behavior of the WT, which, in turn, affect the mechanical stress and thermal conditions within the gearbox, thus justifying their inclusion in the model. The selected variables were prioritized not only based on their statistical correlation with the target signal, but also because of their spatial proximity to the gearbox.

In this study, the feature selection stage focused exclusively on thermal, lubrication system, and mechanical load-related parameters, as these are known to exhibit the earliest and most direct sensitivity to gearbox degradation. Electrical variables from the generator, such as voltage, current, or frequency, were not included in the present analysis, since the primary objective was to model the thermal behavior of the gearbox under healthy operating conditions. Nonetheless, the integration of electrical SCADA parameters remains a relevant avenue for future work, as these signals could complement the current model by providing additional indicators of mechanical or electromechanical faults.

4.2. Data Cleaning and Data Imputation

Data cleaning serves as an essential initial phase in any analytical process, as it removes extraneous data that could skew the findings. When dealing with real-world datasets, it is common to encounter outliers and missing values. In this study, the identification of outliers was based on the established operational ranges for the selected variables, as shown in Table 2. However, these outliers were not systematically discarded, since they may contain valuable information for early fault detection. Instead, values falling outside the defined limits were initially marked as missing and subsequently imputed using the same methodology applied to the originally missing values.

Data imputation was performed using Piecewise Cubic Hermite Interpolating Polynomial (pchip). This polynomial technique is well suited for handling data points with both values and specified slopes at interpolation nodes. Its key advantages lie in its ability to preserve the overall structure of the data, ensure continuity of at least the first derivative, and maintain monotonicity [35]. In cases where missing values occur at the beginning or end of the dataset, the closest available preceding or succeeding value is used, respectively. Figure 4 illustrates the imputation process applied to outlier values detected in the grid production power variable.

4.3. Data Split and Data Normalization

For the development of the ANN, the dataset was systematically partitioned into three subsets (training, validation, and testing) in order to preserve chronological order [36]. In the proposed methodology, only data corresponding to normal operating conditions were used during the training and validation phases, since the main aim was to model the system’s normal behavior. Since seasonality can generally affect temperature variability and turbine operation, the training and validation sets spanned a period of more than one year (18 months) to capture seasonal trends and operational variability, ensuring coverage of at least one full annual cycle. After training, the model generalization performance was evaluated using an independent testing dataset which included the operational periods during which gearbox failure was documented in turbine WT7. The SCADA dataset was divided as follows:

Training: 1 January 2018 to 7 May 2019;
Validation: 8 May 2019 to 30 June 2019;
Testing: 1 July 2019 to 28 February 2022;

The selected variables exhibited distinct physical units and numerical ranges, resulting in heterogeneous magnitudes. Although Min–Max normalization is a widely known preprocessing technique, it was explicitly applied in this study to prevent feature dominance during training and to facilitate stable convergence of the learning algorithm [37]. This step is particularly relevant in multivariate regression tasks involving SCADA data, where input variables can span different physical domains (e.g., temperature, pressure, power).

The normalization parameters were computed exclusively from the training set to preserve consistency and avoid information leakage. The inclusion of this normalization step is also critical to allow reproducibility of the methodology, especially in industrial applications where model transferability and robustness are desired.

Figure 5 illustrates the effect of normalization applied to the grid production power variable.

5. Detection with SCADA Processing

SCADA data analysis is a fundamental activity for checking the operation of wind farms and planning maintenance interventions. Much of the literature emphasizes the importance of data analysis for fault detection (see, for example, references [38,39]) or for monitoring performance and systematic errors (with a particular focus on yaw issues, as discussed in references [40,41,42]).

The turbine analyzed in the present study suffered a severe gearbox fault, which was ultimately clearly visible in the SCADA data. In this case, timely detection is also crucial, as it allows the maintenance crew enough time to organize an important repair intervention.

This section presents results from the standard data post-processing approach, which demonstrated that detection would have only been possible around three months before the unit stopped operating without the use of more advanced methods.

Initially, any fault could be detected through temperature analysis; a damaged component often starts to produce heat at an increasing rate as the fault approaches. Figure 6 shows the trend of the differential gear bearing temperature for the faulty unit and three healthy turbines, calculated relative to the temperature of a healthy unit, versus time. The data were resampled on a daily basis to create a smoother averaged plot. Although this temperature sensor was closest to the damaged component, the plot does not clearly indicate any anomaly. Due to the specific installation conditions of the temperature sensors, the trends are offset by a few degrees. From the plot is not possible to distinguish a specific trend in order to raise an alarm for the faulty units, and setting an alarm threshold that excludes false alarms is not feasible.

Processing the oil pressure data at different points in the lubrication circuit can produce a more valuable result. For instance, a fault in the gearbox will produce metal debris that accumulates in the oil filter. This results in higher hydraulic resistance for the oil filter, which consequently raises the pressure drop before and after the filtering unit.

The results of this analysis are shown in Figure 7, where the pressure drop trend of the faulty unit is compared with that of the other two healthy turbines.

Unlike the temperature plot, the pressure drop plot appears to indicate the fault, but it is unclear when the upward trend exactly began. In any case, the fault appears to be clearly visible around three months before the machine finally stops. In this kind of plot, a threshold can be useful for triggering an alarm, but it should be defined carefully to avoid false alarms.

Analyzing the receiver operating characteristic (ROC) curve shown in Figure 8, it can be seen that the balance between true and false positives only becomes favorable three months before the failure.

These results suggest that standard SCADA data processing techniques are not sufficient for achieving efficient early fault diagnosis, and that more complex, data-driven approaches should be developed instead.

6. The Neural Network Model

An ANN is a computational model composed of interconnected nodes organized into three main layers: input, hidden, and output. Within this framework, unsupervised learning constitutes one of the training paradigms used to analyze unlabeled data. This approach enables the system to identify and represent underlying patterns in the input signals, thereby revealing latent structures within the dataset [43]. The present study applied this methodology to the early detection of gearbox failures by analyzing unlabeled operational data.

The ANN architecture proposed in this work was built upon the 11 selected variables listed in Table 2. The model’s output corresponds to the gear bearing temperature at time t, while the inputs are defined as the remaining ten variables, each measured at time

t - 1

.

The ANN configuration was experimentally determined in order to balance model complexity with prediction accuracy by minimizing the mean squared error (MSE) on the validation set. Several experiments were conducted to evaluate different configurations in terms of hidden layer size and neuron count. The optimal architecture was identified based on performance metrics such as the MSE and computational efficiency. The selected design includes two hidden layers: the first layer consists of 150 neurons, allowing for a rich representation of input features to capture complex patterns; the second layer, reduced to 30 neurons, consolidates these representations into more abstract forms, thereby improving generalization to new data and mitigating overfitting risks.

Figure 9 illustrates the final structure of the network. Although the overall methodology is unsupervised—since only healthy operational data were used for training and no explicit classification between normal and faulty states was provided—the ANN was trained using supervised learning principles, as it was tasked with predicting a specific target variable from historical input data.

Model training was conducted on a laptop running Windows 11 equipped with 12 GB of RAM and a dedicated GPU with 6 GB of video memory. The Adam optimization algorithm was employed with a learning rate

α_{0} = 0.0001

, exponential decay rates

β_{1} = 0.9

and

β_{2} = 0.992

, and

ε = 10^{- 7}

. A mini-batch size of 64 samples was used per iteration, and training was performed over 600 epochs.

7. The Metrics for Fault Detection

The detection of gearbox faults is based on the formulation of a fault indicator (FI), constructed from the discrepancy between the actual measurements of a monitored physical variable,

y_{t}

, and their estimated values,

{\hat{y}}_{t}

. In this study, the residual is defined as the absolute difference

|y_{t} - {\hat{y}}_{t}|

, which serves as the basis for identifying deviations from the expected behavior.

However, due to the inherent uncertainty of the model and the presence of short-term fluctuations, using the raw residual directly as a trigger criterion tends to generate a high rate of false positives. To mitigate this noise sensitivity, the residual is smoothed using an SMA calculated over a window of 1008 points, equivalent to one week of operational data. This technique effectively eliminates fast and irrelevant fluctuations while preserving medium- and long-term variations, which are typically associated with the progressive degradation of gearbox components [44]. Consequently, the smoothed FI becomes a more stable and reliable signal for identifying consistent anomalous patterns.

The alarm condition is triggered when the smoothed FI exceeds a threshold determined from the training data under normal conditions. This threshold is defined as follows:

threshold = μ + κ σ,

where

μ

and

σ

are the mean and standard deviation of the FI calculated on the healthy training set per WT, and

κ

is a tuning parameter that modulates the sensitivity of the detection system. Adjusting the value of

κ

enables a trade-off between early fault detection and the minimization of false positives.

Although the threshold expression resembles the common three-sigma criterion, the parameter

κ

is selected empirically, rather than based on a theoretical statistical rule. This choice reflects the non-Gaussian distribution of residuals and the non-stationary nature of WT degradation. The final value of

κ = 9

was determined through validation across multiple WTs to balance early detection and false alarm rates. A detailed analysis of this calibration is provided in Section 8.

8. Results and Discussion

This section presents the outcomes derived from the proposed methodology, with emphasis on alarm triggering when the FI exceeds a specific threshold. It is worth noting that, among all evaluated WTs, only WT7 experienced a documented failure, which occurred on 23 February 2022.

Table 3 summarizes the alarm activations for two values of the threshold parameter

κ

: 6 and 9. When

κ = 6

was applied, three activations were observed. Two of these were false positives, detected in WT2 and WT8, while the third corresponded to WT7, the only WT with a confirmed failure. Conversely, with

κ = 9

, the alarm was triggered exclusively in WT7, with no false alarms in any other WT. Accordingly, and given the homogeneous nature of the studied farm (technology and operating settings), a single fleet-wide sensitivity value of

κ = 9

is adopted, while per-WT normalization

(μ, σ)

absorbs unit-level offsets and variability.

Figure 10 provides a detailed visualization of the smoothed FI, computed using an SMA over a window of 1008 points (equivalent to one week of operational data). This technique effectively eliminates short-term fluctuations and noise while preserving the medium- and long-term trends typically associated with progressive mechanical deterioration.

In the case of WT7, the FI begins to rise consistently several months before the actual failure, surpassing the

κ = 9

threshold in June 2021. This early detection, approximately nine months in advance, provides strong evidence that the monitoring system is capable of identifying incipient faults before critical failure occurs. Such gradual progression is characteristic of internal structural damage, such as microcracks, that generate heat well before the gearbox undergoes complete mechanical breakdown.

Unlike WT7, the remaining WTs did not exhibit similar behavior. Although the

κ = 6

threshold is briefly exceeded in WT2 and WT8, the FI elevations are short-lived and do not show a sustained upward trend. This suggests that these activations are likely linked to noise or non-structural events, highlighting high sensitivity but limited specificity at

κ = 6

.

The selection of the threshold parameter

κ = 9

does not adhere to conventional statistical rules such as the three-sigma principle, which assumes Gaussian-distributed residuals and linear degradation dynamics. Instead,

κ = 9

was chosen based on empirical validation using historical data from multiple WTs in the wind farm. This selection process involved systematic analysis of both false positives and true-positive detections under different threshold configurations. As shown in Table 3, a lower value such as

κ = 6

led to early activation in WT7 but also triggered spurious alarms in WT2 and WT8, reducing the system’s specificity. On the other hand, setting

κ = 9

eliminated the false alarms while still preserving the ability to detect the actual failure in WT7 with a lead time of approximately nine months.

This behavior is corroborated by the smoothed FI visualizations (Figure 10), where the anomaly in WT7 shows sustained growth well before failure, contrasting with the short-lived, noisy peaks in other WTs. Thus, the choice of

κ = 9

resulted from a trade-off analysis aiming to optimize early detection while minimizing false positives. In industrial monitoring systems, such trade-offs are necessary when data distributions deviate from idealized assumptions. Therefore,

κ

is interpreted not as a rigid statistical parameter, but as a calibration factor tailored to the real-world dynamics of gearbox degradation in WTs.

In this context, the parameter

κ

functions not merely as a statistical constant but as a tunable element aligned with the actual dynamics of the system. Its selection is informed by both quantitative and visual analysis of the FI’s behavior, particularly the persistence and trend differentiation between real failures and spurious oscillations.

9. Conclusions

This research has effectively illustrated the practicality of utilizing ANNs for the prompt identification of gearbox malfunctions in WTs. Specifically, the fault prognosis approach relies exclusively on SCADA data and requires only data from healthy systems for implementation. A comprehensive and dependable model of normal operational behavior has been constructed through a careful selection of SCADA variables that correlate with gear bearing temperature, alongside a thorough process of data cleansing, imputation, and normalization. The efficacy and performance of the developed methodology were validated on an operational wind farm consisting of nine WTs, with one WT excluded due to a different failure type. The findings underscore the model’s capability to detect anomalies in gear bearing temperature up to nine months prior to failure, thereby providing a lead time window sufficient for proactive and cost-effective maintenance planning. This method outperforms traditional SCADA data processing approaches. Its application can therefore be useful for organizing timely and optimal maintenance interventions. This result is especially relevant in the case of WT7, where the proposed model successfully issued an early warning of the gearbox failure approximately nine months in advance, demonstrating its practical value for anticipatory maintenance planning. Moreover, the study tackles and resolves the difficulties related to generating dependable alarms for fault prognosis, utilizing a simple moving average filter to smooth the residuals’ profile and setting a detection threshold based on the mean and standard deviation of the training residuals. The results obtained suggest that the implementation of the ANN methodology facilitates the early detection of gearbox faults, exceeding the predictive capabilities typically achieved through traditional offline oil analysis methods. In summary, the study validates the capability of ANNs to markedly improve early fault detection in WTs, which has significant implications for minimizing downtime and optimizing maintenance strategies.

The utilization of this model serves as a crucial instrument for managing maintainability in wind farms, thereby facilitating the transition to more sustainable and dependable energy sources.

Future research will focus on examining analogous test cases and ultimately applying the current methodology for real-time monitoring of an operational wind farm. This could represent a pivotal advancement in improving the reliability of wind energy conversion systems and optimizing all associated maintenance efforts.

From a modeling perspective, future developments will explore advanced neural network architectures designed for temporal sequence modeling, including recurrent networks and attention-based mechanisms. Moreover, hybrid strategies that integrate anomaly detection with classification algorithms will be investigated to improve diagnostic specificity, especially in scenarios where labeled fault data become available.

In addition, future studies will include a comprehensive exploratory data analysis (EDA) stage aimed at comparing a broader set of SCADA signals. Although this study focused on mechanical and thermal parameters due to their physical proximity to the gearbox and known diagnostic relevance, electrical variables such as generator current and voltage will be examined to assess their potential contribution to early fault detection. This expansion will strengthen the model’s generalizability and ensure the inclusion of all potentially informative features, increasing the rigor and completeness of the feature selection process.

Author Contributions

Conceptualization, B.P., F.C., Y.V. and C.T.; Data Curation, B.P. and F.C.; Formal analysis, B.P., F.C., Y.V. and C.T.; Investigation, B.P., F.C., Y.V. and C.T.; Methodology, B.P., F.C., Y.V. and C.T.; Software, B.P.; Supervision, B.P., F.C., Y.V. and C.T.; Validation, B.P. and F.C.; Writing—original draft, B.P. and F.C.; Writing—review and editing, B.P., F.C., Y.V. and C.T. All authors have read and agreed to the published version of the manuscript.

Funding

This work is partially funded by: (i) grant PID2021-122132OB-C21 funded by MCIN/AEI/10.13039/501100011033 and by “ERDF A way of making Europe”, by the “European Union”; (ii) grant TED2021-129512B-I00 funded by MCIN/AEI/10.13039/501100011033 and by the “European Union NextGenerationEU/PRTR”.

Data Availability Statement

Data are available upon request.

Acknowledgments

The authors acknowledge the Lucky Wind spa company for their technical support and for providing the data used in this work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hussain, M.; Mirjat, N.H.; Shaikh, F.; Dhirani, L.L.; Kumar, L.; Sleiti, A.K. Condition monitoring and fault diagnosis of wind turbine: A systematic literature review. IEEE Access 2024, 12, 190220–190239. [Google Scholar] [CrossRef]
Teng, W.; Ding, X.; Tang, S.; Xu, J.; Shi, B.; Liu, Y. Vibration analysis for fault detection of wind turbine drivetrains—A comprehensive investigation. Sensors 2021, 21, 1686. [Google Scholar] [CrossRef] [PubMed]
Mollasalehi, E.; Wood, D.; Sun, Q. Indicative fault diagnosis of wind turbine generator bearings using tower sound and vibration. Energies 2017, 10, 1853. [Google Scholar] [CrossRef]
Qiao, W.; Lu, D. A survey on wind turbine condition monitoring and fault diagnosis—Part II: Signals and signal processing methods. IEEE Trans. Ind. Electron. 2015, 62, 6546–6557. [Google Scholar] [CrossRef]
Corley, B.; Koukoura, S.; Carroll, J.; McDonald, A. Combination of thermal modelling and machine learning approaches for fault detection in wind turbine gearboxes. Energies 2021, 14, 1375. [Google Scholar] [CrossRef]
Altinpulluk, N.B.; Altinpulluk, D.; Yildirim, M.; Zhao, S.; Qiu, F.; Greco, A. A survey on degradation modeling, prognosis, and prognostics-driven maintenance in wind energy systems. Renew. Sustain. Energy Rev. 2025, 211, 115281. [Google Scholar] [CrossRef]
Wang, Y.; He, R.; Schünemann, W.; Tian, Z.; Pan, J.; Schelenz, R. Degradation assessment of wind turbine based on additional load measurements. Renew. Energy 2024, 235, 121271. [Google Scholar] [CrossRef]
Cao, L.; Qian, Z.; Zareipour, H.; Wood, D.; Mollasalehi, E.; Tian, S.; Pei, Y. Prediction of remaining useful life of wind turbine bearings under non-stationary operating conditions. Energies 2018, 11, 3318. [Google Scholar] [CrossRef]
Oliveira-Filho, A.; Comeau, M.; Cave, J.; Nasr, C.; Côté, P.; Tahan, A. Wind turbine SCADA data imbalance: A review of its impact on health condition analyses and mitigation strategies. Energies 2024, 18, 59. [Google Scholar] [CrossRef]
Milani, A.E.; Zappalá, D.; Castellani, F.; Watson, S. Boosting field data using synthetic SCADA datasets for wind turbine condition monitoring. J. Phys. Conf. Ser. 2024, 2767, 032033. [Google Scholar] [CrossRef]
Astolfi, D.; Iuliano, S.; Vasile, A.; Pasetti, M.; Castellani, F.; Riva Sanseverino, E. Fleet-Wide Knowledge-Discovery-Based Methods for Wind Turbine Performance Monitoring: A Test Case Discussion. Smart Grids Sustain. Energy 2025, 10, 45. [Google Scholar] [CrossRef]
Daems, P.J.; Peeters, C.; Matthys, J.; Verstraeten, T.; Helsen, J. Fleet-wide analytics on field data targeting condition and lifetime aspects of wind turbine drivetrains. Forsch. Im Ingenieurwesen 2023, 87, 285–295. [Google Scholar] [CrossRef]
Murgia, A.; Verbeke, R.; Tsiporkova, E.; Terzi, L.; Astolfi, D. Discussion on the suitability of SCADA-based condition monitoring for wind turbine fault diagnosis through temperature data analysis. Energies 2023, 16, 620. [Google Scholar] [CrossRef]
Liu, Y.; Wu, Z.; Wang, X. Research on fault diagnosis of wind turbine based on SCADA data. IEEE Access 2020, 8, 185557–185569. [Google Scholar] [CrossRef]
Chesterman, X.; Verstraeten, T.; Daems, P.J.; Nowé, A.; Helsen, J. Overview of normal behavior modeling approaches for SCADA-based wind turbine condition monitoring demonstrated on data from operational wind farms. Wind Energy Sci. 2023, 8, 893–924. [Google Scholar] [CrossRef]
Xiao, X.; Liu, J.; Liu, D.; Tang, Y.; Zhang, F. Condition monitoring of wind turbine main bearing based on multivariate time series forecasting. Energies 2022, 15, 1951. [Google Scholar] [CrossRef]
Ma, R.; Li, W.; Qi, Y. Visualization methodology of the health state for wind turbines based on dimensionality reduction techniques. Sustain. Energy Technol. Assess. 2022, 49, 101762. [Google Scholar] [CrossRef]
Castellani, F.; Astolfi, D.; Natili, F. SCADA data analysis methods for diagnosis of electrical faults to wind turbine generators. Appl. Sci. 2021, 11, 3307. [Google Scholar] [CrossRef]
Tautz-Weinert, J.; Watson, S.J. Using SCADA data for wind turbine condition monitoring–a review. IET Renew. Power Gener. 2017, 11, 382–394. [Google Scholar] [CrossRef]
Astolfi, D.; Castellani, F.; Terzi, L. Mathematical methods for SCADA data mining of onshore wind farms: Performance evaluation and wake analysis. Wind Eng. 2016, 40, 69–85. [Google Scholar] [CrossRef]
Castellani, F.; Garinei, A.; Terzi, L.; Astolfi, D.; Gaudiosi, M. Improving windfarm operation practice through numerical modelling and supervisory control and data acquisition data analysis. IET Renew. Power Gener. 2014, 8, 367–379. [Google Scholar] [CrossRef]
Astolfi, D.; Byrne, R.; Castellani, F. Analysis of wind turbine aging through operation curves. Energies 2020, 13, 5623. [Google Scholar] [CrossRef]
Encalada-Dávila, Á.; Puruncajas, B.; Tutivén, C.; Vidal, Y. Wind turbine main bearing fault prognosis based solely on scada data. Sensors 2021, 21, 2228. [Google Scholar] [CrossRef] [PubMed]
Santolamazza, A.; Dadi, D.; Introna, V. A data-mining approach for wind turbine fault detection based on SCADA data analysis using artificial neural networks. Energies 2021, 14, 1845. [Google Scholar] [CrossRef]
Zaher, A.; McArthur, S.D.J.; Infield, D.G.; Patel, Y. Online wind turbine fault detection through automated SCADA data analysis. Wind Energy Int. J. Prog. Appl. Wind Power Convers. Technol. 2009, 12, 574–593. [Google Scholar] [CrossRef]
Stone, E.; Giani, S.; Zappalá, D.; Crabtree, C. Convolutional neural network framework for wind turbine electromechanical fault detection. Wind Energy 2023, 26, 1082–1097. [Google Scholar] [CrossRef]
Xiang, L.; Wang, P.; Yang, X.; Hu, A.; Su, H. Fault detection of wind turbine based on SCADA data analysis using CNN and LSTM with attention mechanism. Measurement 2021, 175, 109094. [Google Scholar] [CrossRef]
Tutiv’en, C.; Benalcazar-Parra, C.; Escuela, A.E.; Vidal, Y.; Puruncaias, B.; Fajardo, M. Wind turbine main bearing condition monitoring via convolutional autoencoder neural networks. In Proceedings of the 2021 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), Mauritius, 7–8 October 2021. [Google Scholar]
Letzgus, S. Change-point detection in wind turbine SCADA data for robust condition monitoring with normal behaviour models. Wind. Energy Sci. Discuss. 2020, 5, 1375–1397. [Google Scholar] [CrossRef]
Turnbull, A.; Carroll, J.; McDonald, A. A comparative analysis on the variability of temperature thresholds through time for wind turbine generators using normal behaviour modelling. Energies 2022, 15, 5298. [Google Scholar] [CrossRef]
Tiboni, M.; Incerti, G.; Remino, C.; Lancini, M. Comparison of signal processing techniques for condition monitoring based on artificial neural networks. In Proceedings of the Advances in Condition Monitoring of Machinery in Non-Stationary Operations, Santander, Spain, 20–22 June 2018. [Google Scholar]
Puruncajas, B.; Castellani, F.; Vidal, Y.; Tutivén, C. Early detection of gearbox failures in wind turbines using artificial neural networks and scada data. In Proceedings of the International Conference of IFToMM ITALY, Turin, Italy, 11–13 September 2024; pp. 337–350. [Google Scholar]
Gu, H.; Liu, W.; Gao, Q.; Zhang, Y. A review on wind turbines gearbox fault diagnosis methods. J. Vibroeng. 2021, 23, 26–43. [Google Scholar] [CrossRef]
Sedgwick, P. Pearson’s correlation coefficient. BMJ 2012, 345, e4483. [Google Scholar] [CrossRef]
Fritsch, F.N.; Carlson, R.E. Monotone Piecewise Cubic Interpolation. SIAM J. Numer. Anal. 1980, 17, 238–246. [Google Scholar] [CrossRef]
Montgomery, D.C.; Jennings, C.L.; Kulahci, M. Introduction to Time Series Analysis and Forecasting; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
Chen, H.; Chen, J.; Ding, J. Data evaluation and enhancement for quality improvement of machine learning. IEEE Trans. Reliab. 2021, 70, 831–847. [Google Scholar] [CrossRef]
Ma, J.; Yuan, Y. Application of SCADA data in wind turbine fault detection—A review. Sens. Rev. 2023, 43, 1–11. [Google Scholar] [CrossRef]
Udo, W.; Muhammad, Y. Data-driven predictive maintenance of wind turbine based on SCADA data. IEEE Access 2021, 9, 162370–162388. [Google Scholar] [CrossRef]
Harrou, F.; Kini, K.R.; Madakyaru, M.; Sun, Y. Uncovering sensor faults in wind turbines: An improved multivariate statistical approach for condition monitoring using SCADA data. Sustain. Energy Grids Netw. 2023, 35, 101126. [Google Scholar] [CrossRef]
Gao, L.; Hong, J. Data-driven yaw misalignment correction for utility-scale wind turbines. J. Renew. Sustain. Energy 2021, 13, 063302. [Google Scholar] [CrossRef]
Castellani, F.; Astolfi, D.; Sdringola, P.; Proietti, S.; Terzi, L. Analyzing wind turbine directional behavior: SCADA data mining techniques for efficiency and power assessment. Appl. Energy 2017, 185, 1076–1086. [Google Scholar] [CrossRef]
Dike, H.; Zhou, Y.; Deveerasetty, K.K.; Wu, Q. Unsupervised learning based on artificial neural network: A review. In Proceedings of the 2018 IEEE International Conference on Cyborg and Bionic Systems (CBS), Shenzhen, China, 25–27 October 2018; pp. 322–327. [Google Scholar]
McKinnon, C.; Alan, T.; Sofia, K.; James, C.; Alasdair, M. Effect of time history on normal behaviour modelling using SCADA data to predict wind turbine failures. Energies 2020, 13, 4745. [Google Scholar] [CrossRef]

Figure 1. Scheme of the gearbox of the WT under investigation.

Figure 2. Main parameters included in the SCADA database.

Figure 3. Overview of the fault detection methodology: (1) SCADA input preprocessing, (2) ANN-based temperature prediction, and (3) residual computation and threshold-based anomaly detection.

Figure 4. Imputed data example (WT power output in kW).

Figure 5. Normalization example (grid production power output).

Figure 6. Trend of the differential gear bearing temperature for the faulty unit and three healthy turbines calculated with respect to the same healthy unit (resampled on a daily basis).

Figure 7. Trend of the differential oil pressure (before–after inline filter) for the faulty unit and two healthy machines.

Figure 8. Receiver operating characteristic (ROC) curve for fault detection using a threshold for the pressure drop on the inline filter.

Figure 9. Neural network architecture used in the proposed methodology.

Figure 10. Results obtained for the wind farm with the thresholds with values of

κ

of 6 and 9 shown for each WT.

Figure 10. Results obtained for the wind farm with the thresholds with values of

κ

of 6 and 9 shown for each WT.

Table 1. Characteristics of the wind turbine.

Rated Power (kW)	2000
Rotor Diameter (m)	100
Gear Ratio	113
Gear Configuration	One planetary and two parallel stages

Table 2. Selected SCADA variables used to develop the normality model.

Monitored SCADA Parameter	Operational Range	Unit	Pearson Corr.
Gear Bearing Temperature	[0, 90]	°C	1.00
Gear Bearing Temperature B	[0, 90]	°C	0.99
Rotor RPM	[0, 30]	rpm	0.94
Gear Oil Temperature Inlet	[0, 60]	°C	0.93
Gear Oil Temperature	[0, 70]	°C	0.92
Ambient Wind Speed	[0, 25]	m/s	0.85
Grid Production Power	[0, 2000]	kW	0.83
Gear Filter After Inline Pressure Oil	[0, 9]	bar	0.76
Gear Filter Before Inline Pressure Oil	[0, 9]	bar	0.76
Hydraulic Oil Temperature	[0, 50]	°C	0.62
Blade Pitch Angle	[−5, 91]	degrees	−0.56

Pearson Corr. = Pearson correlation coefficient with the gear bearing temperature.

Table 3. Summary of activated alarms, marked with ×.

WT	$μ + 6 σ$	$μ + 9 σ$
WT1
WT2	×
WT3
WT4
WT5
WT6
WT7	×	×
WT8	×

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Puruncajas, B.; Castellani, F.; Vidal, Y.; Tutivén, C. Use of Artificial Neural Networks and SCADA Data for Early Detection of Wind Turbine Gearbox Failures. Machines 2025, 13, 746. https://doi.org/10.3390/machines13080746

AMA Style

Puruncajas B, Castellani F, Vidal Y, Tutivén C. Use of Artificial Neural Networks and SCADA Data for Early Detection of Wind Turbine Gearbox Failures. Machines. 2025; 13(8):746. https://doi.org/10.3390/machines13080746

Chicago/Turabian Style

Puruncajas, Bryan, Francesco Castellani, Yolanda Vidal, and Christian Tutivén. 2025. "Use of Artificial Neural Networks and SCADA Data for Early Detection of Wind Turbine Gearbox Failures" Machines 13, no. 8: 746. https://doi.org/10.3390/machines13080746

APA Style

Puruncajas, B., Castellani, F., Vidal, Y., & Tutivén, C. (2025). Use of Artificial Neural Networks and SCADA Data for Early Detection of Wind Turbine Gearbox Failures. Machines, 13(8), 746. https://doi.org/10.3390/machines13080746

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Use of Artificial Neural Networks and SCADA Data for Early Detection of Wind Turbine Gearbox Failures^†

Abstract

1. Introduction

2. Test Case

3. Fault Detection Approach