Design of an Efficient Deep Learning-Based Diagnostic Model for Wind Turbine Gearboxes Using SCADA Data

Mai, Xuan-Kien; Lee, Jun-Yeop; Lee, Jae-In; Go, Byeong-Soo; Lee, Seok-Ju; Dinh, Minh-Chau

doi:10.3390/en18112814

Open AccessArticle

Design of an Efficient Deep Learning-Based Diagnostic Model for Wind Turbine Gearboxes Using SCADA Data

by

Xuan-Kien Mai

¹

,

Jun-Yeop Lee

¹

,

Jae-In Lee

²

,

Byeong-Soo Go

²,

Seok-Ju Lee

³ and

Minh-Chau Dinh

^2,*

¹

Department of Electrical Engineering, Changwon National University, Changwon 51140, Republic of Korea

²

Institute of Mechatronics, Changwon National University, Changwon 51140, Republic of Korea

³

School of Aerospace Engineering, Glocal Advanced Institute of Science & Technology, Changwon National University, Changwon 51140, Republic of Korea

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(11), 2814; https://doi.org/10.3390/en18112814

Submission received: 21 March 2025 / Revised: 24 April 2025 / Accepted: 7 May 2025 / Published: 28 May 2025

(This article belongs to the Special Issue Renewable Energy and Power Electronics Technology)

Download

Browse Figures

Versions Notes

Abstract

Global efforts to address climate change have intensified the transition from fossil fuels to renewable energy sources, positioning wind power as a critical player due to its advanced technology, scalability, and environmental benefits. Despite their potential, the reliability of wind turbines, particularly their gearboxes, remains a persistent challenge. Gearbox failures lead to significant downtime, high maintenance costs, and reduced operational efficiency, threatening the economic competitiveness of wind energy. This study proposes an innovative condition monitoring model for wind turbine gearboxes, utilizing Supervisory Control and Data Acquisition systems and Deep Learning techniques. The model analyzes historical operating data from wind turbine to classify gearbox conditions into normal and abnormal states. Optimizing the dataset for deep neural networks through advanced data processing methods achieves an impressive fault detection accuracy of 98.8%. Designed for seamless integration into real-time monitoring systems, this approach enables early fault prediction and supports proactive maintenance strategies. By enhancing gearbox reliability, reducing unplanned downtime, and lowering maintenance expenses, the model improves the overall economic viability of wind farms. This advancement reinforces wind energy’s pivotal role in driving a sustainable, low-carbon future, aligning with global climate goals and renewable energy adoption.

Keywords:

deep neural network; DBSCAN algorithm; machine learning; operation and maintenance; principal component analysis; SCADA data; wind turbine gearbox

1. Introduction

The urgent need to address climate change has driven nations worldwide to transition decisively from fossil fuel-based energy systems to renewable energy sources, aiming to reduce CO₂ emissions and achieve global sustainability goals. In this context, wind energy has emerged as one of the most significant renewable energy sources, propelled by rapid advancements in wind turbine technology and a growing demand for clean energy solutions [1]. According to the Global Wind Energy Council’s 2024 Report (GWEC), the wind industry achieved a record-breaking installation of 117 GW of new capacity in 2023, marking the most successful year in its history [2]. The International Energy Agency (IEA) forecasts that renewables, with wind playing a key role, will account for nearly 30% of global electricity production by 2030, driven by the expansion of offshore wind farms [3]. Offshore wind farms harness stable, high-speed winds far from shore, minimizing land-use conflicts and environmental impacts such as noise pollution while preserving natural landscapes [4]. This growth not only underscores wind energy’s pivotal role in the global energy transition but also poses an urgent need to enhance the reliability and efficiency of wind turbine systems to meet ambitious climate targets, such as the 45% greenhouse gas emission reduction by 2030 outlined in the Intergovernmental Panel on Climate Change (IPCC) report [5,6].

Despite significant progress in wind energy, the reliability of wind turbines remains a major challenge, particularly concerning gearboxes, a critical yet vulnerable component. Gearboxes are responsible for converting the slow rotation of turbine blades into the high-speed input required by generators, but they operate under harsh conditions such as variable loads, high humidity, and corrosive marine environments, especially in offshore wind farms [7,8]. Operation and Maintenance (O&M) costs account for 20–25% of the total cost per kWh for new turbines, rising to 20–35% toward the end of their lifecycle, with gearbox-related failures contributing up to 40% of these costs due to high repair expenses and prolonged downtime [7,9].

Traditional condition monitoring methods, such as Condition Monitoring Systems (CMSs) based on vibration analysis, struggle with noise, nonlinearity, and instability in vibration data, resulting in low diagnostic accuracy (often below 85%), even with optimized Support Vector Machine (SVM) models enhanced by the Artificial Bee Colony algorithm [10,11]. Similarly, Supervisory Control and Data Acquisition (SCADA) systems, while offering real-time monitoring with diverse data, generate high-frequency datasets that demand substantial computational resources and are prone to false positives without effective preprocessing [12,13]. Recent studies, such as Verma et al. (2022), indicate that current SCADA-based fault prognosis models often detect issues only in advanced failure stages, missing early warning signs essential for preventive maintenance [14]. Condition monitoring and early fault diagnosis techniques for wind turbines were reviewed by Md Liton Hossain et al. via a focus on component-specific faults, signal analysis techniques, and signal processing tools [15]. It highlights the need for trustworthy, accessible online monitoring systems while indicating disadvantages such as high costs, sensor reliability in difficult circumstances, difficulties identifying early faults, and data dependency for AI. Guo, Wei, and colleagues suggest a hybrid fault diagnosis approach for wind turbine gearboxes that combines convolutional neural networks (CNNs) and symmetric dot pattern (SDP) visualization [16]. To detect faults with high accuracy, the method transforms vibration signals into snowflake-shaped SDP images, which are subsequently classified by a CNN. Illumination is an important need in the early and exact diagnosis of faults in wind energy systems due to the drawbacks of conventional condition monitoring techniques, including CMS and SCADA systems. These challenges indicate the creation of an advanced diagnostic model that can process complex and noisy data, correctly predict gearbox failures early on, and optimize scheduled maintenance. In addition to enhancing wind turbine dependability, such a model would solve the urgent cost and efficiency challenges in the wind energy industry.

This study aims to address the identified gap in gearbox diagnostics by developing a current condition diagnostic (CCD) model for wind turbine gearboxes to significantly improve fault detection accuracy and enable proactive maintenance. Specifically, the objectives are as follows:

(1): To design a model that leverages SCADA data to detect gearbox anomalies with superior precision, targeting an accuracy exceeding 95%.
(2): To overcome the limitations of noise and false positives in traditional CMS and SCADA methods through advanced data processing techniques.
(3): To integrate the model with the health monitoring system for real-time monitoring and predictive maintenance, thereby extending gearbox lifespan and optimizing the economic efficiency of wind farms.

With state-of-the-art accuracy, this study advances the scientific utilization of deep learning for gearbox diagnosis using multidimensional SCADA data. This approach shows how deep learning can be optimized for high-dimensional, noisy datasets by using techniques such as Principal Component Analysis (PCA) for dimensionality reduction and DBSCAN for noise filtering. In practice, this study provides wind farms with a scalable, effective predictive maintenance solution that could lower expenses and downtime. The suggested model promotes proactive maintenance processes by enabling early fault detection and diagnosis, thereby improving wind energy systems’ efficiency and sustainability. These objectives are pursued to enhance wind turbine reliability, supporting the scalable adoption of wind energy as a sustainable resource. The proposed approach integrates the latest advancements in artificial intelligence and data science to overcome the shortcomings of traditional diagnostic methods. Utilizing SCADA data from wind turbine in the Republic of Korea as the foundation, the model follows a multi-step process:

(1): Principal Component Analysis (PCA) is applied to reduce data dimensionality and extract key features, improving computational efficiency.
(2): Density-Based Spatial Clustering of Applications with Noise (DBSCAN) removes noise, enhancing data clarity for both normal and anomalous states.
(3): A Deep Neural Network (DNN) with an optimized architecture—employing ReLU and Sigmoid activation functions—classifies gearbox conditions, achieving a confirmed accuracy of 98.8%.

The model is validated against and designed for compatibility with digital twin ecosystems, offering a cutting-edge solution for predictive maintenance that surpasses the reactive nature of traditional CMS and SCADA methods. The diagnostic model developed in this study represents a significant advancement in wind turbine technology, addressing a critical barrier to the widespread adoption of wind energy. Moreover, the model’s high accuracy and scalability position is a benchmark for future research, with the potential to influence global wind farm operations and contribute to the United Nations’ Sustainable Development Goal 7 (Affordable and Clean Energy) by 2030 [17,18]. This research not only enhances technical reliability but also boosts the economic viability of renewable energy, delivering an innovative and practical model for industry adoption.

The structure of this article is carefully crafted to guide readers systematically through the research process and its key findings. Section 2, “Materials and Data Processing Methods”, provides a comprehensive overview of this study’s foundation, introducing the SCADA dataset collected from wind turbine, detailing the principles of data collection, and describing the preprocessing techniques, including data processing and analysis, as well as the design of the CCD model. Section 3, “Design of the CCD Models”, elaborates on the deep neural network (DNN) workflow, outlining the model’s architecture, the application of ReLU and Sigmoid activation functions, and the development of two specialized diagnostic approaches—temperature prediction and condition labeling optimized to enhance fault detection accuracy. Section 4, “Results of the CCD Model Accuracy Verification”, presents a detailed evaluation of the model’s performance, highlighting key accuracy metrics, such as the achieved 98.8%, and validating its effectiveness against historical fault data to underscore its practical applicability in real-world wind turbine operations. Finally, Section 5, “Discussions and Conclusions”, synthesizes the research outcomes, explores their implications for improving wind turbine reliability, and proposes future research directions, including the potential for scaling the model across multiple turbines and integrating it with cutting-edge AI technologies.

2. Materials and Data Processing Methods

2.1. SCADA and Data Collection

The SCADA dataset version used in this study comprises data, providing a robust foundation for analyzing gearbox performance under real-world conditions. Table 1 outlines the specifications of the wind turbine model, designed for medium to large-scale wind energy production. The gearbox features a rated power of 2 MW, with a cut-in/out wind speed range of 3–25 m/s and a rated wind speed of 11 m/s, ensuring optimal performance across varying wind conditions. The mechanical design includes a transmission ratio of 1.83:83 (±1%), achieved through a combination of two planetary stages and a parallel stage, which effectively manages torque and rotational speed transformations. The gearbox supports a rated rotor speed of 16 rpm and a rated generator speed of 1400 rpm, facilitated by high-strength materials and advanced lubrication systems that enhance durability and minimize wear. The dataset captures a comprehensive range of operational parameters, including rotor speed, gearbox temperatures, oil pressures, etc., ensuring a detailed representation of turbine behavior across diverse scenarios. This extensive data collection, conducted at 10 min intervals, facilitates the identification of both normal and anomalous operational patterns, laying solid groundwork for the development of the diagnostic model [19]. By leveraging the SCADA dataset, this study ensures that the proposed model is trained in and validated on high-fidelity data, enhancing its reliability and applicability for predictive maintenance in wind energy systems [20].

The SCADA system recorded gearbox fault conditions, the 2 MW wind turbine gearbox is designed for durability, efficiency, and compactness; this configuration ensures consistent energy conversion under variable wind conditions, reduces maintenance costs, and extends service life [21,22].

Complementing this mechanical design, the SCADA systems are widely adopted in the wind energy sector, collecting critical operational data such as 10 min average time series, turbine status codes, and fault records [23]. These datasets integrate maintenance logs and optimization parameters, supporting fault diagnosis and enabling advanced applications like training deep learning models to classify normal vs. faulty states. As shown in Table 2, the fault data were split into two sets: cases 1 to 7 for model training and 8 to 11 for evaluating the accuracy of two diagnostic methods.

2.2. Data Processing Methods

As illustrated in Figure 1, the data processing workflow is a critical preliminary step that must be rigorously followed to ensure the quality and reliability of the subsequent analysis and model training. Data processing before training an AI model is essential because it significantly enhances the model’s performance by reducing noise and irrelevant features, thereby improving prediction accuracy [24]. The sequence—starting with data collection, followed by feature data selection, data normalization, data filtering, and data analysis—helps mitigate the risk of misdiagnosis, which could lead to misunderstandings and costly errors during maintenance and repair operations. After filtering and normalizing the data, it is split into 4 folds for cross-validations, where each fold is used once for validation, while the remaining three are used for training, ensuring robust performance evaluation. This technique helps assess the model’s generalization ability more effectively than a single train-test split during the model evaluation phase. For instance, improper handling of noisy SCADA data can result in false positives, as noted by Maldonado-Correa et al. (2020), emphasizing the need for effective preprocessing to ensure actionable insights [25]. By adhering to this structured approach, this study ensures that the AI model, mainly the DNN developed, is trained on high-quality, well-prepared data, laying a solid foundation for accurate fault detection in wind turbine gearboxes.

2.2.1. The Feature Data Selection

The fault classifications presented in Figure 2 categorize gearbox conditions into two primary groups: normal operations and anomalies. Mechanical faults, such as gear slippage, broken teeth, bearing failures, or load imbalance, often manifest as distinctive abnormal patterns within mechanical data. When cross-referenced with historical repair logs, these anomalies provide a comprehensive context for accurately diagnosing the condition of the gearbox.

From the gearbox failure, characteristic data were selected from the SCADA system, which provided a detailed time series data collection capturing the operating behavior of the wind turbine gearbox under a variety of operating conditions. This dataset included a variety of variables recorded during normal gearbox operation, including key mechanical, thermal, and environmental parameters, as detailed in Table 3. Each parameter was selected based on the problems that the gearbox encountered during operation.

Gearbox temperature serves as a generalized indicator of anomalies, as faults like insufficient lubrication or gear slippage cause abnormal temperature spikes, making it useful for training temperature-based diagnostic models. However, condition diagnosis using labeled operational points is more complex: SCADA systems typically trigger alerts only during late-stage failures, when signals exceed thresholds, requiring careful label selection and filtering to avoid compromising model accuracy.

2.2.2. Data Analysis

When a wind turbine operates, wind speed and active power are closely interrelated. Figure 3 illustrates the relationship between wind speed and gearbox operational power data over one year. Below the cut-in wind speed, the turbine cannot generate sufficient power to overcome frictional losses in the drivetrain. At rated wind speed, the turbine operates at maximum power output. Above the cut-out wind speed, the turbine must shut down to avoid damage. Consequently, the level of derated power operation scatters (1 and 2) serve as indicators of potential component failures within the wind turbine, including the gearbox.

Figure 4 illustrates the cyclic lifecycle of a wind turbine gearbox, progressing through normal operation, failure, fault, and repair phases. During normal periods, the gearbox functions efficiently within design parameters. Operational stress or wear can trigger a failure period marked by gradual degradation, as identified by SCADA data [26]. Early detection during this phase enables proactive maintenance to minimize downtime. SCADA datasets are categorized into normal operation data and abnormal data (encompassing failure and fault periods), which provide critical insights for training models to predict and mitigate failures effectively.

By visualizing key data features correlated with wind speed signals, as illustrated in Figure 5, this approach provides deeper insights into the complex interplay between critical parameters such as generator speed, gearbox oil tank pressure, and gearbox bearing temperature. These insights lay the groundwork for identifying both operational trends and potential anomalies in gearbox performance, thereby ensuring a robust understanding of system behavior.

2.2.3. Data Filtering

To ensure accurate anomaly detection, the dataset includes a reference period for capturing abnormal data before the repair period. This classification framework is essential during labeling and data preparation, enabling the precise identification of abnormal patterns for model training. By correlating parameters such as oil pressure, bearing temperature, and rotational speed with specific fault types, an information-rich dataset is developed. This methodology enhances the model’s predictive accuracy and diagnostic precision, providing a robust foundation for fault detection and classification, ultimately improving the reliability and efficiency of wind turbine gearboxes. Given the high dimensionality and complexity of SCADA data, Principal Component Analysis (PCA) was employed to reduce dimensionality while preserving critical information. PCA transforms high-dimensional data into a lower-dimensional space by identifying principal components that capture the maximum variance in the data [27]. As shown in Figure 6, projecting the original multi-dimensional data into a two-dimensional space using PCA enables effective anomaly detection. DBSCAN, a density-based clustering algorithm, was applied to the PCA-transformed space to filter noise and identify meaningful clusters. DBSCAN detects clusters based on local point density, marking outliers as noise, making it particularly effective for SCADA datasets where noise can obscure significant patterns [28].

Figure 6 highlights DBSCAN’s noise-filtering results: panel (a) clarifies anomalous operational states, while panel (b) refines normal dataset patterns. Noise clusters (e.g., cluster 2) are flagged as potential misclassifications. These cleaned datasets reduce erroneous signals, boosting predictive model accuracy. Domain expertise was integrated during preprocessing, incorporating operational thresholds and gearbox failure mechanisms to ensure practical relevance. The refined dataset, enhanced by PCA, DBSCAN, and domain knowledge, offers accurate, consistent representations of real-world gearbox conditions. It enables actionable insights for predictive maintenance and fault diagnosis in wind turbines, improving operational efficiency and reliability through scalable, robust models.

2.2.4. Data Normalization

Data normalization is a crucial preprocessing step that transforms raw data into a standardized format, ensuring all features contribute equally to the model without being influenced by their original scales. The Standard Scaler method, a widely used normalization technique, standardizes data by removing the meaning and scaling it to unit variance, making it suitable for AI model training where features vary widely in magnitude. This method works by transforming each feature in the dataset to have a mean of zero and a standard deviation of one, which is achieved using the following formula:

x^{'} = \frac{x - μ}{σ}

(1)

where x is the original feature value, μ is the mean of the feature, σ is the standard deviation, and x′ is the normalized value.

In practice, the Standard Scaler (Scikit-learn 1.6.0) calculates μ and σ from the training data and applies the transformation to both training and testing sets, ensuring consistency across the dataset. This standardization is particularly beneficial for training AI models such as DNNs, as it accelerates convergence by preventing features with larger scales from dominating the learning process. For the wind turbine gearbox diagnosis model in this study, the Standard Scaler was applied to SCADA data parameters like temperature and pressure, ensuring that the DNN can effectively learn patterns and improve fault detection accuracy [29]. By mitigating the impact of outliers and scale differences, this method enhances model robustness and prevents misinterpretations during training, aligning with the need for precise predictive maintenance.

3. Design of the CCD Models

3.1. Preprocessing of the Activation Functions

The integration of SCADA systems with deep learning and AI revolutionizes wind turbine gearbox diagnostics by enabling early fault detection, predictive accuracy, and proactive maintenance. Deep learning uncovers intricate patterns in SCADA datasets that traditional methods fail to detect [30]. The DNN algorithm description in Algorithm 1 analyzes time-series data to identify early-stage degradation, preventing catastrophic failures. Building on preprocessing from Section 2.2, the model architecture, training, validation, and inference steps learn hierarchical data representations for tasks like classification and regression. Two training approaches are applied: temperature prediction learning (forecasting thermal anomalies) and condition label learning (classifying operational states as normal/abnormal), forming a dual framework to optimize fault detection accuracy.

Algorithm 1 DNN algorithm for training CCD model

1:: Initialize W and b
2:: load SCADA dataset T
3:: define DNN architecture with layers
4:: set activation functions: ReLU for hidden layers, Sigmoid for the output layer
5:: while not convergence criterion:
6:: for each training example i in T:
7:: z = W.x + b
8:: a = σ(z)
9:: calculate loss L using cross-entropy
10:: compute gradients ∇L
11:: update weights and biases using optimization algorithm (e.g., Adam):
12:: θ_(t + 1) = θ_t − η∇L
13:: end while
14:: validate model performance on the validation set
15:: test model on unseen data for inference

The DNN architecture is designed as a series of layers, each performing linear transformations followed by activation functions. A typical layer operation is as follows:

z = W . x + b

(2)

a = σ (z)

(3)

where W is the weight matrix, b is the bias vector, x is the input vector, and σ is an activation function (e.g., ReLU, sigmoid, or softmax).

A loss function quantifies the difference between the predicted outputs of the DNN and the actual target values, serving as a guide for optimizing model parameters. In DNN architecture, minimizing the loss functions such as cross-entropy for classification tasks, drives the learning process by adjusting weights to improve prediction accuracy. During training, the model optimizes its parameters by minimizing a loss function, such as cross-entropy for classification:

L = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{j = 1}^{C} y_{i j} l o g ({\hat{y}}_{i j})

(4)

where

y_{i j}

is the true label,

{\hat{y}}_{i j}

is the predicted probability, N is the number of samples, and C is the number of classes.

Optimization algorithms like stochastic gradient descent or Adam update parameters iteratively:

θ_{t + 1} = θ_{t} - η \nabla L

(5)

where

θ

represents the model parameters,

η

is the learning rate, and ∇L is the gradient of the loss function.

The gearbox fault diagnosis model was trained using the architecture previously described, ensuring the model’s robustness and efficiency in predicting gearbox conditions. The model’s training process relied heavily on the effective use of activation functions, specifically ReLU for hidden layers and Sigmoid for the output layer, to enhance learning and ensure accurate binary classification. These activation functions played a critical role in determining how the model processed and transformed data throughout its layers.

The ReLU activation function

f (x) = m a x (0, x)

introduces non-linearity by zeroing negative inputs, enabling the network to capture complex data patterns and mitigating the vanishing gradient problem in deep networks [31]. Its efficiency in accelerating convergence makes it ideal for hidden layers in fault diagnosis tasks. For output layers, temperature prediction uses baseline thermal data to detect anomalies, while condition label learning employs labeled operational states (normal/abnormal). The Sigmoid function

f (x) = \frac{1}{1 + e^{- x}}

, maps output to a 0 to 1 probability range, enabling the precise binary classification of gearbox conditions (0 = normal, 1 = abnormal) [32].

During the training process, the effectiveness of these activation functions directly influenced the model’s performance. ReLU’s ability to accelerate learning in the hidden layers allowed the model to quickly identify key features and patterns related to gearbox health. Meanwhile, Sigmoid’s probabilistic interpretation ensured that the model provided clear and reliable predictions for classification tasks. These two functions complemented each other, creating a balanced and efficient training pipeline. Iterative training on SCADA data enhances the model’s ability to differentiate normal and abnormal gearbox conditions. Combining ReLU (for feature learning) and Sigmoid (for precise classification) ensures reliable fault diagnosis. The model’s efficiency and interpretability make it ideal for predictive maintenance in wind turbine gearboxes.

The need for hyperparameter optimization was addressed by carefully adjusting the DNN architecture parameter values through grid search to maximize model performance. The following hyperparameters were evaluated:

Learning rate: {0.001, 0.01}.
Epoches: Early-stop training via TensorFlow/Keras’ EarlyStopping callback if no improvement occurs after a predefined number of epochs.
Batch size: {32, 64, 128}.
Optimizer: {Adam, SGD, RMSprop}.
Dropout rate: {0.2, 0.3, 0.4}.

A total of 54 hyperparameter combinations were tested to find a configuration that improved model performance while reducing overfitting. This study used a two-step optimization approach to deal with the overfitting problem. First, the model was trained for a fixed number of 1000 epochs to ensure that the loss was reduced to an optimal level. After that, training was dynamically stopped when no improvement was seen by applying early stopping with a patience value of 10 epochs and monitoring the validation loss. Finding the optimal number of epochs for the DNN model architecture is made possible by TensorFlow/Keras’ EarlyStopping callback. By halting the process if the validation metric does not improve after 10 epochs, this technique not only avoids overfitting but also guarantees effective training.

3.2. The Temperature Prediction Learning-Based CCD Model

The design of the CCD model based on temperature prediction learning follows a structured implementation process, as depicted in Figure 7, which outlines the workflow from training to real-time diagnosis. The process begins with the preprocessing of raw historical SCADA data, as detailed in Section 2.2, where data normalization, feature selection, and filtering are applied to ensure high-quality input. These preprocessed data, comprising temperature-related parameters from the wind turbine, are then used in the training phase to develop the DNN architecture. The model training involves feeding the input features into the DNN, with the target output representing predicted temperature values.

The architectural configuration of the DNN, as specified in Table 4, includes eight input nodes, one output node, three hidden layers with 128 neurons each, ReLU activation functions in hidden layers, and a Sigmoid function in the output layer. The training employs a Mean Squared Error (MSE) loss function, a learning rate of 0.001, 276 epochs, and a batch size of 64, ensuring robust learning and convergence. During the real-time operation phase, the trained model processes measured inputs to predict output, which is compared with measured output to assess temperature deviations. In the real-time diagnosis phase, these predictions are analyzed against a predefined warning level to generate diagnosis results, facilitating early fault detection.

This temperature prediction-based approach offers several advantages, including its ability to detect thermal anomalies early, which is critical for preventive maintenance, and its compatibility with real-time monitoring systems like digital twins. However, it also presents notable disadvantages: the method is more complex, requiring two distinct steps (prediction and threshold-based classification), which increases computational complexity; it heavily depends on prediction accuracy, where inaccurate temperature forecasts by the DNN can lead to unreliable diagnosis results; and it necessitates defining a temperature deviation threshold to classify abnormalities, a process that can be challenging and inflexible due to varying operational conditions and the subjectivity in setting appropriate thresholds. Furthermore, the reliance on static thresholds may not adequately account for dynamic environmental factors, potentially leading to missed detections or false alarms in fluctuating conditions.

3.3. The Condition Label Learning-Based CCD Model

The design of the CCD model based on condition label learning, as depicted in Figure 8, adopts a distinct implementation process tailored to classify gearbox states directly, diverging from the temperature prediction approach outlined in Section 3.2, “The temperature prediction learning-based CCD model”. While both methods share similarities in leveraging preprocessed SCADA data from the wind turbine and utilizing a DNN framework with training, real-time operation, and diagnosis phases, the condition label learning focuses on binary classification (normal or abnormal) rather than predicting temperature values. The process begins with the trained DNN processing measured inputs derived from multiple operational parameters, to produce predicted labels for real-time diagnosis.

The DNN architecture, detailed in Table 5, features nine input nodes, one output node, three hidden layers with 512, 256, and 128 neurons, ReLU and Sigmoid activation functions, and is trained using an MSE loss function, a learning rate of 0.001, 248 epochs, and a batch size of 64. Unlike the temperature prediction method, which requires an additional step to define temperature thresholds, this approach delivers a direct classification output, streamlining the diagnostic workflow.

This condition label learning method offers unique advantages: it is simple and efficient by avoiding intermediate steps like temperature prediction, thus reducing complexity; it achieves fast speed by eliminating the need for prior temperature forecasting; and it provides good generalization by utilizing diverse feature data (beyond temperature) to detect abnormalities from multiple perspectives. However, it faces challenges such as potential bias from imbalanced datasets and reliance on high-quality labeled data, which can be hard to ensure consistently in operational settings, setting it apart from the temperature-based method’s dependence on accurate thermal predictions and threshold setting.

4. Accuracy Verification Results of the CCD Models

4.1. Accuracy Results of the CCD Model Training

The model’s high accuracy reflects its effectiveness in minimizing false alarms while ensuring the early detection of potential gearbox faults. This capability is particularly critical in real-world scenarios, where early and reliable fault detection can significantly reduce downtime and maintenance costs. By identifying anomalies early, the model enables timely preventive actions, enhancing the operational reliability and longevity of wind turbine gearbox.

Accuracy, a commonly used metric for evaluating classification models, is calculated as follows:

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(6)

where

TP is the true positive.
TN is the true negative.
FP is the false positive.
FN is the false negative.

The performance of the CCD models, both based on temperature predictive learning and condition label learning, was rigorously evaluated using the confusion matrix, which is distributed in Table 6, derived from a test dataset of 500 SCADA data points consisting of 214 normal data points and 286 abnormal data points, which was used to evaluate the accuracy of the model after training. The temperature predictive learning model achieved 196 true positives (normal) and 281 true negatives (abnormal), with 18 false negatives and 5 false positives, yielding an accuracy of 95.4%. While this demonstrated a reliable discrimination of operational states, the condition label learning model exhibited superior performance, classifying 209 true positives and 285 true negatives with only 1 false negative and 5 false positives, resulting in 98.8% accuracy. The enhanced performance of the condition label learning approach is attributed to its integration of multimodal operational features beyond temperature, enabling more robust and precise gearbox condition predictions. This advancement is critical for reducing diagnostic uncertainties and optimizing predictive maintenance protocols in industrial applications.

Model convergence was assessed using four-fold cross-validation to ensure consistency and generalizability. Accuracy, precision, recall, and F1-score were calculated for each fold, and their mean values are reported in Table 7 and Table 8 for both the temperature prediction and condition label prediction models.

The goal was to maximize cross-validation accuracy while minimizing variance across folds. For the temperature prediction model, the optimal configuration was a learning rate of 0.01, two hidden layers, and a batch size of 64, yielding the reported average accuracy of 0.9672. The condition label prediction model achieved its best performance (average accuracy of 0.9881) with the same configuration. These parameters were selected because they consistently delivered the highest performance metrics while maintaining low variability, ensuring robust generalization to unseen data.

The training and testing process of the Loss values is illustrated in Figure 9, which presents the Loss functions for both the temperature prediction-based learning CCD model and the condition label-based learning CCD model. Following refinement, the CCD model based on temperature prediction learning obtained a validation accuracy of 96.72%. The testing loss was slightly greater at 6.38 × 10⁻⁵ at epoch 276 than the training loss, which was 6.26 × 10⁻⁵. The label-based CCD model, on the other hand, performed better, attaining a validation accuracy of 98.81%, which represents a significant improvement over its initial findings. This model’s testing loss was marginally lower at 1.53 × 10⁻⁵ at epoch 248 than its training loss, which was recorded at 1.84 × 10⁻⁵. The cross-validation tables (Table 7 and Table 8) and the updated loss plots (Figure 9) offer comprehensive information on the stability and performance metrics of both models.

4.2. Accuracy Results of the CCD Model

The performance of the condition diagnosis models was thoroughly evaluated using real-time SCADA data from four distinct cases (Cases 8–11), as depicted in Figure 10 and Figure 11, to compare the effectiveness of the temperature prediction learning and condition label learning approaches.

Each case compares predicted gearbox temperatures with actual values, plotting deviations against a standard threshold of 7.5 °C to detect anomalies. The threshold, usually a multiple of the standard deviation between actual and predicted data, is scientifically determined using the standard deviation [33], given by the following equation:

t h r e s h o l d = \sqrt{\frac{\sum {(D e v i a t i o n - m e a n)}^{2}}{N u m b e r o f d a t a p o i n t}}

(7)

In the temperature prediction learning model (Figure 10), in cases 8, 10, and 11, there were significant deviations above the threshold prior to repair dates, appropriately initiating maintenance actions; however, deviations also appeared during normal operation periods, indicating potential issues with the fixed threshold’s sensitivity. In Case 9, the temperature deviation failed to exceed the threshold due to the model’s predicted values being too close to the actual temperatures, resulting in a missed anomaly detection before the repair time. While the CCD model effectively diagnoses gearbox faults before repairs using temperature prediction, it struggles to distinguish true faults from false alarms during normal post-repair operations. This can stem from its reliance on a fixed threshold. In Cases 8, 10, and 11, frequent threshold exceedances reflect sensitivity to external factors rather than recurring faults. For example, transient operational states can skew temperature readings, causing false positives. In contrast, the condition label learning model (Figure 11) demonstrated superior predictive capability across all cases, detecting abnormalities well in advance of the gearbox fault and repair time. Specifically, in Case 8, abnormal signs were identified 2 days prior; in Case 9, 3 days prior; in Case 10, 1 day prior; and, in Case 11, 3 days prior, with the frequency of abnormal signs increasing over time toward the repair period, aligning precisely with historical maintenance records. The results of the CCD model based on condition labels demonstrate that the model accurately diagnoses fault conditions nearly exclusively during fault time; this is consistent with the results, achieving 98.8% accuracy when evaluated on the test dataset.

5. Discussions and Conclusions

This study introduces a novel approach for wind turbine gearbox condition diagnosis, leveraging a DNN architecture integrated with SCADA data and condition labeling. The incorporation of advanced data processing techniques, including PCA for feature reduction and DBSCAN for data filtering, significantly enhances diagnostic accuracy while minimizing false positives and negatives. Wind turbine gearbox predictive maintenance has been enhanced by deep learning, providing a useful tool for the renewable energy industry. This approach uses deep learning to automatically extract complex patterns from raw SCADA data, in contrast with standard techniques like SVMs, which rely on handcrafted features and frequently struggle with high-dimensional data. Our architecture involves techniques for preprocessing that elevate accuracy and efficiency compared to other deep learning models. These improvements are critical for optimizing operation and maintenance processes in wind energy systems. Validation through four test cases with historical fault data demonstrates the model’s capability to accurately diagnose gearbox failures and predict fault conditions 1–2 days in advance, underscoring the importance of context-specific data collection and processing strategies. The proposed methodology bridges theoretical advancements and practical applications, illustrating how AI-driven condition-based maintenance models can transform SCADA data into actionable insights. By improving diagnostic precision and reducing downtime, this approach enhances the reliability and sustainability of wind turbines, reinforcing their role in global decarbonization efforts. While deep learning models offer significant advantages in terms of predictive accuracy, their inherent lack of interpretability can limit trust and adoption in industrial applications. To mitigate this issue, future work could incorporate explainable AI techniques, such as SHAP or LIME, to provide insights into the model’s decision-making process. These methods can help identify which input features—such as wind speed, rotor speed, or oil temperature—most significantly influence the predicted outcomes, thereby improving transparency and facilitating stakeholder confidence. Additionally, visualizing feature importance and decision pathways through XAI tools can assist operators in interpreting results and making informed maintenance decisions. The limited data coverage for the wind turbine and variations across geographical contexts may impact the generalizability of the model despite present limitations. However, by retraining or calibrating the AI model with new turbine SCADA data, the method described in this paper can be widely applied to other turbines. Future research will address class imbalances in fault types to better capture rare failure modes and focus on collecting samples from underrepresented regions, such as extreme climate zones and distinct turbine operating practices, to improve data diversity, as well as to create adaptive preprocessing pipelines for real-time SCADA integration to manage dynamic model updates, noise elimination, and streaming data synchronization. This is necessary to guarantee seamless deployment in operational facilities for preventative maintenance options. Integration with real-time SCADA data streams is also planned to enable continuous and timely diagnostics. Furthermore, fostering cross-industry collaboration and establishing standardized protocols will be essential to accelerate the adoption of AI-enhanced diagnostic solutions in the renewable energy sector. These advancements hold the potential to position wind energy as a cornerstone of sustainable energy systems, contributing to global efforts toward a low-carbon future.

Author Contributions

Conceptualization and methodology, X.-K.M. and M.-C.D.; software, X.-K.M. and J.-Y.L.; validation, J.-I.L., S.-J.L., B.-S.G. and M.-C.D.; investigation, X.-K.M. and M.-C.D.; writing—original draft preparation, X.-K.M.; writing—review and editing, X.-K.M., B.-S.G. and M.-C.D.; project administration, S.-J.L. and M.-C.D.; supervision, M.-C.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Korea Institute of Energy Technology Evaluation and Planning (KETEP) grant funded by the Korea government (MOTIE) (20223030020180, Development of durability evaluation and remaining useful life prediction technology for wind turbine life extension).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

CCD	Current condition diagnosis
CMS	Condition Monitoring System
DBSCAN	Density-Based Spatial Clustering of Applications with Noise
DNN	Deep neural network
MSE	Mean Squared Error
O&M	Operations and Maintenance
PCA	Principal Component Analysis
ReLU	Rectified Linear Unit
SCADA	Supervisory Control and Data Acquisition
SDP	Symmetric dot pattern
SVM	Support Vector Machine

References

IPCC. Synthesis Report of The IPCC Sixth Assessment Report (AR6) Summary for Policymakers 4; Intergovernmental Panel on Climate Change: Geneva, Switzerland, 2023. [Google Scholar]
GWEC.NET Associate Sponsors Podcast Sponsor Leading Sponsor Supporting Sponsor Co-Leading Sponsor. Available online: https://img.saurenergy.com/2024/05/gwr-2024_digital-version_final-1-compressed.pdf (accessed on 19 March 2025).
International Energy Agency (IEA). Renewables 2024. 2024. Available online: https://www.iea.org/reports/renewables-2024 (accessed on 19 March 2025).
Li, C.; Mogollón, J.M.; Tukker, A.; Steubing, B. Environmental Impacts of Global Offshore Wind Energy Development until 2040. Environ. Sci. Technol. 2022, 56, 11567–11577. [Google Scholar] [CrossRef] [PubMed]
Revathi, L.V.; Sudalaimani, M.; Jeyadevi, S. An ABC-SVM Based Fault Prognosis of Wind Turbines Using SCADA Data. Available online: https://scientiairanica.sharif.edu/article_23593.html (accessed on 19 March 2025).
Ela, E.; Gevorgian, V.; Tuohy, A.; Kirby, B.; Milligan, M.; O’Malley, M. Market Designs for the Primary Frequency Response Ancillary Service-Part I: Motivation and Design. IEEE Trans. Power Syst. 2014, 29, 421–431. [Google Scholar] [CrossRef]
Mutale, S.; Wang, Y.; Yasir, J. Enhanced Efficiency and Quality in Wind Turbine Gearbox Assembly: A New Parallel Assembly Sequence Planning (PASP) Model. Int. J. Sustain. Eng. 2024, 17, 1048–1065. [Google Scholar] [CrossRef]
Elusakin, T.; Shafiee, M. Fault Diagnosis of Offshore Wind Turbine Gearboxes Using a Dynamic Bayesian Network. Int. J. Sustain. Energy 2022, 41, 1849–1867. [Google Scholar] [CrossRef]
Costa, Á.M.; Orosa, J.A.; Vergara, D.; Fernández-Arias, P. New Tendencies in Wind Energy Operation and Maintenance. Appl. Sci. 2021, 11, 1–26. [Google Scholar]
Liu, Z.; Yang, P.; Zhang, P.; Lin, X.; Wei, J.; Li, N. Optimization of Fuzzy Control Parameters for Wind Farms and Battery Energy Storage Systems Based on an Enhanced Artificial Bee Colony Algorithm under Multi-Source Sensor Data. Sensors 2024, 24, 5115. [Google Scholar] [CrossRef]
Yang, W.; Tavner, P.J.; Crabtree, C.J.; Wilkinson, M. Cost-Effective Condition Monitoring for Wind Turbines. IEEE Trans. Ind. Electron. 2010, 57, 263–271. [Google Scholar] [CrossRef]
Javier Maseda, F.; López, I.; Martija, I.; Alkorta, P.; Garrido, A.J.; Garrido, I. Sensors Data Analysis in Supervisory Control and Data Acquisition (Scada) Systems to Foresee Failures with an Undetermined Origin. Sensors 2021, 21, 2762. [Google Scholar] [CrossRef]
Popescu, V.F.; Scarlat, C. Supervisory Control and Data Acquisition (SCADA) Systems for Industrial Automation and Control Systems in Industry 4.0. Land Forces Acad. Rev. 2022, 27, 309–315. [Google Scholar] [CrossRef]
Verma, A.; Zappalá, D.; Sheng, S.; Watson, S.J. Wind Turbine Gearbox Fault Prognosis Using High-Frequency SCADA Data. In Proceedings of the Journal of Physics: Conference Series, Institute of Physics, Delft, The Netherlands, 1–3 June 2022; Volume 2265. [Google Scholar]
Liton Hossain, M.; Abu-Siada, A.; Muyeen, S.M. Methods for Advanced Wind Turbine Condition Monitoring and Early Diagnosis: A Literature Review. Energies 2018, 11, 1309. [Google Scholar] [CrossRef]
Wang, M.H.; Chen, F.H.; Lu, S.D. Research on Fault Diagnosis of Wind Turbine Gearbox with Snowflake Graph and Deep Learning Algorithm. Appl. Sci. 2023, 13, 1416. [Google Scholar] [CrossRef]
Jamaludin, H.; Achlison, U.; Rokhman, N. Enhancing AI Model Accuracy and Scalability Through Big Data and Cloud Computing. J. Technol. Inform. Eng. 2024, 3, 296–307. [Google Scholar] [CrossRef]
Qu, E.; Krishnapriyan, A.S. The Importance of Being Scalable: Improving the Speed and Accuracy of Neural Network Interatomic Potentials Across Chemical Domains. Adv. Neural Inf. Process. Syst. 2024, 37, 139030–139053. [Google Scholar]
Mostafavi, A.; Friedmann, A. Wind Turbine Condition Monitoring Dataset of Fraunhofer LBF. Sci. Data 2024, 11, 1108. [Google Scholar] [CrossRef]
Menezes, D.; Mendes, M.; Almeida, J.A.; Farinha, T. Wind Farm and Resource Datasets: A Comprehensive Survey and Overview. Energies 2020, 13, 4702. [Google Scholar] [CrossRef]
Firoozi, A.A.; Hejazi, F.; Firoozi, A.A. Advancing Wind Energy Efficiency: A Systematic Review of Aerodynamic Optimization in Wind Turbine Blade Design. Energies 2024, 17, 2919. [Google Scholar] [CrossRef]
Kandemir, E.; Hasan, A.; Kvamsdal, T.; Abdel-Afou Alaliyat, S. Predictive Digital Twin for Wind Energy Systems: A Literature Review. Energy Inform. 2024, 7, 68. [Google Scholar] [CrossRef]
Olabi, A.G.; Obaideen, K.; Abdelkareem, M.A.; AlMallahi, M.N.; Shehata, N.; Alami, A.H.; Mdallal, A.; Hassan, A.A.M.; Sayed, E.T. Wind Energy Contribution to the Sustainable Development Goals: Case Study on London Array. Sustainability 2023, 15, 4641. [Google Scholar] [CrossRef]
Lipu, M.S.H.; Miah, M.S.; Hannan, M.A.; Hussain, A.; Sarker, M.R.; Ayob, A.; Saad, M.H.M.; Mahmud, M.S. Artificial Intelligence Based Hybrid Forecasting Approaches for Wind Power Generation: Progress, Challenges and Prospects. IEEE Access 2021, 9, 102460–102489. [Google Scholar] [CrossRef]
Maldonado-Correa, J.; Martín-Martínez, S.; Artigao, E.; Gómez-Lázaro, E. Using SCADA Data for Wind Turbine Condition Monitoring: A Systematic Literature Review. Energies 2020, 13, 3132. [Google Scholar] [CrossRef]
Carroll, J.; Koukoura, S.; McDonald, A.; Charalambous, A.; Weiss, S.; McArthur, S. Wind Turbine Gearbox Failure and Remaining Useful Life Prediction Using Machine Learning Techniques. Wind. Energy 2019, 22, 360–375. [Google Scholar] [CrossRef]
Principal Component Analysis (PCA)—Geeks for Geeks. Available online: https://www.geeksforgeeks.org/principal-component-analysis-pca/ (accessed on 19 March 2025).
DBSCAN Clustering in ML|Density based clustering—GeeksforGeeks. Available online: https://www.geeksforgeeks.org/dbscan-clustering-in-ml-density-based-clustering/ (accessed on 19 March 2025).
Feng, Y.; Zhang, X.; Jiang, H.; Li, J. Compound Fault Diagnosis of a Wind Turbine Gearbox Based on MOMEDA and Parallel Parameter Optimized Resonant Sparse Decomposition. Sensors 2022, 22, 8017. [Google Scholar] [CrossRef] [PubMed]
Alikhashashneh, E.A.; Nahar, K.M.O.; Abual-Rub, M.; Alkhaldy, H.M. A Robust Method for Detecting Fake News Using Both Machine and Deep Learning Algorithms. Indones. J. Electr. Eng. Comput. Sci. 2024, 36, 1816–1826. [Google Scholar] [CrossRef]
Nagamine, T.; Seltzer, M.L.; Mesgarani, N. On the Role of Nonlinear Transformations in Deep Neural Network Acoustic Models. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH; International Speech and Communication Association, San Francisco, CA, USA, 8–12 September 2016; Volume 8, pp. 803–807. [Google Scholar]
Sigmoid Activation Function_Deep Learning Basics_Coursera. Available online: https://www.coursera.org/articles/sigmoid-activation-function (accessed on 19 March 2025).
Park, J.; Kim, C.; Dinh, M.C.; Park, M. Design of a Condition Monitoring System for Wind Turbines. Energies 2022, 15, 464. [Google Scholar] [CrossRef]

Figure 1. Preprocessing workflow for the AI model training.

Figure 2. The fault description of the gearboxes.

Figure 3. The relationship between wind speed and grid active power in a wind turbine.

Figure 4. A lifecycle failure in the wind turbine gearbox.

Figure 5. The relationship between wind speed and (a) gearbox temperature; (b) rotor bearing temperature; (c) rotor speed; (d) oil tank temperature; (e) oil pump pressure; (f) oil tank inlet pressure.

Figure 6. Noise filtering results using the DBSCAN algorithm: (a) Filtering out noise in an anomalous dataset; (b) filtering out noise in a normal dataset.

Figure 7. Process of designing the CCD model based on temperature prediction learning.

Figure 8. Process of designing the CCD model based on condition label learning.

Figure 9. Training and testing losses for (a) the temperature prediction learning-based CCD model and (b) the condition label learning-based CCD model.

Figure 10. Verification accuracy of the CCD model based on temperature prediction learning.

Figure 11. Verification accuracy of the CCD model based on condition label learning.

Table 1. Specifications of the 2 MW wind turbine gearbox.

Parameter	Value
Rated power	2 MW
Cut-in/out wind speed	3~25 m/s
Rated wind speed	11 m/s
Bending strength (fatigue)	$S_{F} > 1.5$
Surface durability (fatigue)	$S_{H} > 1.2$
Rated rotor speed	16 Rpm
Rated generator speed	1400 Rpm
Transmission ratio	1:83.83 (±1%)
Type	2 planetaries and a parallel

Table 2. The failure history of gearbox in the wind turbine.

Case	Contents	Repair
1	Gearbox bearing temperature rise	Gearbox thermo by-pass valve replacement
2	Gearbox filter clogged	Gearbox oil refill
3	Gearbox oil temperature rise	Gearbox oil filter replacement
4	Gearbox pump high-speed alarm occurs	Gearbox oil filter, thermo by-pass valve replacement
5	Gearbox internal gear broken	Gearbox electrical pump motor, oil filter replacement
6	Gearbox pump pressure drop	Gearbox repair
7	Gearbox oil pump cable is exposed	Cable taping and conduit fixing
8	Gearbox filter clogged	Gearbox oil filter, thermo by-pass valve replacement
9	Gearbox pump high-speed alarm occurs	Internal rewiring
10	Gearbox bearing temperature rise	Gearbox thermo by-pass valve replacement
11	Gearbox internal gear broken	Gearbox repair

Table 3. Normal operating sensor data range of gearboxes.

Parameter	Units	Reason for Selection
Rotor speed	Rpm	Indicator of gearbox load/stress.
Gearbox oil tank temperature	degC	Reflects lubrication/cooling efficiency.
Oil tank inlet pressure	Pa	Low pressure signals blockages/leaks.
Oil pump pressure	Pa	Unstable pressure signals system issues.
Gearbox temp	degC	Overheating signals wear, lubrication, or bearing faults.
Gearbox bearing temperature	degC	Indicates temperature abnormalities, wear, fatigue, or cracking.
Environment temperature	degC	Impacts cooling/lubrication efficiency.
Wind speed	m/s	Links power output/rotor speed to operational context.
Active power	kW	Deviations indicate inefficient gearbox performance.

Table 4. Specifications of the DNN architecture for the temperature prediction learning-based CCD model.

Specifications	Value
Number of inputs	8
Number of outputs	1
Number of hidden layers	3
Number of neurons in each hidden layer	128
Activation function in hidden layers	ReLU
Activation function in output layers	Sigmoid
Loss function	MSE
Learning rate	0.001
Epochs	276
Batch size	64

Table 5. Specifications of the DNN architecture for the condition label learning-based CCD model.

Specifications	Value
Number of inputs	9
Number of outputs	1
Number of hidden layers	3
Number of neurons in each hidden layer	512; 256; 128
Activation function in hidden layers	ReLU
Activation function in output layers	Sigmoid
Loss function	MSE
Learning rate	0.001
Epochs	248
Batch size	64

Table 6. Confusion matrix for testing.

Accuracy		Actual
Accuracy		Positive	Negative
Predicted	Positive	TP	FP
Predicted	Negative	FN	TN

Table 7. The accuracy results of the temperature prediction learning-based CCD model.

Folds	Accuracy	Precision	Recall	F1-Score
1	0.967	0.970	0.9650	0.967
2	0.968	0.971	0.966	0.968
3	0.966	0.969	0.964	0.966
4	0.967	0.970	0.965	0.967
Average	0.9672	0.9701	0.9651	0.9675

Table 8. The accuracy results of the condition label learning-based CCD model.

Folds	Accuracy	Precision	Recall	F1-Score
1	0.988	0.990	0.986	0.988
2	0.988	0.990	0.987	0.988
3	0.987	0.989	0.985	0.987
4	0.988	0.990	0.986	0.988
Average	0.9881	0.9901	0.9863	0.9881

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mai, X.-K.; Lee, J.-Y.; Lee, J.-I.; Go, B.-S.; Lee, S.-J.; Dinh, M.-C. Design of an Efficient Deep Learning-Based Diagnostic Model for Wind Turbine Gearboxes Using SCADA Data. Energies 2025, 18, 2814. https://doi.org/10.3390/en18112814

AMA Style

Mai X-K, Lee J-Y, Lee J-I, Go B-S, Lee S-J, Dinh M-C. Design of an Efficient Deep Learning-Based Diagnostic Model for Wind Turbine Gearboxes Using SCADA Data. Energies. 2025; 18(11):2814. https://doi.org/10.3390/en18112814

Chicago/Turabian Style

Mai, Xuan-Kien, Jun-Yeop Lee, Jae-In Lee, Byeong-Soo Go, Seok-Ju Lee, and Minh-Chau Dinh. 2025. "Design of an Efficient Deep Learning-Based Diagnostic Model for Wind Turbine Gearboxes Using SCADA Data" Energies 18, no. 11: 2814. https://doi.org/10.3390/en18112814

APA Style

Mai, X.-K., Lee, J.-Y., Lee, J.-I., Go, B.-S., Lee, S.-J., & Dinh, M.-C. (2025). Design of an Efficient Deep Learning-Based Diagnostic Model for Wind Turbine Gearboxes Using SCADA Data. Energies, 18(11), 2814. https://doi.org/10.3390/en18112814

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Design of an Efficient Deep Learning-Based Diagnostic Model for Wind Turbine Gearboxes Using SCADA Data

Abstract

1. Introduction

2. Materials and Data Processing Methods

2.1. SCADA and Data Collection

2.2. Data Processing Methods

2.2.1. The Feature Data Selection

2.2.2. Data Analysis

2.2.3. Data Filtering

2.2.4. Data Normalization

3. Design of the CCD Models

3.1. Preprocessing of the Activation Functions

3.2. The Temperature Prediction Learning-Based CCD Model

3.3. The Condition Label Learning-Based CCD Model

4. Accuracy Verification Results of the CCD Models

4.1. Accuracy Results of the CCD Model Training

4.2. Accuracy Results of the CCD Model

5. Discussions and Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI