GMM-Based Lightning Damage Detection for Wind Turbines Under De-Rated Operation Using the Scaled Power Curve

Matsui, Takuto; Naito, Koki; Yamamoto, Kazuo

doi:10.3390/en19071790

Open AccessArticle

GMM-Based Lightning Damage Detection for Wind Turbines Under De-Rated Operation Using the Scaled Power Curve

by

Takuto Matsui

^*,

Koki Naito

and

Kazuo Yamamoto

Department of Electrical and Electronic Engineering, Chubu University, 1200 Matsumoto, Kasugai 487-8501, Aichi, Japan

^*

Author to whom correspondence should be addressed.

Energies 2026, 19(7), 1790; https://doi.org/10.3390/en19071790

Submission received: 15 February 2026 / Revised: 3 April 2026 / Accepted: 4 April 2026 / Published: 6 April 2026

(This article belongs to the Section A3: Wind, Wave and Tidal Energy)

Download

Browse Figures

Versions Notes

Abstract

Many countries are actively promoting the large-scale deployment of wind power generation, both onshore and offshore. However, damage to wind turbines caused by winter lightning has become a growing concern in Japan. Japan has made efforts since an early stage to establish legal frameworks for reducing lightning damage; nevertheless, lightning damage to wind turbines remains a problem that has not been completely eradicated. After a wind turbine has been struck by lightning, it is restarted only after its structural integrity has been verified; however, the current method relies on visual inspection by workers, making accurate and rapid inspections difficult. One approach to solving this problem is to use anomaly detection techniques based on SCADA data. Research is currently underway to implement this approach. However, anomaly detection methods based on SCADA data have been criticized for their limited ability to accommodate multiple operating modes, including de-rated operation. In this study, we propose the “scaled power curve” as a robust feature that is less affected by operating modes, with its effectiveness verified through anomaly detection. This method showed improved anomaly detection accuracy compared to using the original power curve as a feature; moreover, in the present case, the method remained effective under de-rated operation. By using this feature, it is expected that a lightning damage detection model can be developed, contributing to improved availability of wind turbines.

Keywords:

AI technology; anomaly detection; GMM; lightning detection system; lightning protection; power curve; de-rated operation; SCADA; wind turbine

1. Introduction

The focus on wind power generation technology has increased significantly on a global scale, with its large-scale deployment progressing in numerous countries, both onshore and offshore. At the 2023 Conference of the Parties of the UNFCCC (COP28), the global installed capacity of renewable energy was set to be tripled, representing an ambitious target and further strengthening the role of wind power [1]. Analogous trends are al-so being observed within Japan. In the 7th Strategic Energy Plan proposed by the Ministry of Economy, Trade and Industry, offshore wind power is positioned as a “key driver” for making renewable energy a major power source [2].

On the other hand, Japan faces a distinctive challenge in wind power development, as winter lightning frequently damages wind turbines, thereby hindering the sustainable expansion of wind power generation projects. Winter lightning is a frequent phenomenon along the coast of the Sea of Japan and is characterized by long-lasting currents and high energy levels [3]. Approximately one-quarter of Japan’s wind turbines are situated along the coast of the Sea of Japan, where wind conditions are favorable; consequently, many of these turbines are exposed to the threat of winter lightning [4]. According to a study, lightning strikes caused 58% of wind turbine accidents in Japan that resulted in insurance claims [5]. This indicates that more than half of all major wind turbine accidents covered by insurance are caused by lightning strikes.

Against this backdrop of challenges, Japan has been developing legal frameworks to mitigate damage to wind turbines caused by lightning strikes. The most fundamental regulations on lightning protection for wind turbines are set out in the “Interpretation of Technical Standards for Wind Power Generation Facilities” [6]. This regulation mandates that all wind turbines in Japan be equipped with Lightning Detection Systems (LDSs) and requires that they be immediately stopped in the event of a lightning strike. This allows for the safe stoppage of wind turbines struck by lightning, thereby reducing the risk of severe damage such as blade breakage [7]. Wind turbines stopped due to lightning strikes are typically inspected by workers to verify their soundness before being restarted. From 2024 onwards, inspections using digital technologies such as AI and drones were legally permitted, which was expected to lead to more sophisticated and efficient inspection methods [8].

Nevertheless, these regulations alone have not been sufficient to eliminate the risk of lightning damage to wind turbines; therefore, there is a need to develop a lightning damage detection system that operates rapidly and accurately. An investigation report [7] indicates that minor lightning damage may go undetected by visual inspection, potentially leading to severe failures after turbine restart. Moreover, the current operational practice tends to be overly conservative, as turbines are stopped after any lightning strike regardless of actual damage. This approach, combined with the time required for site access and inspection, often results in prolonged downtime and reduced availability [9]. For these reasons, the negative impact of lightning strikes on wind turbine operation remains significant; hence, there is an urgent need to develop technologies for rapid and accurate remote monitoring of turbine soundness.

A large volume of research has been dedicated to the development of methodologies for the detection and location of faults in wind power plants [10]. As an IoT based method, Liu et al. proposed a novel iterative nonlinear filter for the detection of bearing faults in wind turbines. This approach has been shown to accurately identify a naturally damaged inner-racing fault in a 15-year-old bearing in-service, outperforming conventional filtering and Hilbert-envelope methods [11]. As an data driven processing techniques, Wang et al. proposed a multi-channel CNN for wind-turbine blade and pitch-angle fault diagnosis, achieving up to 87.8% accuracy on four FRP-blade states using real vibration data from a laboratory-scale test turbine [12]. The necessity to establish technology capable of automatically monitoring faults in remote wind turbines is indicated by the studies.

One method for verifying the soundness of wind turbines involves the use of Supervisory Control and Data Acquisition (SCADA) system. It is expected that the soundness of wind turbines can be monitored without additional equipment, as SCADA systems [13] are already installed on all turbines. Several methods have been proposed for detecting anomalies in wind turbines based on SCADA data, including approaches using feature selection and support vector regression [14], feature extraction through clustering and dimensionality reduction combined with deep learning [15], and change-point detection techniques [16]. However, attempts to detect lightning damage have not yet been put to practical use. Our research group is currently investigating methods for the rapid and accurate detection of lightning damage to wind turbines using anomaly detection techniques based on SCADA data. These studies have identified wind speed, rotational speed, and power as effective features for detecting lightning damage [17]; moreover, anomaly detection models such as the Gaussian Mixture Model (GMM) and autoencoders have been proposed [18,19].

Figure 1 illustrates the conceptual flow of a wind turbine lightning protection system based on SCADA data, which represents the ultimate objective of this study. After a lightning strike, the turbine is automatically restarted in de-rated operation, during which it is immediately stopped if an anomaly is detected. It is then operated in rated mode, where anomaly detection is applied. This process enables faster restarting turbines after lightning strikes while maintaining safety, thereby improving their availability.

However, the present method cannot accommodate multiple operating modes of wind turbines including de-rated operation [19,20]. The anomaly detection model needs to be robust to variations between rated operation and de-rated operation, as the switching of modes is independent of the structural condition of the wind turbine. In contrast, it is difficult to achieve robustness solely by training the model with the available data, because de-rated operation scenarios are less frequent and data are scarcer than for rated operation.

This study proposes an anomaly detection method based on a “scaled power curve” for detecting lightning damage in wind turbines under de-rated operation. The scaled power curve, normalized by the maximum power limit, aligns power curves across operating modes and provides a feature robust to such variations. Accordingly, anomaly detection based on this feature enables robust performance under multiple operating modes. In addition, a GMM is employed as a simple and effective method for SCADA-based anomaly detection to clearly evaluate the effectiveness of the proposed feature [18,21].

The structure of this paper is as follows. Section 2 describes the SCADA data used in this study and the wind turbine accident. Section 3 describes the process and the evaluation methods used for anomaly detection. Section 4 presents the results of anomaly detection and compares them with those obtained without using the scaled power curve. Section 5 is dedicated to a discussion of the feasibility of this method based on the results obtained. Finally, Section 6 provides the conclusions of the paper.

2. SCADA Data and the Accident of a Wind Turbine

2.1. SCADA Data

SCADA is a system that monitors and records the information necessary for controlling wind turbines [13]. Although the type of data recorded varies depending on the manufacturer and model of the wind turbines, most turbines record basic information such as wind conditions, rotational speed, and power output. Modern systems can record this information every second, whereas many operators store only 10-min average data due to storage capacity limitations.

2.2. De-Rated Operation

In this study, “de-rated operation” is defined as operating a wind turbine with its output limit set below its rated output. De-rated operation is automatically activated when a temporary output limitation is deemed necessary based on vibration levels and temperature conditions during operation; it can also be manually activated during inspection. During de-rated operation, the pitch control system reduces rotational speed of the blade to prevent the output or rotational speed from exceeding the set limits.

It is essential to develop an anomaly detection model that can be applied even during de-rated operation in order to improve the accuracy of lightning damage detection for wind turbines. Conventional anomaly detection methods typically define anomalies as data with low probability of occurrence. However, this approach may erroneously classify infrequent de-rated operation as an abnormal, leading to false positives [19,20]. Accordingly, it is necessary to use features that are robust to multiple operating modes, including de-rated operation, in order to develop an anomaly detection model applicable even during de-rated operation.

2.3. Lightning Accident of a Wind Turbine

This subsection introduces the case of a wind turbine accident that was used to test the proposed method of anomaly detection. In one reported case, a wind turbine was struck by lightning during operation, causing the detachment of the tip receptor at the blade. The primary technical specifications of the wind turbine are enumerated in Table 1, while Figure 2 provides the wind speed distribution in the area where the turbine has been installed. Additionally, the details of this accident are as follows:

The wind turbine was under de-rated operation at the time of the accident.
The accident was revealed during a routine inspection when the tip receptor and its appurtenant components in the proximity of the wind turbine were identified by workers.
As recorded by the lightning location system, a total of nine lightning strikes were observed within a radius of 3 km of the wind turbine, occurring during the period between the most recent inspection and the identification of the damage.
Subsequent inspection revealed delamination in the trailing edge of the blade, accompanied by evidence of lightning strike marks. Consequently, the observed damage is probably attributable to lightning strikes.
It has been estimated that the air within the blade may expand rapidly as a result of the lightning discharge penetrating the blade. This resulted in the receptor detaching from the tip of the blade. The occurrence of such damage is influenced by the condition of the insulation of the down conductor and the receptor block in the blade, as well as the strength of the adhesive along the leading and trailing edges.

Table 1. Technical specifications of the wind turbine.

Rated Power	2 MW
Rotor diameter	83.3 m
Nacelle height	70 m
Rated wind speed	13 m/s
cut-in wind speed	3.5 m/s
cut-out wind speed	25 m/s

Figure 2. The wind speed distribution in the area where the turbine has been installed [22].

In this study, SCADA data from this wind turbine were used to verify whether the proposed method could detect this type of accident. Detecting the accident as an abnormal would demonstrate that the method is applicable even under de-rated operation. Table 2 shows the temporal resolution and size of the data used in this study. The training dataset includes only normal operating data, encompassing both rated and de-rated operation, from the wind turbine before the accident. The test dataset includes three types of data: accident scenario data, normal rated operation data, and normal de-rated operation data. This is to verify whether the system can detect accidents as abnormal while ensuring that normal data from multiple operating modes are not mistakenly identified as abnormal.

The accident-scenario data refer to the observations collected from the time the blade’s soundness was last confirmed until the detachment of the receptor was discovered. Because the exact time at which the receptor fell off is unknown, the data recorded before the first of the nine lightning strikes—observed at 1571 min—were defined as normal, whereas the data recorded after the ninth strike—observed at 17,039 min—were defined as abnormal.

3. Lightning Damage Detection Method

3.1. Work Flow

Figure 3 shows the workflow of the proposed method. As illustrated in Figure 3, the method consists of a training process and a test process. These processes were defined using the Python 3.9 programming language and the scikit-learn framework.

The training process is conducted in the following sequence. First, the scaled power curve is calculated from the training dataset and used as a feature (see Section 3.2). Second, pre-processing is applied to these features (see Section 3.2). Third, the Gaussian Mixture Model (GMM) is trained using the pre-processed features (see Section 3.3). Fourth, the trained GMM is used to calculate the anomaly scores of the training data, in order to determine the threshold. The anomaly threshold was set to the 99.9th percentile of the training scores, corresponding to a false-alarm rate of 0.1%. This value was selected to minimize unnecessary turbine shutdowns, as frequent false alarms increase downtime and maintenance costs. Therefore, a 0.1% false-alarm rate was considered an appropriate trade-off between detection sensitivity and operational reliability.

The test process is conducted in the following sequence. First, scaled power curve are calculated from the test dataset using the same method as in the training process, after which pre-processing is applied. Second, the extracted features are input into the trained GMM to calculate the anomaly score. Finally, data are detected as abnormal if their anomaly scores exceed the threshold defined during the training process.

3.2. Feature Exstraction: Scaled Power Curve

In this study, the scaled power curve was calculated from SCADA data and used as a feature. The scaled power curve is a normalized representation that aligns the power curves during both rated and de-rated operation. The calculation of the scaled power curve assumes that the power curve of the wind turbine follows Equation (1) below [23].

P = {\begin{matrix} K_{w} C_{P} {v_{w}}^{3} \\ P_{m a x} \end{matrix} \begin{matrix} (K {v_{w}}^{3} < P_{m a x}) \\ (K {v_{w}}^{3} \geq P_{m a x}) \end{matrix} K_{w} = \frac{1}{2} ρ A

(1)

Here, P denotes the output, v_w denotes the wind speed, and

P_{m a x}

denotes the maximum allowable output. In addition, C_P is the power coefficient, which depends on factors such as blade shape;

ρ

is the air density; A is the rotor area of the wind turbine; and K is assumed to be a constant.

P_{m a x}

varies depending on the operating mode of the wind turbine and can range from 0 kW to the rated output. At this stage, normalizing the power curve (i.e., the v_w-P characteristic) can be achieved by dividing P by

P_{m a x}

and v_w by

\sqrt[3]{P_{m a x}}

, thereby obtaining the scaled power curve independent of the value of

P_{m a x}

.

The Scaled Power Curve can be calculated using the flowchart shown in Figure 4. The

P_{m a x}

may be designated as the upper limit of the power or the upper limit of the rotational speed. Consequently, as demonstrated in Figure 4, a comparison was conducted of the upper limit of the power and the upper limit of the rotational speed, with the lower value being adopted. It is to be noted that

f_{N \to P}

in Figure 4 is a function that returns the

P_{m a x}

when the upper limit of the rotational speed is given, and it must be approximated in advance from the SCADA data. In this study, linear regression was performed on the N_lim − P characteristics of the normal data to obtain an approximate equation. The calculated P* and v_w* are then entered into the GMM.

Figure 5 compares the normal power curve and the scaled power curve. It is evident that the (a) normal power curve exhibits multiple saturation values. The presence of these saturation values is due to differences in operation modes, and the saturation value of around 1000 kW indicates de-rated operation. By contrast, in the (b) scaled power curve, these saturation values are scaled to 1, thus demonstrating a uniform distribution irrespective of the operating modes. The utilisation of scaled wind speed and scaled power as features is anticipated to facilitate robust anomaly detection, independent of operating modes.

In this study, data points corresponding to periods of inactivity were excluded from the extracted features as part of pre-processing. This is because the study focuses on anomaly detection in wind turbines under operating conditions. Here, data with power less than 1 kW were excluded.

3.3. Gaussian Mixture Model

The GMM is an unsupervised learning method commonly employed for anomaly detection [24]. The GMM estimates the probability distribution of normal data using a weighted linear combination of multiple Gaussian distributions, thereby enabling the calculation of data likelihoods. Abnormal data have a low probability of occurrence, resulting in a low likelihood; this property can be exploited for anomaly detection.

The basic equation for GMM is shown in Equation (2) below. Note that in Equation (2), p denotes the probability distribution approximated by the GMM,

N

denotes the multivariate Gaussian distributions, and x denotes a d-dimensional random variable. Equation (2) shows that the probability distribution is approximated as a weighted sum of K Gaussian distribution. Note that

μ_{k}

and

\sum_{k}

denote the mean vector and covariance matrix of the k-th Gaussian distribution, respectively, and

π_{k}

denotes its weight. The parameter set

{μ_{k}, \sum_{k}, π_{k}}

is denoted by

Θ

, and these parameters constitute the model parameters of the GMM. The optimal value of

Θ

is estimated using the EM algorithm [25] by maximizing the log-likelihood function shown in Equation (3). Note that the value of K must be determined in advance before executing the EM algorithm.

{\begin{matrix} \begin{matrix} p (x | Θ) = \sum_{k = 1}^{K} π_{k} N (x | μ_{k}, Σ_{k}) \\ N (x | μ_{k}, Σ_{k}) = \frac{\exp (- \frac{1}{2} {(x - μ_{k})}^{T} \sum_{k}^{- 1} (x - μ_{k}))}{{(2 π | \sum_{k} |)}^{\frac{d}{2}}} \end{matrix} \\ \begin{matrix} 0 \leq π_{k} \leq 1 \\ \sum_{k = 1}^{K} π_{k} = 1 \end{matrix} \end{matrix}

(2)

L (Θ) = \ln \prod_{i = 0}^{N} p (x_{i} | Θ)

(3)

The calculation of the anomaly score for a given data x is demonstrated in Equation (4) below. This calculation is derived from the likelihood output by the trained GMM. As demonstrated in Equation (4), the greater the deviation of x from the distribution of the training data, the more significant the anomaly score. The classification of x as either normal or abnormal is determined by the comparison of this anomaly score with a threshold.

a n o m a l y s c o r e = - l n {p (x | Θ)}

(4)

The GMM settings utilised in this study are delineated in Table 3 below. The GMM was defined using scikit-learn: an open-source machine learning library for the Python programming language. The hyperparameter K was set to 128 based on previous research [21], as it must be determined prior to anomaly detection. A sensitivity analysis with respect to K was also conducted, as described in Section 3.4. The remaining parameters were set to their default values provided by the implementation library, as these settings are widely used and ensure stable convergence.

3.4. Evaluation Method

In this study, the performance of the anomaly detection was evaluated using a confusion matrix, the Precision–Recall (PR) curve, Average Precision (AP), Receiver Operating Characteristic (ROC) curve, and Area Under the Curve (AUC). This subsection provides an overview of these evaluation methods.

The confusion matrix is a table that represents the results of anomaly detection, categorized into the four events shown in Figure 6. Since test data used for anomaly detection are typically highly imbalanced, with abnormal samples being extremely scarce relative to normal data, a confusion matrix is required to evaluate performance separately for each class.

Recall and precision are defined by Equation (5) using the values in the confusion matrix [26]. Recall indicates the proportion of actual abnormal data that are correctly detected, whereas precision indicates the proportion of detected data that are actually abnormal. Recall and precision exhibit a trade-off relationship: lowering the threshold increases recall, whereas it decreases precision.

Recall = \frac{TP}{TP + FN} Precision = \frac{TP}{TP + FP}

(5)

The PR curve illustrates the relationship between recall and precision as the threshold varies [26]. AP represents the average precision calculated from the PR curve: a value closer to 1 indicates higher anomaly detection performance. Calculating the AP does not require setting a threshold, thereby allowing the evaluation of anomaly detection performance without relying on a single threshold value.

The ROC curve is a plot of the True Positive Rate (TPR) and False Positive Rate (FPR) by the threshold [27]. Equation (6) defines TPR and FPR. The area under the ROC curve was defined as the AUC. The greater the accuracy of the anomaly detection model, the greater the value of this variable.

TPR = \frac{TP}{TP + FN} FPR = \frac{F P}{F P + TN}

(6)

In addition, a sensitivity analysis of the GMM hyperparameter K was conducted by varying K (32, 64, 128, and 256) to evaluate its effect on anomaly detection results.

4. Detection Result

This section presents the findings of a test process that aimed to ascertain the method’s capacity to detect blade anomalies resulting from lightning strikes. The test process utilized wind turbine SCADA data, as outlined in Section 2.3.

4.1. During Lightning Strike Accident

Figure 7 illustrates the anomaly detection results for a wind turbine during a lightning strike event, focusing on the temporal evolution of detection performance. Unlike a confusion matrix, which provides only aggregated classification performance, this figure visualizes how anomalies are detected over time, particularly before and after lightning strikes. Figure 7a,d illustrate the power curves and temporal characteristics of power, while Figure 7e shows the temporal characteristics of the anomaly score. In Figure 7a,b, the plot colors indicate the true labels, whereas in Figure 7c,d, they indicate the predicted labels from the anomaly detection. Red plots represent abnormal or predicted-abnormal data, blue plots represent normal or predicted-normal data, and gray plots represent undefined data, which are excluded from the evaluation. Black plots represent missing data, which refer to data that were excluded during pre-processing.

As illustrated in Figure 7e, the anomaly score increases after the 9th lightning strike, demonstrating that the proposed method can accurately detect abnormal data. This result suggests that the scaled power curve may be useful for detecting lightning damage in the present case.

4.2. During Normal Rated Operation

Next, Figure 8 shows the results of anomaly detection using SCADA data during normal rated operation. The legend rules in Figure 8 are the same as those in Figure 7; however, the data in Figure 8 includes only normal class. As shown in Figure 8, all data points—except for six specific points—were correctly classified as normal. This result demonstrates that the anomaly detection model is robust to unseen normal data and does not suffer from overfitting to the training data. In contrast, false positives occur when the wind turbine stops and restarts, as shown in Figure 8d, indicating that transient changes during these events may be misclassified as abnormal.

4.3. During Normal De-Rated Operation

Finally, Figure 9 shows the results of anomaly detection using SCADA data during normal de-rated operation. The legend rules in Figure 9 are the same as those in Figure 7; however, the data in Figure 9 include only normal class. As shown in Figure 9, all data points—except for fourteen specific points—were correctly classified as normal. This indicates that the method is not misinterpreting data during de-rated operation as abnormal, suggesting that the anomalies detected in Figure 4 are likely to reflect only damage caused by lightning strikes. It has also been confirmed that the scaled power curve is a robust feature applicable across multiple operating modes, including de-rated operation. In contrast, false positives occur when the wind turbine stops and restarts, as shown in Figure 9d, indicating that transient changes during these events may be misclassified as abnormal, similar to the results in Section 4.2.

5. Discussion

5.1. Evaluation and Comparison

This subsection provides a comparison between the results obtained using the scaled power curve as a feature and those obtained using the simple power curve. Note that the only difference between the two methods lies in the features; all other parameters are kept identical.

First, Figure 10 shows the anomaly detection results for three scenarios—during a lightning strike accident, during normal rated operation, and during normal de-rated operation—presented as a confusion matrix. Figure 10a shows the result of conventional method, obtained using the power curve as a feature. Figure 10b shows the result of proposed method, obtained using the scaled power curve as a feature.

In conventional method, 226 false positives were observed, whereas in proposed method, the number was reduced to 20. In this case, the scaled power curve led to a substantial reduction in false positives. Note, however, that the results of proposed method show an increase in false negatives compared to conventional method. Furthermore, the period between the first and ninth lightning strikes was excluded from the evaluation because the exact timing of blade damage is unknown in this study. Therefore, it should be noted that the number of false negatives in both results may be further underestimated.

Next, Figure 11 presents the power curve along with the results of anomaly detection based on data obtained during de-rated operation. Figure 11a,c,e show the results of conventional method, obtained using the power curve as a feature. Figure 11b,d,f show the results of proposed method, obtained using the scaled power curve as a feature. Note that Figure 11(b), (d) and (f) is a reproduction of Figure 9(c), (d) and (e), respectively.

As illustrated in Figure 11a,c, a total of 212 false positives were observed when the power curve is used as a feature. This is because the anomaly detection model is mistakenly identifying the upper limit value that appears during de-rated operation as abnormal. The anomaly score continues to exceed the threshold at its saturation values during de-rated operation, as shown in Figure 11e. Accordingly, power-curve-based anomaly detection methods cannot be employed during de-rated operation.

By contrast, only 14 false positives were observed when the power curve was used as a feature, as illustrated in Figure 11b,d. Therefore, Using the scaled power curve as a feature ensures that the saturation values observed during de-rated operation are correctly classified as normal. Furthermore, the anomaly score remains below the threshold, except for the periods immediately before and after the stoppage or restart of the wind turbines. These results suggest that the scaled power curve may be less sensitive to operating modes in the present case.

Figure 12 shows the anomaly detection results for three scenarios—during a lightning strike accident, during normal rated operation, and during normal de-rated operation—presented as a PR curve and ROC curve. In Figure 12, the black line represents the results of conventional method, obtained using the power curve, whereas the red line represents the results of proposed method, obtained using the scaled power curve.

Figure 12 demonstrates that the proposed method’s performance in detecting anomalies is superior to that of the conventional method. Compared to using the power curve as a feature, using the scaled power curve as a feature yield improved PR curves and higher AP by 0.114. Furthermore, the application of the scaled power curve as a feature yielded enhanced ROC curves and an elevated AUC of 0.062 in comparison with the utilization of the power curve as a feature. These results demonstrate that the use of the scaled power curve as a feature enables more accurate anomaly detection.

Table 4 presents the recall, precision, AP and AUC of the conventional and proposed methods for each value of K. In comparison with the conventional method, the proposed method demonstrates enhanced anomaly detection accuracy, as evidenced by the elevated precision, AP and AUC values. It is important to note that precision improved while maintaining a comparable level of recall. The present findings exceed the results of previous studies [20] that used the same dataset, where the maximum AUC achieved was 0.803, whereas the proposed method attains a higher maximum AUC of 0.868. Therefore, these findings provide preliminary evidence that the scaled power curve may be useful for detecting lightning damage.

5.2. Feasibility Studies and Issues

The scaled power curve is likely to be useful for the lightning protection system shown in Figure 1, as it is an effective way of detecting lightning damage to wind turbines as discussed in Section 5.1. In addition, with an average calculation time of 0.122 ms for training and 0.105 ms for testing (on a 13th Gen Intel Core i7-13700KF CPU, Intel Corporation, Santa Clara, CA, USA), this method is fully feasible for real-time use. As the practice of stopping wind turbines when lightning strikes is already in implementation in Japan, the introduction of this method in conjunction with an LDS and SCADA system, which detects anomalies based on the scaled power curve, is expected to enable the safer and more efficient stopping of wind turbines. Furthermore, the capacity for the rapid restart of operations without the necessity for inspection, as a consequence of the detection of anomalies, is believed to result in an improvement in the availability of the wind turbines.

On the other hand, the implementation of the scaled power curve as a feature for anomaly detection led to an increase in false negatives and a decrease in recall. It is believed that this is a limitation of the GMM employed in this method. GMM is a simple anomaly detection method, characterized by a lower modelling expressiveness than more recent methods. Consequently, the model exhibits a delayed response to anomalies and is susceptible to overlooking them. The scaled power curve can serve as an input feature for other anomaly detection models. Future work will be required to integrate it with more advanced anomaly detection models.

In practical applications, it is presumed that false positives can be eliminated by stopping the wind turbines when consecutive anomalies are detected. For instance, let us consider the case in which an operational rule is applied such that the wind turbine is stopped when the anomaly score exceeds the threshold continuously for 5 min. In the following, we consider the case where this operational rule is applied. First, Figure 13b enlarges the temporal variation in the anomaly score shown in Figure 7e. The period during which the threshold is exceeded continuously for 5 min is highlighted in red. For comparison, the result obtained using a simple power curve is shown in Figure 13a. As is evident from a comparison between Figure 13a,b, a larger number of intervals are identified as abnormal when the scaled power curve is used than when the simple power curve is applied. However, because the wind turbine is stopped at the moment when the threshold is exceeded for 5 consecutive occurrences for the first time, it is confirmed that the turbine can be safely stopped by either method.

Second, Figure 14 enlarges the temporal variation in the anomaly score shown in Figure 9e and Figure 11e under normal de-rated operation. The period during which the threshold is exceeded continuously for five minutes is highlighted in red. Because Figure 14 uses data obtained during normal operation, it is desirable that all data be classified as normal. However, in Figure 14a, the threshold is exceeded for 5 consecutive occurrences during many intervals, which would cause unnecessary stoppage of the wind turbine. This indicates that practical operation is infeasible with this method. In contrast, in Figure 14b, the threshold is not exceeded for 5 consecutive occurrences at any time. Therefore, it is confirmed that practical operation is feasible when the scaled power curve is employed.

In addition, it is imperative to highlight the necessity for enhancing the model’s anomaly detection performance in terms of reliability, particularly in the setting of practical implementation. This study is a case study that examines only a single wind turbine accident; therefore, further validation is required in future work. However, it is rare for lightning strikes to cause major accidents involving wind turbines. In addition, very few cases of SCADA data from such accidents are stored in a format that can be analyzed. Consequently, this constitutes an important limitation of the present study. In order to verify the applicability of this method to wind turbines in different conditions, further data collection and investigation is required. As a specific example, the proposed method illustrated in Figure 1 may be applied in conjunction with conventional visual inspections, thereby enabling the collection of SCADA data with ground-truth labels while maintaining safety. The method is therefore expected to be applicable to wind turbines within the same wind farm as that used for validation, because it utilizes the power curve as a feature, eliminating the need for retraining under identical wind conditions and with the same manufacturer.

Another issue is the need for a more rigorous definition of the scaled power curve. In this study, the air density and the rotor area of the wind turbine were assumed to be constant; however, the air density varies depending on location, season, and time of day. By incorporating temperature and atmospheric pressure as features and subsequently defining the scaled power curve with air density as a variable, features robust to variations in air density can be obtained. However, it should be noted that, to obtain these results, the threshold must be set to at least the 99th percentile.

6. Conclusions

This paper reports our findings on a lightning damage detection model based on SCADA data, which is applicable even to wind turbines under de-rated operation. First, we proposed the “scaled power curve” as a robust feature that is less affected by multiple operating modes. The scaled power curve is normalized using the maximum allowable output P_max, thereby enabling the analysis of both de-rated and rated operation on a comparable scale. The GMM was employed for anomaly detection, after which experiments were conducted using both the scaled power curve and the original power curve.

The experimental results demonstrated that, in the field data used in this study, the use of the scaled power curve as a feature yielded more accurate anomaly detection performance than using the original power curve as a feature. The anomaly detection model using the scaled power curve successfully identified normal conditions, resulting in a significant reduction in false positives—equivalent to 206 min of false positive events. Furthermore, the accuracy of the PR and ROC curves also improve. These results suggest that the scaled power curve may be less sensitive to operating modes in the present case.

Meanwhile, several issues were identified in anomaly detection when the scaled power curve was used as a feature. The primary issue is an increase in false negatives and a corresponding decrease in recall. To address this problem, an operational countermeasure is effective in practical implementation. Specifically, it was demonstrated that anomaly detection based on the scaled power curve becomes practically feasible if an operational rule is introduced such that the wind turbine is stopped only when the threshold is exceeded more than five consecutive times. The second issue is the reliability of the anomaly detection model. It is evident that further data collection and analysis is required in order to address this issue.

Based on the findings from this study, it is expected that a lightning damage detection model applicable to wind turbines under de-rated operation can be developed. This technology enables the rapid and accurate remote assessment of wind turbine soundness after lightning strikes, allowing quicker restart and thereby improving overall availability.

Author Contributions

Conceptualization, K.Y. and T.M.; methodology, T.M.; software, T.M. and K.N.; validation, T.M. and K.N.; formal analysis, T.M.; data curation, T.M. and K.N.; writing—original draft preparation, T.M.; writing—review and editing, K.Y.; visualization, T.M.; supervision, K.Y.; project administration, K.Y.; funding acquisition, K.Y. All authors have read and agreed to the published version of the manuscript.

Funding

New Energy and Industrial Technology Development Organization: JPNP21015.

Data Availability Statement

The datasets presented in this article are not readily available because they are subject to confidentiality agreements with collaborative research partners. Requests to access the datasets should be directed to the first author.

Acknowledgments

This paper is based on results obtained from a project, Green Innovation Fund Projects (JPNP21015), subsidized by the New Energy and Industrial Technology Development Organization (NEDO).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LDS	Lightning detection system
SCADA	Supervisory Control and Data Acquisition
GMM	Gaussian Mixture Model
EM	Expectation–Maximization
PR	Precision-Recall
AP	Average Precision
TP	True positive
FP	False positive
TN	True negative
FN	False negative

References

United Nations Framework Convention on Climate Change (UNFCCC). Summary of Global Climate Action at COP 28. Available online: https://unfccc.int/sites/default/files/resource/Summary_GCA_COP28.pdf?utm_source=chatgpt.com (accessed on 9 May 2025).
Ministry of Economy. Trade and Industry Strategic Energy Plan. Available online: https://www.enecho.meti.go.jp/category/others/basic_plan/pdf/2025_strategic_energy_plan.pdf (accessed on 1 October 2025).
Study Committee of Lightning Protection for Electrical and Electronic Equipment. Lightning Protection for Electrical and Electronic Equipment; The Institute of Electrical Installation Engineers of Japan: Tokyo, Japan, 2016; ISBN 978-4-9902110-7-3. [Google Scholar]
Takada, Y. Wind Farm and Lightning; Seizandou Shoten: Tokyo, Japan, 2015. [Google Scholar]
Adachi, S. Insurance for Lightning Strikes in Wind Turbine Generator Systems. J. Inst. Electr. Eng. Jap. 2019, 139, 530–533. [Google Scholar] [CrossRef]
Ministry of Economy. Trade and Industry Interpretation of Technical Standards for Wind Power Generation Facilities. Available online: https://www.meti.go.jp/policy/safety_security/industrial_safety/law/20241001_huugikaisyaku.pdf (accessed on 26 August 2025).
Ministry of Economy. Trade and Industry New Energy Power Generation Facility Accident Response and Structural Strength Working Group. Available online: https://www.meti.go.jp/shingikai/sankoshin/hoan_shohi/denryoku_anzen/newenergy_hatsuden_wg/index.html (accessed on 26 August 2025).
Ministry of Economy. Trade and Industry Clarification of the Interpretation of Laws and Regulations under Our Jurisdiction in Light of the Digital Principles. Available online: https://www.meti.go.jp/policy/safety_security/industrial_safety/law/files/digitalgensoku-denryokuanzen.pdf (accessed on 1 October 2025).
Yamamoto, K.; Kazui, H.; Izuchi, S. Influence of Lightning Strike on Availability of Wind Turbine and Its Damage Analysis. In Proceedings of the 2022 36th International Conference on Lightning Protection (ICLP); IEEE: New York, NY, USA, 2022. [Google Scholar]
Kou, L.; Li, Y.; Zhang, F.; Gong, X.; Hu, Y.; Yuan, Q.; Ke, W. Review on Monitoring, Operation and Maintenance of Smart Offshore Wind Farms. Sensors 2022, 22, 2822. [Google Scholar] [CrossRef] [PubMed]
Liu, Z.; Zhang, L. Naturally Damaged Wind Turbine Blade Bearing Fault Detection Using Novel Iterative Nonlinear Filter and Morphological Analysis. IEEE Trans. Ind. Electron. 2020, 67, 8713–8722. [Google Scholar] [CrossRef]
Wang, M.; Lu, S.; Hsieh, C.; Hung, C. Fault Detection of Wind Turbine Blades Using Multi-Channel CNN. Sustainability 2022, 14, 1781. [Google Scholar] [CrossRef]
Anderson, M. Supervisory Control and Data Acquisition. Available online: https://www.realpars.com/blog/scada (accessed on 10 April 2025).
Tao, L.; Siqi, Q.; Zhang, Y.; Shi, H. Abnormal Detection of Wind Turbine Based on SCADA Data Mining. Math. Probl. Eng. 2019, 2019, 5976843. [Google Scholar] [CrossRef]
Liu, X.; Lu, S.; Ren, Y.; Wu, Z. Wind Turbine Anomaly Detection Based on SCADA Data Mining. Electronics 2020, 9, 751. [Google Scholar] [CrossRef]
Letzgus, S. Change-Point Detection in Wind Turbine SCADA Data for Robust Condition Monitoring with Normal Behaviour Models. Wind Energy Sci. 2020, 5, 1375–1397. [Google Scholar] [CrossRef]
Matsui, T.; Yamamoto, K.; Sumi, S.; Triruttanapiruk, N. Detection of Lightning Damage on Wind Turbine Blades Using the SCADA System. IEEE Trans. Power Delivery 2021, 36, 777–784. [Google Scholar] [CrossRef]
Matsui, T.; Yamamoto, K.; Ogata, J. Anomaly Detection for Wind Turbine Damaged Due to Lightning Strike. Electr. Power Syst. Res. 2022, 209, 107918. [Google Scholar] [CrossRef]
Matsui, T.; Matsuoka, K.; Yamamoto, K. Lightning Damage Detection Method Using Autoencoder: A Case Study on Wind Turbines with Different Blade Damage Patterns. Wind 2025, 5, 12. [Google Scholar] [CrossRef]
Matsui, T.; Matsuoka, K.; Yamamoto, K. Lightning Damage Detection for Wind Turbine Blade Based on SCADA Data Analysis Using Autoencoder. In Proceedings of the XVII SIPDA International Symposium on Lightning Protection, Thessaloniki, Greece, 21–26 September 2025. [Google Scholar]
Matsui, T.; Yamamoto, K.; Ogata, J. Study on Improvement of Lightning Damage Detection Model for Wind Turbine Blade. Machines 2021, 10, 9. [Google Scholar] [CrossRef]
New Energy and Industrial Technology Development Organization. Local Wind Condition Map. Available online: https://localwind.infop.nedo.go.jp/nedo/index.html (accessed on 13 November 2025).
Burton, T.; Sharpe, D.; Jenkins, N.; Bossanyi, E. Wind Energy Handbook; John Wiley & Sons: New York, NY, USA, 2001; ISBN 9780471489979. [Google Scholar]
Bishop, C.M. Pattern Recognition and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2007; ISBN 9780387310732. [Google Scholar]
Fraley, C.; Raftery, A.E. How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. Comput. J. 1998, 41, 578–588. [Google Scholar] [CrossRef]
Davis, J.; Goadrich, M. The Relationship between Precision-Recall and ROC Curves. In Proceedings of the 23rd International Conference on MACHINE Learning—ICML’06; ACM Press: New York, NY, USA, 2006. [Google Scholar]
Fawcett, T. An Introduction to ROC Analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]

Figure 1. SCADA-based anomaly detection framework for turbine restart control after a lightning strike.

Figure 3. Workflow of the proposed SCADA-based lightning damage detection method: (a) processing flow, (b) pseudocode.

Figure 4. The flowchart for calculating scaled power curve.

Figure 5. Power curve of the wind turbine on normal operation: (a) normal power curve, (b) scaled power curve.

Figure 6. Confusion matrix.

Figure 7. Results of the anomaly detection for data during lightning strike accident: (a) power curve with true label, (b) temporal characteristics of power with true label, (c) power curve with predicted label, (d) temporal characteristics of power with predicted label, and (e) temporal characteristics of the anomaly score.

Figure 8. Results of the anomaly detection for data during normal rated operation: (a) power curve with true label, (b) temporal characteristics of power with true label, (c) power curve with predicted label, (d) temporal characteristics of power with predicted label, and (e) temporal characteristics of the anomaly score.

Figure 9. Results of the anomaly detection for data during normal de-rated operation: (a) power curve with true label, (b) temporal characteristics of power with true label, (c) power curve with predicted label, (d) temporal characteristics of power with predicted label, and (e) temporal characteristics of the anomaly score.

Figure 10. Confusion matrices of anomaly detection result: (a) Result of conventional method using the power curve, and (b) Result of proposed method using the scaled power curve.

Figure 11. Results of the anomaly detection for data during normal de-rated operation: (a) Result using power curve, (b) Result using scaled power curve, (c) Temporal result using power curve, (d) Temporal result using scaled power curve, (e) Time characteristics of anomaly score by using power curve, and (f) Time characteristics of anomaly score by using scaled power curve.

Figure 12. PR curves and ROC curves of anomaly detection results: (a) PR curves, (b) ROC curves.

Figure 13. Visualization of the period during which the threshold is exceeded continuously for five minutes: (a) Result obtained using a power curve, (b) Result obtained using a scaled power curve.

Figure 14. Visualization of the period during which the threshold is exceeded continuously for five minutes on normal de-rated operation: (a) Result obtained using a power curve, (b) Result obtained using a scaled power curve.

Table 2. SCADA data obtained from the wind turbines.

	Train Dataset	Test Dataset
Type of SCADA	1 min average data	1 min average data
Data Period	The data recorded before accident for 41,979 min (about 33 days) (including rated and de-rated operation)	The data recorded before and after accident for 20,520 min (about 16 days) (de-rated operation)
		The data recorded before accident for about 1000 min (rated operation)
		The data recorded before accident for about 1000 min (de-rated operation)

Table 3. Setting parameters for the GMM.

Parameter	Value/Description
The Gaussian components: K	128
The covariance type of Gaussian distributions	Each component has its own general covariance matrix.
The convergence threshold	0.001
The number of EM iterations to perform	100
The number of initializations to perform	1
The initializing method for the weights, the means and the precisions	k-means

Table 4. Evaluation comparison between conventional method and proposed method.

	The Gaussian Components: K	Recall	Precision	AP	AUC
Conventional method	32	0.005	0.084	0.800	0.800
	64	0.052	0.458	0.805	0.804
	128	0.196	0.752	0.803	0.803
	256	0.037	0.349	0.792	0.793
Proposed method	32	0.188	0.984	0.918	0.868
	64	0.160	0.989	0.918	0.868
	128	0.171	0.968	0.917	0.865
	256	0.215	0.947	0.907	0.853

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Matsui, T.; Naito, K.; Yamamoto, K. GMM-Based Lightning Damage Detection for Wind Turbines Under De-Rated Operation Using the Scaled Power Curve. Energies 2026, 19, 1790. https://doi.org/10.3390/en19071790

AMA Style

Matsui T, Naito K, Yamamoto K. GMM-Based Lightning Damage Detection for Wind Turbines Under De-Rated Operation Using the Scaled Power Curve. Energies. 2026; 19(7):1790. https://doi.org/10.3390/en19071790

Chicago/Turabian Style

Matsui, Takuto, Koki Naito, and Kazuo Yamamoto. 2026. "GMM-Based Lightning Damage Detection for Wind Turbines Under De-Rated Operation Using the Scaled Power Curve" Energies 19, no. 7: 1790. https://doi.org/10.3390/en19071790

APA Style

Matsui, T., Naito, K., & Yamamoto, K. (2026). GMM-Based Lightning Damage Detection for Wind Turbines Under De-Rated Operation Using the Scaled Power Curve. Energies, 19(7), 1790. https://doi.org/10.3390/en19071790

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

GMM-Based Lightning Damage Detection for Wind Turbines Under De-Rated Operation Using the Scaled Power Curve

Abstract

1. Introduction

2. SCADA Data and the Accident of a Wind Turbine

2.1. SCADA Data

2.2. De-Rated Operation

2.3. Lightning Accident of a Wind Turbine

3. Lightning Damage Detection Method

3.1. Work Flow

3.2. Feature Exstraction: Scaled Power Curve

3.3. Gaussian Mixture Model

3.4. Evaluation Method

4. Detection Result

4.1. During Lightning Strike Accident

4.2. During Normal Rated Operation

4.3. During Normal De-Rated Operation

5. Discussion

5.1. Evaluation and Comparison

5.2. Feasibility Studies and Issues

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI