Recent Advances in the Use of eXplainable Artiﬁcial Intelligence Techniques for Wind Turbine Systems Condition Monitoring

There is a good probability that wind turbines will emerge as one of the predominant technologies for electricity production in the upcoming decades [...]


Wind Turbine Condition Monitoring in a Nutshell
There is a good probability that wind turbines will emerge as one of the predominant technologies for electricity production in the upcoming decades. As representative examples, consider that more electricity has been produced from wind than from natural gas in the first quarter of 2023 in the U.K. and that the European Commission has set as an objective that half of the electricity produced in Europe should derive from wind. Projections indicate that the global wind power capacity is poised to more than double within a decade [1].
To mitigate the Levelized Cost Of Energy (LCOE), it is imperative to systematically curtail the costs linked to the utilization of wind energy. Notably, the predominant outlay in a wind farm project is related to operation and maintenance, which can scale up to approximately 20-30% of the overall expenses, particularly for offshore installations due to the intricate on-site accessibility challenges.
Hence, a pivotal avenue of innovation in the realm of wind farms' O&M pertains to predictive maintenance strategies [1]. This paradigm primarily encompasses precise early-stage fault diagnosis, prognostication of impending failure times, and consequential decision-making aimed at cost minimization. An analysis conducted by [2] reported that assuming 25% of the generator and gearbox failures timely diagnosed the O&M costs and the producible energy lost can decrease in the order of 10%.
Unfortunately, effective preventive diagnoses can be challenging to conduct since wind turbines are complex machines subjected to non-stationary operation conditions. These harsh conditions may hinder the accuracy of physical models for wind turbine condition monitoring, which are seldom built, leading to the predominance of data-driven-based methods in the scientific literature [3].
The adoption of data-driven models is supported by the massive data amount collected by the Supervisory Control And Data Acquisition (SCADA) systems, which are used for remote control and for monitoring and practically equipping all modern wind turbines. Particularly, for each wind turbine, modern SCADA systems collect hundreds of measurements at ten-minute intervals, offering a comprehensive overview of environmental conditions, operational behavior, mechanical responses, thermal trends, electrical characteristics, and power conversion dynamics. Moreover, based on cost considerations, modern wind turbines can be outfitted with dedicated Turbine Condition Monitoring (TCM) systems that measure and record vibrations in critical rotating components. These systems operate at a sampling frequency that can extend to the order of tens of kHz.
Particularly, the expansion in dataset size promotes advances in the development of wind turbine condition monitoring methods, which have been mostly based on Machine Learning (ML) techniques. As an example, more than 700 studies from the 2010-2020 decade deal with Artificial Intelligence applications for wind turbine condition monitoring arising from the study conducted by [4]. A common ground characterized by the following main aspects emerges from the latter study: • Utilize data depicting normal operational conditions to construct a regression or classification model. • Employ a designated testing period to continuously monitor for any deviations from the established normal behavior. • If such anomalies are detected, trigger an alarm.
A drawback of these approaches is their heavy reliance on black-box models, which expose them to a set of potential risks like over-parametrization and limited generality. However, it is important to note that such concerns are not unique to the wind turbine condition monitoring domain. Therefore, the researchers' recent attention to the concepts of interpretability and explainability is not surprising.

The Rise of eXplainable Artificial Intelligence Methods
Interpretability and explainability are closely related concepts, and their demarcation is not always clear-cut. Interpretability primarily revolves around understanding the relationship between causes and effects [5], whereas explainability encompasses a broader notion of how to elucidate the contribution of a model's parameters in influencing its output [6].
A comprehensive examination of eXplainable Artificial Intelligence (XAI) techniques for industrial applications is presented in [7], where they provide a set of guidelines for constructing XAI-based regression models. Notably, in [7], it is recommended to retain the physical units of both input and output variables-deviating from the conventional practice with black-box models-due to their advantageous role in interpreting Shapley coefficients. Particularly, the computation of the Shapley coefficients [8] is a powerful XAI technique, which has been recently applied, for example, in [9,10] in the context of wind turbine condition monitoring.
The utility of Shapley coefficients is pronounced when an output, such as the power of a wind turbine in this instance, exhibits a multivariate dependence on various input variables. In the context of [10], solely environmental variables were taken into account, whereas [9] encompassed a broader range of variables, considering aspects related to operational behavior, mechanical response, and internal temperatures.
In greater detail, the significance of the Shapley value corresponding to the i-th instance of the j-th input variable involves quantifying how much the discrepancy between the model's estimation and the actual measurement depends on the particular i-th instance of the j-th covariate. This is practically achieved through Monte Carlo sampling. The predicted output value is calculated when incorporating all features, and then again when excluding the j-th variable. The disparity between these two values signifies the contribution of the j-th input variable to the model's error, denoted as the Shapley value.
As exemplified in [10], the significance of environmental variables is assessed in terms of their predictive relevance for wind turbine power. The analysis reveals that wind speed overwhelmingly holds the most substantial influence, while wind direction, turbulence intensity, wind shear, and ambient temperature also contribute meaningfully.
On the other hand, in [9], a combination of Sequential Features Selection (SFS) and computation of the Shapley coefficients is employed for classifying the most important features in a multivariate data-driven model for the power of a wind turbine. One of the main peculiarities of this work is the cross-testing workflow on a fleet of similar wind turbines from a farm, which allows the individuation of anomalous wind turbines and the corresponding input variables (which means the sub-components of the system) mainly linked to such anomalies. Particularly, in the conducted case study, incipient faults related to the hydraulic or electrical control of the blade pitch are highlighted, which had not been identified using state-of-the-art methods.
A further experience is outlined in [11], where the employed method is a supervised implementation of the Variational AutoEncoder (VAE) model. The outcomes of the work are an indicator of the wind turbine health state, a classifier giving as output the diagnosis, and a 2D plot that projects the wind turbine system behavior in a low-dimensional representation space. The evolution of the wind turbine behavior is therefore an interpretable trajectory in this 2D space and the Mahalanobis distance is used to compute the statistical difference between a certain state and the cluster of healthy data. An alarm is possibly raised through the Exponentially Weighted Moving Average (EWMA) method. The method is shown to be effective in two test cases: a main bearing degradation and a wind turbine stop due to icing.
Input variables interpretability is also assessed by using deep learning techniques as proposed by [12][13][14]. Particularly, in [12], an interpretable normal-behavior model is set up for wind turbine condition monitoring by modeling internal temperatures (namely, the gearbox oil temperature) based on the spatial-temporal attention module and the gated recurrent unit network (STAGN). The interpretability is given by expressing the spatialtemporal correlations learned with attention weights, which allow comprehension of the relation between the working variables (generator speed, impeller speed, active power, ambient temperature, and grid current) and the target temperature.
In [13], a pattern mining data fusion algorithm is employed for wind turbine condition monitoring. The general idea is to combine multiple data sources, which are the SCADAcollected measurements with ten minutes of averaging time and the alarm logs. Test cases of generator-bearing drive end and non-drive end faults are analyzed. Interpretable rules for diagnosing the faults are formulated, which are based on conditions on meaningful SCADA-collected measurements (as, for example, subcomponent temperatures) and on the operating behavior (which is extracted from the alarm logs).
In [14], data spreading from TCM systems (thus, with high sampling frequency) are processed for diagnosing wind turbine electromechanical faults. A scalable and lightweight Convolutional Neural Network (CNN) framework is employed, which can combine data from a variety of signals collected at the most important wind turbine subcomponents. The employed interpretability techniques are multidimensional scaling and layer-wise relevance propagation. These techniques allow for identifying the signal features, which are relevant for fault identification (alarm raising and subcomponent individuation).

Research Directions
The aforementioned review of literature underscores the promising potential of XAI techniques within the realm of wind turbine systems condition monitoring. In the research field of wind energy, the current trajectory is undoubtedly data-centric, with a pronounced emphasis on effective data utilization for predictive maintenance-a significant impending challenge.
The domain of wind farm management is rich with problem scenarios that entail establishing relationships between pivotal output parameters and an array of input variables, spanning diverse categories such as environmental factors, grid-related aspects, and various machine components. Similar to the scenario involving black-box models, the formulation of universal methodologies for data-driven, XAI-based wind turbine condition monitoring could prove challenging. Nonetheless, the overarching advantage of XAI techniques lies in their capacity to elucidate a model's output, thereby facilitating the establishment of cause-and-effect relationships. Moreover, the inherent explainability nurtured through such techniques offers the prospect of transferring knowledge garnered from successful test cases into a broader context.
Consequently, the potential hazards associated with the proliferation of Machine Learning literature-particularly the pitfalls of over-parameterization and limited generalization-may be effectively mitigated through the strategic adoption of XAI methodologies.