Combination of Thermal Modelling and Machine Learning Approaches for Fault Detection in Wind Turbine Gearboxes

: This research aims to bring together thermal modelling and machine learning approaches to improve the understanding on the operation and fault detection of a wind turbine gearbox. Recent fault detection research has focused on machine learning, black box approaches. Although it can be successful, it provides no indication of the physical behaviour. In this paper, thermal network modelling was applied to two datasets using SCADA (Supervisory Control and Data Acquisition) temperature data, with the aim of detecting a fault one month before failure. A machine learning approach was used on the same data to compare the results to thermal modelling. The results found that thermal network modelling could successfully detect a fault in many of the turbines examined and was validated by the machine learning approach for one of the datasets. For that same dataset, it was found that combining the thermal model losses and the machine learning approach by using the modelled losses as a feature in the classiﬁer resulted in the engineered feature becoming the most important feature in the classiﬁer. It was also found that the results from thermal modelling had a signiﬁcantly greater effect on successfully classifying the health of a turbine compared to temperature data. The other dataset gave less conclusive results, suggesting that the location of the fault and the temperature sensors could impact the fault-detection ability. distance of sensors from the failure. The HS failure with sensors at the HS stage gave more agreeable results as opposed to planetary failure, with sensors located only at the HS stage. The results of the thermal modelling and PFI were reconciled with wind speed data to explore if environmental conditions impacted the results. Further work could be undertaken to explore other environmental or operational conditions to determine if they have an effect on the results of the thermal network modelling. the potential of Additionally, the approach outlined in this allows for new high-weight features to be engineered through thermal modelling for data-driven failure prediction using classiﬁcation. By the accuracy of monitoring, the of WT


Introduction
Wind energy has an increasing share of the installed capacity of energy generation in the UK, in Europe and globally. The projection of global power generation from wind energy is expected to continue to increase from around 3% in 2015 to 8% in 2030 [1]. This transition to low carbon energy generation is vital for sustainable development and to solve the climate crisis. To enable this, the EU has set a target that 32% of energy must be from renewable sources by 2030 [2].
To accelerate the transition to renewable, low carbon energy, the cost of wind energy generation must be reduced. Operation and Maintenance (O&M) can contribute up to 40% of the total LCOE (levelised cost of wind energy) for offshore wind energy [3]. The percentage of electricity production lost due to gearbox downtime is the highest of all subassemblies [4]. Approximately 75% of all WTs (wind turbine) are geared [3], potentially because they have a lower capital cost. Therefore, WT gearbox condition monitoring and fault detection is important to improve reliability and to reduce the LCOE of wind.
This work is part of a wider research in which the authors proposed using a detailed understanding of the physical behaviour of a healthy gearbox and how this behaviour changes when fault occurs. This is especially useful when historical operational data are unavailable and/or models are transferred from other gearbox types. This paper explores existing condition-monitoring research and highlights some of the obstacles that existing techniques face. It details the methods of three different approaches used to analyse SCADA temperature data, one based on physical modelling, the other using machine learning techniques and the last combining the physical model and machine learning approach through feature engineering. The results from these three approaches will be compared to see if they provide a better understanding of how the thermal behaviour of a gearbox is affected by a fault and if temperature data can reliably be used for fault detection. The novelty of this research stems from the following: • Applying a thermal network modelling approach to a WT gearbox failure identification. • Comparing the output of thermal network modelling with temperature measurements to determine what has the greatest effect on gearbox health classification. • Using and combining both physical modelling and machine learning approach through feature engineering to improve understanding of WT gearbox thermal behaviour for failure prediction.

Literature Review: Existing Fault Detection Methods
WTs use SCADA (Supervisory Control and Data Acquisition) data to relate operating information-rotor speed, power and and system conditions-as well as environmental conditions-wind speed, ambient temperature, etc. [5,6]. With appropriate techniques, SCADA data can be used as an effective tool for condition monitoring [7]. Many WTs also collect additional data for a condition monitoring system (CMS). The data available depend on the sensors installed, turbine type and turbine operator. The application of this data is important to enable WT operators to transition from preventive to predictive maintenance policies [8], which has the potential to improve reliability and to reduce LCOE. Recent condition monitoring research predominantly applies data science techniques to successfully detect faults. Prevalent techniques will be briefly discussed.

SCADA and CMS Data Analysis Techniques
Data analysis techniques have progressed considerably in recent years, with advances in computing technology, whereby large amounts of data can be processed in relatively short time frames. SCADA and CMS data can be analysed using different approaches. Trending and clustering are arguably less complex methods, but the results tend to be case-specific, making the results difficult to interpret, requiring manual interpretation or generating false alarms [9]. Normal behaviour modelling (NBM) is a popular data analysis approach, a regression-based anomaly detection method whereby a model for the normal behaviour is established using training data and applied to actual parameter values. The difference in predicted and actual normal behaviour is calculated and used to track changes in damage-sensitive features [8]. A prevalent method of NBM that has been applied to wind turbine fault detection is neural networks, which has been proven as a successful means of detecting anomalies [10][11][12].
One of the shortcomings of SCADA analysis being based on statistical analysis is that it requires vast data sets of historical operation data and failure history [13]. Damage modelling is an alternative to this "black box" approach. It is based on replicating the physical behaviour of a system and how it changes as a result of damage. There are few studies where this approach has been used successfully [14,15], and as a result, application is yet to be established [9].

Vibration for Condition Monitoring
Vibration analysis has gained significant attention in recent fault detection research. Vibration data is collected from accelerometer sensors installed in the WT gearbox in addition to SCADA systems. Vibration data are useful for condition monitoring as it can detect changes in the structural properties of a component. For example, a gear defect, such as a crack, can affect the stiffness of the neighbouring teeth. This changes the vibration signal, which is reflected in amplitude and frequency modulation [16]. Moreover, the vibration signals induced by gear tooth damage such as pitting or spalling will exhibit spectral features that are different from those caused by manufacturing errors [17].
The diagnosis of gears through vibration signals can be performed using time, frequency or time-frequency methods; however, data analysis relies on advanced signal processing techniques due to noise in signals, high sampling rate, and variable speed. Vibration-based fault detection is difficult to apply at low speed (LS), as detecting faults is difficult due to weak fault signals [18] and it is reported that the average detection accuracy of the existing vibration monitoring systems is only about 50% [10], although with continuously evolving research, this accuracy will almost certainly be improved.
Advance signal processing and statistical analysis is required for other CMS data including but not limited to acoustic measurement, electrical effect monitoring and power quality.

Temperature for Condition Monitoring
It is common that WT SCADA systems include temperature sensors installed on components, such as the main bearing, high speed shaft bearing and gearbox oil [19]. Temperature can be a good indicator of fault since functioning mechanical systems experience power loss and power loss tends to increase with the occurrence of a fault in the system, as a result of increased friction or a reduced efficiency of energy transfer in the cooling mechanism. This in turn raises the component or system temperature [14,20] and cause a resulting change in thermal behaviour [21].
The time granularity of the data (usually 10 min) of SCADA is well suited to temperature analysis as the coarse time granulation of SCADA data naturally "denoises" [22] and, as a result, does not require complex signal processing [21,23,24]. However, the simple temperature trending approach alone is rarely successful in highlighting potential failures [25] as operating conditions influence system temperature. Reference [26] argued that, although the temperature signal has good anti-interference performance, for a bearing fault, its linear change trend cannot fully demonstrate nonlinear degradation and the resulting information can be limited.
A comprehensive review of fault detection using temperature measurements was carried out by [21]. They compared research using data from contact sensors and thermography for bearing and gear defects, which is highly relevant for this research. Contact sensors are most commonly used to collect temperature data, and historically, the best diagnostic method was to use threshold values on average temperatures [21], but temperature anomalies can be affected by various factors, such as oil temperature, ambient temperature and other operating conditions. How threshold values are established can be difficult and, if not accurate, can result in false alarms or missed failure warnings. Reference [27] acknowledged these issues and used a method based on probability estimation to generate confidence intervals to determine whether the oil temperature is abnormal.
Recent studies applied sophisticated data analysis techniques to temperature SCADA data for fault detection. Reference [28] found that using the gearbox temperature collected from SCADA data gave an early warning and alarm about the future fault. Reference [22] proposed a method that highlighted fault onsets in advance for 3 out of the 7 turbines using temperature from main bearing. Whether the advance warning is sufficient to prevent failure is unique to WT locations and operators. Reference [29] compared their modelled average failure rate over all WT plants (0.42), which was close to the actual failure rate found in WTs of the same type (0.52). Reference [20] found that their proposed method detected all abnormal behaviors associated with temperature, and Reference [13] found that anomalous mechanical functioning of gearboxes had consequences in terms of thermal behaviour, which was found to be the case for their test wind farm.
Thermography is useful in visualising the temperature distribution of a system; however, it is not generally used for online measurements and generates a lot of data. In order to let a system autonomously detect conditions and faults, infrared thermal imaging is mostly used in combination with image processing and machine learning. Reference [30] applied this to fault diagnosis of electric impact drills and used a nearest neighbour classifier and the backpropagation neural network. An interesting study used both infrared thermal imaging and vibration data via feature fusion, wherein model-driven features were extracted from the vibration measurements and data-driven features were extracted from the infrared thermal imaging data [31].
This research aims to use temperature data with a damage modelling approach to understand the physical operation of a gearbox; currently, literature directly related to this is limited.

Methodology
This research is based on analysing SCADA data from multi-Megawatt (MW)utility scale WTs that have a multi stage gearbox with a combination of planetary and parallel axis stages. A typical configuration is shown in Figure 1 with examples of where the sensors are located. The data analysed in this research have this configuration. In this study, the datasets from two test cases are used, as detailed in Table 1. The test cases are from commercial operational WTs; each case features the same sensor configuration and failure mode. The number and location of temperature sensors dictates the number and location of nodes to be used in the model; based on this, heat transfer between nodes can be deduced. For each dataset, there are SCADA data for when the turbine is considered "healthy" (one year from failure) and when the turbine is one month before failure (1M2F). The SCADA data are preprocessed prior to analysis. This took the form of separating out relevant temperature data from the rest of the SCADA data and of removing data points where the power was recorded below zero. To illustrate the difficulty of selecting a threshold temperature to detect faults, the difference in mean temperature data for a healthy turbine and 1M2F was calculated and is shown in Figure 2. It would be expected that the 1M2F data would be higher, but as Figure 2 shows, it is not always the case and is certainly not consistent across turbines that are of similar configuration. This research explores if temperature data can be applied to different methods for successful fault detection. The two different data analysis techniques are physical modelling and machine learning "black box" approach, as shown in Figure 3. The results of the physical modelling (thermal network modelling) and machine learning (weighting analysis) are compared to see if physical and data driven approaches yield a better understanding of how a fault can affect the thermal behaviour of a gearbox, the temperature data collected and the optimal use for fault detection.

Thermal Network Modelling
Thermal network modelling can be equated to electrical circuit theory by analogy where resistance to heat transfer is equivalent to electrical resistance, heat flow equates to current and temperature difference is equivalent to potential difference [32]. To create the thermal model, steady-state conditions were assumed and the gearbox components were split into a number of lumped mass isothermal nodes. An energy balance was applied to each node shown in (1). Losses (Q) can be calculated at each node, using the change in temperature and thermal resistances linking the nodes applied to (2) and (3) or (4) depending on the heat transfer mode; conduction or convection. Further details on thermal modelling methods can be found in [33]. Figure 4 shows the process of thermal modelling, general to all gearbox applications. As a comparison, a machine learning approach was applied to the same data to determine if the data-driven approach identified the same key areas as the thermal modeling approach. The machine learning technique selected was weighting analysis in the form of permutation feature importance (PFI).

Permutation Feature Importance
To provide an insight into how the input variables affect classification of the turbines in terms of healthy or 1M2F, PFI was performed to identify which temperature sensors had the greatest effect on the classification. The PFI implementation steps followed are shown in Figure 5. A random forest classifier was used, and Bayesian optimization was used to optimise hyperparameter selection. The hyperparameter optimised the number of trees and the complexity (depth) of the trees in the forest [34] to eliminate over or underfitting.
Extracting feature importance through permutation estimates the value of how influential the predictor variables in the model are at predicting the response. The influence of a predictor increases with the value of this measure. If a predictor is influential, then permuting its values should affect the model error. If a predictor is not influential, then permuting its values should have little effect on the model error. In this application, the temperature node with the highest value has the greatest effect on the classification outcome and could be used to indicate if a gearbox is healthy or close to failure.  Figure 6 shows the difference in mean calculated losses for healthy and 1M2F for each turbine in the dataset.
Based on thermal modeling, a difference in losses between the healthy data and 1M2F data can be seen in 3 out of 5 cases. The negative difference in losses suggest that the healthy turbine generates higher losses than it does one month before failure, which is not as would be expected. Of the turbines with a positive difference in losses, HSC has the greatest difference for 2 out of 3 turbines.

PFI
The PFI algorithm was run with the same data, as shown in Figure 7. The PFI of the dataset yields HSC nodes as the most important in most turbines, suggesting it is the location most affected by an HS fault.  Figure 8 shows the difference in mean calculated losses for healthy data and 1M2F for each turbine in the dataset. The data modelling was carried out on the whole dataset initially. The results for dataset 2 show that the range of loss differential is much larger across the turbines than dataset 1, indicating that there may be factors in the data affecting the results, the most obvious factor being power, as power is a function of torque and rotational speed and gearbox losses are also a function of torque and rotational speed. To explore this further, modelling was carried out on data split by power output to observe whether it has an effect on the calculated losses, as shown in Figure 9a,b for above and below rated power, respectively. Figure 9a shows a much-reduced range in terms of difference in losses, giving more uniform results, and in most cases, an increase in losses is shown for 1M2F. It can be seen that HSrtr has the highest change in losses for 3 out of 6 turbines and that HSmid also has consistently high change in losses, suggesting that these locations (HSrtr and HSmid) could be used to indicate fault.

PFI
The PFI model was applied to the turbine data to identify which variables had the greatest effect on health classification, as with the thermal modelling, it was also split by power output, as shown in Figures 10 and 11a,b. Contrary to the dataset 1 PFI, which showed the same node being the most important feature for all but one turbine, dataset 2 has weaker trends. It can be seen that HSmid and HSrtr appear as important features in some cases but not significantly or consistently for different power levels.

Discussion and Combining Approaches
The results for dataset 1 suggests that thermal modelling can detect a fault one month before failure in the majority of turbines studied. The highest losses can be seen at a node in the same stage as the fault, HSC; this result is validated by the PFI, confirming that it most frequently has the highest feature importance for classification. The PFI analysis on dataset 2 does not give a clear indication of which variable has the greatest effect on classification and that there is a weak correlation between PFI and thermal modelling results. This could be explained by the difference in failure location at the planetary stage and sensor locations in the HS stage.

Combining Thermal Modelling and Machine Learning Approach
To compare the effectiveness of thermal modelling for fault detection, the thermal loss data from the node with the greatest increase in losses 1M2F for each turbine were selected and added as a feature to the classification data set and the PFI algorithm was run. The results are shown in Figure 12a,b for datasets 1 and 2, respectively. Figure 12a shows that the losses are the most important feature when determining classification for dataset 1; for turbine 3, the importance is significant. This suggests that differences between healthy and unhealthy gearboxes are clearer in the heat and power domains than they are in the temperature domain. For dataset 2, the loss importance is not the same. The loss data are the most important for only 1 turbine but scores highly for 3 of the 5 other turbines. The fact that the engineered feature based on losses becomes the most important or one of the important features for a correct classification demonstrates the value of combining the thermal model approach with the machine learning approach through feature engineering.

Future Work
As this method has only been applied to two test cases, future work should apply the method to additional test cases, more specifically, other failure modes. The failures in this research were both bearing failures so it would be interesting to explore this method on gear tooth failures, for example. This method could be developed by combining with other measurement data, such as, vibration data. There is growing research on this idea; Reference [26] used a model for the remaining useful life (RUL) prediction of bearings that used both vibration and temperature characteristics, taking full advantage of the high sensitivity of vibration characteristics and the powerful anti-interference capability of temperature characteristics. Reference [35] found that supplementing vibration data with temperature measurements gave better representation of the state of the machine when compared with vibration data alone.
Future work could also involve applying this method to improve condition monitoring of other components, for example, the generator. There is existing research that applied thermal network modelling to induction machines [36] but not in the context of WT condition monitoring.

Conclusions
This research explored existing condition monitoring and fault detection literature to compare and contrast methods and results as well as highlighted some of the obstacles that existing techniques face. It then used three different methods to analyse SCADA temperature data, one based on physical modelling, the other using machine learning techniques, and the last combining the physical model and machine learning approach through feature engineering. The results of this research suggest that, overall, when a fault is not obvious via a difference in temperature data, thermal modelling can detect a fault. For dataset 1, the results from thermal network modelling were significantly more influential in turbine healthy classification. The results show an agreement between thermal modelling and PFI to an extent, but this agreement seems to be impacted by the failure mode examined and the distance of sensors from the failure. The HS failure with sensors at the HS stage gave more agreeable results as opposed to planetary failure, with sensors located only at the HS stage. The results of the thermal modelling and PFI were reconciled with wind speed data to explore if environmental conditions impacted the results. Further work could be undertaken to explore other environmental or operational conditions to determine if they have an effect on the results of the thermal network modelling.
Application of this theory has the potential to set condition monitoring threshold values for operational WT gearboxes by applying thermal modelling and gearbox power loss calculations [37] for a healthy gearbox. This paper shows the potential advantage of physics-informed machine learning, especially in complex systems such as WTs. Domain knowledge is necessary for black box models. Additionally, the approach outlined in this paper allows for new high-weight features to be engineered through thermal modelling for data-driven failure prediction using classification. By improving the accuracy of condition monitoring, the cost of WT O&M can be reduced. Funding: This work has been funded by the EPSRC, project reference number EP/L016680/1.

Conflicts of Interest:
The authors declare no conflict of interest.