An Unsupervised Learning Approach to Condition Assessment on a Wound-Rotor Induction Generator

: Accurate online diagnosis of incipient faults and condition assessment on generators is especially challenging to automate through supervised learning techniques, because of data imbalance. Fault-condition training and test data are either not available or are experimentally emulated, and therefore do not precisely account for all the eventualities and nuances of practical operating conditions. Thus, it would be more convenient to harness the ability of unsupervised learning in these applications. An investigation into the use of unsupervised learning as a means of recognizing incipient fault patterns and assessing the condition of a wound-rotor induction generator is presented. High-dimension clustering is performed using stator and rotor current and voltage signatures measured under healthy and varying fault conditions on an experimental wound-rotor induction generator. An analysis and validation of the clustering results are carried out to determine the performance and suitability of the technique. Results indicate that the presented technique can accurately distinguish the different incipient faults investigated in an unsupervised manner. This research will contribute to the ongoing development of unsupervised learning frameworks in data-driven diagnostic systems for WRIGs and similar electrical machines.


Introduction
Recently, more attention is being given to research of wound-rotor induction generator (WRIG) condition monitoring methods. This is due to the growing interest in the use of the WRIG for wind-turbine applications because of its desirable traits such as dynamic control and relatively robust performance. Despite the WRIG's robust performance, abnormal machine behaviors due to faults can lead to damage to the turbine system and its subsystems, resulting in further losses caused by unplanned maintenance and downtime [1,2]. The possibility to perform accurate diagnoses for different types of faults-at an incipient stage-has thus been an ongoing research challenge.
Predictive maintenance generally consists of two key aspects, namely detection and diagnosis of faults through available methods and to thereafter remove the anomaly that is causing reduction in performance of the machine in order to prevent unplanned downtime and/or failure [3]. The most commonly occurring problems with these machines are interturn short-circuited windings on the stator and rotor, broken rotor bars and end rings, bearing faults, as well as air-gap eccentricities either in static, dynamic or mixed forms [4].
Although many advanced signal processing techniques have been presented for fault detection and diagnosis, these require expert knowledge and experience to adequately implement [5]. The field of condition monitoring on electrical machines is evolving into the digital age and there is a greater need for improvement to condition monitoring with the development of efficient and reliable predictive analytics systems. Thus, intelligent or expert systems must extend on these aforementioned fault detection and diagnosis methods to reduce the need for expert knowledge and experience for adequate implementation. The use of Machine Learning (ML) techniques in modern industrial informatics now offers the potential to automatically diagnose the aforementioned problems while the machine is in operation. Supervised learning techniques are by far the most commonly proposed techniques for achieving these data-driven diagnostic goals. These supervised learning approaches vary, extended to applications in various types of machines and continue to be widely researched and proposed [6][7][8]. However, the practical problem of data imbalance is often overlooked. The problem itself arises due the lack of availability of faulty condition data. The research presented in [9], is example whereby supervised learning is only made successful through use of an extensive database of vibration measurement data under healthy and faulty motor conditions. As pointed out in [10], traditional intelligent methods-i.e., employing supervised techniques, fall short of obtaining adequate diagnostic accuracy in practice due to the limited availability of labeled data thus resulting in fault-type data imbalance. In fact, the challenges in fault diagnosis on electrical rotating machines due to imbalanced data sets, have become an important topic of researchers in the field of intelligent condition monitoring, as it poses to constraints on the accuracy of conventional techniques using supervised learning approaches [11]. These imbalanced data can be often overlooked when applying supervised learning techniques, because typical real-world applications mostly operate under healthy conditions and faults sparsely occur over the life of the machine [12]. Although it is possible to experimentally emulate fault-condition data for the purpose of training, validating and testing diagnostic models, these data do not account for the various levels and types of faults that occur in practice. Thus, the successful deployment of the ML model in practice depends on how closely the experimentally emulated training and test data matches the practical scenario. Besides these drawbacks of robust data-driven methods, model-based methods, on the other hand, operate on the basis of detecting discrepancies between actual system behavior and the mathematical model [13]. Therefore, the drawback with these approaches is that it is not practicable to establish fault indicators-that can be accurately modelled and measured-for all potential fault occurrences on a WRIG. This research is aimed at addressing the aforementioned practical challenges of implementing machine learning for automated incipient fault diagnosis. An investigation into the use of an unsupervised learning approach for condition diagnosis on a WRIG is presented. The feasibility of the unsupervised learning approach is assessed here using electrical measurement modalities of an experimental WRIG as model attributes-together with frequency-domain signal processing thereof. Results indicate that the presented unsupervised learning approach accurately extracts and clusters patterns in the selected attributes corresponding to the different incipient fault conditions tested on the experimental WRIG. The potential for employing unsupervised learning in detecting anomalous behavior that deviates from healthy operation of the WRIG is demonstrated. To the best of the authors' knowledge, this study has not yet been presented and will contribute to the research and development of unsupervised learning frameworks in data-driven diagnostic systems for WRIGs.

Background
Machine learning (ML) enables acquisition of knowledge for the main purpose of making decisions and predictions [14]. The different types of learning techniques used in ML can be broadly categorized into supervised learning, semi-supervised learning, unsupervised learning, reinforced learning as presented in Figure 1. In supervised learning, the classifier is trained with known data so that it can predict, or classify, the unknown instances. On the other hand, unsupervised learning is used to learn from the input data without any specific outcome variable/s. Semi-supervised learning uses the labeled data from a smaller subset of the data to identify and label other data in order to subsequently retrain the model. Reinforcement learning interacts with dynamic environment to achieve objectives based on rewards and penalities [15]. Unsupervised learning essentially determines hidden patterns based on input data without corresponding output labels [16]. Because unsupervised learning uncovers distinct classes without a teacher, the actual labels must be manually identified [17]. Simply put, the unsupervised learning results generally need manual intervention for confirmation of target classes. Although unsupervised learning is largely suited to more exploratory applications due to it being more subjective and without the straightforward objective of response prediction, it usage is ever increasing [16]. Some common applications of unsupervised learning include inter alia DNA/gene classification in computational biology [18,19], physics [20], wireless communications [21], building systems [22] and more.
The lack of fault-condition training and test data-that precisely account for the eventualities and nuances of practical operating conditions on an electrical machinerenders continuous online monitoring of the machine as part of predictive maintenance strategy a difficult proposition. Unsupervised learning does offer the potential to "group" machine responses over time in a manner that can be used to identify significant changes in the health of machine. However, this potential can only be realized if a clear distinction between healthy and different fault conditions can be made without the use of training data. This work investigates this potential condition monitoring approach the unsupervised learning technique of clustering.
Clustering is simply a method of uncovering distinct groups or classes in set of observations. In its simplest form, the similarities between observations are measured by the distance between them within the feature space. Euclidean distance is the commonly used method to quantify the similarity between an instance and a centroid [23]. In general, the most commonly used clustering techniques are k-means, Hierarchical, density-based spatial clustering of application with noise (DBSCAN), grid-and model-based methods [23]. Ultimately, clustering is a powerful means of uncovering and visualizing hidden trends in a dataset, and grouping instances for modelling purposes [24]. Hierarchical clustering is a method in which the clusters are shaped as a tree structure, with each node presenting a different cluster. This method is a divisive and agglomerative one which is based on splitting and merging of clusters [25], providing more detail with regards to the relation between data sets at different levels. Hierarchical clustering has been found to be wellsuited to smaller datasets whereas partition clustering has been found to work better with larger datasets [26]. K-means is one of the most commonly applied clustering techniques across a multitude of different fields. This type of clustering essentially works on the principle of grouping instances according to the distances in the feature space [25]. The key aspect of this method is determining the number of groupings in the dataset and how well the instances are grouped. The commonly used validity method for the k-means clustering are Silhouette analysis and elbow method. The elbow method assists with determining the optimal number of clusters-by evaluating the error for different number of clusters and corresponding assignments of instances. Silhouette analysis is also used to validate and interpret the results of clustering by evaluating the distance measures between each point in a cluster and its neighboring cluster. The validation is based on the comparison of cluster tightness and separation of each cluster [27]. Silhouette values provide a means of evaluating clustering validity and is also an indicator of the appropriate sectioning of clusters. Table 1 presents an overview of typical average silhouette widths and corresponding overall strengths of separation [28]. The silhouette values range between −1 to +1, where a value of +1 indicates that the instance is far away from its neighboring cluster and well matched to the assigned cluster. An object with −1 value indicates that an instance is very close to its neighboring cluster and is potentially not well-suited for the assigned cluster. An object with zero value indicates occurrence of an instance on the boundary. Generally, achieving clustering using unsupervised learning is based on the clear identification of feature patterns/similarity among instances. This identification is done by minimizing the sum of squares of distances-within the feature space-between data and the corresponding cluster centroid. K-means clustering is unsupervised, hence, suitable performance measures are key in evaluating the results. Due to the variety of clustering techniques results in a different set of clusters, the selected or preferred clustering should have a way of verification method. The presented methodology employs the k-means clustering technique, because of its simple, yet powerful ability to cluster a high number of features. Furthermore, an essential reason for selecting this technique for this research is the structured forms for validation are available-i.e., the aforementioned silhouette and elbow methods. In the presented methodology, the cluster centroids arising from several different signals of a WRIG under healthy and varying fault conditions through use of an experimental setup is first investigated. Thereafter, these patterns in the data set-without pre-existing labels-are verified with the ground truth to confirm the accuracy of the technique.
Harmonic order tracking analysis is a well-known method for condition monitoring on machines and essential consists of tracking stator current signature harmonics as fault indicators [29,30]. Additionally, the use of multiple or a combination of signaturesbeyond only the stator current signature-have been shown to extend diagnostic accuracies [31]. The different fault conditions considered for the investigation are stator-and rotor-winding inter-turn short-circuit, and brush faults. Winding faults are considered to be these faults constitute the major proportion of faults that occur with WRIGs in practice, typically occurring as incipient faults and ultimately cascading into other faults such opening or shorting of the phase windings. Based on the previous studies, it has been shown that stator faults contributes from 30% to 40% of the faults experienced under electrical failure category. In addition these faults lead windings to asymmetrical [32,33]. Therefore, in this work, the harmonic orders of the stator voltages and currents, and rotor currents are employed as the feature candidates for the high-dimension cluster modelling. Multiple tests of this unsupervised approach are conducted on all the signatures-both separately and combined.

Experimental Configuration
The laboratory configuration used in the investigation is presented in Figures 2 and 3. It entails the use a wound-rotor induction machine (3-phase, 1 kW, 380 V (nominal, 4-pole), together with a relatively larger induction machine and variable speed drive combination used as a prime mover. The setup also uses a capacitor bank for voltage self-excitation, variable resistors for loading, and several transducers for measuring voltage, current, shaft speed etc. all connected to data acquisition system interfaced with a computer. This hardware-in-the-loop configuration enables real-time measurement and online monitoring. The modes of operation were recorded under different conditions as presented in Table 2. Specific details of the induction machine used for WRIG construction is given in Table 3.
Rotor-winding short (6 turns) For the presented experimental investigation, the machine is monitored under healthy and different fault conditions when the generator is not loaded and when it is loaded. The three faults considered are stator and rotor inter-turn short-circuited windings, and a contact fault on the brushes. There faults were implemented by modifying the experimental configuration. The stator voltages, stator currents, and rotor currents are measured under the different test conditions. A data instance in this case represents sampling and recording of each of the signals over a period of time during the different test conditions. During the generator test conditions, the procedure of recording the different signals is repeated until a set of at least 20 instances of each of the feature signatures are recorded. It should be highlighted that presented investigation aims to demonstrate the potential of the unsupervised approach in providing sufficiently separable clusters under normal and different faulty conditions. In practice, suitable sample sizes may vary and thus no specific sample size is stipulated for all. The performance of the clustering should be carefully analyzed via the elbow plots and the silhouette plots in order to ensure suitability of results, arising from the used data, as is carried out in this work.

Signal Processing and Feature Extraction
The experimental data are based on four machine conditions namely healthy, inter-turn short-circuits on the stator and rotor windings and brush contact faults. The recorded signal data for the 3 phases of the stator voltage and current, and rotor current are decomposed into frequency spectra using the Fast-Fourier Transform (FFT)-i.e., single-sided amplitude spectra or harmonic components of each of the measured signals, whereby each phase of the voltages and currents are treated as separate signals. The harmonics components for each signal are then normalized with respect to the maximum and minimum harmonics. During this process, the each of the harmonics' magnitudes are normalized with respect to the magnitude of the maximum harmonic order. This means that all harmonic orders are calculated with respect to fundamental harmonic order. The fundamental harmonic order for each of the different signals is thus equal to 1 when normalized because it has been normalized with respect to itself. These sets of normalized signal harmonics are used as features or attributes to the cluster models. The spectra for the stator voltages and currents are obtained up to and including the 10th order and the spectra for the rotor currents are obtained up to and including the 3rd order. Examples of the experimental measurements for the WRIG induced stator voltages, and rotor currents are given for healthy no-load and stator-winding fault no-load conditions in Figure 4. As can be seen, there are no immediately overt differences-particularly with the measured voltages. This demonstrates the necessity to extract the signal frequency harmonics, which are more sensitive to incipient fault occurrences. Examples of the normalized frequency spectra for one of the three phases for each the stator voltage and current, and rotor current are given in Figure 5, under each of the conditions analyzed. Here, the differences between conditions become more apparent, but still exemplify the need for an intelligent data-driven system to track these patterned variations in the various harmonic features.  The used harmonic orders of the stator voltages and currents, and rotor currents are related to the relevant observable frequencies of interest associated with the machine design and operation. The frequency range of the stator voltages and currents arise mainly form the electromagnetic design of the machine-as a function of excitation and the stator and rotor designs. Thus, the orders recorded are based on not only observable frequencies, but also on the orders of prominent frequencies are key characteristics of the machine's design and operation. Thus, there are 30 dimensions (3 phases and DC components) for the stator voltage and for the stator current feature spaces, and 9 dimensions (3 phases and DC components) for the rotor current feature space.
The k-means clustering method is applied in this investigation (typical methodology shown in Figure 6). In general, the number of clusters which is k, is first selected. The center of the clusters is then determined, and the instances are randomly assigned into k clusters based on their similarities. The Euclidean distances between each instance and the mean of the cluster centroids is then computed. Then the new centroid is computed for each cluster. If the data point is not closest to an assigned cluster, it is reassigned to the nearest cluster. The process iterates until the criterion function converges whereby each instance is assigned to the most suitable cluster. The selection of the value of k, or number of clusters, requires careful consideration and is dependent on the analyses of the clustering results obtained under different values of k. As mentioned, these analyses are carried out with the use of silhouette and elbow methods. In the context of the presentation application, the coordinates of the healthy cluster centroid is of particular interest in practice. The healthy cluster centroid coordinates would be specific to the machine and its application-i.e., the machines design and specifications, and its operating conditions. It is expected that in practice, the coordinates of the new healthy instances will undergo some shifting during continuous monitoring but will still be in and around the range of the healthy cluster. The movement of newly measured instances towards another cluster and deviation from the healthy cluster will indicate occurrence of anomalies and facilitate fault detection. However, the potential to sufficiently separate clusters arising from healthy and different fault conditions, which enables detection of deviations, needs to be proven. Thus, the presented methodology is developed to determine if clusters arising from healthy and different fault conditions are sufficiently separable to enable fault detection on the WRIG via unsupervised learning.

Results and Analysis
The proposed k-means methodology is applied to the unsupervised data for stator current, stator voltage and rotor current. The k-means plot results obtained for the rotor current, using experimental data, are shown in Figures 7-9. A scatter plot of clustered instances is shown in Figure 7, visualized in terms of three feature harmonics when k = 4, with corresponding silhouette plot given in Figure 8. The silhouette values for the case of rotor current features only are also given for k = 5 in Figure 9.    Figure 10 shows the k-means clustering for the stator current harmonics as feature data, where k = 4. The corresponding silhouette plots for k = 4 and k = 5 are shown in Figures 11 and 12, respectively. It is observed that most points typically have larger silhouette values for the clusters. Additionally, it is found that when k = 5, cluster 1 at k = 4 moves to cluster 4. The transition from k = 4 to k = 5, also results in cluster 3 separating into 2 narrower clusters. Cluster 3 consists of misclassified instance with value of approximately −0.1685.             Table 4 presents a summary of average silhouette values obtained. These resulting values indicate that the k-means clustering yields reasonable to excellent structures. The use of stator voltage harmonics as feature data achieved a reasonable split based on the averages which is greater than 0.6. Rotor current features provided good performance while stator current achieved the best clustering performance when single-signature harmonics where employed as the feature set. This means that the stator current-out of each of the signatures used for the investigation-carries the most information regarding the incipient faults considered. This is somewhat intuitive based on the fact that the stator current of an electrical machine is relatively more sensitive to intrinsic design and operational factors-which are affected by the fault conditions-in comparison to other machine signatures. This is in line with previous studies and recognition of stator current as a good fault indicator. The prevalence of motor current signature analysis over the years a good example of this. The combined signature feature does however yield the best clustering performance. Although the stator current signature does hold more characteristic information (and patterned variance) related to the different condition, the combined signature set hold this information plus additional correlation strength arising from the other signaturesi.e., stator voltage and rotor current. In order to directly compare clustering for each of the cases of different features used, the sum of squared errors for each of the clustering models are determined for different number of clusters. It should be highlighted that the sum of squared errors is normalized here to compare the elbow plots as given in Figure 19. This is because the features sets have varying sizes and number of dimensions, and thus require normalization for suitable comparison. Figure 19 confirms that the optimal number of clusters obtained for each of the different models closely match the ground truth, particularly when the combined signatures' harmonics are used as feature data.

Conclusions
The need for improved accuracy and flexibility in condition monitoring approaches on electrical machines has prompted the transition from model-based methods to datadriven approaches. Although there have been tremendous steps in the development and application of data-driven approaches in the field-which have been largely based on supervised learning techniques-there are still the problems of data imbalance and a lack of training data that precisely accounts for all the nuances of faults encountered in practice. In an effort towards overcoming these problems, this study investigates the suitability of an unsupervised learning approach in accurately diagnosing incipient faults on a WRIG using online signatures. A high-dimensional k-means modelling approach was used to develop diagnostic models for the machine under different incipient fault conditionsi.e., stator-and rotor-winding, and brush faults. Each of the models were developed and tested using single-signature harmonics and combined signature harmonics as features. The clustering models were analyzed and validated using silhouette values and elbow plots. The clustering results indicate that stator current harmonics provide the best results among the single-signature harmonics feature sets, while the combined set of harmonics from all signatures yielded excellent model structures when measured against the ground truth. Overall, the results of the investigation indicate that the presented unsupervised approach is well-suited for online condition diagnostics on WRIGs, even under incipient fault conditions. This research thus serves as a meaningful step in enhancing data-driven diagnostic systems-through unsupervised learning-for predictive maintenance on WRIGs and similar electrical machines.