Review of Recent Advances in the Application of the Wavelet Transform to Diagnose Cracked Rotors

Wavelet transform (WT) has been used in the diagnosis of cracked rotors since the 1990s. At present, WT is one of the most commonly used tools to treat signals in several fields. Understandably, this has been an area of extensive scientific research, which is why this paper aims to summarize briefly the major advances in the field since 2008. The present review considers advances in the use and application of WT, the selection of the parameters used, and the key achievements in using WT for crack diagnosis.


Introduction
The detection of cracks in rotors is a vitally important area of maintenance as failure to recognize cracks can lead to catastrophic mechanical failure.The dynamic behaviour of cracked rotors has been studied since the 1960s, and since then advances have allowed for the detection and recognition of cracks before breaking occurs.However, there remain some cases, where crack detection occurs late, when the situation has become critical and catastrophic failure is imminent.Therefore, it is important to focus research on improving existing crack detection techniques by using knowledge about the dynamical behavior of these systems.Studies which focus on cracked rotors have been carried out using a variety of different methods in order to select patterns able to detect fault conditions.
This has become such a pressing issue that in 2008, the journal "Mechanical Systems and Signal Processing" published a special issue focusing solely on cracked rotors.Bachschmid and Pennacchi [1] highlighted that despite the high number of papers published about the dynamics of cracked rotors, these papers do not present experimental results, making it difficult to translate the results obtained to real systems.It is often the case that these studies do not focus on the inverse process of crack identification.Sekhar [2] stated that most of the studies dealt with a single crack, however, the problem of crack sizing and locating becomes more complex.A review of crack identification in rotating shafts by Papadopoulos [3], concluded that most of the work in crack detection was based on using vibration signatures.Papadopoulos classified diagnosis techniques as applying either model-based or vibration methods.
Model-based methods consider aim to produce identical effects to those seen in real life by using equivalent loading at the place of the crack.Usually these models are based on information extracted from directly measured signals, signal models, and process models, as demonstrated by in [4][5][6].Vibration methods are based on detecting variations in the response, which are then assigned to a crack.Signals analyzed via vibration methods can come from real systems or models, as in the cases of [7,8], respectively.Models are mainly based on the crack breathing mechanism; further detail regarding this can be found in [9], and new breathing functions were proposed in [10].The most commonly used model in this field is the simple Jeffcott rotor model, detailed in [11,12].The Finite Elements Approach (FEA) has attracted researchers as an investigative tool for the study of crack breathing phenomena.However, traditional FEA has some limitations, such as low efficiency and accuracy, slow convergence, and strong nonlinearity.Wavelet spaces have also been employed, and from this wavelet finite element methods (WFEM) have been developed [13].When comparing numerical results obtained using this model with experimental data, it is not only possible to detect the crack, but also the location and depth of that crack.These models have been applied to experimental results in works as demonstrated by [14], where it was combined with Genetic Algorithms (GA).Other examples of the WFEM model have been presented in [15,16].
Signals obtained from studying rotating machinery have complex structures and are comprised of a large amount of information.The aim of condition monitoring is to select the proper processing to extract the optimal features of a signal for fault detection.The features must fit two main requirements: they must be easy to handle with a low computational cost, and they must contain valuable information about the fault to be predicted.Within vibration methods, there is a large range of different signal processing methods that can be applied for this purpose.Short Time Fourier Transform (STFT) and Hilbert transform (HT) have traditionally been used to observe changes in response or in eigenfrequencies when a crack forms and propagates [17,18].In 2004, Sabnavis et al. [19] affirmed that, for cracks, 1x and 2x frequency components using Fast Fourier Transform (FFT) and HT experiment very important changes at the steady state when a crack forms.However, in practice, these changes are very difficult to detect due to the presence of noise and other interfering factors.In recent years, new techniques classified by Sabnavis et al. [19], which they termed 'non-traditional methods', have appeared.The non-traditional techniques that have emerged work in both the time and frequency domains.Specifically, the Hilbert-Huang Transform (HHT), based on the Empirical Mode Decomposition (EMD) for obtaining the Intrinsic Mode Functions (IMFs), has been used effectively in the diagnosis of cracked rotors.This approach was detailed in [20,21].However, due to its effectiveness in treating non-stationary signals, the most widely used technique in crack diagnosis in rotors is the Wavelet Transform (WT).Nowadays, WT is the most common technique used to treat signals [22].
WT theory has also been applied to acoustic emission (AE) signals to detect crack growth, as demonstrated in [23].Recently, AE signals have been widely used in order to monitor the condition of rotating machinery.When the sensor is close to the source, AE is capable of detecting crack growth [24].Typical frequencies associated with activity of AE range from 20 kHz to 1 MHz.
A review of the use of WT in the diagnosis of faults in machinery was presented by Peng and Chu [25] in 2004.They stated that the use of WT had not yet achieved precedent.The main reason for this is that the wavelet application is not a straightforward issue due to the requirement to select the wavelet function and the range scale, and also due to the fact that there are no standardized or generalized methods to carry out these requirements.Another drawback of using WT is the difficulty in understanding the results obtained.
The aim of the present review is to summarize the advances that have been achieved since 2008 in the application of WT to diagnose cracked rotors.

The Wavelet Transform
WT is a modern mathematical development for the treatment of signals both the time and frequency domains.Therefore it is specially useful in observing the changes in patterns over time.The Fourier Transform, and most techniques derived from it, are inappropriate to treat non-stationary signals due to the absence of any temporary information.A technique that is suitable for treating non-stationary signals is STFT.However, the major disadvantage for this technique is the resolution of the frequency obtained, which remains constant for the whole signal as the same window is applied.On the other hand, WT allows a high frequency resolution at low frequencies, and high time resolution at high frequencies, as desired.
The first concept of wavelets was set up by Morlet in 1984.Later, Meyer, Mallat, and Daubechies contributed with important developments regarding orthogonal wavelet functions.However, the use of WT became popular in engineering applications since the end of the 1980s, when Mallat [26] found that there was a relationship between Quadrature Mirror Filters and orthonormal wavelet functions.From this, new techniques allowed for the application of WT to a signal using recursive-filtering banks.Recently, WT has become the most widely applied tool in signal processing in many different fields such as voice recognition [27,28], noise reduction [29,30], electrocardiographs [31], and radio-frequency interference mitigation [32], amongst others.

Continuous Wavelet Transform
The same way Fourier Transform (FT) obtains correlation coefficients between the analyzed signal and a sinusoidal one, WT obtains correlation coefficients between the signal and an orthonormal function, called the wavelet function, depending on the scale and position of the wavelet function.There are a lot of different types of wavelet functions, such as those detailed by Daubechies, Coiflet, Symlet, Morlet, and Meyer.
The Continuous Wavelet Transform (CWT) allows analysis of a signal through the correlation coefficients of that signals instead of using the whole signal information .The mathematical formula for determining CWT is shown in Equation ( 1): where x(t) is the signal in the time domain, ψ is the wavelet function, and w is a weighting function.CWT is one of the best tools available to detect singularities of a signal.Singularity detection is carried out with the local maxima lines [25].
Regarding crack detection, Nagaraju et al. [41] affirmed that the presence of sub-critical (1/3 and 1/2 of the critical speed) components in the CWT may be indicative of cracking, although this may not always be reliable due to the high sensitivity to noise.They proposed an alternative technique, involving the calculation of the phase angle between the two signals (cracked and un-cracked) of two transverse vibration.The phase angle can be calculated with the Cross Wavelet Transform (XWT), called as they proposed.
The sub-critical 1/3 and 1/2 components calculated with CWT have also been used as indicators of the of a crack [42].The amplitude values of sub-critical and critical peaks in the CWT are used to feed an Artificial Neural Network (ANN) [43], allowing crack diagnosis, and determination of the position and depth of the crack.
CWT has also been applied to diagnose notched rotors (i.e., rotors with a transverse open crack) [44], where it has been demonstrated that both CWT and changes on the 2× harmonic can be used as robust indicators.
Srinivas et al. [45] used the CWT coefficients as the input for an ANN.They studied the ability of the system to detect combined faults of unbalance and shaft crack.This method was tested successfully with a hit rate of 99.9%.
CWT has been used in engineering applications for fault detection of rotating machinery in the form of a scalogram.The scalogram is defined as the square of the modulus of the CWT.However, the use of CWT in current applications, to diagnose machinery faults, is still relatively rare.This is due to the fact that the visual interpretation of wavelet results is often difficult.Efforts have been made to extract the best features analyzing the residual wavelet scalogram [46].

Discrete Wavelet Transform
The initial application of the Discrete Wavelet Transform (DWT) was to use discrete data as the scale parameter, c, in the Equation (1).Commonly, the dyadic grid was used, resulting in c = 2 k .After Mallat's developments [26], many derivations of the DWT appeared.These were based on the use of digital filters to optimize the computational process, generating information in frequency bands.This tool consists of the processing of a discrete signal, x(i), at different positions and scales (i.e., different frequencies and resolution levels), decomposing the signals in approximation (A) and detail (D).The approximation information is obtained by means of a low pass filter, and the detail information using a high pass filter, as it can be observed in Figure 2.After applying filters to a signal, S, of a frequency band [0, π] and number of samples, N, the frequency band is halved obtaining both approximation information (A) [0, π/2] and detail information (D) [π/2, π].Therefore, applying the Nyquist rule [47], downsampling by two can be justified without losing any relevant information; this gives a resulting number of N/2 samples [48].
The recursive application of the filtering algorithm resulted in Multiresolution Analysis (MRA) and the Wavelet Packet Transform (WPT).
The main application of this tools was in crack detection in the feature extraction.The DWT can represent a signal with a limited number of coefficients that can be directly used as features.
Statistical parameters, and the energy of the coefficients, can also be used as features.The features selected can be all from the same decomposition level in the tree, this is called the single-level basis selection.Alternatively, the features can be selected from different levels, this is termed the multiple-level basis selection.The main problem when using these kind of transforms is in finding the best features for crack detection.In addition, the decision as to whether a defect does or does not exist requires a classification system.Using thresholding methods or intelligent classification systems, such as ANN, GA, or Support Vector Machines (SVM) is common.

Multiresolution Analysis
The concept of MRA was developed by Meyer and Mallat in 1986.MRA consists on the application of the DWT in a recursive way until the desired decomposition level is reached.Initially, the main disadvantage of MRA is that only approximation information could be decomposed in the different frequency bands, as shown in Figure 3. MRA decomposition procedure until decomposition level 3 of decomposition.Each decomposition is based on DWT (see Figure 2).Sawicki et al. [49], applied MRA for a transverse crack detection using signals from an experimental set-up, using the RMS of the coefficients of the lower frequency bands.
MRA has been used for crack detection with very good results for signals obtained from a Jeffcott rotor model combined with neural networks [50].
MRA has been also applied to detect cracks growing in shafts using AE [23].Specifically, envelope analysis combined with MRA has been applied with very good results.In [51], a comparison between vibration and AE signature was carried out.Using vibration signals, after MRA decomposition, diagnostic characteristics are observed.In the case of AE, envelope analysis was performed after wavelet decomposition, which allowed diagnostic information to become available.

Wavelet Packets Transform
Due to the incapability of MRA to decompose high frequency bands, in 1992, Coifman, Meyer and Wickerhauser developed the Wavelet Packets Transform (WPT).Using WPT, all the approximation and detail information can be decomposed until the desired level.This process can be observed in Figure 4.
WPT decomposition process until level 3 of decomposition, where each decomposition is performed based on DWT (see Figure 2).
The term W(k, j) represents the coefficients obtained for a packet, where k is the decomposition level (here, the number of packets obtained is 2 k ) and j is the position of the packet within the decomposition level.Then, each correlation vector W(k, j) has the structure of Equation ( 2): WPT has been used in the last years for fault detection in rotating machinery and condition monitoring [52,53].WPT coefficients can be directly applied for fault detection as they content valuable information about them, according to [54].Features related to the statistical parameters of WPT coefficients have been used combined with SVM in [55].It was highlighted in [25] that wavelet energy-based features are often not suitable to detect incipient defects, since slight changes in a signal will be masked.Later, normalized wavelet packet quantifiers were proposed for condition monitoring, such as relative energy [56].WPT energy has been successfully used to detect incipient induced cracks in shafts at steady state, using vibration signals coming from a rig [33], solving the problem of how to select the best feature.WPT has also been used to detect the appearance of cracks in railway axles during fatigue testing at steady state, using thresholding methods [57].Signals coming from Jeffcott rotor models during start-up, under different crack conditions simulated using crack breathing functions, have also been successfully analyzed via WPT energy [58].
Studies have also been carried out to examine the application of WPT to detect faults in rotating machinery using signals obtained experimentally from a custom-built rotor kit [59].The experimental set-up simulated the main operating conditions of rotating machinery, at laboratory scale, which are used in a wide range of industrial applications involving several different pieces of equipment.It has been affirmed that the WPT is a powerful tool for detailed feature extraction.For impulsive faults like rubbing, it is proposed a combined method of WPT and CWT for fault detection.
WPT has also been applied to AE signals [60], showing advantages in the diagnosis at early period crack faults.The diagnosis results obtained were clear, reliable and accurate.

Wavelet Transform Parameters Selection
One of the main parameters to take into consideration when using wavelet theory is the selection of the frequency resolution in each zone of the signal.This is determined by the scale range in CWT, and by the decomposition level in DWT.
When using CWT, there are different resolution in frequency and in time at the different zones of the signal, therefore the scale range must be adjusted to get the desired frequency range at a local zone of the signal.The scale range determines the frequency range.The relationship between the scale, s, and the frequency, f , when applying CWT is shown in Equation (3).
where f ψ is the central frequency of the wavelet function in Hz, and f s the sampling frequency.The analysis of the standard deviation of the wavelet scales energy is one of the most important features in distinguishing impulsive and oscillatory transient disturbance, as performed in [61] were a power system transient analysis was performed.
On the other hand, when using DWT, every time the signal is decomposed, the frequency range is divided by two, as demonstrated in Figure 2. When the MRA is applied, not all of the packets are decomposed, due to this the frequency ranges of the different areas of the signal are different.As opposed to MRA, when using WPT, all the packets obtained at each level have the same frequency resolution, until the desired one is reached.Considering the application of WPT using a decomposition level, k, and the global frequency resolution, f s /2, (half of the sampling frequency f s in accordance with the Nyquist rule), the frequency resolution of the whole signal is divided into equal parts between the number of packets obtained 2 k .Thus, the frequency resolution, f r , of each packet is given by Equation ( 4) [36]: The value of f r must achieve a compromise between covering the desired range whilst also being small enough to avoid being influenced by other frequencies that may adulterate the results.
In most cases, the decomposition level selected is optimal for one purpose and has been reached after trying several other options [62].Since all WPT packets have the same frequency resolution, the automatic selection of the optimal scale range or decomposition level becomes easier.An automatic method has been detailed [63] for the selection of features involving sets that produce accurate results in classification for systems.The method utilizes wavelet decomposition and texture statistics.The use of intelligent classification systems has also been used for feature selection (either single-level basis selection, or multiple-level basis selection).Classification results of SVM have also been used at different Signal to Noise Ratios for feature selection [64].Based on this work, a method to automatically select the optimal decomposition level was proposed in [58].The method is based on the diagnosis results provided by an ANN.The method proposes an automatic evaluation to diagnose cracks in shafts, trying decomposition levels from two to nine.
Another critical parameter to be selected when applying wavelets is the wavelet function.In the same way it was common to determine the decomposition level of the range scale selection, it is also common to experiment with several possibilities to try to get the optimum wavelet function for the desired purpose.In terms of data compression, several methods are used, as Compression Ratio and Peak Signal to Noise Ratio, this is demonstrated in [65].
In most cases, the selection of the decomposition level and the wavelet function are closely related.For example, when the DWT is applied to remove noise, there are several methods proposed, such as the Signal to Noise Ratio, MAX, and EXP methods [66].In [67], the decomposition level and the wavelet function are selected in order to minimize the percentage root mean square difference criterion.
For fault detection, there is no optimal way to select the best wavelet function for a specific application.In practice, this can be carried out by comparing the shape of the fault to be detected with the wavelet function to be used.Symmetric wavelets were found to be more effective in singularity analysis, for example, a very narrow pulse-like anti-symmetric wavelet, such as db10 or any higher-order of Daubechies family where it was also found to perform well [59].Specifically, in crack detection, the maximum cross-correlation coefficients computed between the fault signal and the different wavelet functions from the library will give the optimal wavelet function, as shown in [68].The Daubechies family, and more specifically the Daubechies order six has been widely used for crack detection due to its proven effectiveness in this field [33,57,69].

Results Presentation
Studies about the dynamics of cracked rotors usually do not involve the inverse process of crack detection.Therefore they do not include quantitative results of detection methods.On the other hand, when using intelligent classification systems, it is common to present the results in the form of a success rate percentage.
Recently, the Probability of Detection (POD) curves have been imported from Non Destructive Evaluation (NDE) methods for presenting the results of crack detection in rotors.In NDE methods, POD curves are a normalized parameter used to evaluate the capability of a specified inspection method under specified conditions [70].Usually, POD is represented versus crack size "a" (length, depth, or, more recently the reflecting area [71]).
The confidence level of these POD curves is a matter of great importance.Usually, the lower 95% confidence bound is shown, which means that 95% of the POD curves constructed would be better than that represented.An NDE crack size, a NDE , may be defined as the crack size that reaches a certain probability of detection with a given confidence level.Figure 5 shows the mean POD(a) curve together with the 95% confidence bound, and the 90/95 crack size (the probability of detection is 90% and the confidence bound is 95%) [72].Recently, several works regarding crack detection in rotors have used POD curves to present diagnosis results, examples of these can be found in [33,[73][74][75].

Prospects
In recent years, an extensive volume of research related to cracked rotors diagnosis has been generated.Multiple works applying very different methods have been carried out that have obtained promising results; some of which have been mentioned in the present work.A high number of studies applied novel methods to experimental signals, these also provided quantitative information of the inverse process of crack detection, as a response of the highlights in [1].Currently, despite the high number of methods applied and the different features used, none of them stands out above the rest.New methodologies appeared to solve the problem of the non-existing standard methodology to select the best wavelet function, the decomposition level, and the features.One of these methods proposed was based on the use of intelligent classification systems.However, this remedy can be worst than the disease since there are again a lot of parameters that must be selected when applying intelligent classification systems.There is no agreement when deciding which intelligent classification system is best, nor a methodology to select optimal functions or values for its parameters.The most common extended methodology is to optimize the success rate percentage.
Several works have published quantitative data about success in results when experimentally diagnosing a specific fault, in a specific element, belonging to a specific machine, and working under specific conditions.It would be appropriate to study how many of the techniques applied are robust and still work when these very specific conditions are altered, or under what range of conditions these techniques still work; for example, what mounting; what speed; what load; whether they work when multiple faults are present; or when faults are in a different location.Another critical issue is the decision over what criteria should be used in the verification of whether a technique works or not.There is a need to benchmark the different techniques that have proven effectiveness in this area.So, it would be very useful to establish standard comparison parameters, statistically representative, such as POD curves and the computational cost.Benchmarking of the different methods under many different conditions would make it possible to check if one of them stands out.These kinds of studies are required in order to achieve the aim of a method of condition monitoring that is general while also being robust and reliable.

Conclusions
Here, the literature relating to the application of the Wavelet theory to cracked rotor diagnosis has been briefly reviewed.This revision of the literature has covered the different ways in which Wavelet Transform has been applied, its combination with intelligent classification systems, wavelet parameters selection, and the presentation of results.The future prospects for the techniques currently used have been discussed.It is concluded that there are a lot of different techniques that have worked properly for crack diagnosis, but only under certain conditions.Extrapolation of these methods to other conditions, and benchmarking between the different proposed techniques for crack detection are needed.
The parameter c is related to the scale, and b is related to the position of the wavelet function.CWT(c, b; ψ) represents the resulting coefficients, as a function of c, b and the wavelet function ψ.Figure1shows a signal in the time domain and the corresponding CWT.

Figure 1 .
Figure 1.Signal in the time domain and its CWT.

Figure 2 .
Figure 2. DWT decomposition of a signal (S) in approximation information (A) and detail information (D) using filters.
Figure 3.MRA decomposition procedure until decomposition level 3 of decomposition.Each decomposition is based on DWT (see Figure2).
DWT: Discrete Wavelet Transform MRA: Multiresolution analysis WPT: Wavelet Packets transform SVM: Support Vector Machines NDE: Non Destructive Evaluation POD: Probability of detection t: Time N: Number of samples for a time domain signal c: Scale parameter in CWT b: Shift parameter in CWT s: Scale range evaluated in CWT ψ: Wavelet function w: Weighting function x: Time domain signal A: Approximation information D: Detail information k: Decomposition level j: Position of a packet within decomposition level W(k, j): WPT coefficients f : Frequency f ψ : Central frequency of the wavelet function f s : Sampling frequency f r : Frequency resolution of a packet in WPT a: Crack size a NDE : Crack size that reaches certain POD at a given confidence level