Rotating Machinery Fault Diagnosis Based on Improved Multiscale Amplitude-Aware Permutation Entropy and Multiclass Relevance Vector Machine

The health state of rotating machinery directly affects the overall performance of the mechanical system. The monitoring of the operation condition is very important to reduce the downtime and improve the production efficiency. This paper presents a novel rotating machinery fault diagnosis method based on the improved multiscale amplitude-aware permutation entropy (IMAAPE) and the multiclass relevance vector machine (mRVM) to provide the necessary information for maintenance decisions. Once the fault occurs, the vibration amplitude and frequency of rotating machinery obviously changes and therefore, the vibration signal contains a considerable amount of fault information. In order to effectively extract the fault features from the vibration signals, the intrinsic time-scale decomposition (ITD) was used to highlight the fault characteristics of the vibration signal by extracting the optimum proper rotation (PR) component. Subsequently, the IMAAPE was utilized to realize the fault feature extraction from the PR component. In the IMAAPE algorithm, the coarse-graining procedures in the multi-scale analysis were improved and the stability of fault feature extraction was promoted. The coarse-grained time series of vibration signals at different time scales were firstly obtained, and the sensitivity of the amplitude-aware permutation entropy (AAPE) to signal amplitude and frequency was adopted to realize the fault feature extraction of coarse-grained time series. The multi-classifier based on the mRVM was established by the fault feature set to identify the fault type and analyze the fault severity of rotating machinery. In order to demonstrate the effectiveness and feasibility of the proposed method, the experimental datasets of the rolling bearing and gearbox were used to verify the proposed fault diagnosis method respectively. The experimental results show that the proposed method can be applied to the fault type identification and the fault severity analysis of rotating machinery with high accuracy.


Introduction
Rotating machinery is one of the most common mechanical equipment, which plays an important role in industrial applications. It generally operates under tough working environments, which can eventually result in mechanical breakdown that lead to high maintenance costs, severe financial losses, and safety concerns [1,2]. As rotating machinery is the most malfunctioning part of a mechanical system, the fault diagnosis of rotating machinery has been a popular research topic in the industry. At present, there are many different theoretical methods to solve the fault diagnosis of rotating machinery, including a vibration signal analysis, acoustic emission, thermal imaging and multi-sensor fusion, etc [3]. In the condition monitoring and fault diagnosis technology of rotating machinery, the fault diagnosis technology based on vibration signals is widely used because of the close correlation between the vibration signal and mechanical structure [4,5].
The vibration signal is widely used in fault diagnosis of rotating machinery because it is easy to collect and monitor online. For example, when there is a local fault in the running process of the rolling bearing, each contact causes an instantaneous shock and stimulates the rolling bearing to conduct high-frequency free vibration attenuation according to its inherent frequency. The instantaneous impact caused by the failure has obvious periodicity, the impact frequency depends on the bearing speed, and the impact amplitude depends on the bearing fault size. Therefore, the impact characteristics caused by local damage should be extracted by signal analysis technology and then, fault identification should be conducted by the artificial classifier.
As rotating machinery works in the industrial environment, its vibration signal often contains the inherent vibration signal of rotating machinery, the fault impact signal and background noise. The vibration signals collected by the accelerometers have the characteristics of non-linearity, non-stationarity and impact [6]. Therefore, how to effectively extract fault signal characteristics from complex vibration signals and accurately identify these fault features are the key problems in the fault diagnosis of rotating machinery. Pattern recognition is one of the important methods to realize the vibration signal analysis of rotating machinery. Many scholars have made achievements in this field. Jiang et al. [7] decomposed the vibration signal by ensemble local characteristic-scale decomposition (ELCD) and obtained a series of intrinsic scale components (ISCs). The principle ISCs were selected and the permutation entropy (PE) values of these ISCs were calculated to construct the feature vector. Finally, the fault type of the rolling bearing is identified by the relevance vector machine (RVM) constructed by the feature vector set. Li et al. [8] proposed a rolling bearing fault diagnosis method based on multiscale permutation entropy (MPE) and improved the support vector machine based on the binary tree (ISVM-BT). Local mean decomposition (LMD) was utilized to decompose the vibration signal of the rolling bearing into a set of product functions (PFs), and the MPE extracted the fault features of the rolling bearing from PFs. The ISVM-BT established by a feature set effectively identified the fault type automatically. Chen et al. [9] presented an integrated fault diagnosis method for a gearbox using complementary ensemble empirical mode decomposition (CEEMD), sample entropy (SampEn) and a probabilistic neural network (PNN). CEEMD decomposed the vibration signal of the gearbox into a set of intrinsic mode functions (IMFs). The fault features were extracted by SampEn from each IMF. Then, the PNN was used as the classifier to identify the fault type of the gearbox.
Due to the nonlinearity and non-stationarity of the vibration signals of rotating machinery a time-frequency analysis method is often used to solve the problem of feature extraction of the vibration signals of rotating machinery. The fast Fourier transform (FFT) is a classical time-frequency analysis method, but it is only suitable for solving the problem of stationary signal analysis. The wavelet transform (WT) is also a classical time-frequency analysis method that can preset the time and frequency window of interest. However, the WT is not an adaptive signal decomposition method, and it requires the kernel function and its parameters to be set in advance. The wavelet packet transform (WPT) can select the frequency resolution and the WPT is more flexible than the WT. However, the WPT is still not an adaptive time-frequency analysis method. The empirical mode decomposition (EMD) is a self-adaptive time-frequency method that can adaptively decompose the vibration signal into a set of intrinsic mode functions (IMFs) that contain the amplitude and frequency characteristics. However, the EMD has the end effect and mode mixing problem in that the stability of the IMFs is poor, which affects the subsequent feature extraction process. In order to solve the problems existing in EMD, EEMD and a complete ensemble empirical mode decomposition (CEEMD) are proposed [6].
In recent years, due to the fact that fault information contained in the vibration signals can be extracted more effectively at different time scales, a large number of scholars have applied a multiscale entropy (MSE) algorithm and its variants to fault feature extraction of rotating machinery [10,11]. In addition to the MSE algorithm [12], some scholars proposed many fault feature extraction methods of the vibration signal based on the multiscale permutation entropy (MPE) [13][14][15] and multiscale fuzzy entropy (MFE) [16,17]. However, the commonly used entropy theoretical methods still have some limitations. The poor stability of the approximate entropy (ApEn) results leads to its excessive dependence on the length of the time series. The calculation efficiency of the sample entropy (SampEn) is low, which is not suitable for analyzing long time series. When calculating the PE value of the time series, the PE algorithm does not consider the average amplitude of the time series. The different signals with significantly different mean amplitudes may be counted in the same order. Moreover, if there are elements with the same amplitude in the time series, the results calculated by the PE will be random. At present, the performance of the fault feature extraction method based on entropy theory needs to be improved. In order to solve the above problems, the amplitude-aware permutation entropy (AAPE) was proposed by Azami and Escudero [18] to improve the classical PE. The AAPE is sensitive to the changes in the amplitude, in addition to the frequency that can highlight the fault information contained in the vibration signal more effectively than the PE. In the coarse-graining procedures of the MSE and MPE, the length of the coarse-grained time series decreases with the increase of the scale factor. Therefore, when the scale factor is large, the entropy value of the coarse-grained time series is unstable. The literature [19] improved the coarse-graining procedure of MSE by making the computation stable and reliable in the case of a large time scale through the sliding averaging process, so as to solve the shortcomings of the traditional MSE.
After the fault features are extracted from the vibration signals of rotating machinery, a high performance classifier is needed to identify the fault types and fault severity. Many artificial intelligence techniques have been adopted to realize the fault diagnosis of rotating machinery, such as the artificial neural network (ANN) [20], support vector machine (SVM) [21], random forest (RF) [22]. The structure of the ANN is usually set by experience and its recognition rate is related to the number of training samples. Although the SVM can realize classification with high accuracy under the dichotomist condition of small training samples, the SVM requires multiple dichotomers to realize multiple classifications and further, the selection of the kernel function directly affects the classification accuracy. It is difficult for the random forest classifier to obtain the most ideal parameters, and the selection of parameters has a great impact on the recognition results. The relevance vector machine (RVM) [7] is more sparse than the SVM, more suitable for online monitoring, and its generalization ability is better than the SVM. However, like the SVM, the RVM is a binary classifier and cannot directly implement multiple classifications. The multiclass relevance vector machine (mRVM) [23] is an extended algorithm that can directly classify the input samples into multiple categories and output the probabilities belonging to each category. The unique properties of the mRVM are suitable for the multi-fault identification of rotating machinery fault diagnosis [6,24].
In view of the above problems in fault diagnosis of rotating machinery based on the pattern recognition method, this paper presents a novel fault diagnosis method based on the improved multiscale amplitude-aware permutation entropy (IMAAPE) and the mRVM for rotating machinery. The main contributions of this paper are summarized as follows: (1) As the AAPE is very sensitive to the amplitude change of the vibration signal, the vibration of rotating machinery needs to be pre-processed before the feature extraction to minimize the interference of external noise to the vibration signal. The intrinsic time-scale decomposition (ITD) was used to decompose the vibration signal of rotating machinery into a group of proper rotation components stably, among which the optimum PR component can highlight the main time-frequency characteristics of the vibration signal so as to facilitate the subsequent fault feature extraction.
(2) The performance of the AAPE improved. A fault feature extraction method of rotating machinery based on the IMAAPE is proposed for the first time. The IMAAPE improves the coarse-graining procedure in a multiscale analysis and adopts the characteristics of the AAPE sensitive to the amplitude and frequency changes of the vibration signal. The IMAAPE can calculate the AAPE values in different (3) The mRVM multi-classifier is trained to realize fault identification and fault severity analysis of rotating machinery. In this paper, two different realization methods of the mRVM and the effect of parameter selection on the identification accuracy of rotating machinery fault types are discussed by comparing experiments.
The organization of the rest of this paper is as follows. Section 2 introduces the theoretical basis of the methodologies adopted in this paper. The proposed fault diagnosis method is described in Section 3. Section 4 verifies the feasibility and effectiveness of the proposed fault diagnosis method by rolling bearing experiments and gearbox experiments, respectively. Finally, conclusions are drawn in Section 5.

Instrinsic Time-Scale Decomposition
The ITD is an algorithm for the efficient and precise time-frequency-energy (TFE) analysis of signals. The ITD can decompose a complex time series into a series of proper rotation (PR) components and accurately extract the intrinsic instantaneous amplitude, frequency information and other morphological characteristics of the complex time series, which is suitable for the analysis of non-stationary and nonlinear signals [25].
Let X t be the complex time series to be analyzed and define L as the baseline extraction factor. L can extract the baseline signal L t = LX t from X t , then X t can be decomposed into: where L t is the baseline signal, and H t is the PR component. The main steps of the ITD algorithm are as follows: Assuming that {τ k , k = 1, 2, . . .} represents the local extrema of signal X t , the default τ 0 = 0. L t and H t are defined in the interval [0, τ k ], and X t is valid in the interval t ∈ [0, τ k+2 ]. In the successive extrema interval (τ k , τ k+1 ], the extracted baseline signal L t is expressed as: in which: where α is a linear scaling factor used to adjust the amplitude of the extracted PR component, α ∈ [0, 1] [25]. According to Equation (2) and Equation (3), the PR component H t can be expressed as: where H is proper rotation extraction operator. The baseline signal L t can be taken as the input signal of the next decomposition and the above steps can be repeated to obtain a series of PR components. The termination condition of decomposition is that the baseline signal L t becomes monotonous or less than a certain preset value.
After the ITD, the time series X t is decomposed into a series of PR components and a monotone trend component. The kurtosis value of the signal can effectively describe the pulse characteristic of the signal. The higher the kurtosis value, the richer the impact features contained in the signal. Therefore, the PR component with the maximum kurtosis value is defined as the optimum PR component and its calculation process is expressed as follows: where K i represents the kurtosis value of the ith PR component and n represents the length of the time series. U i is the normalized kurtosis value of the ith PR component and m is the number of PR components. The optimum proper rotation component selects the PR component corresponding to the maximum value of U i .

Multiscale Entropy
The entropy analysis of the time series from a single scale may lose some important information of the original signal. The multiscale entropy (MSE) was proposed by Costa M. to represent the complexity of a signal. The MSE relies on the computation of the sample entropy over a range of scales to extract the characteristic information of the complex signal in different time scales [26]. The MSE algorithm is composed of two steps: (1) The coarse-graining procedure derives a set of time series representing the system dynamics on different time scales. The coarse-graining procedure for scale i is obtained by averaging the samples of the time series inside the consecutive but non overlapping windows of length i. For a monovariate discrete signal of {X i } = {x 1 , x 2 , . . . , x N }, the coarse-grained time series can be computed as: where y τ j represents the new-time series obtained after the coarse-graining procedure when the scale factor is τ. The length of the coarse-grained time series {y τ } is N/τ.
(2) Then, the sample entropy of each coarse-grained time series is calculated and n sample entropy values of different time scales are obtained to describe the signal characteristics of the original time series.

Amplitude-Aware Permutation Entropy
Bandt put forward the concept of basic permutation entropy in 2002 [27]. At present, the PE is widely used in the analysis of complex time series signals to measure the complexity of a nonlinear and non-stationary signal. The calculation process of the PE is as follows: Assume the given time series x = {x 1 , x 2 , . . . , x N } with length N, and for each time point t, embed the signal x in a d-dimensional space to obtain the reconstruction vectors where d and l denote the embedding dimension and the time delay, respectively. Each vector X d,l t is arranged in an increasing order as {x t+( j 1 −1)l , x t+( j 2 −1)l , . . . , x t+( j d−1 −2)l , x t+( j d −1)l }, where j * is the index of the element in the reconstruction vector. Therefore, when the embedding dimension is d, there are d! potential ordinal pattern and the ith permutation is called as π i . For each π i , p(π i ) represents the occurrence probability as follows: where f (π i ) is the function that counts the number of occurrences of π i . Whenever the inner elements of X d,l t are arranged in order of π i , f (π i ) increases by 1. The definition of the PE is as follows: However, there are two main problems in describing the complex time series by the PE. First, the traditional PE only considers the ordinal structure of a time series, but ignores the amplitude information of the corresponding elements in the time series. Second, the effect of the elements with equal amplitude on the PE value in the time series is not clearly explained. In view of these, Azami and Escudero proposed the amplitude-aware permutation entropy (AAPE) to improve the sensitivity of the PE to the amplitude and frequency of the time series. The flow chart of the AAPE algorithm is shown in Figure 1 [18].
where ( ) i f π is the function that counts the number of occurrences of i π . Whenever the inner elements of However, there are two main problems in describing the complex time series by the PE. First, the traditional PE only considers the ordinal structure of a time series, but ignores the amplitude information of the corresponding elements in the time series. Second, the effect of the elements with equal amplitude on the PE value in the time series is not clearly explained. In view of these, Azami and Escudero proposed the amplitude-aware permutation entropy (AAPE) to improve the sensitivity of the PE to the amplitude and frequency of the time series. The flow chart of the AAPE algorithm is shown in Figure 1 [18].  Assuming that the initial value of p(π d,l i ) is 0, for the time series X d,l t , when t gradually increases from 1 to N − d + 1, p(π d,l i ) should be updated whenever π d,l i appears.
where A ∈ [0, 1] is the adjustment coefficient to adjust the weight of the signal amplitude mean and the deviation between the amplitudes. Therefore, the probability of p(π d,l i ) appearing in the whole time series is π d,l i .
The AAPE calculation of the time series can be expressed as follows:

Multiclass Relevance Vector Machine
The traditional RVM is a binary classifier which cannot directly solve the multi-classification problem. The multiclass relevance vector machine (mRVM) effectively solves the multi-classification application problem of the traditional RVM. The basic principle of the mRVM is described below.
The input training data sample set is denoted as C} is the corresponding category tag. The kernel function is set as K ∈ R N×N , the auxiliary variable Y ∈ R C×N is introduced as the target of weight parameter w T K, and obtain: y cn |w c , k n ∼ N ycn (w c T k n , 1) The continuous nature of Y allows, not only multiple class discrimination by the multinomial probit link t n = i if y ni > y nj ∀ j i, but also a probabilistic output for class membership via the multinomial probit likelihood function, where ε is the expectation of the standard normal distribution p(u) ∼ N(0, 1) and Φ is the Gaussian cumulative distribution function.
In order to ensure the sparsity of the mRVM, similar to the RVM, a normal prior distribution with mean value of 0 and variance of α −1 nc is introduced for weight parameter w. α nc belongs to the prior parameter matrix A ∈ R N×C and obeys the Gamma distribution of parameters τ, υ. τ, υ(< 10 −5 ) can guarantee the sparsity of mRVM. The regressors w closed-form posterior can be derived based on Figure 2. In Equation (15), A c is a diagonal matrix derived from the c column of A which expresses the scales α ic across samples. Then, through the maximum posterior probability estimation, the equation can be obtained as: (16) and obtain: ( ) T cn c n c n cn y y ,~N ,1 w k w k (13) The continuous nature of Y allows, not only multiple class discrimination by the multinomial probit link if n n i n j = i y > y j i ∀ ≠ t , but also a probabilistic output for class membership via the multinomial probit likelihood function, (14) where ε is the expectation of the standard normal distribution ( )~(0,1)  When category i is given, the update method based on weight parameters is as follows: For a certain category, the posterior expectation of auxiliary variables is: For ∀c i, and for the ith class: where the "tilde" symbol above y denotes the expected value and Φ is a normalized cumulative distribution function, Φ n,i,c The posterior probability distribution of the prior parameters of the weight vector is: Psorakis proposed two training methods of the mRVM in the literature [23] and the difference between them lies in the different nuclear operation modes at the training stage. The mRVM 1 follows the construction process, starting with an empty sample set, gradually adding samples according to their contribution to the method, or deleting samples with a low contribution to the method. The mRVM 1 has two convergence principles: conv 1 and conv 2 . The mRVM 1 _conv 1 follows the principle described by [28]. The mRVM 1 _conv 2 adds the limit of the minimum number of iterations to the mRVM 1 _conv 1 . The mRVM 2 follows a top-down process, first loading the entire training sample set and then removing the unnecessary samples during the training process. The mRVM 2 has two convergence principles: conv A and conv N . For the mRVM 2 _conv A principle, log A (k) − log A (k−1) < ς indicates the iterative convergence. For the mRVM 2 _conv N , the number of iterations is limited to λN train .

The Proposed Fault Diagnosis Method
The main process of the proposed fault diagnosis method of rotating machinery includes signal preprocessing, fault feature extraction and fault identification. The principle of the fault diagnosis method proposed in this paper is introduced below.

Signal Preprocessing
Due to the fault of rotating machinery, the vibration signal has impact characteristics and the impact amplitudes are obviously different with different fault severity. In order to reduce the influence of external interference on the vibration signals and highlight the fault features of the vibration signals, it is necessary to preprocess the vibration signals before the feature extraction.
Although different from the time-frequency analysis method such as the EMD, EEMD and LMD, the ITD is used to highlight the major amplitude variations in the vibration signals. The ITD algorithm is adopted to decompose the vibration signal into a sum of proper rotation components, for which instantaneous frequency and amplitude, as well as a monotonic trend, are well defined. The ITD can effectively suppress the mode mixing and end effect. The optimum proper rotation component is selected for further fault feature extraction because it contains the most obvious fault features. The calculation process of signal preprocessing based on the ITD can be referred to Section 2.1.

Feature Extraction
In order to effectively extract fault features of the vibration signals, an improved multi-scale amplitude-aware permutation entropy (IMAAPE) algorithm is proposed in this paper. This method improves the coarse-graining procedures in a multi-scale analysis and improves the stability of the fault feature extraction. In the classical MSE algorithm, when the scale factor τ is high, the number of elements in the coarse-grained time series decreases, which leads to instability of the entropy measure. In order to solve the problem of shortening the length of the time series after the MSE coarse-graining procedure, relevant scholars have improved the coarse-graining procedure [19].
Supposing that the time series to be analyzed is {x 1 , x 2 , . . . , x N }, a set of coarse-grained time series z i,2 , · · ·} is generated by the improved coarse-graining procedure where The improved coarse-graining procedures for scale factor τ = 2 and τ = 3 are shown in Figure 3.
For each scale factor τ and embedded dimension d, the AAPE value of each time series in z (τ) i (i = 1, 2, · · · , τ) is calculated respectively, and its average value is defined as IMAAPE,  Figure 3. The improved coarse-graining procedures for scale factor τ=2 and τ=3.

Fault Identification
The high-performance multi-classifier can realize the fault type identification and further, a fault severity analysis of rotating machinery. The mRVM is adopted to analyze and identify the fault features of rotating machinery in this paper. After the feature extraction of the vibration signal samples with different fault types and fault severity by IMAAPE, a fault feature set is formed to model the mRVM classifier. The established mRVM classifier can identify the fault type and analyze the fault severity of rotating machinery by extracting the IMAAPE fault feature from the vibration signals.

Fault Diagnosis Procedure
The fault diagnosis procedure of rotating machinery proposed in this paper is shown in Figure 4. The whole procedure of the fault diagnosis method consists of two parts: the training part and testing part. In the training process, the vibration signals of rotating machinery under different fault states are collected according to a fixed sampling frequency to form the vibration signal sample set. The ITD is used to decompose the vibration signal into a set of PR components and the optimum PR component is selected to highlight the fault characteristics of rotating machinery. Then, the IMAAPE algorithm proposed in this paper is used to extract the features of the optimum PR component and construct the feature vector to accurately describe the fault type and fault severity. The fault feature vector set is constructed by all IMAAPE feature vectors that are extracted from the vibration signals in the vibration signal sample set. The mRVM multiple classifier is established by the fault feature vector set. In the testing process, the vibration signals of rotating machinery are collected by vibration accelerometers in real time. The fault features contained in the vibration signals are effectively extracted by the feature extraction method based on the IMAAPE proposed in this paper. Finally, the fault type and fault severity of rotating machinery are estimated by the mRVM classifier.

Fault Identification
The high-performance multi-classifier can realize the fault type identification and further, a fault severity analysis of rotating machinery. The mRVM is adopted to analyze and identify the fault features of rotating machinery in this paper. After the feature extraction of the vibration signal samples with different fault types and fault severity by IMAAPE, a fault feature set is formed to model the mRVM classifier. The established mRVM classifier can identify the fault type and analyze the fault severity of rotating machinery by extracting the IMAAPE fault feature from the vibration signals.

Fault Diagnosis Procedure
The fault diagnosis procedure of rotating machinery proposed in this paper is shown in Figure 4. The whole procedure of the fault diagnosis method consists of two parts: the training part and testing part. In the training process, the vibration signals of rotating machinery under different fault states are collected according to a fixed sampling frequency to form the vibration signal sample set. The ITD is used to decompose the vibration signal into a set of PR components and the optimum PR component is selected to highlight the fault characteristics of rotating machinery. Then, the IMAAPE algorithm proposed in this paper is used to extract the features of the optimum PR component and construct the feature vector to accurately describe the fault type and fault severity. The fault feature vector set is constructed by all IMAAPE feature vectors that are extracted from the vibration signals in the vibration signal sample set. The mRVM multiple classifier is established by the fault feature vector set. In the testing process, the vibration signals of rotating machinery are collected by vibration accelerometers in real time. The fault features contained in the vibration signals are effectively extracted by the feature extraction method based on the IMAAPE proposed in this paper. Finally, the fault type and fault severity of rotating machinery are estimated by the mRVM classifier.

Experiment and Analysis
In order to verify the feasibility and effectiveness of the fault diagnosis method of rotating machinery proposed in this paper, the rolling bearing and gearbox are taken as examples to carry out the experiments and analysis. The rolling bearing experiment adopts the famous public data set provided by Case Western Reserve University Bearing Data Center [29]. The gearbox experiment is carried out at the QPZZ-II vibration analysis and fault diagnosis test platform system of rotating machinery manufactured by Jiangsu Qianpeng Diagnosis Engineering Co., Ltd. (Zhenjiang, China) [30].

Experimental Platform and Data Set
The experimental platform designed by Case Western Reserve University Bearing Data Center is shown in Figure 5. The vibration data was collected using accelerometers, which were attached to the housing with magnetic bases. The accelerometers were placed at the 12 o'clock position at both the drive end and fan end of the motor housing. During some experiments, an accelerometer was attached to the motor supporting base plate as well. The vibration signals were collected using a 16 channel DAT recorder. In this paper, normal (Norm), inner race (IR) fault, outer race (OR) fault and ball elements (BE) fault are used for the experiments. The experimental sample description and experimental sample distribution under different load conditions of the rolling bearing are shown in Tables 1 and 2, respectively. The vibration signal waveforms of the rolling bearing in different fault types with different fault severity at load 0 hp (1 hp = 746 w) are shown in Figure 6. As shown in Figure 6, the amplitude and frequency of the vibration signals of the rolling bearing under different fault states and fault severity are different.

Experiment and Analysis
In order to verify the feasibility and effectiveness of the fault diagnosis method of rotating machinery proposed in this paper, the rolling bearing and gearbox are taken as examples to carry out the experiments and analysis. The rolling bearing experiment adopts the famous public data set provided by Case Western Reserve University Bearing Data Center [29]. The gearbox experiment is carried out at the QPZZ-II vibration analysis and fault diagnosis test platform system of rotating machinery manufactured by Jiangsu Qianpeng Diagnosis Engineering Co., Ltd. (Zhenjiang, China) [30].

Experimental Platform and Data Set
The experimental platform designed by Case Western Reserve University Bearing Data Center is shown in Figure 5. The vibration data was collected using accelerometers, which were attached to the housing with magnetic bases. The accelerometers were placed at the 12 o'clock position at both the drive end and fan end of the motor housing. During some experiments, an accelerometer was attached to the motor supporting base plate as well. The vibration signals were collected using a 16 channel DAT recorder. In this paper, normal (Norm), inner race (IR) fault, outer race (OR) fault and ball elements (BE) fault are used for the experiments. The experimental sample description and experimental sample distribution under different load conditions of the rolling bearing are shown in Tables 1 and 2, respectively. The vibration signal waveforms of the rolling bearing in different fault types with different fault severity at load 0 hp (1 hp = 746 w) are shown in Figure 6. As shown in Figure 6, the amplitude and frequency of the vibration signals of the rolling bearing under different fault states and fault severity are different.

Fault Feature Extraction
A vibration signal waveform of the rolling bearing with ball elements fault under the fault diameter of 7 mils with the load 0 hp is shown in Figure 7. The vibration signal is decomposed by the ITD and the ITD decomposition results (PR components) are shown in Figure 8. The PR component with the largest amplitude is the optimum PR component as shown in Figure 8. The optimum PR component highlights the frequency and amplitude characteristics of the ball elements fault.             Figure 11. It can be seen from the Figure 11 that the IMAAPE fault feature extraction method can make feature vectors extracted under different fault types have good clustering characteristics under different fault diameter conditions. Figure 12 shows the two dimensions of the IMAAPE feature vectors for different fault types under different fault diameters with the load 0 hp. It can be seen from Figure 12 that, although the fault features have different fault severity, the IMAAPE feature extraction method can also have good separability. The IMAAPE fault feature extraction method can provide an effective means for the fault severity analysis of the rolling bearings. The between-class distance and the within-class distance of the different rolling bearing fault feature extraction methods are shown in Tables 3 and 4, respectively. The ratio of the between-class distance to the within-class distance is shown in Table 5. To some extent, the between-class distance and the within-class distance can represent the clustering effect of the feature extraction method. Compared with other feature extraction methods, the fault features extracted by the IMAAPE can have a relatively larger between-class distance and a smaller within-class distance. Although the ratio of the between-class distance to the within-class distance of the IMAAPE is not the best in these methods, its comprehensive performance is the best. Therefore, the fault features    Figure 11. It can be seen from the Figure 11 that the IMAAPE fault feature extraction method can make feature vectors extracted under different fault types have good clustering characteristics under different fault diameter conditions. Figure 12 shows the two dimensions of the IMAAPE feature vectors for different fault types under different fault diameters with the load 0 hp. It can be seen from Figure 12 that, although the fault features have different fault severity, the IMAAPE feature extraction method can also have good separability. The IMAAPE fault feature extraction method can provide an effective means for the fault severity analysis of the rolling bearings. The between-class distance and the within-class distance of the different rolling bearing fault feature extraction methods are shown in Tables 3 and 4, respectively. The ratio of the between-class distance to the within-class distance is shown in Table 5. To some extent, the between-class distance and the within-class distance can represent the clustering effect of the feature extraction method. Compared with other feature extraction methods, the fault features extracted by the IMAAPE can have a relatively larger between-class distance and a smaller within-class distance. Although the ratio of the between-class distance to the within-class distance of the IMAAPE is not the best in these methods, its comprehensive performance is the best. Therefore, the fault features  Figure 11. It can be seen from the Figure 11 that the IMAAPE fault feature extraction method can make feature vectors extracted under different fault types have good clustering characteristics under different fault diameter conditions. Figure 12 shows the two dimensions of the IMAAPE feature vectors for different fault types under different fault diameters with the load 0 hp. It can be seen from Figure 12 that, although the fault features have different fault severity, the IMAAPE feature extraction method can also have good separability. The IMAAPE fault feature extraction method can provide an effective means for the fault severity analysis of the rolling bearings. The between-class distance and the within-class distance of the different rolling bearing fault feature extraction methods are shown in Tables 3 and 4, respectively. The ratio of the between-class distance to the within-class distance is shown in Table 5. To some extent, the between-class distance and the within-class distance can represent the clustering effect of the feature extraction method. Compared with other feature extraction methods, the fault features extracted by the IMAAPE can have a relatively larger between-class distance and a smaller within-class distance. Although the ratio of the between-class distance to the within-class distance of the IMAAPE is not the best in these methods, its comprehensive performance is the best. Therefore, the fault features extracted by the IMAAPE have good clustering characteristics. The time required by the different feature extraction methods is shown in Table 6. It can be seen from Table 6 that the IMAAPE fault feature extraction algorithm proposed in this paper has higher computational efficiency.  Table 6. It can be seen from Table 6 that the IMAAPE fault feature extraction algorithm proposed in this paper has higher computational efficiency.    Table 6. It can be seen from Table 6 that the IMAAPE fault feature extraction algorithm proposed in this paper has higher computational efficiency.

Fault Identification
In order to illustrate the fault identification accuracy of the fault diagnosis method based on the IMAAPE and the mRVM proposed in this paper, the samples under different loads were used to verify the effectiveness of the proposed method. The selection of the experimental samples is shown in Tables 1 and 2. The experimental results of fault identification accuracy of the different classifiers with different loads are as shown in Tables 7-10. The fault feature extraction method based on the IMAAPE combined with different classifiers has high fault identification accuracy. Moreover, the experimental results show that the identification accuracy of the mRVM 1 _conv 1 is higher than other classification methods, and at the same time, this method has reasonable operation efficiency. Therefore, the mRVM 1 _conv 1 was used as the multiple classifier of the rolling bearing fault diagnosis method in this paper.  The fault identification accuracy of mRVM 1 _conv 1 under different fault severity with the load 0 hp is shown in Table 11. It can be seen that the fault diagnosis method proposed in this paper can effectively identify the different fault severity of the rolling bearing. Further, the identification accuracy reaches 99.25%. As the selection of nuclear parameters has a great impact on the identification accuracy of the mRVM, this paper compares the selection of nuclear parameters of the mRVM and the experimental results are shown in Table 12. It can be seen that when the nuclear parameter is 8.5, the mRVM multi-classifier has the highest fault identification accuracy for the rolling bearing faults with different fault severity.  In this paper, the effectiveness of the different fault extraction methods combined with the mRVM classifier for the rolling bearing were compared. The experimental results are shown in Table 13. The fault diagnosis method proposed in this paper has the highest fault identification accuracy up to 99.925%.

Experimental Platform and Data
The experimental platform QPZZ-II was manufactured by Jiangsu Qingpeng Diagnosis Engineering Co., Ltd. A picture of QPZZ-II is shown in Figure 13. The vibration signals were collected by accelerometers. In the experiments, five working states were considered including the normal condition, gear pitting fault (pitting), gear tooth breaking (tooth breaking), pinion wear fault (wearing) and gear pitting fault coupling with pinion wear fault (pitting and wearing). The experimental sample description of the gearbox is shown in Table 14. The acquisition equipment is QPZZ-II produced by the Jiangsu Qianpeng Diagnosis Engineering Co. Ltd. and the sampling frequency is 5.12 kHz. There are 53,248 data points for each health condition. The collected data was divided into several non-overlapping samples. Each sample contained 1024 points. In this paper, the effectiveness of the different fault extraction methods combined with the mRVM classifier for the rolling bearing were compared. The experimental results are shown in Table 13. The fault diagnosis method proposed in this paper has the highest fault identification accuracy up to 99.925%.

Experimental Platform and Data
The experimental platform QPZZ-II was manufactured by Jiangsu Qingpeng Diagnosis Engineering Co., Ltd. A picture of QPZZ-II is shown in Figure 13. The vibration signals were collected by accelerometers. In the experiments, five working states were considered including the normal condition, gear pitting fault (pitting), gear tooth breaking (tooth breaking), pinion wear fault (wearing) and gear pitting fault coupling with pinion wear fault (pitting and wearing). The experimental sample description of the gearbox is shown in Table 14. The acquisition equipment is QPZZ-II produced by the Jiangsu Qianpeng Diagnosis Engineering Co. Ltd. and the sampling frequency is 5.12 kHz. There are 53,248 data points for each health condition. The collected data was divided into several non-overlapping samples. Each sample contained 1024 points.    The vibration signal waveforms of the gearbox in different fault conditions are shown in Figure 14. As can be seen from the figure, the vibration signals of the gearbox under different fault states have differences in amplitude and frequency. This can provide more effective help for the pattern recognition method. The vibration signal waveforms of the gearbox in different fault conditions are shown in Figure 14. As can be seen from the figure, the vibration signals of the gearbox under different fault states have differences in amplitude and frequency. This can provide more effective help for the pattern recognition method.

Fault Feature Extraction
The vibration signal is decomposed by the ITD and the ITD decomposition results (PR components) are shown in Figure 15. The optimum PR component highlights the frequency and amplitude characteristics of the gear pitting fault.

Fault Feature Extraction
The vibration signal is decomposed by the ITD and the ITD decomposition results (PR components) are shown in Figure 15. The optimum PR component highlights the frequency and amplitude characteristics of the gear pitting fault. The vibration signal waveforms of the gearbox in different fault conditions are shown in Figure 14. As can be seen from the figure, the vibration signals of the gearbox under different fault states have differences in amplitude and frequency. This can provide more effective help for the pattern recognition method.

Fault Feature Extraction
The vibration signal is decomposed by the ITD and the ITD decomposition results (PR components) are shown in Figure 15. The optimum PR component highlights the frequency and amplitude characteristics of the gear pitting fault.

Fault Identification
As shown in Tables 15-18, the selection of nuclear parameters of different mRVM implementations can make the fault identification accuracy of the gearbox up to 100%. In order to evaluate the effectiveness of the gearbox fault diagnosis method proposed in this paper, fault diagnosis accuracy comparison experiments were carried out. The experimental results show that the fault identification accuracy of the IMAAPE and the mRVM based fault diagnosis methods reach 100% in Table 19. Compared with other fault feature extraction methods, the IMAAPE has obvious advantages in fault identification accuracy. Therefore, the proposed fault diagnosis method proposed in this is suitable for gearbox fault diagnosis and has high identification accuracy.

Fault Identification
As shown in Tables 15-18, the selection of nuclear parameters of different mRVM implementations can make the fault identification accuracy of the gearbox up to 100%. In order to evaluate the effectiveness of the gearbox fault diagnosis method proposed in this paper, fault diagnosis accuracy comparison experiments were carried out. The experimental results show that the fault identification accuracy of the IMAAPE and the mRVM based fault diagnosis methods reach 100% in Table 19. Compared with other fault feature extraction methods, the IMAAPE has obvious advantages in fault identification accuracy. Therefore, the proposed fault diagnosis method proposed in this is suitable for gearbox fault diagnosis and has high identification accuracy.

Fault Identification
As shown in Tables 15-18, the selection of nuclear parameters of different mRVM implementations can make the fault identification accuracy of the gearbox up to 100%. In order to evaluate the effectiveness of the gearbox fault diagnosis method proposed in this paper, fault diagnosis accuracy comparison experiments were carried out. The experimental results show that the fault identification accuracy of the IMAAPE and the mRVM based fault diagnosis methods reach 100% in Table 19. Compared with other fault feature extraction methods, the IMAAPE has obvious advantages in fault identification accuracy. Therefore, the proposed fault diagnosis method proposed in this is suitable for gearbox fault diagnosis and has high identification accuracy.    Table 18. The influence of nuclear parameter selection of the mRVM 2 _conv N on fault identification accuracy of the rolling bearing (%).

Conclusions
This paper presents a novel diagnosis method for rotating machinery, which can further analyze the fault severity of rotating machinery on the basis of accurately identify the fault types. The experiments were conducted to illustrate the validity and feasibility of the fault diagnosis method for rotating machinery. This paper can summarize the following conclusions: (1) The improved multiscale amplitude-aware permutation entropy (IMAAPE) proposed in this paper improves the coarse-graining process of the MSE and the problems existing in the PE, and can effectively extract the fault information contained in the vibration signals. Moreover, compared with other fault feature extraction methods, the IMAAPE has higher execution efficiency.
(2) The multiclass relevance vector machine (mRVM) is suitable for the multi-classification of rotating machinery and has high identification accuracy on the basis of reasonable selection of nuclear parameters.
(3) The rolling bearing experiments and gearbox experiments show the effectiveness of the proposed method. The experimental results on the rolling bearing and gear box show that the proposed fault diagnosis method for rotating machinery has a high fault identification accuracy of over 99%. In particular, the rolling bearing experiments show the potential application of the proposed method in fault severity analysis.
Author Contributions: Y.C. clarified the research content and ideas, and completed the writing of this paper. T.Z. and W.Z. studied the method and carried out experiments. Z.L. and H.L. analyzed the data and proofread the manuscript.