A Hydraulic Pump Fault Diagnosis Method Based on the Modified Ensemble Empirical Mode Decomposition and Wavelet Kernel Extreme Learning Machine Methods

To address the problem that the faults in axial piston pumps are complex and difficult to effectively diagnose, an integrated hydraulic pump fault diagnosis method based on the modified ensemble empirical mode decomposition (MEEMD), autoregressive (AR) spectrum energy, and wavelet kernel extreme learning machine (WKELM) methods is presented in this paper. First, the non-linear and non-stationary hydraulic pump vibration signals are decomposed into several intrinsic mode function (IMF) components by the MEEMD method. Next, AR spectrum analysis is performed for each IMF component, in order to extract the AR spectrum energy of each component as fault characteristics. Then, a hydraulic pump fault diagnosis model based on WKELM is built, in order to extract the features and diagnose faults of hydraulic pump vibration signals, for which the recognition accuracy reached 100%. Finally, the fault diagnosis effect of the hydraulic pump fault diagnosis method proposed in this paper is compared with BP neural network, support vector machine (SVM), and extreme learning machine (ELM) methods. The hydraulic pump fault diagnosis method presented in this paper can diagnose faults of single slipper wear, single slipper loosing and center spring wear type with 100% accuracy, and the fault diagnosis time is only 0.002 s. The results demonstrate that the integrated hydraulic pump fault diagnosis method based on MEEMD, AR spectrum, and WKELM methods has higher fault recognition accuracy and faster speed than existing alternatives.


Introduction
In recent years, modern industrial equipment has developed in the direction of complexity, high speed, and intelligence. This not only ensures the quality of the products, but also greatly reduces the associated production costs and significantly improves production efficiency. However, in the actual production process, due to the continuous expansion of system scale and the increasing complexity of the structures involved, the probability of equipment failure has been gradually increasing. Therefore, it is very important to improve the safety and reliability of such equipment. In this regard, it is necessary not only to improve the reliability of the equipment in the process of equipment design and manufacturing, but also to carry out real-time status monitoring and fault diagnosis during the operation of the equipment. In this way, we can detect the fault initially, then predict the location and cause of the fault, in order to provide accurate information for predictive maintenance. Therefore, more and more attention has been paid to fault diagnosis and prediction technology, for which a lot of research work has been carried out [1][2][3][4].
Hydraulic systems have the advantages of small size, light weight, high power, smooth transmission, and fast response. Hydraulic systems have been widely used in the aerospace, shipbuilding, metallurgical, machine tool, agricultural machinery, and engineering machinery industries, among other fields. The wide application of hydraulic systems has greatly promoted the development of industrial production [5]. Since entering the 21st century, hydraulic systems and hydraulic equipment have become more complicated and large-scale. Once a failure occurs, it not only can damage the equipment, but may even cause casualties [6]. Hydraulic pumps have the characteristics of complex structure, high speed, high pressure, and continuous operations, and are the most prone to failure. Therefore, fault diagnosis technology for hydraulic systems has become a research hotspot in recent years.
In fault diagnosis, the first step is to extract fault features from the acquired data [7,8]. Feature extraction is the process of obtaining fault features through signal processing [9,10]. Considering that the vibration signal of a hydraulic pump has non-linear and non-stationary characteristics, it is difficult to extract the fault features accurately by using traditional timeand frequency-domain analyses. The key information of the hydraulic pump vibration signal can be extracted more accurately by using time-frequency characteristic analysis methods. Time-frequency analysis methods include non-adaptive analysis and adaptive analysis. Adaptive analysis refers to the modal decomposition and reconstruction of the signal, according to the characteristics of the signal, in the process of signal processing, in order to achieve good analysis results.
Considering the mode mixing phenomenon of the empirical mode decomposition (EMD) method [11,12], Wu et al. proposed the ensemble empirical mode decomposition (EEMD) method, which effectively suppresses modal mixing by adding Gaussian white noise to the original signal [13]. In view of the shortcomings of the EEMD method, scholars have carried out a lot of research. Finally, Zheng et al. proposed the modified ensemble empirical mode decomposition (MEEMD) method [14], which adds a pair of Gaussian white noises with the same amplitude and opposite signs, such that they cancel each other and the influence of the noise on the signal is reduced. Compared with the EEMD method, this method has a better decomposition effect. Jiang et al. applied the MEEMD method to the fault diagnosis of the main shaft bearing of a wind turbine [15]. Their method showed good results, in terms of training accuracy and training speed, which verifies the feasibility of this method. Zheng proposed a method combining MEEMD and generalized adaptive S transform (AGST), which was successfully applied to the vibration signal analysis of an internal combustion engine [16]. Shi et al. applied the MEEMD method to fault feature extraction of rolling bearings, which not only effectively suppressed the phenomenon of modal mixing but also greatly reduced the number of pseudo-components and accurately extracted the weak fault information of rolling bearings [17]. Although the MEEMD method has been widely used in the field of fault diagnosis, it has seldom been reported in hydraulic system fault diagnosis. Therefore, in this paper, we applied it to the vibration signal feature extraction of an axial piston pump, in order to improve the accuracy of fault identification.
In 2004, Professor Huang from Nanyang Technological University in Singapore proposed a new single-hidden layer feedforward neural network, called the extreme learning machine (ELM) [18][19][20]. In 2018, Zhang et al. proposed a method combining evidence theory and ELM, which effectively solved the classification problem of incomplete data [21]. However, it should be noted that the optimal number of neurons in the hidden layer obtained by the incremental extreme learning machine and its improved method may be very large, thus wasting a lot of unnecessary training time.
As the ELM method needs to set the number of hidden layer neurons and randomly select input weights and hidden layer biases, ELM instability is inevitable. Therefore, Huang et al. introduced a kernel function into the ELM to propose the kernel-ELM (KELM) method. This method avoids the problems of artificially setting the number of hidden layer neurons and randomly selecting the input weights and hidden layer biases in ELM by means of kernel function mapping. KELM can achieve very high training accuracy by using the powerful mapping ability of kernel function [22]. To date, KELM has been widely used in various fields. One study [23] has proposed a parallel KELM, which effectively improved the computing speed of KELM. Cho et al. applied a K-ELM to correctly classify the stress levels to reflect five different levels of stress situations [24]. Zhang et al. proposed a soft sensing algorithm that can detect alumina concentration by the electrical signals such as voltages and currents of the anode rods [25]. Yang et al. combined the multiscale feature energy with the KELM to effectively diagnose the airflow jamming fault of a fluidized bed [26]. Zhang et al. applied the KELM to the fault diagnosis of a coal mill and optimized the kernel parameters with particle swarm optimization (PSO) [27]. Yang et al. proposed a deep kernel extreme learning machine (DK-ELM), which has been successfully applied to aero-engine fault diagnosis [28]. Considering that wavelet functions have the advantages of time-frequency local analysis and multi-resolution analysis, a wavelet kernel function has been introduced into KELM to construct the wavelet kernel extreme learning machine (WKELM).
In this paper, a fault diagnosis method for a hydraulic pump based on the integration of the MEEMD, autoregressive (AR) spectrum energy, and WKELM methods is proposed and studied. First, the vibration signals of an axial piston pump under different working states were acquisited from a hydraulic pump fault simulation test bed. Secondly, signal decomposition and feature extraction were carried out for the acquired original vibration signals, using the fault feature extraction method based on MEEMD and AR spectrum. Finally, the fault diagnosis method based on WKELM was used to diagnose the working states of the hydraulic pump. The results show that the diagnosis accuracy rate can reach 100%. Compared with BP, SVM, and ELM, the feasibility and superiority of the method proposed in this paper are demonstrated.

Basic Principles of MEEMD Method
MEEMD is an improvement on EEMD [29]. First, considering that the set of Gaussian white noise added in EEMD cannot be completely canceled out, MEEMD adds a pair of Gaussian white noise signals instead, where the amplitude of the white noise is the same but the sign is opposite. The purpose of this is to allow the added Gaussian white noise signals to cancel each other out, thereby reducing the influence of noise on the signal [30]. Secondly, considering that EEMD will decompose more pseudo-components, the concept of PE is introduced to MEEMD, in order to detect whether the signal is abnormal. The abnormal signals are decomposed again and eliminated, such that the pseudo-components can be effectively reduced. At the same time, the MEEMD method can also suppress the phenomenon of mode mixing in the EMD and EEMD methods. Therefore, the MEEMD method has a better decomposition effect than EMD or EEMD.
The Specific steps of MEEMD are as follows [14]: (1) Two groups of Gaussian white noises, noise l (i) and -noise l (i), with equal absolute values, are added to the original signal {x(i), i = 1, 2, . . . , n}: where noise l (i) is the added Gaussian white noise signal, a l is the amplitude of the added noise signal, and l = 1, 2, . . . , Ne, where Ne is the number of white noise additions.
The signals x + (i) and x − (i), after adding Gaussian white noise, are decomposed by EMD. The first-order Intrinsic Mode Function (IMF) component sequences are respectively expressed as c + l,1 (i) and c − l,1 (i) , where l = 1, 2, . . . , Ne. The overall average of the above components can eliminate the residual white noise to the greatest extent, that is: The PE of the IMF components c 1 (i) is calculated after the overall average. If the PE is greater than θ, it is considered to be an abnormal signal; otherwise, it is approximately considered to be a stationary signal. The general range of θ is 0.55-0.6; θ is taken in this paper.
(2) Determine whether c 1 (i) is an abnormal signal. If c 1 (i) is an abnormal signal, the first IMF component c 1 (i) will be separated from the original signal x(i), where the new signal r 1 (i) is: Taking the signal r 1 (i) as the original signal, Step (1) is repeated until the IMF component c q (i) obtained after the overall average is not an abnormal signal.
(3) The first q − 1 decomposed abnormal signal components whose PE is greater than θ are separated from the original signal, namely: where j = 1, 2, . . . , q − 1, q − 1 is the number of abnormal components, and r q−1 (i) is the residual signal after removing the abnormal components.
(4) After the remaining signal, r q−1 (i), is decomposed by EMD, the decomposed IMF components are arranged, in order from high frequency to low frequency: Finally, MEEMD method can be expressed as: where c k (i) are the IMF components finally obtained, k = 1, 2, . . . , z, z is the number of IMF components finally obtained, and r (i) is the residual component finally obtained.

Permutation Entropy
Permutation entropy (PE) [31] is a method for detecting the sudden change of dynamic behavior and signal regularity, which has the advantages of fast calculation speed and strong anti-interference ability. It is especially suitable for non-linear and non-stationary signal analysis and has good robustness. The steps of the permutation entropy method are as follows: Given a period of time-series: {x(i), i = 1, 2, . . . , n}, the reconstruction matrix B can be obtained by phase space reconstruction: where λ is the time delay, m is the embedding dimension, and k = n − (m − 1)λ. Each row b(i) in the reconstruction matrix B can be regarded as a reconstruction component. The elements of the i th reconstruction component b(i) in the reconstruction matrix B are arranged in ascending order, according to the size of the value: where j 1 , j 2 , . . . , j m are the indices of the column of the reconstructed component element. If , they are sorted by the sizes of j 1 , j 2 ; that is, when j 1 <j 2 , Therefore, for any discrete sequence {x(i), i = 1, 2, . . . , n}, a set of symbolic sequences can be obtained for each row of its reconstruction matrix B: where l = 1, 2, . . . , k, k ≤ m!, as there are m! permutations of m symbols {j 1 , j 2 , . . . , j m } and, so, the m-dimensional phase space can map m! kinds of symbol sequences, where S(l) is just one of them. The probability {P 1 , P 2 , . . . , P k } of each symbol sequence is calculated, following which the PE of the discrete sequence {x(i), i = 1, 2, . . . , n} can be defined as [32]: When P j = 1/m!, then H P (m) reaches its maximum ln(m!). For convenience, ln(m!) is often used to standardize H P (m). Thus, the PE obtained in Equation (4) can be standardized as follows: where PE norm ∈ [0, 1]. The smaller the value of PE norm , the more regular the time-series; thus, PE norm can reflect the randomness of the time-series [33].

AR Spectrum Method
The autoregressive (AR) model is a time-series analysis method. It not only can fully reveal the characteristics of a signal, but also can improve the resolution of the signal. By analyzing the signal using an AR model, the characteristic frequency contained in the signal can be effectively extracted, such that feature extraction can be carried out accurately. The AR spectrum method can overcome the limitations of the Hilbert method, as there is neither estimation error nor a windowing effect. Therefore, the AR spectrum has great advantages in fault feature extraction. The general expression of the AR model is [34]: where a k (k = 1, 2, . . . , p) are the parameters of the AR model, p is the order of the model, and u(n) is a stationary white noise sequence with a mean value of zero and a variance of σ 2 . The AR(p) process {x(n)} defined by Equation (12) can be regarded as a white noise sequence u(n) generated by an all-pole filter, whose transfer function is: Therefore, the power spectral density, p x (e jω ), of {x(n)} can be expressed as: The AR spectrum has the advantages of high resolution, smoothness, ease of locating, and so on. By analyzing a signal through the AR model, the characteristic frequency contained in the signal can be effectively extracted and the feature extraction can be carried out accurately.

Wavelet Kernel Extreme Learning Machine
WKELM has been proposed as an improvement on the basis of ELM. Considering that a wavelet kernel function has the characteristics of multi-scale approximation by wavelet analysis, it has a better effect for non-linear classification. Therefore, a wavelet kernel function was used to replace the matrix multiplication of the output matrix, H, of the ELM hidden layer neurons. This not only can improve the fitting ability of the method for non-linear samples but can also improve the stability of the method and shorten the training time.

Extreme Learning Machine Theory
The key characteristic of ELM is that the input weights and hidden layer biases of the network are randomly selected [35], and there is no need for iterative adjustment in the process of operating the method. Therefore, the calculation speed of ELM is extremely fast, which can greatly save time and effectively overcome the shortcomings of traditional single-hidden layer feed-forward neural networks, which have slow convergence speed and can easily become stuck in local minima. ELM can minimize the training error in the process of training and ensure that the norm of the output weights is very small, such that it has higher function approximation ability and generalization performance than traditional neural networks.
ELM [18] is a single-hidden layer feed-forward neural network, which is composed of an input layer, a hidden layer, and an output layer. The ELM network structure is similar to that of a single-hidden layer feed-forward neural network, as shown in Figure 1. Suppose we have an ELM with u neurons in the input layer, v neurons in the hidden layer, and w neurons in the output layer. Given a training sample set (X, where w i = [w i1 , w i2 , · · · , w iu ] T are the weights between the input neurons and the i th hidden layer node; β i are the weights between the i th hidden layer node and the output layer neurons, β i = [β i1 , β i2 , · · · , β iw ] T ; b i are the biases of the i th hidden layer node; and g(·) is the activation function of the ELM hidden layer. When the number of hidden layer neurons, v, and the number of training set samples, M, of single-hidden layer feedforward neural network are equal, the neural network can approach the training samples with zero error, that is: Define: Therefore, Equation (17) can be written as: where H is the hidden layer output matrix of ELM. When the activation function, g(·), of the hidden layer is infinitely differentiable, the weight matrix W of the input and hidden layers and the bias b of the hidden layer can be randomly determined before training and remain unchanged during training. In this case, the hidden layer output matrix H is a constant matrix. In this way, the solution of the weight matrix β between the hidden layer and the output layer can be transformed into the solution of the least-squares solutionβ of the linear equations Hβ = T, namely: where H + is the Moore-Penrose generalized inverse of the hidden layer output matrix H.
According to Karush-Kuhn-Tucker (KKT) theory, the solution of Equation (22) can be transformed into the solution of the Lagrangian function [36]: where β is the output weight vector, C is a penalty factor, ξ i is the training error, α i are Lagrange operators, h(x i ) is the feature mapping of the hidden layer, t i are the target output values, and M is the number of samples.
Calculating the partial derivative of Equation (23), the output weights,β, of the ELM can be solved as:β where I is the identity matrix of order v. Finally, the output function of the ELM can be expressed as [37]:

Theory of Kernel Extreme Learning Machine
As the weights between the input layer and the hidden layer and hidden layer biases of the ELM are randomly selected during training, a good training accuracy cannot be guaranteed every time; even the same training sample can have different results in two training processes. In order to solve this problem, Huang et al. proposed an improved ELM method in 2012; namely, kernel-ELM (KELM). The KELM method introduces a kernel function into the ELM method, which replaces the method of randomly selecting the input weights and hidden layer biases in original ELM method.
In Equation (21), the hidden layer output matrix, H, of ELM can be written as: where h(x i ) is the feature mapping of the hidden layer. HH T in Equation (24) can be replaced by constructing a kernel function without knowing the specific form of h(x i ): where K x i , x j is the kernel function and i, j ∈ (1, 2, . . . , M). The ELM kernel matrix is defined as: Then, Equation (25) for the ELM can be changed into the form of KELM: The KELM method was obtained by introducing a kernel method into the ELM method. Compared with basic ELM, the KELM method has better non-linear function approximation ability and superior data classification ability. In addition, KELM does not need to randomly select input weights and hidden layer biases during training, only requiring the selection of appropriate kernel functions to overcome the shortcomings of extreme learning machines. In addition, KELM not only can overcome the dimensional disaster problem of ELM, but also has very high recognition accuracy.

Wavelet Kernel Extreme Learning Machine Theory
The basic use of a wavelet function is to fit any function with wavelets of different scales. As wavelet functions have strong mapping ability, a wavelet kernel can be used as the kernel function in the KELM method [38]. Therefore, a wavelet kernel function is constructed from a wavelet function in this paper, which is used as the kernel function of KELM. Thus, a WKELM is constructed.
Given a mother wavelet function, ψ(x), whose scale factor and translation factor are a and b respectively, the wavelet basis function can be expressed as: According to tensor product theory, a multidimensional wavelet function can be expressed by tensor products of several one-dimensional wavelet functions: where s is the number of one-dimensional wavelet functions, when the multidimensional wavelet function is written in tensor product form; and x i is the independent variable of the i th one-dimensional wavelet function. According to Equation (31), a translation-invariant kernel function can be constructed: In this paper, the Morlet wavelet is used to construct the wavelet kernel function. The Morlet wavelet function is given as: The wavelet kernel function constructed from the Morlet wavelet function is: where x i are the i th components of x.
As the wavelet kernel function is constructed using the translation-invariant kernel of the wavelet function, the wavelet kernel function inherits the multi-scale approximation characteristics of the wavelet function. Therefore, taking a wavelet kernel function as the kernel function in KELM will have a better effect on classification. The method flow is as follows: (1) The number of input neurons and output neurons are determined, according to the number of features and categories, and the WKELM classifier model is established; (2) Set the value of the penalty factor C and the parameter of the wavelet kernel a; (3) The input characteristics of the training samples and their corresponding output categories are used to train the WKELM; (4) The features of the test sample are used as the input for the WKELM and, then, the trained WKELM classifier is used for the category decision.

Test System and Data Acquisition
A hydraulic pump fault simulation test bed was used for the test. The vibration sensor used in this paper is piezoelectric and charge output type vibration sensor. The charge amplifier is used to amplify and convert the output charge of the vibration sensor, which is used to sample the vibration signal of the hydraulic pump shell in x, y and z directions. The pressure sensor is a piezoresistive pressure sensor, which is used to sample the pressure pulsation signal of the hydraulic oil at the outlet of the hydraulic pump. The main components used in the test are described in Table 1. The installation of an experimental installation is shown in Figure 2. The vibration sensors are used to collect the vibration signals of the hydraulic pump in different states. The schematic diagram of the test system is shown in Figure 3. During the test, the different fault states of the hydraulic pump were simulated by replacing the plunger and slipper components of different fault types, as well as through use of a worn center spring. The outlet pressure of the hydraulic pump could be controlled by adjusting the pilot-operated proportional overflow valve 18. Vane pump 3 was used as a supplementary pump, in order to make up for the poor self-priming capacity of the piston pump.  During the test, a data acquisition system was required to display and store the vibration signal collected by the vibration sensor in real-time [39]. The data acquisition system used in this test was based on the LabVIEW software (National Instruments (NI), Austin, Texas, USA). As shown in Figure 4, the sampling frequency, sampling time, and controls for starting and stopping the data collection could be set using the front panel. At the same time, the collected pump speed, pressure, vibration, and other information could be displayed in real-time. The fault setting methods of the axial piston pump during the test are shown in Table 2. The process of the test was as follows: First, the three vibration sensors were fixed to the end cap and shell of the hydraulic pump, respectively, as shown in Figure 3. When the hydraulic system was unloaded, the test equipment was started. When the system reached normal speed, the system pressure was set to 10 MPa through the pilot-operated proportional overflow valve 18. The speed of the motor was set to 1480 r/min. Then, after the sampling frequency was set to 20 kHz in the data acquisition system, the data acquisition system was started, in order to collect and store the speed and other signals of the hydraulic pump in the normal state. Finally, the normal components were replaced by the fault components of the axial piston pump, and the vibration signal of the hydraulic pump in the fault state was acquisited. After the data acquisition stage, the original signals of the hydraulic pump in normal state and three fault states were obtained. Some pictures of the fault components are shown in Figure 5.

Simulation Signal Analysis of MEEMD Method
A simulation signal, x(t), with a sampling frequency of 1000 Hz and a sampling time of 2s was constructed: where x 1 (t) is a sine signal, x 2 (t) is an amplitude modulation signal, and n(t) are two intermittent random signals: x 2 (t) = (t + 1)2 sin 2π · 8t + π 3 .
(37) Figure 6 shows the time-domain diagram of the synthesized signal and its components. The synthesized signal was decomposed by EEMD, and the decomposition results obtained are shown in Figure 7.  The amplitude of the added Gaussian white noise was a l = 0.2, and the total number of noises added was N e = 100. The results obtained by EEMD decomposition of the signal are shown in Figure 7. It can be seen, from Figure 7, that EEMD decomposed many pseudo-components. After EEMD decomposition of the simulation signal x(t), seven IMF components and one residual component were decomposed, among which IMF3 and IMF5, respectively, represent the sine signal x 1 (t) and the amplitude modulation signal x 2 (t) in the synthetic simulation signal x(t). The simulation signal x(t) was also decomposed by MEEMD, as shown in Figure 8. It can be seen, from Figure 3, that three IMF components and one residual component were decomposed, among which IMF2 and IMF3, respectively, represent x 1 (t) and x 2 (t) in the synthetic simulation signal x(t). Compared with EEMD, MEEMD can effectively eliminate the pseudo-components.

Fault Feature Extraction Process
In this paper, a fault feature extraction method based on MEEMD and AR spectrum is proposed. First, the non-stationary vibration signals are decomposed by MEEMD, in order to obtain some IMF components. Secondly, the effective IMF components are screened by the Pearson correlation coefficient method, then analyzed by the AR spectrum method. Finally, the calculated AR spectrum energy is taken as the fault feature of the signal. Compared with the Hilbert-Huang transform (HHT) method, the feature extraction method based on MEEMD and AR spectrum proposed in this paper not only can solve the mode mixing phenomenon of EMD decomposition by the HHT method but can also overcome the limitation of Hilbert analysis in HHT method. The specific steps of this method are as follows: (1) The signal is decomposed by the MEEMD method, from which several IMF components are obtained. (2) As there may still be pseudo-components in the IMF components obtained through MEEMD decomposition, it is necessary to eliminate such components. In this paper, the pseudo-components are eliminated by the Pearson correlation coefficient method. First, the Pearson correlation coefficient r between all IMF components and the original signal is calculated. Then, t IMF components with large correlation coefficients are taken as the effective components. The Pearson correlation coefficient r is calculated as follows: where x(i) is the original signal, y(i) is each order of IMF components, i = 1, 2, · · · , n, n is the signal length, x is the average of x(i), and y is the average of y(i). (3) AR spectrum estimation of the t IMF components is carried out and the AR spectrum energies of the t IMF components are calculated. After normalization, the characteristic matrix, P, of the signal is composed as follows: where p i = [p 1i , p 2i , · · · , p ti ] T , i = 1, 2, · · · , M, and p ki is the AR spectrum energy of the t th IMF component of the i th sample.

Analysis of Acquired Hydraulic Pump Signals
The x-direction vibration signal of a single slipper loosening fault was selected for analysis. Figure 9 shows the time-domain diagram of the vibration signal in the state of the single slipper loosening fault of the axial piston pump. In order to show that the MEEMD method had a better effect than the EEMD method in the signal processing of the hydraulic pump, the number of IMF components and the index of orthogonality (IO) decomposed by the two methods were compared. Figure 10 shows the vibration signal decompositions of hydraulic pump in the single slipper loosing fault state by the two methods. Table 3 gives the IO of EEMD and MEEMD methods. Orthogonality refers to the correlation between different components. The orthogonality of the signal can be measured by calculating the IO, after which the degree of modal mixing between the components can be judged. The larger the IO value, the worse the orthogonality; that is, the more serious the degree of modal mixing. The Equation for IO of IMF components is [11]: where j = k, c j (i) is the j th IMF component, c k (i) is the k th IMF component, x(i) is the original sequence, n is the length of the time-series, I is the number of IMF components, j = 1, 2, · · · , I, and k = 1, 2, . . . , I.  It can be seen, from Figure 10, that nine IMF components were decomposed by the EEMD method, while six IMF components were decomposed by the MEEMD method. It is obvious that the MEEMD method addressed the problem of excessive false modal components, as was obvious in the EEMD method. It can be seen, from Table 3, that the IO value for the MEEMD method was smaller than that of the EEMD method. This indicates that the MEEMD method is more effective than the EEMD method in suppressing the phenomenon of modal mixing.

Fault Feature Extraction of Hydraulic Pump Based on MEEMD-AR Spectrum
First, the x-direction vibration signals of the hydraulic pump in four states were decomposed by the MEEMD method, and several IMF components were obtained. The vibration signals of the hydraulic pump in normal working state and single slipper wear fault state were decomposed by MEEMD, respectively. The decomposition results are shown in Figure 11. The Pearson correlation coefficient method was used to calculate the correlation coefficient between all IMF components decomposed by MEEMD and the original signal. Figure 12 shows the mean value of correlation coefficients between each IMF component decomposed by the MEEMD method and the original signal of 50 samples in each of the four hydraulic pump states. It can be seen, from Figure 12, that the correlation coefficient between the first IMF component and the original signal was more than 0.75, the correlation coefficient between the second IMF component and the original signal was basically more than 0.4, the correlation coefficient between the third IMF component and the original signal was more than 0.34, and the correlation coefficient between the fourth IMF component and the original signal is more than 0.3. The correlation coefficients between the other IMF components and the original signal were all less than 0.3. Considering the complexity and accuracy of the method, the components whose correlation coefficient between the IMF component and original signal was greater than 0.3 were selected as the effective components in this paper. It can be seen, from the analysis of Figure 12, that the correlation coefficients between the first four IMF components and the original signal in the four states of the hydraulic pump were greater than 0.3. Therefore, AR spectrum analysis was only performed for the first four IMF components decomposed by MEEMD in various states, as shown in Figure 13. It can be seen from Figure 13 that the AR spectra of the same-order IMF components of hydraulic pumps in the four states were different, where there were different characteristics among the different IMF components. In addition, the maps obtained by AR spectrum analysis were not only smooth, but also easy to distinguish. The AR spectrum energies of different states (i.e., corresponding to each IMF component in Figure 12) were calculated, and the results are shown in Table 4. It can be intuitively seen, from Table 4, that the feature values extracted by different states had different distributions, demonstrating the effectiveness of the feature extraction method, to a certain extent.

Fault Diagnosis Method of Hydraulic Pump Based on WKELM
Equipment will inevitably fail in the process of operation. In order to identify the equipment state information from the vibration signal, a fault diagnosis method based on the integration of MEEMD, AR spectrum, and WKELM methods is proposed in this paper, in order to identify the equipment state. Figure 14 shows the fault diagnosis flow chart, based on the MEEMD-AR-WKELM integration method.

Hydraulic Pump Fault Diagnosis Based on WKELM
The z-direction vibration signal of the axial piston pump was selected for analysis. A sample was intercepted every 2000 data points of the hydraulic pump vibration signal in four states, where 50 samples were taken in each state. The training sample set was composed of 30 samples selected from 50 samples in each state, while the test sample set was composed of the remaining 20 samples. Two groups of AR spectrum energies in four different states of the hydraulic pump were taken to draw the AR spectrum energy distribution diagram of IMF components in different states, as shown in Figure 15. It can be seen from Figure 15 that the features extracted from the different hydraulic pump working states had different distributions, while the features taken from the same working state had a similar distribution, thus indicating that the feature extraction method had a good effect.
The features of 200 samples in the four states of the axial piston pump were extracted, and the 4-dimensional fault features of each sample were extracted. A total of 30 samples were selected from 50 samples in each state, in order to form the training sample matrix, P train 120×4 , while the remaining 20 samples were used to form the test sample matrix, P test 80×4 . After the extraction of feature vectors, the WKELM method was used to identify the state of the hydraulic pump. The number of input neurons was u = 4 and the number of output neurons was w = 4 for the WKELM method. In consideration of the influence of the WKELM parameters on diagnosis accuracy, the penalty factor C and wavelet kernel coefficient a were analyzed. The influences of C and a on the recognition accuracy of WKELM are shown in Figure 16. It can be seen from Figure 16, when C=10, the training accuracy reaches 100%. When a ≤ 1, and it gradually decreases and then reaches a plateau. When a > 1, the test accuracy first increases and then decreases, reaching the maximum 98% at a = 1. When C = 100, the training accuracy reaches 100%. When a ≤ 2, and it gradually decreases. When a > 2, the test accuracy first increases and then decreases, reaching the maximum 99% at a = 1.5. When C = 1000, the training accuracy is 100%; when a ≤ 4.5, the test accuracy first increases and then decreases, reaching the maximum 98% at a = 1.5. When C = 10000, although the training accuracy is generally high, the test accuracy is very low. In general, when the value range of a is between 1 and 2.5, the training accuracy and the test accuracy are relatively high. After taking values within this range many times, we finally found that when a = 1.9 and C = 500, both the training accuracy and the test accuracy reached 100%. Therefore, the parameters of WKELM are determined as a = 1.9 and C = 500.
After determining the parameters of WKELM, the WKELM model was trained. The characteristic matrix, P train 120×4 , of the training samples was taken as the input of WKELM, while the labels corresponding to each state were taken as the output (the label for normal working was 1, the label for single slipper wear fault was 2, the label for single slipper loosing fault was 3, and the label for center spring worn fault was 4). Finally, the characteristic matrix, P test 80×4 , of the test samples was tested. The obtained recognition results are shown in Figure 17. It can be seen, from Figure 17, that all the predicted values identified by WKELM were the same as the true values; that is, the accuracy of the recognition results was 100%. This demonstrates that the proposed fault diagnosis method, based on the integration of MEEMD, AR spectrum, and WKELM methods, is effective.

Comparison and Analysis of Different Hydraulic Pump Fault Diagnosis Methods
In order to prove that WKELM has more advantages than the BP neural network, SVM, and ELM methods, hydraulic pump fault diagnosis was carried out using these methods under the same conditions. Among them, the number of neurons in the input, hidden, and output layers of BP neural network were four, 15, and four, respectively, and the activation function was the Sigmoid function. The LibSVM toolkit, provided by the Taiwanese scholar Lin Chih-Jen, was selected for SVM. The number of hidden layer neurons for the ELM was 14 and the activation function was the sigmoid function. The diagnosis accuracies of the four fault diagnosis methods are shown in Table 5. The training and testing times of the four methods were also compared, as shown in Table 6.  It can be seen from Tables 5 and 6 that in terms of fault recognition accuracy, WKELM had the highest fault recognition accuracy, followed by the BP neural network and ELM, while the SVM had the lowest fault recognition accuracy. In terms of time, WKELM had the shortest time, followed by ELM, while the BP neural network took the longest time. It can be seen that WKELM performed better than the other three methods, in terms of fault recognition accuracy and time consumption.
The Case Western Reserve University dataset was used to demonstrate the fault diagnosis method proposed in this paper has higher fault recognition accuracy. The sampling frequency of the data is 12 kHz, the model of the bearing is SKF6205, the damage diameter is 0.36 mm, the motor loads is 2 horsepower. Take the normal working state, rolling ball fault state, inner race fault state, and outer race fault data, 0.2 s for each state data as a sample, take 50 samples under each state, of which 30 as the training sample, 20 as test samples, the results as shown in Table 7. It can be seen that the fault diagnosis method proposed in this paper has a high recognition accuracy. In order to prove that the advantages of the fault diagnosis method proposed in this paper, hydraulic pump fault diagnosis was carried out using these methods under the same conditions. The diagnosis accuracies of the fault diagnosis methods [40,41] are shown in Table 8. Therefore, the advantages of the fault diagnosis method proposed in this paper, based on the integration of MEEMD, AR spectrum, and WKELM, were proved.

Conclusions
In this paper, an axial piston pump fault diagnosis method based on MEEMD, AR spectrum, and WKELM integration was proposed and studied. Our main conclusions are as follows: (1) Compared with the EEMD method, the MEEMD method can better suppress the phenomenon of mode mixing in hydraulic pump vibration signal decomposition. At the same time, the MEEMD method has better orthogonality and fewer pseudocomponents; (2) The hydraulic pump fault feature extraction method based on the MEEMD and AR spectrum methods proposed in this paper can effectively solve the problem of low efficiency inherent to traditional feature extraction methods. This method can effectively extract the fault features from the vibration signals of a hydraulic pump; (3) The WKELM method was improved and introduced for the fault diagnosis of a hydraulic pump, and a full hydraulic pump fault diagnosis method was proposed based on MEEMD-AR-WKELM integration. The diagnosis accuracy of different hydraulic pump fault states using this method was 100%. Compared with the BP, SVM, and ELM methods, the fault diagnosis method proposed in this paper was shown to have higher fault identification accuracy and faster identification speed. Institutional Review Board Statement: Not applicable, this study not involving humans or animals.
Informed Consent Statement: Not applicable, this study not involving humans.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.