Intelligent Fault Identification for Rolling Bearings Fusing Average Refined Composite Multiscale Dispersion Entropy-Assisted Feature Extraction and SVM with Multi-Strategy Enhanced Swarm Optimization

Rolling bearings act as key parts in many items of mechanical equipment and any abnormality will affect the normal operation of the entire apparatus. To diagnose the faults of rolling bearings effectively, a novel fault identification method is proposed by merging variational mode decomposition (VMD), average refined composite multiscale dispersion entropy (ARCMDE) and support vector machine (SVM) optimized by multistrategy enhanced swarm optimization in this paper. Firstly, the vibration signals are decomposed into different series of intrinsic mode functions (IMFs) based on VMD with the center frequency observation method. Subsequently, the proposed ARCMDE, fusing the superiorities of DE and average refined composite multiscale procedure, is employed to enhance the ability of the multiscale fault-feature extraction from the IMFs. Afterwards, grey wolf optimization (GWO), enhanced by multistrategy including levy flight, cosine factor and polynomial mutation strategies (LCPGWO), is proposed to optimize the penalty factor C and kernel parameter g of SVM. Then, the optimized SVM model is trained to identify the fault type of samples based on features extracted by ARCMDE. Finally, the application experiment and contrastive analysis verify the effectiveness of the proposed VMD-ARCMDE-LCPGWO-SVM method.


Introduction
The operating conditions of industrial equipment are complicated, and rolling bearings are widely employed in the types of machinery that play important roles in industrial systems, such as coal, petrochemical, electric power and other industries [1,2]. Rolling bearings will inevitably cause damage to different degrees when running for a long time. What is worse, a fault in the rolling bearings may result in mechanical failure, causing economic loss and personal injury, and even inducing catastrophic accidents. However, monitoring the health condition of the rolling bearings through appropriate indicators and providing information can greatly reduce the occurrence of failures, and avoid major accidents [3,4].
Generally, once the rolling bearings fail, it is accompanied by vibration and sound. Therefore, using appropriate technology to process the collected vibration or acoustic signals could well detect potential failures [5][6][7]. Feature extraction is a committed step in identifying rolling bearing faults, but the vibration signals present nonlinear and nonstationary characteristics resulting in a limitation in the ability of feature extraction. Since signals are generally affected by noise information, an excellent signal processing method is necessary to eliminate the negative effects of these interferences [8][9][10]. Therefore, various time-frequency signal analysis approaches have been widely employed to extract the features in the fault identification of rolling bearings, including empirical mode decomposition (EMD) [11,12], local mean decomposition (LMD) [13], ensemble empirical mode decomposition (EEMD) [14] and variational mode decomposition (VMD) [15,16]. Many scholars have conducted much research into these methods: EMD is efficient for dealing with nonstationary signals, decomposing complex signals into a series of intrinsic mode functions (IMFs) adaptively, while endpoint effect defects remain. Compared to EMD, LMD has advantages in reducing iteration times and suppressing endpoint effect, which can adaptively decompose signal into a sum of subcomponents [17]. Nevertheless, LMD is complex in calculation and susceptible to sampling frequency, which affects the decomposition errors. To overcome these limitations, EEMD adds a set of white noise to help analyze the original signal [18]. However, the result impairs the purity of the original signal in the feature extraction process. In contrast to the above methods, VMD has excellent performance in signal processing, which avoids the mode mixing problem in EMD, the influence of sampling frequency in LMD and noise effect in EEMD. Furthermore, the capability and advancement of VMD has already been confirmed by preceding studies in engineering applications [19]. Thus, VMD was employed to decompose the nonstationary fault signals here, which laid the foundation for fault pattern recognition in rolling bearings.
It is key to extract fault features from the vibration signals in order to realize machinery equipment fault identification [20,21]. On account of nonstationary signals being decomposed by VMD, fault feature extraction would be successful and effective in this study. Entropy is a physical quantity, representing the regularity and complexity of a system which can reflect the nonlinear characteristics of a vibration signal. For example, permutation entropy (PE) [22], sample entropy (SampEn) [23] and fuzzy entropy (FE) [24,25] are all familiar entropies in the feature extraction of rolling bearing fault identification. The PE concept computes simply and quickly, but the disparity between the signal amplitude values is not adequately taken into account [26]. However, dispersion entropy (DE) [27,28] has the advantages of less influence by mutation signals, which can solve the shortages of slowing the calculation in SampEn and FE. Nevertheless, DE does not take into account sufficiently the relationship information between neighboring amplitudes. Meanwhile, DE is adept in analyzing time series at a single scale, but it may ignore the hidden valuable fault information at other scales. To overcome the drawbacks, previous researchers have made improvements, for example, GRCMMFDE was proposed by Zheng et al. [20] to extract fault features. In this paper, a modified DE, namely average refined composite multiscale dispersion entropy (ARCMDE) is put forward, which can not only preserve the original data effectively, but also enhance the ability of multiscale fault feature extraction of the IMFs by fusing average refined composite multiscale procedures.
It is virtually a pattern recognition issue for identifying rolling bearing faults. Therefore, many pattern recognition methods have been employed in various engineering application problems. For instance, artificial neural network (ANN) [29,30], Bayesian decision [31] and support vector machine (SVM) [32] have been employed in identification issues. Among the above methods, ANN has a strong capacity to deal with pattern recognition problems, while it requires abundant samples and is time-consuming to adjust the network structure parameters. Bayesian decision performs with notable capacity by considering prior probability, yet good accuracy is premised on a prior model with appropriate assumptions. Compared with above methods, SVM requires a small number of samples for training, and has good generalization ability. What is more, it has particular advantages in dealing with nonlinear and multidimensional pattern recognition problems [33]. It can satisfy the classification performance by means of finding an optimal hyperplane. Meanwhile, SVM has been applied for pattern recognition combining with Entropy 2021, 23, 527 3 of 27 feature extraction in rolling bearing fault identification. Therefore, SVM is explored to implement fault identification here.
The SVM model is easily affected by penalty factor C and kernel parameter g when performing pattern recognition. To address this issue, many optimization algorithms have been used to optimize the SVM model, for instance, Harris hawks optimization (HHO) [34,35], whale optimization algorithm (WOA) [36], particle swarm optimization (PSO) [37], moth−flame optimization (MFO) [38], differential evolution (DE) [39], sine cosine algorithm (SCA) [40] and grey wolf optimization (GWO) [41]. Although these intelligent optimization algorithms have achieved some favorable results, there are still problems of premature convergence of different degrees. In order to improve convergence precision, an enhanced GWO algorithm (LCPGWO) coupled with levy flight [42], cosine factor and polynomial mutation [43] is proposed in this paper. Compared with PSO, GWO, SCA, WOA, MFO and DE algorithms on 12 well-known benchmark functions, the results show that the LCPGWO has greater advantages in finding the optimal solution, which is employed to optimize the penalty factor C and kernel parameter g of SVM in this study.
In conclusion, firstly, the nonstationary original vibration signals were decomposed into several IMFs by means of VMD. Afterwards, ARCMDE was proposed to construct the feature vectors of different fault samples. Subsequently, the LCPGWO was explored to optimize the SVM model, which was employed to carry out the classification of different fault samples. Lastly, VMD-ARCMDE-LCPGWO-SVM method was applied to compare with other methods in terms of different locations and the motor speeds of rolling bearing faults. The performance of the proposed method was proved to be perfect for the engineering application problem. This study has the following contributions: (1) Average refined composite multiscale dispersion entropy (ARCMDE) was proposed to enhance the ability of fault feature extraction. (2) A novel multistrategy enhanced swarm optimizer (LCPGWO) was proposed to calibrate the parameters of SVM, which made it an excellent fault identification model. (3) The effectiveness of LCPGWO was verified by performance analysis with 12 wellknown benchmark functions. (4) The superiority of the proposed fault identification method was ascertained by engineering experiment and comparative analysis.
The rest of this paper is arranged like this. Section 2 contributes to the fundamental theories about VMD and SVM. The proposed fault identification method according to ARCMDE and LCPGWO optimization approach is presented in Section 3. Section 4 is devoted to demonstrate the superiority of the proposed method in engineering application. The conclusions are in Section 5.

Variational Mode Decomposition
VMD is a nonrecursive signal preprocessing, which can adaptively decompose the nonstationary signal into K band-limited intrinsic mode functions (IMFs) by setting the mode number K previously. The core of the VMD method is employed to construct and solve variational problems, which is established below: where m k = {m 1 , m 2 , . . . , m k } and ω k = {ω 1 , ω 2 , . . . , ω k } represent the set of K mode functions and central frequencies, respectively. The ∂ t is the partial derivative of time t, δ(t) is the unit pulse function and f (t) is the input signal of the given real value. The above variational problem can transform into an unconstrained problem, which can be expressed as: where α represents the penalty factor and β(t) is the Lagrange multiplier [44]. Then m k and ω k can be optimized by Equations (3) and (4), respectively.
The iterative equations in frequency domain are derived as follows: The Lagrange multipliers are expressed in Equation (7).

Support Vector Machine
SVM is designed for the two-classification issue, which can solve learning problems with limited samples. For a given sample set {(x i , y i )|i = 1, 2, . . . , n}, it maps the sample space to higher dimensions, and then a hyperplane is constructed, which can transform the nonlinear problem of the sample space into the linear problem of the feature space to solve. The hyperplane function is defined as follow: where and b are weight vector and bias parameter, respectively. Equation ·x is the inner product. For example, in order to correctly identify samples of a binary classification issue, all samples demand the following conditions: The maximizing classification interval is 2/ 2 , which is obtained by minimizing 2 . Then the slack term ξ and penalty factor C are brought into Equation (9) to solve the linear indivisibility problem of SVM model.
Lagrange function is introduced in Equation (10), which can be described as: where µ i means the Lagrange multiplier, K(x i , x j ) is the kernel function of SVM. In this paper, the radial basis function (RBF) is selected as the kernel function of SVM. Solving the dual problem of Equation (11), the optimal classification discriminant function with RBF is defined as: Among them, RBF function is represented as: where g represents the kernel parameter, φ(x) is the nonlinear vector function. For a given time series r = {r i , i = 1, 2, · · · N}, which length is N. r i is normalized by mean of employing a mapping function [27].

Intelligent Fault Identification for
where σ represents the variance of the normal distribution and µ represents the expectation value. The time series r is normalized to y = {y 1 , y 2 , · · · y n }, y i ∈ (0, 1). Subsequently, the phase space is reconstructed into a matrix for y: where j = 1, 2, · · · , N − (m − 1)t d , m is embedding dimension, t d is time delay, y m j is mapped to the scope [1, c]: where z c j represents the j-th member of class sequence z m,c j , and round is rounding. Each z m,c j corresponds to a dispersion pattern The frequency of π v 0 v 1 ...v m−1 can be deduced as: where that corresponding to z m,c j : The dispersion entropy is defined as: There is a linear negative correlation between DE value and time series. The larger the DE value, the more irregular the time series.

Average Refined Composite Multiscale Dispersion Entropy
As a single scale method, DE may result, in that much useful and significant information hides in multiple scales, which would limit the representation precision of nonstationary fault signals. To solve the disadvantage, average refined composite multiscale dispersion entropy (ARCMDE) is proposed, which is utilized to extract multiscale fault features from IMFs. Given time series r = {r i , i = 1, 2, · · · N} with length N, the k-th composite multiscale coarse-grained sequence u k,2 , · · · is defined as: where τ is scale factor. For each scale factor, refined composite multiscale dispersion entropy (RCMDE) is expressed in Equation (21).
where p(π v 0 v 1 ···v m−1 ) is mean probability of the dispersion pattern π of coarse-grained sequence r Finally, ARCMDE is expressed as average value of all RCMDE in the scale τ, such that:

Grey Wolf Optimization
Grey wolves are in a dominant position in the competitive natural environment which have a strict social hierarchy and ingenious cooperative predation. According to the behavior of grey wolves when they hunt, GWO is proposed to solve optimization problems, where the grey wolves are graded into four levels [45]. The first level, called α wolf, may not be the strongest wolf, but it is the best manager in the system and responsible for overall planning. The β wolf on the second level is the best substitute for the α wolf. The δ Entropy 2021, 23, 527 7 of 27 wolf on the third level acts as the suboptimal solution. The ω wolf is the candidate solution at the bottom and is responsible for balancing the internal relations of the wolf population.
The mathematical expression of grey wolves' predation can be expressed as follows: where D represents the distance between the wolf and the prey, X and X p denote the position of the grey wolf and prey, respectively, t is current iterations time.
The coefficient vectors A in Equation (23) and C in Equation (24) are expressed as follows: where a is convergence factor, which decreases linearly from 2 to 0, h 1 and h 2 are random vectors in [0, 1], max is the maximum number of iterations.
In GWO algorithm, α, β and δ wolves to approach and surround the prey when the prey is identified by the grey wolves. Therefore, the position of the prey can be determined by the position of the grey wolves. The mathematical model for updating the position of each wolf is as follows: where D α , D β and D δ represent distances of α, β and δ wolves from other individuals respectively, X 1 , X 2 and X 3 denote the current position of α, β and δ wolves respectively. The positional relationship between the grey wolf individual ω and the prey can be determined as follows: If |A| < 1, the wolves will attack the prey; otherwise, the wolves will search for the prey. To sum up, the pseudocode of GWO algorithm is shown in Algorithm 1. Algorithm 1. The algorithm pseudocode of GWO.
Initialize the parameters a, A and C 3.
Evaluate the fitness of each wolf 4.
Assign the best three grey wolves to X α , X β , X δ 5.
for each search agent 7.
Update the position of current grey wolves by Equation (

Grey Wolf Optimization Coupled with Multiple Enhancement Strategies
GWO is slow in convergence speed, resulting in an easy fall into local optimum in the later iteration. In this section, the improved GWO coupled with multiple enhancement strategies (LCPGWO) is explored to solve the shortcomings of GWO, which can improve the ability of global search, accelerate convergence and enhance the capacity for preventing local optimum during the later iteration. The realization processes of LCPGWO are described below in detail.
As shown in Equation (25), parameter a influences the change of coefficient vectors A, which coordinates the local and global explorations. The larger a is, the stronger the global exploration ability; the smaller a is, the stronger the local exploration ability. To promote the adaptation during both local and global explorations, the linearly decreasing a in Equation (27) is substituted with cosine factor as shown in Equation (31). Thus, a is large and reduces slowly for global exploration in the early iteration stage, while it will reduce rapidly for the local search in the later iteration stage.
Additionally, inertial weight based on cosine factor is introduced in this paper to enhance the global exploration, which can be seen in Equation (32).
With inertial weight, the positions of α, β and δ wolves are reformulated in Equation (33).
In iterative process, it falls into local optimum easily when the ω wolf approaches the other three wolves. By introducing inertial weight [46], the positional relationship between the grey wolf individual ω and the prey can be redefined as follows: where W 1 , W 2 and W 3 represent the learning rate of ω to α, β and δ wolves, respectively.
In this paper, α wolf is searched globally by levy flight strategy for preventing local optimum, where the flight step is stable expansion distribution. The next generation of α wolf is calculated as follows: where X(t) is position of α wolf at t-th iteration, operator ⊕ is entry-wise multiplications, d and Levy(θ) are random numbers and step of α wolf respectively, which are determined by Equations (37) and (38). where d 0 is a constant, θ is levy index, a random number between 0 and 2, whose value is set at 1.5 here, the flight step of α wolf is a power-law formula.
A more detailed description about levy flight can be summarized as Equation (39).
where s and v are both normal distributions: where parameter γ is the standard gamma function.
In swarm intelligence optimization algorithm, it traps in local optimum easily. For this purpose, a polynomial mutation operator for GWO is introduced in this section to promote the exploring ability within the whole situation space, thus to avoid from trapping in local optimum as well as maintain the diversity of solution in the later iteration stage. The mathematical formula of the polynomial mutation is written in Equation (42).
where X(t) is the original optimal individual position, X(t + 1) is the mutated optimal individual position, u k represents the upper limit of the position and l k is the lower limit of the position. The parameter ξ is calculated as follows: where parameter s is a random number in [0, 1], η is also [0, 1]. The parameters ξ 1 and ξ 2 are deduced in Equation (44).
To sum up, the pseudocode of LCPGWO algorithm is displayed in Algorithm 2.
Initialize the parameters a by Equation (31), and initialize A and C 3.
Evaluate the fitness of each wolf 4.
Assign the best three grey wolves to X α , X β , X δ 5.
for each search agent 7.
Update the position of current grey wolves by Equation (34) 8.
Calculate the new positions of grew wolves employing the levy flight and polynomial mutation by Equations (36) and (42)  9. end for 10. Update a, A and C 11. Evaluate the fitness of each wolf 12. Update X α , X β , X δ 13. t = t + 1 14. end while 15. return X α

Experimental Study and Results Analysis Benchmark Functions
To prove the effectiveness of the proposed LCPGWO algorithm, six well-known nature-inspired optimization algorithms, including PSO, GWO, SCA, WOA, MFO and DE were applied for comparison. Meanwhile, 12 benchmark functions were selected for optimization experiments as listed in Table 1 and divided into two categories, where F1-F7 were unimodal functions and F8-F12 were multimodal functions [47][48][49]. In Table 1, Fmin was the minimum value of each benchmark function. The unimodal functions were mainly employed to test the convergence rate of the algorithms, while the multimodal functions were carried out to test the global exploration ability of the algorithms.
u(x i , 10, 100, 4) Comparison and Analysis with Different Algorithms The experiment was on a personal computer, which was equipped with Windows 10 system and an Intel(R) Core (TM) CPU at 2.89 GHz and 4 GB memory. The simulation software was MATLAB R2016a.
Each benchmark function was run 10 times independently to obtain an objective result. The iteration number and searching agents in the experiment were set at 200 and 40, respectively. The detailed parameter settings are shown in Table 2. During iterations of all algorithms, the optimal fitness values were recorded every time. The average fitness values were obtained to draw a curve reflecting the convergence trend of the algorithms. The convergence curves of PSO, GWO, SCA, WOA, MFO, DE and LCPGWO algorithms on the 12 well-known benchmark functions are listed in Figure 1. At the same time, the maximum value, minimum value, mean values and standard deviations of the optimal solution obtained by all algorithms are displayed in Table 3, where a lower value means better search ability and stability.      From Figure 1, it was observed that the proposed LCPGWO algorithm converged better than PSO, GWO, SCA, WOA, MFO, DE algorithms for all F1−F12, indicating that the proposed LCPGWO algorithm was able to prevent local optimum and converge to the optimal value at a faster speed. From Table 3, it can be concluded that the proposed LCPGWO algorithm achieved the lowest value in the maximum value, minimum value, mean value and standard deviation for both unimodal and multimodal functions. In particular, the results of the LCPGWO algorithm were superior to PSO, GWO, SCA, WOA, MFO, DE algorithms, especially on functions F6, F8, F9 and F10. On the whole, the proposed LCPGWO algorithm is more effective and feasible than contrastive methods.

SVM Optimized by LCPGWO
In order to obtain a good generalization performance in dealing with fault identification issues, it is necessary to assign appropriate parameters C and g of SVM. Thus, the proposed LCPGWO was used to optimize SVM model. The main procedures of classification machine learning with SVM optimized by the proposed LCPGWO algorithm are in below: Step 1: Initialize the population and set relevant parameters; Step 2: Update individual's status according to Equations (36) and (42); Step 3: Calculate the fitness value, which is the cross-validation accuracy of SVM; Step 4: Update individual's new position; Step 5: Repeat Steps 2-4 until the maximum time of iterations is reached or the convergence condition is met; Step 6: Choose the maximal cross-validation accuracy as the optimal parameters C and g of SVM; Step 7: Train the optimal SVM model according to the training set; Step 8: Recognize the testing set and finish the identification.

Intelligent Fault Identification for Rolling Bearings Fusing the Proposed Method
A novel fault identification method is proposed according to VMD, ARCMDE as the feature extraction and SVM optimized by LCPGWO as classification model in this paper. The flowchart of the fault identification with the proposed method is illustrated in Figure 2. To be specific, firstly, vibration signals were decomposed into four IMFs of different series by VMD. Afterwards, the feature vectors were constructed by means of the proposed ARCMDE, which extracted fault features from IMFs. Finally, the parameters of SVM model were optimized by multistrategy enhanced swarm optimization algorithm LCPGWO, thus achieving the fault pattern recognition.

Intelligent Fault Identification for Rolling Bearings Fusing the Proposed Method
A novel fault identification method is proposed according to VMD, ARCMDE as the feature extraction and SVM optimized by LCPGWO as classification model in this paper. The flowchart of the fault identification with the proposed method is illustrated in Figure  2. To be specific, firstly, vibration signals were decomposed into four IMFs of different series by VMD. Afterwards, the feature vectors were constructed by means of the proposed ARCMDE, which extracted fault features from IMFs. Finally, the parameters of SVM model were optimized by multistrategy enhanced swarm optimization algorithm LCPGWO, thus achieving the fault pattern recognition.

Data Collection
To attest the effectiveness of the proposed method in this paper, the machinery fault simulator (MFS) manufactured by SQI company was used to measure the relevant experimental data for bearings. The detailed information of machinery fault simulator is shown in Figure 3. The type of rolling bearings selected in the experiment was ER12KCL. Meanwhile, the motor speeds of the bearings were 1800 rpm and 2200 rpm when collecting experimental data. The time of sample data collection was set as 10 s. The bearings' state types were also divided into four, which were inner race fault, ball fault, outer race fault and combination fault, which are displayed in Figure 4. The diameter of all the experimental bearings was 3/4 inches. The vibration signals of the rolling bearings were collected by employing the acceleration sensor mounted on the bearing seat of the motor drive end, and the sampling frequency was 12.8 kHz. Further, there were 61 samples of the vibration signals for each type, and one sample possessed 2048 sample points. The detailed data about the experiments are shown in Table 4. and combination fault, which are displayed in Figure 4. The diameter of all the experimental bearings was 3/4 inches. The vibration signals of the rolling bearings were collected by employing the acceleration sensor mounted on the bearing seat of the motor drive end, and the sampling frequency was 12.8 kHz. Further, there were 61 samples of the vibration signals for each type, and one sample possessed 2048 sample points. The detailed data about the experiments are shown in Table 4.    types were also divided into four, which were inner race fault, ball fault, outer race fault and combination fault, which are displayed in Figure 4. The diameter of all the experimental bearings was 3/4 inches. The vibration signals of the rolling bearings were collected by employing the acceleration sensor mounted on the bearing seat of the motor drive end, and the sampling frequency was 12.8 kHz. Further, there were 61 samples of the vibration signals for each type, and one sample possessed 2048 sample points. The detailed data about the experiments are shown in Table 4.

Application to Fault Identification of Rolling Bearings
To fully prove that the proposed fault identification method was effective, other relevant methods were applied to compare with the proposed VMD-ARCMDE-LCPGWO-SVM method. More specifically speaking, FE, DE and RCMDE were employed to compare with ARCMDE at the feature extraction stage; GWO was applied for comparison at the parameter optimization stage. The settings of the same parameters were uniform in all the comparison experiments.
Feature extraction is the main problem in the process of identifying rolling bearing faults. VMD was selected to decompose the fault signal into a set of IMFs. The parameter of decomposing mode number K was decided in advance, where it was determined by the center frequency observation method according to a previous study [50]. In this paper, the K value was obtained by experiment using sample data under a motor speed of 1800 rpm. As shown in Figure 5, if K is too large, the center frequencies of adjacent IMFs are too close, resulting in mode mixing, which means excessive decomposition. However, if K is too small, the fault signal cannot be effectively decomposed, which leads to more valuable information being ignored. Therefore, the K value in the paper was set as 4.
SVM method. More specifically speaking, FE, DE and RCMDE were employed to com with ARCMDE at the feature extraction stage; GWO was applied for comparison a parameter optimization stage. The settings of the same parameters were uniform in a comparison experiments.
Feature extraction is the main problem in the process of identifying rolling bea faults. VMD was selected to decompose the fault signal into a set of IMFs. The param of decomposing mode number K was decided in advance, where it was determine the center frequency observation method according to a previous study [50]. In this pa the K value was obtained by experiment using sample data under a motor speed of rpm. As shown in Figure 5, if K is too large, the center frequencies of adjacent IMF too close, resulting in mode mixing, which means excessive decomposition. Howev K is too small, the fault signal cannot be effectively decomposed, which leads to m valuable information being ignored. Therefore, the K value in the paper was set as 4.    After the IMFs of all samples were obtained through signal decomposition, fault feature vectors were constructed by calculating ARCMDE values, where parameters should be chosen properly beforehand [51,52]. Here, four parameters were set in advance, which were embedding dimension m, number of class c, maximum scale factor max τ , and time delay d t . By referring to previous papers [53], the parameter settings of ARCMDE are displayed in Table 5.  To verify the performance of the proposed method, 61 feature vectors belonging to different fault types were selected for the contrast experiment. They were randomly divided into two parts, where 40 vectors were selected for training and the remaining 21 vectors were for testing. After that, the proposed LCPGWO method was utilized to enhance the classification identification performance of SVM by searching the optimal values of parameters C and g of SVM model. The searching range of C and g were set in [2 −10 , 2 10 ], meanwhile, the optimization experiments were accomplished by 100 iterations and 20 searching agents. The five-fold cross-validation was applied for calculating the fitness values of training samples in this experiment. Hence, according to the obtained optimal parameters C and g by the proposed LCPGWO method, the SVM model was trained and employed to achieve fault identification. For a dependable verification of the proposed method about the effectiveness and superiority, each of these comparative fault identifi- After the IMFs of all samples were obtained through signal decomposition, fault feature vectors were constructed by calculating ARCMDE values, where parameters should be chosen properly beforehand [51,52]. Here, four parameters were set in advance, which were embedding dimension m, number of class c, maximum scale factor τ max , and time delay t d . By referring to previous papers [53], the parameter settings of ARCMDE are displayed in Table 5. To verify the performance of the proposed method, 61 feature vectors belonging to different fault types were selected for the contrast experiment. They were randomly divided into two parts, where 40 vectors were selected for training and the remaining 21 vectors were for testing. After that, the proposed LCPGWO method was utilized to enhance the classification identification performance of SVM by searching the optimal values of parameters C and g of SVM model. The searching range of C and g were set in [2 −10 , 2 10 ], meanwhile, the optimization experiments were accomplished by 100 iterations and 20 searching agents. The five-fold cross-validation was applied for calculating the fitness values of training samples in this experiment. Hence, according to the obtained optimal parameters C and g by the proposed LCPGWO method, the SVM model was trained and employed to achieve fault identification. For a dependable verification of the proposed method about the effectiveness and superiority, each of these comparative fault identification methods was run 10 times, on average, independently, and training samples were randomly selected. Moreover, accuracy (ACC), adjusted rand index (ARI), F-measure (F) and normalized mutual information (NMI) [54,55] were applied to evaluate the capability of these different approaches. The higher values of the four metrics mean that the matching degree between fault identification result and real samples distribution information is better. The calculation methods of the four metrics are shown in the Table 6, where the range of ARI is set in [−1, 1], and the rest of the metrics are in [0, 1].  The following notations in Table 6 are adopted: TP, TN, FP, FN represent true positive, true negative, false positive, false negative on the basis of the result of fault identification and actual label; Φ and Ω are the sets of given actual label classified result, respectively; n 11 is the number of sample pairs for the same label in both Φ and Ω, while n 00 is for different label, C 2 n is all possible sample pairs combinations; P(ϕ) and P(ω) represent the probability functions of Φ and Ω respectively; P(ϕ, ω) is the joint probability functions of Φ with Ω.
Eight relevant methods were employed to compare for illustration of the advantages of the proposed approach in this study. The four evaluation values of fault identification results are shown in Table 7. By comparing with the results of different methods, it proves that the proposed VMD-ARCMDE-LCPGWO-SVM method is the best of four evaluation metrics. It has the highest values at 0.9597, 0.9627, 0.9838, 0.9838 under a motor speed of 1800 rpm and 0.9303, 0.9381, 0.9712, 0.9714 under a motor speed of 2200 rpm. Although the performance of NMI under 2200 rpm was not optimal, it was still very good, which could be considered a desirable result. The evaluation value deviations were also very low. In order to better analyze the results, the results when the motor speed was 1800 rpm were taken as the analysis. For the parameter optimization of the SVM model, it can be observed that the ACC of the proposed VMD-ARCMDE-LCPGWO-SVM method was far better than VMD-ARCMDE-GWO-SVM method. Additionally, the VMD-FE-LCPGWO-SVM and VMD-DE-LCPGWO-SVM methods also performed better than the VMD-FE-GWO-SVM and VMD-DE-GWO-SVM methods, respectively, which proved the effectiveness of LCPGWO method for parameter optimization. Based on the above experimental analyses, the conclusion can be drawn that the proposed VMD-ARCMDE-LCPGWO-SVM method achieves stable competitiveness when compared with other methods. In order to show the fault identification evaluation results comparison of different methods more intuitively, the comparison of evaluation values of different methods under 1800 rpm are shown in Figure 8, illustrating that the proposed method has obvious advantage in fault identification. As shown in Figure 9, the proposed VMD-ARCMDE-LCPGWO-SVM method achieved more outstanding results than the other methods on the whole, where the evaluation values of the data had a strong performance at 2200 rpm. Furthermore, the boxplots of the four evaluation values are displayed in Figure 10, it demonstrates the performances of the different methods. The proposed method possesses better stability and overall performance. Therefore, with the experiments on various fault locations, motor speeds and the detailed comparative analysis given above, the superiority of the proposed identification model is effectively demonstrated.

Conclusions
Increasingly complex rotating machinery equipment must have excellent mechanical fault identification technology to ensure its safe and effective operation. In this paper, a novel fault identification approach is proposed by fusing VMD, ARCMDE and SVM with LCPGWO optimization. Firstly, VMD was employed to decompose the nonstationary fault signals into several IMFs by the center frequency observation method. Afterwards, ARCMDE fusing the superiorities of DE and average refined composite multiscale procedure, was proposed to construct the feature vectors of different fault samples, which performed excellently in multiscale fault feature extraction from the IMFs. Subsequently, LCPGWO, which was GWO enhanced by multistrategy, including levy flight, cosine factor and polynomial mutation strategies, was compared with the other algorithms on different benchmark functions. The results demonstrated that the use of LCPGWO was explored to improve the ability of global search, accelerate convergence and enhance the capacity for jumping out of local optimum in the later iteration. Thus, LCPGWO was applied to optimize penalty factor C and kernel parameter g of SVM model, which was employed to realize the fault classification for different fault samples. Lastly, the proposed VMD-ARCMDE-LCPGWO-SVM method was applied to compare with other methods for rolling bearing fault identification. Meanwhile, the experiment results were measured by four evaluation metrics named ACC, ARI, F and NMI. The proposed fault identification

Conclusions
Increasingly complex rotating machinery equipment must have excellent mechanical fault identification technology to ensure its safe and effective operation. In this paper, a novel fault identification approach is proposed by fusing VMD, ARCMDE and SVM with LCPGWO optimization. Firstly, VMD was employed to decompose the nonstationary fault signals into several IMFs by the center frequency observation method. Afterwards, ARCMDE fusing the superiorities of DE and average refined composite multiscale procedure, was proposed to construct the feature vectors of different fault samples, which performed excellently in multiscale fault feature extraction from the IMFs. Subsequently, LCPGWO, which was GWO enhanced by multistrategy, including levy flight, cosine factor and polynomial mutation strategies, was compared with the other algorithms on different benchmark functions. The results demonstrated that the use of LCPGWO was explored to improve the ability of global search, accelerate convergence and enhance the capacity for jumping out of local optimum in the later iteration. Thus, LCPGWO was applied to optimize penalty factor C and kernel parameter g of SVM model, which was employed to realize the fault classification for different fault samples. Lastly, the proposed VMD-ARCMDE-LCPGWO-SVM method was applied to compare with other methods for rolling bearing fault identification. Meanwhile, the experiment results were measured by four evaluation metrics named ACC, ARI, F and NMI. The proposed fault identification method has smaller error, better stability and higher reliability than the other contrastive methods. Particularly, under motor speed 1800 rpm, the identification accuracy of the proposed method was 9.33, 3.62, 1.71 and 0.57% higher than the VMD-FE-GWO-SVM, VMD-DE-GWO-SVM, VMD-RCMDE-GWO-SVM and VMD-ARCMDE-GWO-SVM methods; and also 8.09, 3.43 and 0.67 higher than the VMD-FE-LCPGWO-SVM, VMD-DE-LCPGWO-SVM and VMD-RCMDE-LCPGWO-SVM methods. Meanwhile, the evaluation metrics were also outstanding under 2200 rpm. Therefore, it can be expected to provide a new way for rolling bearing fault identification.

Discussion
The generation and development of rolling bearing faults are caused by the coupling of many factors, which contain a large number of uncertain factors. Conventional diagnostic methods have difficulty in obtaining satisfactory results. The SVM method is a relatively novel method in the field of rolling bearing fault identification. Although some research has been done in this paper, the author believes that there are still some issues worthy of further research: (1) in practical engineering applications, different components in the unit influence each other, complex rolling bearing combinations may have multifactor failures in the future. Therefore, it is still necessary to conduct in depth failure mechanism research to make the identification work more targeted, more accurate and reliable. (2) The research of a fault identification classifier is only one aspect of the problem of fault identification. The premise of fault identification is to apply an advanced signal analysis method to extract more effective and more capable features from the rolling bearings' operating state. Therefore, it is necessary to extract fault feature information from multiple angles according to the bearing fault signal characteristics and combined with new signal processing technology, so as to lay a foundation for the SVM to provide more effective fault features. (3) The occurrence of rolling bearing faults is a gradual process, and minor failures have little impact, but parts must be replaced after reaching a certain degree of severity. Therefore, real-time monitoring of rolling bearings is very important. This paper only analyzes the bearing vibration signals collected on the experimental platform, and does not realize real-time monitoring, which is also the direction of author's next research.

Data Availability Statement:
The data included in this study are all owned by the research group and will not be transmitted.

Conflicts of Interest:
The authors declare no conflict of interest.