A Novel Rolling Bearing Fault Diagnosis Method Based on MFO-Optimized VMD and DE-OSELM

: Rolling bearings are critical in maintaining smooth operation of rotating machinery and considerably inﬂuence its reliability. The signals collected from rolling bearings in ﬁeld conditions are often subjected to noise, creating a challenge to extract weaker fault features. This paper proposes a rolling bearing fault diagnosis method that addresses the above-mentioned problem through the moth-ﬂame optimization algorithm optimized variational mode decomposition (MFO-optimized VMD) and an ensemble differential evolution online sequential extreme learning machine (DE-OSELM). By using the dynamic adaptive weight factor and genetic algorithm cross operator, the optimization accuracy and global optimization ability of the moth-ﬂame optimization (MFO) are improved, and the two basic parameters of VMD decomposition level and quadratic penalty factor are adaptive selected. Since the vibration characteristics of the signal cannot be fully interpreted by a single index, The effective weighted correlation sparsity index (EWCS) is utilized to extract the relevant intrinsic mode functions (IMF) of VMD decomposition and extract their energies as features. In order to improve the classiﬁcation accuracy, The energy feature set is subsequently inputted into DE-OSELM for training and classiﬁcation purposes, and the proposed method is assessed via a sample set with four different health states of actual rolling bearings. Our proposed method results are compared with other diagnosis methods, proving its feasibility to diagnose rolling bearing faults with higher classiﬁcation accuracy.


Introduction
Rolling bearings are one of the most widely used mechanical components in rotary machinery and have diverse applications in the medical, aerospace, and railway fields [1][2][3].Bearing failures account for 30% of rotating machinery failures, according to published research [4].Ensuring the normal and safe operation of rolling bearings thus demands state monitoring and fault diagnosis [5].When a rolling bearing fails, vibration signals emit regular pulse signals, which analysts often scrutinize to identify the specific type of failure [6][7][8].However, the noisy nature of collected vibration signals poses difficulties in extracting fault features from pulse signals, which is an area of active research in the field of fault diagnosis.
Several approaches have been proposed for processing vibration signals, including empirical mode decomposition (EMD) [9][10][11], ensemble empirical mode decomposition (EEMD) [12][13][14], and the local mean decomposition method (LMD) [15][16][17].Nonetheless, modal-analysis-based approaches, such as EMD, EEMD, and LMD, are limited in that they cannot fully address the issue of modal mixing and endpoint effect.Dragomiretskity et al. [18] introduced an adaptive signal analysis approach named VMD that deals with signal processing by formulating and resolving variational issues, providing strong resistance to noise.Owing to its ability to tackle modal mixing and endpoint effect effectively, VMD finds applications in multiple fields, including generator anomaly detection, structural health monitoring, and bearing fault diagnosis [19][20][21].Li and colleagues [22] devised a diagnosis method for fault detection in rolling bearings using VMD and improved Kernel Extreme Learning Machine and demonstrated that VMD successfully addressed the mode mixing problem and boasted superior computational efficiency compared to EMD and LMD methods.
However, VMD requires pre-set parameters, including the decomposition number (K) and penalty factor (α), which significantly affect the final outcome.To identify the optimal parameter combination, several studies have proposed different methods.For instance, Jiang et al. [19] established a central frequency mode decomposition (CFMD), based on VMD, and the difficulty of the selection of initial parameters in the traditional VMD was relieved, provided that the range of bandwidth parameters was preset.Wang et al. [23] compared the center frequency of modal components that were decomposed by different parameter combinations.However, this method had limited adaptability.Tang et al. [24] implemented the Particle Swarm Optimization (PSO) algorithm to optimize VMD and demonstrated its ability to extract early fault features from bearing vibrations.Another study by Zhang et al. [25] suggested a parameter adaptive VMD strategy based on the grasshopper optimization algorithm (GOA) and validated its efficiency in analyzing the vibration signals of real rotating machinery.Moreover, Gu et al. [26] adopted the grey wolf optimizer (GWO) algorithm, which was superior to the fixed-parameter VMD and maximum weighted kurtosis optimization VMD in terms of selecting the optimal parameter combination.These studies show that optimization algorithms with strong search abilities can achieve better adaptive selection of VMD parameters.Recently, the MFO algorithm [27] has achieved excellent performance in solving engineering optimization problems [28,29].Sivalingam et al. [30] compared MFO with other optimization algorithms, finding that MFO performed the best.However, the original MFO algorithm searches in the region around its unique flame, hence increasing the risk of falling into the local optimum and slow convergence speed.Therefore, the original MFO algorithm requires improvement.Although VMD can effectively solve the modal problem, there will still be false intrinsic modal functions (IMF) in the decomposed IMF, which will affect the accuracy of the fault classification of rolling bearings [31].Therefore, it is necessary to screen the decomposed IMF.
In addition to feature extraction, the swift and accurate identification of fault types is pivotal for fault diagnosis.ELM replaces the gradient descent algorithm with a random assignment method and enhances the generalization ability of traditional classification networks, making it effective for fault diagnosis [32].Jiang et al. [33] proposed a fault diagnosis model based on multiscale weighted permutation entropy (MWPE) and ELM, which demonstrated superior recognition accuracy and speed compared to using various multiscale feature extraction methods with BPNNs and SVMs.Similarly, Lan et al. [34] achieved the diagnosis of slipper abrasion faults using ELM and demonstrated that the classification performance of ELM is superior to BP and SVM.Liang et al. [35] proposed OSELM to resolve the long training time issue of ELM when the training data were large.This algorithm obviates the need to retrain historical data to facilitate rapid diagnosis.Sahani et al.'s use of VMD and OSELM for real-time detection and classification of power quality events provided high classification accuracy and robustness [36].However, the randomly generated input weight and hidden layer bias in the OSELM algorithm have been found to limit its prediction accuracy and robustness.Hence, this paper introduces the ensemble DE-OSELM method, which was proposed by Zhou et al. [37], to realize fault classification and diagnosis.
This paper presents a novel approach for detecting rolling bearings faults, utilizing optimized VMD and DE-OSELM.Firstly, the method adopts a dynamic adaptive weight factor and a crossover operator.Compared with the original MFO algorithm and other intelligent optimization algorithms, the improved MFO algorithm has higher global optimization ability and search accuracy, and it solves the problem of sub-optimal performance and limited convergence accuracy of multi-objective optimization algorithm.Secondly, the improved MFO is used to optimize VMD and overcome the susceptibility of VMD parameters to artificial settings, thus facilitating the adaptive selection of these parameters.Thirdly, since the vibration characteristics of the signal cannot be fully interpreted by a single index, this method introduces a new evaluation index, the effective weighted correlation sparsity index, to strip false modal components and filter the IMF recovered through VMD decomposition.The energy features of the effective IMFs are subsequently extracted as feature vectors.Finally, in order to improve the classification accuracy, the energy eigenmatrix is normalized and subjected to DE-OSELM training to identify the fault type.

Basic Principle
VMD is an adaptive signal processing technique that relies on the formulation and resolution of variational problems [38].Essentially, the VMD algorithm partitions the input signal into K IMFs while simultaneously minimizing the estimated broadband sum.To achieve this, the algorithm operates under the assumption that the sum of the IMF components is equivalent to that of the original signal.The corresponding variational problem is subject to certain constraints and can be expressed as: where; u k is the decomposed IMF components, and ω k is the central frequency of each IMF component.
The problem is solved by introducing a penalty factor α and a Lagrange multiplier operator λ into the model, thereby transforming it into an unconstrained variational problem.
where; f is the original signal.The steps for solving Equation (2) are as follows: Initialize parameters û1 k , ω1 k , λ1 , and set n = 0; Let n → n + 1 , an update u k , ω k , λ iteratively according to Equations (3)-( 5): where; n is the number of iterations, ∧ is the Fourier transform, and τ is the noise tolerance.Repeat Step 2 until the cycle ends, when the components satisfy Equation ( 6), and K IMFs are obtained.
where; ε is the discriminant accuracy.
Appl.Sci.2023, 13, 7500 Proper parameter selection is a prerequisite for VMD signal decomposition, where the parameters K and α exert major influence on the decomposition effect, whereas the parameters ε and τ play a less prominent role [39].Hence, achieving the optimal VMD decomposition effect warrants the identification of suitable K and α values.

MFO Algorithm
Inspired by the natural phenomenon of moths fighting fire, the MFO algorithm assumes that the moth population flies around the flame population in the form of a logarithmic spiral curve.If a new position is found to be better than the original flame position, the position will be updated.The matrix M represents the initial moth position, and the matrix OM represents the initial moth fitness value, as given below.
where; M i is the position of the i − th moth in the solution space of the moth population M; n is the population number; and d is the dimension of the solution space.
where; OM i is the fitness value of the i − th moth.Each moth has a unique flame corresponding to it.The flame represents the local optimal solution found by each moth in the search process.The flame position is represented by the matrix F, and the matrix OF represents the flame fitness value.
where; F i is the position of the i − th flame in the solution space of flame population F; n is the number of flame population; and d is the dimension of solution space.
where; OF i is the fitness value of the i − th flame.Moth M i will move towards the corresponding flame F j in the form of a logarithmic spiral curve due to phototaxis.The movement formula is defined as Equation (11), and the matrix S(M i , F j ) represents the updated position of the moth.
where; M i is the i − th moth; F j is the j − th flame; D i is the distance between the i − th moth and the j − th flame; b is a constant related to the shape of the spiral function; and t is a random number in the interval [−1, 1].
The algorithm adaptively reduces the number of flames based on Equation ( 12), thereby enhancing its efficiency and ensuring that the population of moths is converging towards the optimal flame.
where; f lameno is the number of current flames; N is the number of original flame population; l is the number of current iterations; and T is the maximum number of iterations.
For the MFO algorithm, a good moth population and flame population after the initial position of the each moth are usually around the corresponding search flame area, and moths will follow the flame along into local optimum only if the flame in a local optimum.Thus, the original MFO is unable to jump out from the local optimum, which leads to the relatively low convergence accuracy and slow convergence speed.

The Improved MFO Algorithm
To address the issues related to local optimization, low convergence accuracy, and slow convergence speed of the original optimization algorithm, the MFO is modified by using both a dynamic adaptive weight factor and a crossover operator to improve the performance in terms of global optimization ability, accuracy, and efficiency.
In the original concept of the MFO algorithm, the moth population is supposed to fly around the flame population in the form of a logarithmic spiral curve, and the flame is not fully utilized, which makes the algorithm fall into local optimal easily.To improve its global optimization ability and convergence speed, the dynamic adaptive weight factor µ is introduced into the position updating strategy of moths, as formulated below.
where; l represents the current iteration number, and T represents the maximum iteration number.The updated moth position combined with the adaptive weight factor µ is introduced in Equation ( 14).
The gradual decrease as the iteration µ increases from 1 to 0 results in an enlarged search scope during the initial stages.Consequently, the algorithm performance significantly improves concerning search and global optimization abilities, accuracy, and efficiency.
In order to effectively elevate the algorithm out of local optima, the positions of the front m flames are disordered by applying the crossover operator from the genetic algorithm.The main idea is to make p times of cross recombination of each dimension data in the matrix involving front m flames, and the corresponding dimension data of other flames are combined into a new flame.If the fitness value of the new flame is better than the original flame, the original flame will be replaced.The flame population contains a relatively high diversity and can jump out of local optimum in a certain probability by perturbation of the front m optimal flames.
The process for improving the MFO is shown in Figure 1.
the original flame, the original flame will be replaced.The flame population contains a relatively high diversity and can jump out of local optimum in a certain probability by perturbation of the front m optimal flames.
The process for improving the MFO is shown in Figure 1.

Verification of the Algorithm
To verify the effectiveness and superiority of the improved MFO algorithm, four commonly used functions were selected for the test, and the dimensions for the tested functions are all 30.The tested functions are given as follows.

(1) Schwefel's Problem 1.2 function
, and the optimal value of this function is 0.
(2) Schwefel's Problem 2.22 function where; , and the optimal value of this function is 0.
(3) Sum Squares function where; , and the optimal value of this function is 0.
(4) Ackley function , and the optimal value of this function is 0.

Verification of the Algorithm
To verify the effectiveness and superiority of the improved MFO algorithm, four commonly used functions were selected for the test, and the dimensions for the tested functions are all 30.The tested functions are given as follows. 8.
In the verification, the moth population is set to consist of 30 individuals, with a maximum of 1000 iterations; the parameters m and p in the crossover operator are set as 15 and 5, respectively, which indicates that the first 15 flame positions after sorting are disordered five times.Each tested function was performed 20 times, and the results are compared with those obtained from the original MFO, GWO, and PSO optimization algorithms.A summary of the comparison results is shown in Table 1.
The iterative optimization convergence curves of MFO, GWO, and PSO are shown in Figure 2. Additionally, Table 1 illustrates that the improved MFO algorithm exhibits superior optimization capability compared to three other algorithms.It produces the highest optimization accuracy across four test functions and is able to identify global optimal values.These findings provide convincing evidence that the algorithm improvement was effective.
The iterative optimization convergence curves of MFO, GWO, and PSO are shown in Figure 2. Additionally, Table 1 illustrates that the improved MFO algorithm exhibits superior optimization capability compared to three other algorithms.It produces the highest optimization accuracy across four test functions and is able to identify global optimal values.These findings provide convincing evidence that the algorithm improvement was effective.Modal aliasing may likely occur if the VMD decomposition process is accompanied by improperly chosen parameters [40].To prevent the influence of artificial parameters, the improved MFO algorithm is implemented in the parameter selection process of VMD to achieve the adaptive selection of the critical parameters of VMD.
This paper employs envelope entropy to describe the coefficient characteristics of signals.A smaller envelope entropy indicates more regular fault pulses [27].Thus, the minimum average envelope entropy (MAEE) is utilized as the fitness function, which is defined as: where; a i (n) is the envelope signal of the i-th modal component, p i (n) is the normalized form of a i (n), and H en (i) is the envelope entropy value of the i − th modal component.After the original signal has been subjected to VMD decomposition, it gives rise to several modal components.The efficacy of the decomposition can be evaluated through determination of the average envelope entropy of each modal component.A decrease in the average envelope entropy value indicates a reduction in noise signals in the decomposed modal components, yielding more consistent fault pulses.
The specific process of the MFO-optimized VMD proposed in this paper is as follows: (1) The moth population and flame matrix are initialized, and MAEE was used as fitness function.
(2) VMD decomposition was performed on the position of each moth to obtain the fitness value of each moth.The moth population was sorted according to the fitness value, and the former flameno flames were selected to construct the matrix.(3) Select the first m flame and any flame for p times cross recombination, and, if the fitness value of the new flame is better than the original flame, the original flame is replaced.(4) Equation ( 14) was used to update the moth population position, and VMD decomposition was performed on the signal in the new position of each moth to calculate the fitness value of each moth and update the moth population and flame population.
(5) The algorithm determines whether the pre-determined number of iterations has been reached.In case it has not, the algorithm repeats the process in steps 2 to 4. When the predetermined number of iterations has been reached, the iteration stops, and the optimal configuration of parameters is presented as output.

IMF Screening Based on the EWCS
When the value of K is greater than the number of components in the signal, overdecomposition occurs.False IMFs may exist among the K IMFs components of the original signal when decomposed by the MFO-optimized VMD method.These false components can adversely affect subsequent fault classification.Therefore, it is necessary to separate these false IMF components.While correlation coefficients or kurtosis criteria are frequently employed in separating false IMFs by existing methods [41], these indicators may not fully capture the intricate vibration characteristics of signals [42].To address this limitation, we propose a novel evaluation index, known as the EWCS, to screen IMFs.A modal component with a calculated EWCS greater than 0 is identified as an effective one.The EWCS calculation involves both the correlation coefficient and sparsity of the IMF, providing a more comprehensive assessment of its suitability.The EWCS is calculated using the formula: where; Spa is the sparsity of signal x(t), and Cor is the correlation coefficient between signal x and y.

Simulated Analysis
A mathematical model was utilized to simulate the signal of rolling bearing inner ring faults for the purpose of evaluating the effectiveness of the MFO-optimized VMD method and the EWCS index: The simulation signal was synthesized by adding a periodic impact component s(t) with Gaussian white noise n(t).The sampling frequency ( f s ) was set to 12,000 Hz, while the rotation frequency ( f r ), the structural resonance frequency ( f n ), the fault frequency ( f i ), and the damping ratio (C) were set to 33 Hz, 3000 Hz, 79 Hz, and 500, respectively.Additionally, the signal-to-noise ratio (SNR) of the Gaussian white noise was set as −5 dB.The resulting simulation signal was plotted in both the time domain and the envelope spectrum, as presented in Figure 3.
with Gaussian white noise ) (t n .The sampling frequency ( s f ) was set to 12,000 Hz, while the rotation frequency ( r f ), the structural resonance frequency ( n f ), the fault frequency ( i f ), and the damping ratio ( C ) were set to 33 Hz, 3000 Hz, 79 Hz, and 500, respectively.Additionally, the signal-to-noise ratio (SNR) of the Gaussian white noise was set as −5 dB.The resulting simulation signal was plotted in both the time domain and the envelope spectrum, as presented in Figure 3. Figure 3 shows that the periodic impact characteristics that resulted from the structure failure are enveloped by noise, which makes it difficult to extract the impact characteristics precisely and directly.As such, signal processing techniques are necessary to extract the impact characteristics.
The optimal parameter combination for VMD was determined using the improved MFO.Initially, we set the population size to 20 and a maximum of 10 iterations, with m and p values both set to 10.The outcome of the parameter search is illustrated in Figure 4. Figure 3 shows that the periodic impact characteristics that resulted from the structure failure are enveloped by noise, which makes it difficult to extract the impact characteristics precisely and directly.As such, signal processing techniques are necessary to extract the impact characteristics.
The optimal parameter combination for VMD was determined using the improved MFO.Initially, we set the population size to 20 and a maximum of 10 iterations, with m and p values both set to 10.The outcome of the parameter search is illustrated in Figure 4.  Figure 4 demonstrates that the MAEE value of 9.2493800, achieved with parameter combination <3, 773>, was obtained on the third iteration.To demonstrate the proposed improved MFO algorithm's efficiency, we compared its performance to three other popular optimization algorithms: the original MFO, GWO, and PSO.Each algorithm had an initial population size of 20 and was restricted to a maximum of 10 iterations.Each algorithm ran for 10 iterations, and Table 2 presents the results obtained.Table 2 reveals that GWO exhibits the highest MAEE value amongst the four algorithms across the ten operations.Despite the fact that the MAEE values obtained by the PSO, MFO, and improved MFO3 algorithms are equivalent, the improved MFO algorithm has the smallest mean square error and standard deviation across all results, demonstrating its efficacy.
The optimal parameter combination of <3, 773> was used to decompose the simulation signal by VMD.To validate the proposed EWCS index, we computed the Peak Signalto-Noise Ratio (PSNR), Mean Squared Error (MSE), and EWCS value of the three resulting IMFs obtained from the decomposition.The computed results are presented in Table 3, where it can be observed that IMF2 exhibits the highest PSNR and the smallest MSE, thus implying that this IMF contains abundant fault information.This observation is consistent Figure 4 demonstrates that the MAEE value of 9.2493800, achieved with parameter combination <3, 773>, was obtained on the third iteration.To demonstrate the proposed improved MFO algorithm's efficiency, we compared its performance to three other popular optimization algorithms: the original MFO, GWO, and PSO.Each algorithm had an initial population size of 20 and was restricted to a maximum of 10 iterations.Each algorithm ran for 10 iterations, and Table 2 presents the results obtained.Table 2 reveals that GWO exhibits the highest MAEE value amongst the four algorithms across the ten operations.Despite the fact that the MAEE values obtained by the PSO, MFO, and improved MFO3 algorithms are equivalent, the improved MFO algorithm has the smallest mean square error and standard deviation across all results, demonstrating its efficacy.
The optimal parameter combination of <3, 773> was used to decompose the simulation signal by VMD.To validate the proposed EWCS index, we computed the Peak Signal-to-Noise Ratio (PSNR), Mean Squared Error (MSE), and EWCS value of the three resulting IMFs obtained from the decomposition.The computed results are presented in Table 3, where it can be observed that IMF2 exhibits the highest PSNR and the smallest MSE, thus implying that this IMF contains abundant fault information.This observation is consistent with the outcome of the proposed method, which screens effective IMFs based on EWCS value, and it substantiates the validity of the proposed method.The effective component IMF2, after being screened, underwent an envelope demodulation analysis and yielded an envelope spectrum displayed in Figure 5.The extracted frequency components included the conversion frequency ( f r ), the fault characteristic frequency ( f i ), and its double frequency (2 f i ).These results demonstrate the successful application of the VMD optimization method proposed in this paper, in combination with EWCS index screening, for effectively decomposing the simulation signal x(t).To verify the superiority of the proposed method, we employed a VMD decomposition technique with fixed parameters of <5, 2000> and identified effective components as IMF3, IMF4, and IMF5 based on the EWCS (Method 1).Additionally, we utilized the screening method based on correlation coefficient and kurtosis value, presented in reference [43], along with a <3, 773> VMD decomposition, and we identified IMF1 and IMF2 as effective components (Method 2).Subsequently, we reconstructed and analyzed the selected effective components using envelope demodulation, and the outcomes are illustrated in Figure 6. Figure 6 indicates that Method 1 and Method 2 were unable to extract r f and i f .By contrast, the VMD optimization method and EWCS index utilized in the proposed method were successful in extracting signal components that carried the information on system faults.This finding reinforces the effectiveness of the VMD optimization method and the EWCS index in fault feature extraction.To verify the superiority of the proposed method, we employed a VMD decomposition technique with fixed parameters of <5, 2000> and identified effective components as IMF3, IMF4, and IMF5 based on the EWCS (Method 1).Additionally, we utilized the screening method based on correlation coefficient and kurtosis value, presented in reference [43], along with a <3, 773> VMD decomposition, and we identified IMF1 and IMF2 as effective components (Method 2).Subsequently, we reconstructed and analyzed the selected effective components using envelope demodulation, and the outcomes are illustrated in Figure 6.To verify the superiority of the proposed method, we employed a VMD decomposition technique with fixed parameters of <5, 2000> and identified effective components as IMF3, IMF4, and IMF5 based on the EWCS (Method 1).Additionally, we utilized the screening method based on correlation coefficient and kurtosis value, presented in reference [43], along with a <3, 773> VMD decomposition, and we identified IMF1 and IMF2 as effective components (Method 2).Subsequently, we reconstructed and analyzed the selected effective components using envelope demodulation, and the outcomes are illustrated in Figure 6. Figure 6 indicates that Method 1 and Method 2 were unable to extract r f and i f .By contrast, the VMD optimization method and EWCS index utilized in the proposed method were successful in extracting signal components that carried the information on system faults.This finding reinforces the effectiveness of the VMD optimization method and the Figure 6 indicates that Method 1 and Method 2 were unable to extract f r and f i .By contrast, the VMD optimization method and EWCS index utilized in the proposed method were successful in extracting signal components that carried the information on system faults.This finding reinforces the effectiveness of the VMD optimization method and the EWCS index in fault feature extraction.

The Proposed Method
To begin with, a modification of MFO is utilized to enable adaptive selection of VMD parameters, thereby avoiding the influence of manually set parameters on VMD decomposition effect.The decomposed IMFs are screened using the EWCS index.Given that the energy levels of individual components of the vibration signal's IMF differ based on the rolling bearing's distinct movement states, the energy attributes of these components are isolated to develop a feature matrix.Lastly, the energy eigenmatrix is input into DE-OSELM in order to perform fault classification using the principles outlined in reference [37].Figure 7 shows the flow chart for our proposed method.The specific process is listed as follows: (1) The improved MFO was used to optimize K and α in VMD parameters, and the optimal parameter combination ) ( 0 0 ,α k was obtained. (2) The VMD with optimized parameters was used to process the collected signals, and K IMFs were decomposed.(3) The EWCS index was used to screen K IMFs, eliminate the false IMFs, extract the energy features of effective IMFs, and form the feature vector matrix. (4)The energy feature matrix is normalized and input into DE-OSELM for training and fault classification.

Experimental Verification
To assess the effectiveness and superiority of MFO-optimized VMD and DE-OSELM for detecting faults in rolling bearings, we conducted an experiment using actual operation data of the 6205 deep groove ball bearing type obtained from Western Reserve University's laboratory [44].As illustrated in Figure 8, the simulation test bench for rolling bearing faults features a pitch circle diameter of 39.04 mm, as well as nine rolling bodies with a diameter of 7.94 mm each and 0°contact angle.The vibration signals were collected under four operating states: normal, inner ring failure, outer ring failure, and rolling body failure.We shifted 100 groups of vibration signals for each state, with each group consisting of 1024 signal lengths, resulting in a total of 400 groups.The specific process is listed as follows: (1) The improved MFO was used to optimize K and α in VMD parameters, and the optimal parameter combination (k 0 , α 0 ) was obtained.(2) The VMD with optimized parameters was used to process the collected signals, and K IMFs were decomposed.(3) The EWCS index was used to screen K IMFs, eliminate the false IMFs, extract the energy features of effective IMFs, and form the feature vector matrix.(4) The energy feature matrix is normalized and input into DE-OSELM for training and fault classification.

Experimental Verification
To assess the effectiveness and superiority of MFO-optimized VMD and DE-OSELM for detecting faults in rolling bearings, we conducted an experiment using actual operation data of the 6205 deep groove ball bearing type obtained from Western Reserve University's laboratory [44].As illustrated in Figure 8, the simulation test bench for rolling bearing faults features a pitch circle diameter of 39.04 mm, as well as nine rolling bodies with a diameter of 7.94 mm each and 0 • contact angle.The vibration signals were collected under four operating states: normal, inner ring failure, outer ring failure, and rolling body failure.
We shifted 100 groups of vibration signals for each state, with each group consisting of 1024 signal lengths, resulting in a total of 400 groups.

Experimental Verification
To assess the effectiveness and superiority of MFO-optimized VMD and DE-OSELM for detecting faults in rolling bearings, we conducted an experiment using actual operation data of the 6205 deep groove ball bearing type obtained from Western Reserve University's laboratory [44].As illustrated in Figure 8, the simulation test bench for rolling bearing faults features a pitch circle diameter of 39.04 mm, as well as nine rolling bodies with a diameter of 7.94 mm each and 0°contact angle.The vibration signals were collected under four operating states: normal, inner ring failure, outer ring failure, and rolling body failure.We shifted 100 groups of vibration signals for each state, with each group consisting of 1024 signal lengths, resulting in a total of 400 groups.Ten groups of normal signals were randomly selected for the experiment.The improved MFO algorithm was initiated with an initial population size of 30.The maximum Ten groups of normal signals were randomly selected for the experiment.The improved MFO algorithm was initiated with an initial population size of 30.The maximum number of iterations was set as 10, with m = 10 and p = 5.K and α values were ranged between [4,12] and [800, 3000].Figure 9 shows the search results, and the optimized values for K and α were determined to be 10 and 924, respectively.
Appl.Sci.2023, 13, x FOR PEER REVIEW 14 of 20 number of iterations was set as 10, with m = 10 and p = 5.K and α values were ranged between [4,12] and [800, 3000].Figure 9 shows the search results, and the optimized values for K and α were determined to be 10 and 924, respectively.The EWCS index was utilized to screen the decomposed IMFs, and the EWCS values of each modal component for the ten groups of signals were presented in Table 4. Based on the screening outcomes of the ten groups of signals, the effective components of IMF1, IMF2, and IMF3 were identified.The EWCS index was utilized to screen the decomposed IMFs, and the EWCS values of each modal component for the ten groups of signals were presented in Table 4. Based on the screening outcomes of the ten groups of signals, the effective components of IMF1, IMF2, and IMF3 were identified.VMD decomposition was performed using K values of 10 and 924 on 400 sets of vibration signals in four different states.The energy characteristics of IMF1, IMF2, and IMF3 were extracted, resulting in an eigenvector matrix that was normalized.Adding labels produced a 400 × 4 eigenmatrix.The energy characteristic matrix was randomly partitioned into training and test sets at a ratio of 3:1, resulting in 300 training samples and 100 test samples.After VMD decomposition, the components of the reconstructed signal were simpler, and the fault frequency was more obvious.Figure 10 shows the spectrum diagram of the original signal and reconstructed signal with the rolling element fault.Parameter settings used in the DE algorithm consisted of NP = 20, F = 0.5, and CR = 0.75 [37].The performance of the OSELM algorithm is affected by the choice of activation function.Therefore, to shorten the training time, which is twice as long as that of OSELM with RBF activation [32], the Sigmoid function was selected as the activation function for OSELM.The number of nodes in its hidden layer and bias range were set to  To substantiate the efficacy of this approach, the results obtained from Fixed Parameter <8, 2000> VMD and MFO-optimized VMD were fed into several machine learning models, namely, ELM, OSELM, KNN, and DE-OSELM, to recognize and classify the data.We repeated the experiment 30 times to ensure its reliability.Classification outcomes of the proposed approach, as well as the other three methodologies, are showcased in Table 5.The results demonstrate that the accuracy of classification achieved through the proposed method is notably better than the other three methods, affirming the efficacy and superiority of the proposed method.To substantiate the efficacy of this approach, the results obtained from Fixed Parameter <8, 2000> VMD and MFO-optimized VMD were fed into several machine learning models, namely, ELM, OSELM, KNN, and DE-OSELM, to recognize and classify the data.We repeated the experiment 30 times to ensure its reliability.Classification outcomes of the proposed approach, as well as the other three methodologies, are showcased in Table 5.The results demonstrate that the accuracy of classification achieved through the proposed method is notably better than the other three methods, affirming the efficacy and superiority of the proposed method.

Diagnostic Case Analysis
The viability and effectiveness of the proposed approach were evaluated using the drivetrain dynamics simulator (DDS), as illustrated in Figure 12, manufactured by Spec-traQuest [45].The ER-16K rolling bearing type was utilized to obtain the corresponding vibration signals.The bearing had a pitch circle with a 15.16 mm diameter, nine rolling bodies with a diameter of 3.125 mm, and a contact angle of 0 • .The experiment was conducted with a motor frequency of 20 Hz, zero load, a sampling frequency of 12,800 Hz, and a sampling length of 200 KiB.Vibrations were captured under four operating conditions: normal operation, inner ring failure, outer ring failure, and rolling body failure.A total of 400 signals were recorded, consisting of 100 sets of vibration signals for each of the four operating conditions, with a signal length of 1024 samples per group.Ten groups of normal signals were randomly selected for this study, and their decomposition layers and quadratic penalty factor were optimized through the improved MFO algorithm to obtain an optimal combination of <10, 1535>.The VMD algorithm was then applied to decompose each signal using this optimized parameter combination, after which the EWCS value of each IMF was calculated.The findings are presented in Table 6.After VMD decomposition, the components of the reconstructed signal were simpler, and the fault frequency was more obvious.Figure 13 shows the spectrum diagram of the original signal and reconstructed signal with the rolling element fault.Ten groups of normal signals were randomly selected for this study, and their decomposition layers and quadratic penalty factor were optimized through the improved MFO algorithm to obtain an optimal combination of <10, 1535>.The VMD algorithm was then applied to decompose each signal using this optimized parameter combination, after which the EWCS value of each IMF was calculated.The findings are presented in Table 6.After VMD decomposition, the components of the reconstructed signal were simpler, and the fault frequency was more obvious.Figure 13 shows the spectrum diagram of the original signal and reconstructed signal with the rolling element fault.
The feature matrix is input into both OSELM and DE-OSELM classification models, and their respective classification results are presented in Figure 14.Accordingly, it is evident from the figure that DE-OSELM achieves higher classification accuracy than OS-ELM, further reinforcing the effectiveness and universality of DE-OSELM.This comparison result provides robust evidence to support the superiority of DE-OSELM in classification performance.
To validate the efficacy of our proposed technique, we input the results of both the fixed parameter <7, 2000> VMD and the MFO-optimized VMD into four distinct classification algorithms-ELM, OSELM, KNN, and DE-OSELM-for classification and recognition purposes.Table 7 displays the classification outcomes of the four techniques.Table 7 supports the efficacy of the proposed MFO-optimized VMD and DE-OSELM diagnostic methods, which demonstrate greater accuracy and reliability compared to the other three methods.The feature matrix is input into both OSELM and DE-OSELM classification models, and their respective classification results are presented in Figure 14.Accordingly, it is evident from the figure that DE-OSELM achieves higher classification accuracy than OSELM, further reinforcing the effectiveness and universality of DE-OSELM.This comparison result provides robust evidence to support the superiority of DE-OSELM in classification performance.To validate the efficacy of our proposed technique, we input the results of both the fixed parameter <7, 2000> VMD and the MFO-optimized VMD into four distinct classification algorithms-ELM, OSELM, KNN, and DE-OSELM-for classification and recognition purposes.Table 7 displays the classification outcomes of the four techniques.Table 7 supports the efficacy of the proposed MFO-optimized VMD and DE-OSELM diagnostic methods, which demonstrate greater accuracy and reliability compared to the other three methods.At the same time, it is compared with two existing fault diagnosis methods, namely, the fault diagnosis method, based on EMD and OSELM [46], the fault diagnosis method, based on VMD and KNN [47], and results are shown in Table 8.From the results, we can see that their classification accuracy scores.The differences are 92.7% and 97.13%, and,

Figure 1 .
Figure 1.The flowchart of the improved MFO.

Figure 1 .
Figure 1.The flowchart of the improved MFO.

Figure 3 .
Figure 3. (a) Time-domain of the simulation signal.(b) Frequency-domain diagram of the simulation signal.

Figure 3 .
Figure 3. (a) Time-domain of the simulation signal.(b) Frequency-domain diagram of the simulation signal.

Figure 4 .
Figure 4.The improved MFO convergence curve of the simulation signal.

Figure 4 .
Figure 4.The improved MFO convergence curve of the simulation signal.

Figure 5 .
Figure 5.The effective component envelope spectrum.

20 Figure 7 .
Figure 7.The fault diagnosis process based on the MFO-optimized VMD and DE-OSELM.

Figure 7 .
Figure 7.The fault diagnosis process based on the MFO-optimized VMD and DE-OSELM.

Figure 8 .
Figure 8.The rolling bearing fault simulation test bench.

Figure 8 .
Figure 8.The rolling bearing fault simulation test bench.

Figure 9 .
Figure 9.The improved MFO convergence curve of the bearing signal.

Figure 9 .
Figure 9.The improved MFO convergence curve of the bearing signal.

20 Figure 10 .
Figure 10.(a) original signal; (b) reconstructed signal.Parameter settings used in the DE algorithm consisted of NP = 20, F = 0.5, and CR = 0.75[37].The performance of the OSELM algorithm is affected by the choice of activation function.Therefore, to shorten the training time, which is twice as long as that of OSELM with RBF activation[32], the Sigmoid function was selected as the activation function for OSELM.The number of nodes in its hidden layer and bias range were set to 25 and [0, 1], respectively.The input weight was constrained to [−1, 1].An amount of 50 training data were used initially, and each learning data block in the following step contained 40 data.The OSELM and DE-OSELM were implemented to investigate the robustness of the latter, with 30 experiments conducted for each group.The experiment results presented in Figure 11 demonstrate the superior classification accuracy of DE-OSELM, as compared to OSELM, indicating better robustness.

Figure 11 .
Figure 11.Comparison of the DE-OSLEM and OSELM classification results.

Figure 11 .
Figure 11.Comparison of the DE-OSLEM and OSELM classification results.
Appl.Sci.2023, 13, x FOR PEER REVIEW 16 of 20 and a sampling length of 200 KiB.Vibrations were captured under four operating conditions: normal operation, inner ring failure, outer ring failure, and rolling body failure.A total of 400 signals were recorded, consisting of 100 sets of vibration signals for each of the four operating conditions, with a signal length of 1024 samples per group.

Figure 14 .
Figure 14.A comparison of the DE-OSLEM and OSELM classification results.

Figure 14 .
Figure 14.A comparison of the DE-OSLEM and OSELM classification results.

Table 1 .
Test results of the four algorithms.

Table 2 .
Results of four optimization algorithms.

Table 2 .
Results of four optimization algorithms.

Table 3 .
The selection of effective IMF components.

Table 4 .
The value of EWCS of each IMF for 10 groups of normal signals.

Table 4 .
The value of EWCS of each IMF for 10 groups of normal signals., resulting in an eigenvector matrix that was normalized.Adding labels produced a 400 × 4 eigenmatrix.The energy characteristic matrix was randomly partitioned into training and test sets at a ratio of 3:1, resulting in 300 training samples and 100 test samples.After VMD decomposition, the components of the reconstructed signal were simpler, and the fault frequency was more obvious.Figure10shows the spectrum diagram of the original signal and reconstructed signal with the rolling element fault.

Table 5 .
The comparison of classification accuracy of the four methods.

Table 5 .
The comparison of classification accuracy of the four methods.

Table 7 .
Comparison of classification accuracy of the four methods.

Table 7 .
Comparison of classification accuracy of the four methods.