A New Method Based on Time-Varying Filtering Intrinsic Time-Scale Decomposition and General Refined Composite Multiscale Sample Entropy for Rolling-Bearing Feature Extraction

The early fault diagnosis of rolling bearings has always been a difficult problem due to the interference of strong noise. This paper proposes a new method of early fault diagnosis for rolling bearings with entropy participation. First, a new signal decomposition method is proposed in this paper: intrinsic time-scale decomposition based on time-varying filtering. It is introduced into the framework of complete ensemble intrinsic time-scale decomposition with adaptive noise (CEITDAN). Compared with traditional intrinsic time-scale decomposition, intrinsic time-scale decomposition based on time-varying filtering can improve frequency-separation performance. It has strong robustness in the presence of noise interference. However, decomposition parameters (the bandwidth threshold and B-spline order) have significant impacts on the decomposition results of this method, and they need to be artificially set. Aiming to address this problem, this paper proposes rolling-bearing fault diagnosis optimization based on an improved coyote optimization algorithm (COA). First, the minimal generalized refined composite multiscale sample entropy parameter was used as the objective function. Through the improved COA algorithm, optimal intrinsic time-scale decomposition parameters based on time-varying filtering that match the input signal are obtained. By analyzing generalized refined composite multiscale sample entropy (GRCMSE), whether the mode component is dominated by the fault signal is determined. The signal is reconstructed and decomposed again. Finally, the mode component with the highest energy in the central frequency band is selected for envelope spectrum variation for fault diagnosis. Lastly, simulated and experimental signals were used to verify the effectiveness of the proposed method.


Introduction
Rolling bearings are a key component in mechanical equipment. Their working status affects the operation of the entire equipment. The existence of bearing defects inevitably affects the dynamic performance of the equipment, and even damages the whole machine. Therefore, rolling-bearing fault monitoring and identification can allow for understanding the development of bearing faults. Maintenance strategies need to be developed in advance for minor faults that appear early, so as to avoid economic losses and accidents caused by equipment damage or sudden shutdowns [1][2][3]. Current research on bearing failures is mainly based on the analysis of vibration signals. Information that can characterize the running state of bearings is extracted from signals and through pattern-recognition algorithms. Bearing fault monitoring and diagnosis can then be realized. Currently, there are several signal-processing methods that are widely used: wavelet transform [4,5], stochastic resonance [6,7], and empirical mode decomposition (EMD) [8,9]. However, in wavelet transform, the choice of basic function and potential function parameters in stochastic resonance has great influence on signal analysis. Therefore, it cannot be defined as an adaptive signal-processing method. Empirical mode decomposition can be based on the signal itself. However, its modal-aliasing phenomenon seriously affects its application in actual signals.
Intrinsic time-scale decomposition (ITD) is an adaptive nonstationary nonlinear signalprocessing method. Time-frequency analysis based on the intrinsic time-scale decomposition can quantitatively describe the relationship between frequency and time, accurately analyzing time-varying signals [10]. On the basis of these advantages, scholars introduced this method from the medical field to the fault diagnosis of mechanical signals [11][12][13][14][15][16][17][18][19][20][21][22]. For example, Lin and Chang published a rolling-bearing fault diagnosis method based on an enhanced kurtosis spectrum and intrinsic time-scale decomposition [11]. Duan and Yao et al. proposed a comprehensive eigentime decomposition method for the fault diagnosis of a gearbox under variable operating conditions [12]. Xiang and Qu et al. proposed intrinsic time-scale decomposition and singular-value decomposition for variable-condition gear fault diagnosis [13]. Zhang and Liu et al. proposed a complete diesel-engine fault-diagnosis method integrated with intrinsic time-scale decomposition [14]. Hu and Xiang et al. proposed ensemble intrinsic time-scale decomposition to the fault diagnosis of fan gear [15]. Tong, Cao, et al. proposed improved intrinsic time-scale decomposition combined with complex tree wavelet packet transform and singular-value decomposition and used it to diagnose rolling-bearing faults [16]. Liu and Zhang et al. proposed the use of intrinsic time-scale decomposition for diesel-engine fault diagnosis [17]. Bi and Ma et al. proposed to use complete ensemble intrinsic time-scale decomposition and detect gasoline-engine knock [18]. Yu and Liu proposed using sparse coding on the basis of intrinsic time-scale decomposition to diagnose weak bearing faults [19]. Yuan and Peng proposed the use of smooth intrinsic time-scale decomposition for the fault diagnosis of rolling bearings [20]. Lei and Zhou et al. used intrinsic time-scale decomposition to monitor tool wear during milling [21]. Ma et al. proposed complete ensemble intrinsic time-scale decomposition with adaptive noise (CEITDAN) that was applied to the feature extraction of rolling bearings [22].
However, there are currently still many problems in the study of intrinsic time-scale decomposition. Existing research results improved the decomposition of the intrinsic time scale. However, problems such as curve distortion and modal aliasing still need to be further improved. Therefore, in order to further allow for the better application of eigentime-scale decomposition algorithms to the early fault diagnosis of bearings, this paper proposes an intrinsic time-scale decomposition method based on time-varying filtering (TVF-ITD). It is applied to the framework of CEITDAN, improving the noise-reduction effect of intrinsic time-scale decomposition on strong background noise signals in order to improve the effectiveness of early fault diagnosis. And named it as complete ensemble intrinsic timescale with adaptive white noise based on time-varying filtering (TVF-CEITDAN).
In the process of engineering practice, when the bearing has an early failure, the collected signal noise is relatively strong. The characteristic frequency of each type of fault is also different [23][24][25]. Therefore, adaptive analysis is also needed to extract different types of faults. In this paper, an TVF-CEITDAN method was used to decompose a signal. The use of adaptive white noise can offset noise in the collected signal. The collected signal contained both white and colored noise. Therefore, this method still retains some colored-noise components that are difficult to remove.
In order to further reduce the interference of noise on feature extraction, this paper proposes to introduce entropy calculation into the process of signal decomposition and reconstruction. Entropy is used as an analytical method to estimate the complexity of time series. It is widely used in the fault diagnosis of bearings [26][27][28][29][30]. At present, research on entropy is ongoing. The emergence of multiscale entropy in recent years provides a direction for its more accurate signal analysis [31][32][33][34][35]. Multiscale entropy can accurately measure the complexity of different mode components. It has high noise immunity and high consistency. At the same time, it is not restricted by large amounts of data. In the case of fewer data, a stabler entropy value can be obtained. Therefore, generalized refined composite multiscale sample entropy was chosen as the method of noise elimination after decomposition (GRCMSE) [36]. The entropy value of the generalized refined composite multiscale sample was calculated under 10,000 sets of white noise as the threshold, used in the decomposition of subsequent signals. This can further filter out residual noise in a decomposed signal through the verification of analog and measured vibration signals. The proposed method in this paper can accurately extract early fault features under a strongly noisy background. Compared with other methods, it shows better practicability.
The rest of this article is organized as follows. Section 2 introduces the principles of the ITD algorithm, GRCMSE, and the group-optimization algorithm. Section 3 introduces the proposed method. In Section 4, the effectiveness of the proposed method is verified by analog and measured signals. Lastly, the conclusions are drawn in Section 5.

Intrinsic Time Scale Decomposition
ITD is adaptive to nonstationary signals. The method decomposes the vibration signal into a series of proper rotation components (PRCs) with physical significance and the sum of monotonous trend residuals. After the decomposition result is obtained, spectrum analysis can be performed on any PRC in order to obtain amplitude and frequency-modulation characteristics that are difficult for the original signal to show [37].
Suppose the noise signal to be processed is X t , which is a set of discrete data composed of real numbers. All extreme points in X t are found, corresponding signal moments are τ k , (k = 1, 2, . . . , M), and M are the total number of signal extremes. First, L is defined as the baseline extraction operator; then, let τ 0 be the first intrinsic scale of signal X t and be decomposed as: where L t = LX t and H t = (1 − L)X t are the baseline extraction signal and the PRC, respectively. In the first decomposition, a baseline extraction signal is removed from original detection signal X t to obtain the PRC. Using X k and L k is equivalent to X(τ k ) and L(τ k ). To make X t meaningful in t ∈ [0, τ k+2 ], both L t and H t are defined to be on [0, τ k ]. On interval [τ k , τ k+1 ] of continuous extreme points, we define the piecewise linear baseline extraction operator as follows: Among them: where σ is the linear scaling used to control the amplitude of the PRC; σ ∈ [0, 1] usually takes the value 0.5. The decomposed PRC represents the local relatively high frequency component in the original detection signal, that is, the PRC. The baseline signal is used as the next original signal to continue the decomposition. Obtain a series of PRCs arranged in different frequency ranges from high to low. When iterating to produce a monotonous residual trend signal, decomposition ends. The decomposition process of the entire intrinsic time scale of signal X t can be as follows: where H k t is the k-th PRC, L p t is the residual component, and the number of PRCs after the decomposition is p.

Generalized Refined Composite Multiscale Sample Entropy
Multiscale sample entropy (MSE) can measure the complexity of time series and effectively detect small changes. When a rolling bearing fails, the complexity of nonlinear dynamics also changes [38]. Therefore, MSE is very suitable for the feature extraction of rolling-bearing faults. Experiment results showed that MSE can overcome the shortcomings of single-scale analysis of sample entropy, and analytical results are more accurate. However, applying MSE to the feature-extraction process of rolling bearings still has the two following defects: (1) Through the coarse-graining process of homogenized data, the dynamic mutation behavior of the original signal is "neutralized" to a certain extent, making the estimated entropy value biased. (2) The stability of MSE increases with an increase in coarse-grained scale factor. In order to overcome this shortcoming, the second moment (variance) is used in the coarse-grained step. Instead of the first moment (average value), the generalized MSE (GMSE) algorithm is proposed [39], which still has some shortcomings. GMSE depends heavily on the length of the time series, and the GMSE value may be uncertain or unreliable. The probability of invalid entropy may increase. In order to resolve these shortcomings, this paper applied the GRCMSE algorithm [36].
The process of the multiscale sample entropy algorithm is as follows: (1) For time series {x(i), i = 1, 2, · · · , N}, following formula is used to define coarsegrained series y (s) : where s is the scale factor. (2) The sample entropy of coarse-grained sequence y (s) is calculated under different scale factors s, that is, multiscale sample entropy: where m is the embedding dimension; r is the similarity tolerance; E SE (·) is the sample entropy value; n (m) s and n (m+1) s are the numbers of mand m + 1-dimensional space vectors of the coarse-grained sequence, respectively.
Generalized multiscale sample entropy (GMSE) is the calculation of the mean value of the coarse-grained process of multiscale sample entropy, which is extended to the second moment in order to overcome the shortcomings of "neutralizing" the dynamic mutation behavior of the original signal caused by the coarse-grained method of the homogenized data. The specific calculation steps are as follows: (1) For time series {x(i), i = 1, 2, · · · , N}, the following formula is used to calculate generalized coarse-grained series y (2) The sample entropy of generalized coarse-grained sequence y This aims at the second deficiency of the multiscale sample entropy algorithm. On the basis of GMSE, considering multiple coarse-grained sequences under a unified scale G,h . Then, entropy is calculated to reduce the probability of invalid entropy. The above analysis is GRCMSE, and the specific steps are as follows: (1) For time series {x(i), i = 1, 2, · · · , N}, the following formula is used to calculate generalized composite coarse-grained y (s) (2) In the range of 1 ≤ h ≤ s, the average values n

Coyote Optimization Algorithm
The coyote optimization algorithm (COA) is a new group optimization algorithm proposed in 2018 by Pierezan et al., inspired by the behavior of coyotes in North America [40]. The algorithm simulates existing coyote populations and their evolution, including heuristic random coyote-population grouping, growth, birth, and death, original-group driving-away, and new-group acceptance behavior. The COA focuses on the social structure and cultural exchange of coyotes and does not search according to social hierarchy and ruling rules. After the coyotes are equally divided into several subgroups, the al pha animal and cultural trends of the subgroups are independently determined. Two different random wolves in the group form cultural differences with the al pha coyote and the cultural trend, respectively. As a disturbance catalyzes the growth of a coyote, two different coyotes in the group are randomly selected to give birth to cubs under the influence of a specific environment, and the birth and death of coyotes are synchronized. Whether the cub survives depends on whether its adaptability is better than that of any coyote in the group. Either the cub dies, or the oldest coyote in the group dies. Individual coyotes are driven out of the group to enter other groups, which plays a role in experience exchange. Through the cyclic evolution of growth, birth, and death, the exclusion of the original group and the acceptance of the new group obtains the coyote with the best fitness as a solution to the optimization problem. In COA, decision variables are represented by coyote social-state factors in each dimension of the solution vector. Each coyote represents a candidate solution to the problem. The advantages and disadvantages of the candidate solutions depend on the coyotes' social adaptability.
COA groups the initialized coyote populations according to the principle of random equal distribution. So, after setting the number of coyotes in group N p ∈ N * and the number of coyotes in a single group N c ∈ N * , we can obtain N p × N c individual coyotes. The initial social conditions of these individual coyotes are randomly set. This is achieved by assigning random values in the search space to various social-state factors of the coyotes. Equation (11) expresses the assignment method of the j-th dimension of the c coyote in the p package: where ub j and lb j represent the upper and lower bounds, respectively, of the j-th dimension of the decision variable, and r is uniformly distributed in [0, 1]. On this basis, the social adaptability of coyotes is evaluated according to Equation (12): The growth of the coyotes in the group is the result of cultural interaction. It is affected by the al pha wolf, the cultural trend cult p,t of this group, and two different coyotes (cr 1 and cr 2 ) randomly in the group. The cultural difference between al pha wolf and random wolf cr 1 forms impact factor δ 1 , and the cultural difference between cult p,t and random wolf cr 2 forms impact factor δ 2 , namely: The al pha wolf is the coyote with the best environmental adaptation in the group. When solving the minimization problem, it is defined as: Cultural trends provide conditions for coyotes in the group to share information, composed of the median values of the social conditions of all coyotes in the group, and it is the performance of algorithmic swarm intelligence. The specific calculation formula is: where O p,t represents the social condition under which [1, D] in the p-th package in the t-th iteration is sorted by dimension. Therefore, the social state of a coyote after growing up is shown in Equation (17): where r 1 and r 2 are the weights of the influence of al pha wolf and the cultural trend within the group, respectively; they are random numbers that obey [0, 1], uniformly distributed. COA still uses a greedy algorithm to determine whether the growth of a coyote is allowed. Equation (18) evaluates the growth state of a coyote. In Equation (19), coyotes with better environmental adaptability are retained to participate in the subsequent processes of growth, birth, and death, exclusion from the original group, and acceptance into the new group: Following the laws of nature, coyotes in the group give birth to cubs when they grow up, and they also face death. Two parent coyotes selected at random in the group give birth to cubs in a specific environment. The specific birth method of the cubs is: where r 1 and r 2 are two random coyotes in the current group; j 1 and j 2 represent two random dimensions; rand j is a random number uniformly distributed in [0, 1]; R j is a random number within the bounds of the j-th dimensional decision variable, representing the impact of the reproductive environment on the cubs; and P s and P a are scattering probability and correlation probability, respectively, which determine the degree of cultural diversity of coyotes in the group: COA synchronizes the birth and death of coyotes to maintain a stable population size. According to the standard design of Algorithm 1, older coyotes have a higher mortality rate. Among them, ω and ϕ represent the set of coyotes whose environmental adaptability is not as good as that of the cubs and the number of coyotes in the set, respectively. If there are two or more coyotes of similar age in ω, the coyote with the worst adaptability is still set to die according to the greedy algorithm.

8: end if
The entire population is unstable in which individual coyotes are driven out of the group and accepted into the new group. The more coyotes there are in the group, the higher the probability P e of this happening is: Different from sharing cultural information within groups, this mechanism can promote the global cultural exchange of coyote population. In order to ensure that A is between 0 and 1, the number of coyotes in each group is required to be up to 14 [41]. The COA process uses pseudocode as shown in Algorithm 2: Algorithm 2. COA Pseudo Code.

1:
Define control parameters N p , N c , and maximal iterations N_Max 2: Initialize N p packs with N c coyotes each and verify coyote adaptation 3: While t < N_Max, do 4: for p = 1:N p 5: determine alpha and culture tendency 6: for c = 1:N c 7: Update the social condition 8: Evaluate the new social condition 9: Adaptation 10: end for 11: Generate the pup considering intrinsic and extrinsic influence 12: The pup dies or the oldest coyote dies 13: end for 14: Transition between packs 15: Update coyote ages 16: end while 17: Select the best adapted coyote To sum up, the COA structure is novel and can better balance global exploration and local development. The coyote growth model considers more factors and strengthens its ability to jump out of the local optimum. When giving birth to cubs, the influence of the external environment makes the algorithm capable of exploring well. An individual coyote is expelled and reaccepted by the new group, playing a role in overall cultural exchange, avoiding information convergence within the group and the optimization of the mantis oriole style. The local development of individual coyotes then has global awareness. Therefore, the algorithm has good value.

Proposed Method
This paper proposes an improved time-varying, filter-based self-adaptive whitenoise, fully integrated eigentime-scale decomposition algorithm, and an improved coyote optimization algorithm. The two methods are organically combined with the generalized fine composite multiscale sample entropy. The filtering effect was improved to extract early fault characteristic frequencies of rolling bearings under strong noise. This section includes three parts. The first part is an improved, time-varying filter-based self-adaptive white-noise fully integrated intrinsic time-scale decomposition algorithm; the second part is a coyote optimization algorithm based on gradient-based optimizer (GBO)-sine-cosine optimization (SCA) optimization; the third part proposes the specific implementation of feature extraction.

Fully Integrated Intrinsic Time-Scale Algorithm with Adaptive White Noise Based on Time-Varying Filtering
Because the ITD method is the same as the EMD method, there are problems of mode components lacking physical meaning and mode aliasing. At the same time, ITD uses linear interpolation for curve fitting. As a result, the decomposed signal has serious curve distortion in low-frequency components. Therefore, this article was inspired by the TVF-EMD method [41]. TVF was applied to the ITD method in order to further improve the modal aliasing. This article applied the TVF-ITD method to the CEITDAN technical framework. The specific methods proposed in this article are as follows.
The intrinsic time-scale decomposition of time-varying filtering is essentially performed by constructing a low-pass filter of which the cut-off frequency changes with time to complete the iterative removal of linear interpolation in the ITD process and replace the inherent rotation component with the local narrowband signal as the iteration stop condition. For a given arbitrary multicomponent signal A, it can be expressed as a two-component signal: Therefore, only the decomposition process of the two-component signal needs to be considered. The basic steps of time-varying filtering eigentime-scale decomposition for two-component signals are as follows: Step 1: Perform Hilbert transform on x(t) to obtain amplitude A(t) and phase ϕ(t) of the complex analytical signal. The derivative of the instantaneous phase obtains instantaneous frequency ϕ (t).
Step 2: Determine time {t min }, {t max } and amplitude A({t min }), A({t max }) of the minimal and maximal values of amplitude curve A(t).
Step 3: Respectively interpolate the extreme points of A(t), and the resulting curves are β 1 (t) and β 2 (t), respectively.
Step 5: Available local cutoff frequency ϕ bis (t): The above steps are the construction of the time-varying filter.
Step 6: In order to eliminate mode aliasing caused by noise and other components, cutoff frequency ϕ bis (t) needs to be adjusted.
(1) Define time series where , then e j is the rising edge of ϕ bis (t); otherwise, e j is the falling edge of ϕ bis (t).
(2) If e j is the rising edge of ϕ bis (t), then φ bis e j−1 : e j is regarded as the lowest value; otherwise, e j is the falling edge of ϕ bis (t); then, φ bis e j : e j+1 is regarded as the lowest value. The rest of ϕ bis (t) is seen as a peak.
(3) Adjusted cut-off frequency ϕ bis (t) is obtained by interpolating between the peaks.
Step 7: reconstruct the signal according to adjusted cut-off frequency: Taking the extreme point of h(t) as the node, h(t) is divided into n segments, and the step size of each segment is m; n is called the order of B-spline function. Using Equations (32) and (33) for B-spline interpolation approximation, the approximation result is recorded as m(t), which represents the local mean curve: where [·] ↓m is downsampling, sampling every m point; [·] ↑m means oversampling, that is, m sampling points are inserted between every two sampling points.
Step 8: judge whether stop criterion θ(t) < ξ is satisfied, where ξ is the given bandwidth threshold. If satisfied, x(t) is PRC; if not, x(t) = x(t) − m(t) repeats Steps 1-7 until the stop criteria are met; x(t) satisfying the stopping criterion is PRC: As mentioned earlier, the extraction and calculation of L t in the eigentime-scale algorithm were replaced by the TVF method proposed in this section. This can compensate for curve distortion and modal aliasing caused by monotonic linear interpolation. In order to further improve the modal aliasing effect, TVF-ITD is introduced into the CEITDAN ideological framework (TVF-CEITDAN). The specific flowchart is shown in Figure 1.
a t a t a t a t (38) As mentioned earlier, the extraction and calculation of t L in the eigentime-scale algorithm were replaced by the TVF method proposed in this section. This can compensate for curve distortion and modal aliasing caused by monotonic linear interpolation. In order to further improve the modal aliasing effect, TVF-ITD is introduced into the CEITDAN ideological framework (TVF-CEITDAN). The specific flowchart is shown in Figure 1.

Coyote Optimization Algorithm Based on GBO-SCA Optimization
The proposed method in this paper is inspired by the GBO method proposed in [42] and introduced into the coyote optimization algorithm. The GBO method is an optimization algorithm that combines the gradient method and the population method and consists of two main parts. One of them is GSR based on the gradient search (GB) method. in the gradient search rule (GSR), the motion of the vector is controlled to better search and obtain a better position in the FEASIBLE region to improve the trend of exploration and accelerate the convergence of GBO. However, the rule was extracted from Newton's gradient-based approach [43]. Based on the fact that many optimization problems are non-differentiable, the numerical gradient method is used instead of the direct function derivation method. However, the numerical gradient method uses one sample at a time for gradient descent. Therefore, to ensure accuracy, the stochastic random gradient descent method uses only one sample per training to determine the direction of the gradient. Local minima may be obtained, and the accuracy rate is not high. For convergence rate, the stochastic gradient descent method only iterates one sample at a time. Therefore, the iteration direction varies greatly and does not converge quickly to the local optimal solution. Therefore, in this paper, the GSR algorithm in the GBO method is proposed to be replaced by the positive cosine algorithm [44] to improve the overall convergence rate.
The sine-cosine optimization algorithm is a random optimization algorithm with a high degree of flexibility. The principle is simple, easy to implement, and can be easily applied to optimization problems in different fields. The optimization process of the sinecosine optimization algorithm can be divided into two stages. In the exploration stage, the optimization algorithm quickly finds feasible regions in the search space by combining a random solution in all random solutions. During the development phase, the random solution gradually changes. The rate of change of the random solution is lower than the rate of the exploration phase. In the sine-cosine algorithm, the candidate solutions are first randomly initialized. Then, the value of the current solution in each dimension is updated according to the sine or cosine function, combined with random factors. The specific update equation is: where X t i is the position of the current individual in the i-th dimension and t-th generation; r 2 is a random number from 0 to 2π; r 3 is a random number from 0 to 2; r 4 is a random number from 0 to 1; and P t i means the position of the i-th dimension of the optimal individual position variable in the second iteration: where a is a constant; t is the current iteration number; T is the maximal iteration number; and parameter r 1 indicates that the location area of the next solution is within or outside the current solution and the optimal solution. A smaller value of r 1 helps to enhance the local development capability of the algorithm. A larger r 1 value helps to improve the global exploration ability of the algorithm. At the same time, the r 1 value gradually decreases with the number of iterations, which balances the local development and global search capabilities of the algorithm. r 2 , r 3 and r 4 are random factors, and r 2 defines how far the current solution is toward or away from the optimal solution. Parameter r 3 gives a random weight for the optimal solution to immediately emphasize the effect of (r 3 > 1) or ignore the influence of the optimal solution of (r 3 < 1) in defining the moving distance of the candidate solution. Parameter r 4 is an equal switch between sine and cosine functions. For a given problem, the sine-cosine optimization algorithm randomly creates a series of candidate solutions. According to the sine and cosine functions, the value of each candidate solution in all dimensions is updated. The cyclic mode of sine and cosine functions allows for one solution to be repositioned around other solutions. This can ensure that the search is performed in the space between the two solutions, and it can converge to the global optimum faster. The algorithm flow is as follows: Step 1: Population initialization. Suppose that the group size of the population is m, and m solutions are randomly generated in the range of [0, 255]. Randomly set the initial position of each solution.
Step 2: Calculate the fitness of all solutions.
Step 3: The location of the solution is updated. Select the corresponding location update formula according to the r 4 value. Update the position of the candidate solution in each dimension. Recalculate the fitness values of all candidate solutions. In this way, the fitness of each solution and of the global optimal position is obtained.
Step 4: Compare and update the location of the global optimal solution. The fitness value of each updated solution is compared with that of the global optimal solution. If the fitness value of the current solution is greater than the global optimal fitness value, the position of the global optimal solution is updated.
Step 5: If the algorithm termination condition is satisfied, output the optimal solution; otherwise, repeat Steps 2-4.
Compared with the GSR algorithm in the GBO method, this method both improves convergence speed and reduces the parameters of the GBO method, improving the simplicity of calculation. We named this method GBO-SCA.
In summary, this article aims to improve the shortcomings of the COA method. By applying the GBO-SCA algorithm to the COA algorithm, the specific process is shown in Figure 2.
Step 2: Calculate the fitness of all solutions.
Step 3: The location of the solution is updated. Select the corresponding location update formula according to the 4 r value. Update the position of the candidate solution in each dimension. Recalculate the fitness values of all candidate solutions. In this way, the fitness of each solution and of the global optimal position is obtained.
Step 4: Compare and update the location of the global optimal solution. The fitness value of each updated solution is compared with that of the global optimal solution. If the fitness value of the current solution is greater than the global optimal fitness value, the position of the global optimal solution is updated.
Step 5: If the algorithm termination condition is satisfied, output the optimal solution; otherwise, repeat Steps 2-4.
Compared with the GSR algorithm in the GBO method, this method both improves convergence speed and reduces the parameters of the GBO method, improving the simplicity of calculation. We named this method GBO-SCA.
In summary, this article aims to improve the shortcomings of the COA method. By applying the GBO-SCA algorithm to the COA algorithm, the specific process is shown in Figure 2.

Specific Implementation Plan for Proposed Method
This paper proposes a new and improved method of eigentime-scale decomposition, and a new idea of applying generalized fine compound multiscale sample entropy to early fault diagnosis. Because the signal-to-noise ratio of each mode component in the CEITDAN ideological framework is fixed. Therefore, according to the threshold of the generalized fine composite multiscale sample entropy, it can be a fixed value when processing the agreement signal. There is no need to recalculate the threshold for each decomposition. According to the literature [36], scale factor s = 25 is determined, and similarity tolerance is r = 0.15 × SD, where SD is the standard deviation. The method of early fault diagnosis of the overall rolling bearing in this article is as follows: (1) Calculate the entropy value of the generalized fine composite multiscale sample of 10,000 groups of white noise, and the generalized fine composite multiscale sample entropy of the original signal. The average of the two is chosen as the threshold value.

Case A: Numerical Simulation
In order to prove the superiority of the proposed algorithm in practical application, a bearing failure simulation signal was first selected, and complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), ITD, and TVF-CEITDAN decomposition, respectively, was used. The structure of the used bearing fault simulation signal is shown in Equation (43). In order to improve the objectivity of the three comparisons, orthogonality index (OI) and energy conservation index (ECI) were selected for the quantitative evaluation of the performance of different methods [45,46]. Theoretically speaking, after the algorithm decomposes the signal, the different obtained mode components should belong to a completely orthogonal relationship. That is, the orthogonality index is equal to zero. However, due to computer error and environmental interference, the orthogonality index cannot be zero [45]. Therefore, in practical application, the closer the orthogonal index is to 0, the more accurate the decomposition effect is; this shows that decomposition accuracy is not high. Therefore, from an energy point of view, the energy-preservation index can also evaluate the decomposition performance of different methods. Different from the orthogonal index, the value of the energy-conservation index is closer to 1. This shows that the less energy is lost in the decomposition process of this method, the more complete the decomposition result is. However, more than 1 indicates that there is a false component [46]. In summary, we used OI, ECI, and root mean square error (RMSE) to compare these methods. The OI and ECI are defined as: where x(t) represents the original signal, mod n (t) represents the i-th mode component decomposed by decomposition algorithm, and r m (t) indicates residual component: where A i is the amplitude modulation with a period of 1/ f r , the rotation frequency of the shaft is f r ; n(t) is random white noise; r is the damping coefficient of the system; T is the interval between consecutive shocks; t i is the rolling during the i-th period delay caused by body slip; A 0 and C 0 are arbitrary constants; and f n is the natural frequency of the system. Among them, f n = 3000 Hz, f n = 3000 Hz, t i = 1, A 0 = 2, C 0 = 1, r = 0.005 fault characteristic frequency. The sampling rate was 8192 Hz, and the number of sample points was 2048, as shown in Figure 3. Figure 4 shows the GRCMSE values of 10,000 groups of white noise signals, where the minimum value is averaged with the GRCMSE value of the numerical signal as the threshold in this section. The threshold value used in this section is 0.43. Figure 5 shows the decomposition result of CEEMDAN, ITD, and TVF-CEITDAN.
the shaft is r f ; ( ) n t is random white noise; r is the damping coefficient of T is the interval between consecutive shocks; i t is the rolling during the delay caused by body slip; 0 A and 0 C are arbitrary constants; and n f is the quency of the system. Among them, fn = 3000 Hz, = 3000 fault characteristic frequency. The sampling rate was 8192 Hz, and of sample points was 2048, as shown in Figure 3. Figure 4 shows the GRCMS 10,000 groups of white noise signals, where the minimum value is average GRCMSE value of the numerical signal as the threshold in this section. Th value used in this section is 0.43. Figure 5 shows the decomposition result of C ITD, and TVF-CEITDAN.     Figure 5 shows that the curve decomposed by ITD caused low-frequency component distortion due to linear interpolation. CEEMDAN decomposed a total of 11 mode components, and the curve was smooth. However, IMF5 has modal aliasing, and the first three mode components did not highlight the impact signal. There was still strong noise interference. In the TVF-CEITDAN method, the first three components contained less noise, and the characteristics of the impact signal could be observed. According to the fault-diagnosis method proposed in this paper, the GRCMSE value of each rotation component of TVF-CEITDAN was calculated. Figure 6 shows that the GRCMSE values of PR2 and PR3 both exceeded the threshold. Therefore, the remaining rotation components were selected for signal reconstruction. TVF-CEITDAN was used to decompose again. The decomposition result is shown in Figure 7. In the application of the signal-decomposition algorithm, a key issue is how to select appropriate mode components to transform the envelope spectrum to extract the fault characteristic frequency and perform fault diagnosis. Therefore, this article used analysis of the central frequency band energy content contained in the different mode components to select the appropriate mode components for reconstruction, as shown in Figure 8. The central frequency band contained in PR1 accounted for the largest proportion of energy. Therefore, PR1 was selected to obtain the  Figure 5 shows that the curve decomposed by ITD caused low-frequency component distortion due to linear interpolation. CEEMDAN decomposed a total of 11 mode components, and the curve was smooth. However, IMF5 has modal aliasing, and the first three mode components did not highlight the impact signal. There was still strong noise interference. In the TVF-CEITDAN method, the first three components contained less noise, and the characteristics of the impact signal could be observed. According to the fault-diagnosis method proposed in this paper, the GRCMSE value of each rotation component of TVF-CEITDAN was calculated. Figure 6 shows that the GRCMSE values of PR2 and PR3 both exceeded the threshold. Therefore, the remaining rotation components were selected for signal reconstruction. TVF-CEITDAN was used to decompose again. The decomposition result is shown in Figure 7. In the application of the signal-decomposition algorithm, a key issue is how to select appropriate mode components to transform the envelope spectrum to extract the fault characteristic frequency and perform fault diag-nosis. Therefore, this article used analysis of the central frequency band energy content contained in the different mode components to select the appropriate mode components for reconstruction, as shown in Figure 8. The central frequency band contained in PR1 accounted for the largest proportion of energy. Therefore, PR1 was selected to obtain the fault characteristic frequency for fault diagnosis. Figures 9-11 are the envelope-spectrum results of CEEMDAN, ITD, and the proposed method. Figures 9 and 10 show that, due to the strong noise of the analog signal, the result of the envelope spectrum obtained after decomposition could hardly show the characteristic frequency of the fault. There was no way to analyze and judge the type of failure. However, Figure 11 clearly shows the fault characteristic frequency and its multiplier. This also shows that the proposed method could maintain good effectiveness under strong noise interference, reducing it to ideal conditions. Figure 12 is a comparison diagram of the fitness value and the number of iterations of the GBO-SCA-COA in this paper. The proposed method could converge faster than traditional optimization algorithms can.               In order to objectively compare the practicability and effectiveness of the three ods, OI and RMI machine RMSE were used in this study, and results are shown in 1. As can be seen from Table 1, the ECI index of the TVF-CEITDAN method propo this paper reaches 0.9471, while the other two compared methods are only 0.55 0.5915, indicating that the method proposed in this paper has the least energy loss signal decomposition process. For the OI index, the proposed method is much s than the remaining two methods and is closest to 1, which can be for the comput error of the computer, indicating good orthogonality and relative independence the components. The RMSE of the proposed method in this paper is the smallest, indicates that the decomposition error is the smallest. In the comparison of the signal-to-noise ratio, the results of the proposed method in this paper are also the l which indicates that the noise reduction effect of the proposed method in this p better than the other two methods. The proposed method performed best in multi dicators, and it can be further applied to actual engineering use.  In order to objectively compare the practicability and effectiveness of the three m ods, OI and RMI machine RMSE were used in this study, and results are shown in 1. As can be seen from Table 1, the ECI index of the TVF-CEITDAN method propos this paper reaches 0.9471, while the other two compared methods are only 0.5512 0.5915, indicating that the method proposed in this paper has the least energy loss i signal decomposition process. For the OI index, the proposed method is much sm than the remaining two methods and is closest to 1, which can be for the computat error of the computer, indicating good orthogonality and relative independence am the components. The RMSE of the proposed method in this paper is the smallest, w indicates that the decomposition error is the smallest. In the comparison of the o signal-to-noise ratio, the results of the proposed method in this paper are also the la which indicates that the noise reduction effect of the proposed method in this pap better than the other two methods. The proposed method performed best in multip dicators, and it can be further applied to actual engineering use. In order to objectively compare the practicability and effectiveness of the three methods, OI and RMI machine RMSE were used in this study, and results are shown in Table 1. As can be seen from Table 1, the ECI index of the TVF-CEITDAN method proposed in this paper reaches 0.9471, while the other two compared methods are only 0.5512 and 0.5915, indicating that the method proposed in this paper has the least energy loss in the signal decomposition process. For the OI index, the proposed method is much smaller than the remaining two methods and is closest to 1, which can be for the computational error of the computer, indicating good orthogonality and relative independence among the components. The RMSE of the proposed method in this paper is the smallest, which indicates that the decomposition error is the smallest. In the comparison of the output signal-to-noise ratio, the results of the proposed method in this paper are also the largest, which indicates that the noise reduction effect of the proposed method in this paper is better than the other two methods. The proposed method performed best in multiple indicators, and it can be further applied to actual engineering use.

Case B: Experimental Analysis
In this section, we use an engineering test equipment for signal acquisition and fault diagnosis using a method consistent with the analog signal idea to demonstrate the effectiveness of the proposed method in this paper ( Figure 13). The test platform consisted of a test bench and control system, which was mainly composed of the tested bearings, accompanying test bearings, test spindle, bearing outer ring fixture, drive unit, and loading system. During the test, there were four sets of bearings, divided into two sets of tested bearings and two sets of accompanying bearings. Two sets of loading systems respectively applied loads to the test bearings. The drive unit provided power for the whole test bench, and the loading system provided radial force for the test bearings, which rotated the spindle drive of the bearings and obtained the vibration signal of the bearings during the rotation process through vibration sensors. The drive unit provided bearing speed in the range of 1000-20,000 rpm, which was continuous. The loading system had a loading range of 0-6 KN, loading accuracy of ±2%, and it was continuously adjustable. The operating limit temperature of the test bench was 250 • C [47].

Case B: Experimental Analysis
In this section, we use an engineering test equipment for signal acquisition and fault diagnosis using a method consistent with the analog signal idea to demonstrate the effectiveness of the proposed method in this paper ( Figure 13). The test platform consisted of a test bench and control system, which was mainly composed of the tested bearings, accompanying test bearings, test spindle, bearing outer ring fixture, drive unit, and loading system. During the test, there were four sets of bearings, divided into two sets of tested bearings and two sets of accompanying bearings. Two sets of loading systems respectively applied loads to the test bearings. The drive unit provided power for the whole test bench, and the loading system provided radial force for the test bearings, which rotated the spindle drive of the bearings and obtained the vibration signal of the bearings during the rotation process through vibration sensors. The drive unit provided bearing speed in the range of 1000-20,000 rpm, which was continuous. The loading system had a loading range of 0-6 KN, loading accuracy of ±2%, and it was continuously adjustable. The operating limit temperature of the test bench was 250 °C [47].  The vibration signal collected from the Fault bearing marked in Figure 13 is the signal used for the analysis in this paper, while the Healthy bearing is the accompanying bearing required for the test. Tables 2 and 3 show the test-bearing parameters and characteristic frequency calculation formulas of rolling bearing, respectively.  The vibration signal collected from the Fault bearing marked in Figure 13 is the signal used for the analysis in this paper, while the Healthy bearing is the accompanying bearing required for the test. Tables 2 and 3 show the test-bearing parameters and characteristic frequency calculation formulas of rolling bearing, respectively.  Table 3. Bearing-failure characteristic-frequency calculation formulas.

Bearing Fault Equation
Bearing cage (FTF) Sampling frequency was 18,400 Hz, data length was 8192 points, and bearing rotation speed was 3000 rpm. According to the parameters in Table 2 and the formula in Table 3, the bearing inner ring failure frequency can be calculated as 407 Hz. Due to computer error, the eigenfrequency in the envelope spectrum is actually 408Hz. Figures 14 and 15 show the time domain and frequency domain plots of the acquired signals. From the above figure, it can be seen that the fault information is completely masked by the noise, and it is impossible to get the fault characteristic frequency visually and determine the fault type. Figure 16 shows the GRCMSE values of the 10,000 sets of white noise. The minimal value of the 10,000 sets of white noise and the GRCMSE value of the measured signal were selected, and their average value was used as the threshold value of 0.54.  Table 3. Bearing-failure characteristic-frequency calculation formulas.

Bearing Fault Equation
Bearing cage (FTF) Sampling frequency was 18,400 Hz, data length was 8192 points, and bearing rotation speed was 3000 rpm. According to the parameters in Table 2 and the formula in Table 3, the bearing inner ring failure frequency can be calculated as 407 Hz. Due to computer error, the eigenfrequency in the envelope spectrum is actually 408Hz. Figure 14 and Figure 15 show the time domain and frequency domain plots of the acquired signals. From the above figure, it can be seen that the fault information is completely masked by the noise, and it is impossible to get the fault characteristic frequency visually and determine the fault type. Figure 16 shows the GRCMSE values of the 10,000 sets of white noise. The minimal value of the 10,000 sets of white noise and the GRCMSE value of the measured signal were selected, and their average value was used as the threshold value of 0.54.     Figure 17 shows that the decomposed curve by ITD caused low-frequency component distortion due to linear interpolation. CEEMDAN decomposed a total of 13 mode components. However, IMF4, IMF5, and IMF6 had modal aliasing. The first three mode components also had strong noise interference. In the TVF-CEITDAN method, the first three components contained less noise, and some of the characteristics of the impact signal could be observed. According to the proposed fault-diagnosis method, the GRCMSE value of each rotation component of TVF-CEITDAN was calculated as shown in Figure  18. The GRCMSE values of PR1 all exceeded the threshold. The remaining rotation components were then selected for signal reconstruction, and TVF-CEITDAN was used to decompose again. The decomposition result is shown in Figure 19. This part is the same as that in the previous section. The method of analyzing the energy ratio of the center  Figure 17 shows that the decomposed curve by ITD caused low-frequency component distortion due to linear interpolation. CEEMDAN decomposed a total of 13 mode components. However, IMF4, IMF5, and IMF6 had modal aliasing. The first three mode components also had strong noise interference. In the TVF-CEITDAN method, the first three components contained less noise, and some of the characteristics of the impact signal could be observed. According to the proposed fault-diagnosis method, the GRCMSE value of each rotation component of TVF-CEITDAN was calculated as shown in Figure 18. The GRCMSE values of PR1 all exceeded the threshold. The remaining rotation components were then selected for signal reconstruction, and TVF-CEITDAN was used to decompose again. The decomposition result is shown in Figure 19. This part is the same as that in the previous section. The method of analyzing the energy ratio of the center frequency band contained in the different mode components was used to select the appropriate mode components for reconstruction, as shown in Figure 20. The central frequency band contained in PR1 accounted for the largest proportion of energy. Therefore, PR1 was selected for fault diagnosis. Figures 21-23 are the envelope-spectrum results of CEEMDAN, ITD, and the proposed method. Figures 21 and 22 show that, due to the stronger noise of the measured signal, the obtained envelope spectrum after decomposition could hardly show the fault characteristic frequency, and it was impossible to analyze and judge the fault type. CEEMDAN and ITD could only extract one characteristic frequency of the inner ring, and the gap between peak value and noise was small. However, Figure 23 clearly shows the fault characteristic frequency and its double frequency. This also shows that the proposed method maintained good effectiveness under strong noise interference, reducing noise interference to ideal conditions. Figure 24 is a comparison diagram of the fitness value and the number of iterations of the proposed improved coyote optimization algorithm. Our method could converge faster than traditional optimization algorithms can. Figure 23 clearly shows the fault characteristic frequency and its double frequency. This also shows that the proposed method maintained good effectiveness under strong noise interference, reducing noise interference to ideal conditions. Figure 24 is a comparison diagram of the fitness value and the number of iterations of the proposed improved coyote optimization algorithm. Our method could converge faster than traditional optimization algorithms can.                In order to objectively compare the practicability and effectiveness of the three m ods, OI, ECI, and machine RMSE were used, and the results are shown in Table 4. A be seen from Table 4, the proposed method in this paper performs consistently with analog signal comparison results, and all the indexes perform best. The ECI index o proposed method reaches 0.9389, while the other two comparison methods are only 0. and 0.5625, indicating that the proposed method has the least energy loss in the pro of signal decomposition, and because the measured signal is noisier and the signa noise ratio is lower, the three methods are slightly lower in this index, which is also in with the objective rule. For the OI index, the proposed method is much smaller than remaining two methods and is closest to 1, which can be the calculation error of the puter, indicating good orthogonality and relative independence among the compon The RMSE of the method proposed in this paper is the smallest, which indicates tha decomposition error is the smallest. Since the input and output signal-to-noise ratios not be calculated for the measured signals, the output signal-to-noise ratio is not use a comparison index to compare the three methods in this section. The proposed me performed best in multiple indicators and it can be further applied to actual enginee use. In order to objectively compare the practicability and effectiveness of the three methods, OI, ECI, and machine RMSE were used, and the results are shown in Table 4. As can be seen from Table 4, the proposed method in this paper performs consistently with the analog signal comparison results, and all the indexes perform best. The ECI index of the proposed method reaches 0.9389, while the other two comparison methods are only 0.5234 and 0.5625, indicating that the proposed method has the least energy loss in the process of signal decomposition, and because the measured signal is noisier and the signal-to-noise ratio is lower, the three methods are slightly lower in this index, which is also in line with the objective rule. For the OI index, the proposed method is much smaller than the remaining two methods and is closest to 1, which can be the calculation error of the computer, indicating good orthogonality and relative independence among the components. The RMSE of the method proposed in this paper is the smallest, which indicates that the decomposition error is the smallest. Since the input and output signal-to-noise ratios cannot be calculated for the measured signals, the output signal-to-noise ratio is not used as a comparison index to compare the three methods in this section. The proposed method performed best in multiple indicators and it can be further applied to actual engineering use.

Conclusions
This study proposed a new method for the early fault diagnosis of rolling bearings with the three following aspects: an improved eigentime-scale decomposition algorithm, an improved coyote optimization algorithm, and generalized fine composite multiscale sample entropy to analyze the noise-reduction effect. Through simulation signals and an engineering test platform, the effectiveness of the proposed method was fully proven. The proposed method takes generalized fine composite multiscale sample entropy as the objective function and removes the rotation component dominated by noise after decomposition. It carries out a second decomposition, fully retaining the weak signal of early bearing failure, and reduces the influence of noise. Through a comparison of different physical indicators, the purpose of the quantitative analysis of the method is obtained. The proposed method can be effectively applied to engineering practice. In future research, we will focus on the applicability of this method in the case of mixed failures. A set of feasible early failure and hybrid failure diagnosis methods for rolling bearings are proposed.