A Dual-Optimization Fault Diagnosis Method for Rolling Bearings Based on Hierarchical Slope Entropy and SVM Synergized with Shark Optimization Algorithm

Slope entropy (SlopEn) has been widely applied in fault diagnosis and has exhibited excellent performance, while SlopEn suffers from the problem of threshold selection. Aiming to further enhance the identifying capability of SlopEn in fault diagnosis, on the basis of SlopEn, the concept of hierarchy is introduced, and a new complexity feature, namely hierarchical slope entropy (HSlopEn), is proposed. Meanwhile, to address the problems of the threshold selection of HSlopEn and a support vector machine (SVM), the white shark optimizer (WSO) is applied to optimize both HSlopEn and an SVM, and WSO-HSlopEn and WSO-SVM are proposed, respectively. Then, a dual-optimization fault diagnosis method for rolling bearings based on WSO-HSlopEn and WSO-SVM is put forward. We conducted measured experiments on single- and multi-feature scenarios, and the experimental results demonstrated that whether single-feature or multi-feature, the WSO-HSlopEn and WSO-SVM fault diagnosis method has the highest recognition rate compared to other hierarchical entropies; moreover, under multi-features, the recognition rates are all higher than 97.5%, and the more features we select, the better the recognition effect. When five nodes are selected, the highest recognition rate reaches 100%.


Introduction
Rolling bearings, as a key component in rotating machinery, serve a very significant role in modern industry. However, because of the increasingly sophisticated and complex structure of bearings and their common use in harsh working environments, rolling bearings are very prone to failures, which can lead to economic losses and even endanger personal safety [1][2][3]. Therefore, aiming to ensure the normal work of rotating machinery and reduce maintenance costs, it is of great importance to carry out fault diagnoses of rolling bearings [4][5][6].
Since bearing vibration signals contain rich state information about the bearing during operation, a vibration analysis method is broadly applied to rolling bearing faults [7,8]. In general, the method mainly consists of two steps: feature extraction and fault classification, in which valid feature extraction is crucial for accurate fault diagnosis. As the bearing vibration signal has nonlinear dynamic characteristics, traditional feature extraction methods based on Fourier transform and statistical analysis only characterize features from the time domain or frequency domain, and they cannot detect potential faults through changes in the complexity of the system to achieve effective and accurate extractions of fault features [9,10].
(1) To extract the fault information of bearing signals more comprehensively, on the basis of SlopEn, this paper adds the concept of hierarchy and is the first to propose hierarchical slope entropy (HSlopEn). (2) Since the thresholds of HSlopEn have a relatively large impact on the entropy value and the selection of suitable parameters of an SVM is particularly important for the classification, this paper applies the WSO to optimize the parameters of HSlopEn and an SVM and proposes WSO-HSlopEn and WSO-SVM, respectively. (3) Targeting the application of bearing fault diagnosis under different operating conditions, this paper proposes a dual-optimization fault diagnosis method for rolling bearings based on HSlopEn and an SVM synergized with the WSO. The remaining parts of this paper are structured as follows. Section 2 presents the basic concepts of algorithms. Section 3 introduces the steps of the proposed fault diagnosis method. Section 4 carries out the single-feature and multi-feature extraction experiments for bearing signals, and Section 5 summarizes the conclusions of this study.
The remaining parts of this paper are structured as follows. Section 2 presents the basic concepts of algorithms. Section 3 introduces the steps of the proposed fault diagnosis method. Section 4 carries out the single-feature and multi-feature extraction experiments for bearing signals, and Section 5 summarizes the conclusions of this study.

Slope Entropy
Slope entropy (SlopEn) is an algorithm proposed in 2019 to calculate the complexity of time series. It is based on symbolic patterns and magnitude information. The main calculation steps are listed below: (1) For a given time series , ,⋯, , according to the embedding dimension , extract the subsequences: , ,⋯, , , , . . . , , . . . , , , . . . , , of which 1.

Hierarchical Slope Entropy
Since SlopEn only considers the low-frequency components of the time series, aiming to describe the time series more comprehensively, on the basis of SlopEn and combined with the concept of hierarchy, this paper proposes a new complexity feature, namely hierarchical slope entropy (HSlopEn). The specific process of HSlopEn is as follows: (1) First, given a time series X = {x(i), i = 0, 1, · · · , N, N = 2 n } of length N, define an average operator Q 0 (x) and a difference operator Q 1 (x), which can be expressed as where the Q 0 (x) and Q 1 (x) operators are the low-frequency part and high-frequency part, respectively, of the original given time series after hierarchical decomposition and n is a positive integer.
(2) The operators Q j (j = 0 or 1) in matrix form is defined as (3) The l-dimension vectors [u 1 , u 2 , . . . , u l ] ∈ {0, 1}(l N) are constructed, and an integer e can be expressed: where, for a positive e, there is a unique set of l-dimension vectors [u 1 , u 2 , . . . , u l ] ∈ {0, 1} and the positive integer e represents the sequence number of the node at each layer, where 0 e 2 n−1 .
(4) The hierarchical decomposition of a given time series X yields a hierarchical component corresponding to the node e at the Kth level, defined as (5) By calculating the SlopEn of nodes on different layers, the HSlopEn can be expressed as HSlopEn(X, m, K, γ, δ) = SlopEn(X n,e , m, γ, δ) where K is the number of layers of decomposition, γ and δ are the two thresholds of SlopEn, and m is the embedding dimension. As displayed in Figure 2, the hierarchical decomposition structure diagram when K = 3 is shown. SlopEn is calculated on each node after the hierarchical decomposition.
In Figure 2, X indicates the original time series, x 1,1 is the first node of the first layer, x 2,1 is the first node of the second layer, and so on. In Figure 2, indicates the original time series, , is the first node of the first layer, , is the first node of the second layer, and so on.

Analysis of the Parameters for HSlopEn
The main parameters of HSlopEn include the number of decomposition layers , embedding dimension , two threshold parameters and , and time delay . First, the number of decomposition layers determines the number of nodes in the hierarchical decomposition. When the number of decomposition layers is too large, the number of nodes decomposed is too large, resulting in a large number of calculations for SlopEn values of all nodes; when the value is too small, resulting in a small number of decomposed nodes, there are insufficient frequency bands for the given time series. Referring to other references, the default number of decomposition layers is 3 in this paper. Then, the embedding dimension is used to extract the subsequence of a given time series. If it is too small, it is difficult to determine the dynamic changes of the time series; if it is too large, it is difficult to capture the subtle changes in the time series. After that, the two threshold parameters and are used to divide the symbol pattern of a given subsequence, which affects the change in entropy value. Lastly, the default time delay is 1, as important information about frequency may be lost at that time if 1.
The effect of embedding dimension and thresholds on the performance of the HSlopEn is investigated below by analyzing the noisy signals.
To investigate the effect of embedding dimension on the entropy value of hierarchical slope entropy, 50 sets of white Gaussian noise (WGN) of signal length 2048 are used, with the embedding dimension varying from 2 to 5 and the two threshold parameters and defaulting to 0.1 and 0.001, respectively. Figure 3 shows the mean and standard deviation (SD) of the HSlopEn values for different embedding dimensions in every node.

Analysis of the Parameters for HSlopEn
The main parameters of HSlopEn include the number of decomposition layers K, embedding dimension m, two threshold parameters γ and δ, and time delay d. First, the number of decomposition layers K determines the number of nodes in the hierarchical decomposition. When the number of decomposition layers is too large, the number of nodes decomposed is too large, resulting in a large number of calculations for SlopEn values of all nodes; when the value is too small, resulting in a small number of decomposed nodes, there are insufficient frequency bands for the given time series. Referring to other references, the default number of decomposition layers K is 3 in this paper. Then, the embedding dimension m is used to extract the subsequence of a given time series. If it is too small, it is difficult to determine the dynamic changes of the time series; if it is too large, it is difficult to capture the subtle changes in the time series. After that, the two threshold parameters γ and δ are used to divide the symbol pattern of a given subsequence, which affects the change in entropy value. Lastly, the default time delay d is 1, as important information about frequency may be lost at that time if d > 1. The effect of embedding dimension and thresholds on the performance of the HSlopEn is investigated below by analyzing the noisy signals.
To investigate the effect of embedding dimension on the entropy value of hierarchical slope entropy, 50 sets of white Gaussian noise (WGN) of signal length 2048 are used, with the embedding dimension m varying from 2 to 5 and the two threshold parameters γ and δ defaulting to 0.1 and 0.001, respectively. Figure 3 shows the mean and standard deviation (SD) of the HSlopEn values for different embedding dimensions in every node.  In Figure 2, indicates the original time series, , is the first node of the first layer, , is the first node of the second layer, and so on.

Analysis of the Parameters for HSlopEn
The main parameters of HSlopEn include the number of decomposition layers , embedding dimension , two threshold parameters and , and time delay . First, the number of decomposition layers determines the number of nodes in the hierarchical decomposition. When the number of decomposition layers is too large, the number of nodes decomposed is too large, resulting in a large number of calculations for SlopEn values of all nodes; when the value is too small, resulting in a small number of decomposed nodes, there are insufficient frequency bands for the given time series. Referring to other references, the default number of decomposition layers is 3 in this paper. Then, the embedding dimension is used to extract the subsequence of a given time series. If it is too small, it is difficult to determine the dynamic changes of the time series; if it is too large, it is difficult to capture the subtle changes in the time series. After that, the two threshold parameters and are used to divide the symbol pattern of a given subsequence, which affects the change in entropy value. Lastly, the default time delay is 1, as important information about frequency may be lost at that time if 1.
The effect of embedding dimension and thresholds on the performance of the HSlopEn is investigated below by analyzing the noisy signals.
To investigate the effect of embedding dimension on the entropy value of hierarchical slope entropy, 50 sets of white Gaussian noise (WGN) of signal length 2048 are used, with the embedding dimension varying from 2 to 5 and the two threshold parameters and defaulting to 0.1 and 0.001, respectively. Figure 3 shows the mean and standard deviation (SD) of the HSlopEn values for different embedding dimensions in every node.  As shown in Figure 3, as the embedding dimension m becomes larger, the entropy value of the HSlopEn also becomes larger, but the entropy value of each node for HSlopEn is close to others at different embedding dimensions, and the difference between the mean and SD is small, which indicates that the change in the embedding dimension affects the size of the entropy value, but the stability of the HSlopEn hardly changes. The embedding dimension m is set to 3 in this paper. In addition, to further study the effect of thresholds γ and δ on the entropy of the HSlopEn, 50 independent pink noise (PN) and WGN signals are selected, where each noise is sampled at 2048 Hz and the embedding dimension m is 3. The three sets of thresholds (γ, δ) for HSlopEn are manually set, which are (0.1, 001), (0.3, 0.1), and (0.8, 0.3), and the mean and standard deviation (SD) of the HSlopEn values for the three sets of thresholds in every node are displayed in Figure 4. Figure 3, as the embedding dimension becomes larger, the entropy value of the HSlopEn also becomes larger, but the entropy value of each node for HSlopEn is close to others at different embedding dimensions, and the difference between the mean and SD is small, which indicates that the change in the embedding dimension affects the size of the entropy value, but the stability of the HSlopEn hardly changes. The embedding dimension is set to 3 in this paper. In addition, to further study the effect of thresholds and on the entropy of the HSlopEn, 50 independent pink noise (PN) and WGN signals are selected, where each noise is sampled at 2048 Hz and the embedding dimension is 3. The three sets of thresholds ( , for HSlopEn are manually set, which are (0.1, 001), (0.3, 0.1), and (0.8, 0.3), and the mean and standard deviation (SD) of the HSlopEn values for the three sets of thresholds in every node are displayed in Figure 4. It can be seen in Figure 4 that, as the threshold changes, the entropy values of the two types of noise signals change; at the same time, the ability to discriminate between the noise signals is constantly changing, so the threshold has a significant effect on the entropy of the HSlopEn. The WSO is used in the paper to optimize the thresholds to avoid taking values based on artificial experience and further improve the fault diagnosis.

WSO-HSlopEn and WSO-SVM
Following the principle of the HSlopEn algorithm, the two threshold parameters and of the HSlopEn are used to divide the sign pattern of a given time sequence It can be seen in Figure 4 that, as the threshold changes, the entropy values of the two types of noise signals change; at the same time, the ability to discriminate between the noise signals is constantly changing, so the threshold has a significant effect on the entropy of the HSlopEn. The WSO is used in the paper to optimize the thresholds to avoid taking values based on artificial experience and further improve the fault diagnosis.

WSO-HSlopEn and WSO-SVM
Following the principle of the HSlopEn algorithm, the two threshold parameters γ and δ of the HSlopEn are used to divide the sign pattern of a given time sequence subsequence. Thus, the two threshold parameters have a great influence on the HSlopEn value. At the same time, the classification effect of the support vector machine (SVM) mainly depends on the selection of the penalty factor (C) and kernel function parameters (g), and it is generally difficult to take the values based on manual experience. Hence, the selection of an appropriate penalty factor and kernel function parameters is also particularly important for the classification and recognition accuracy of the SVM.
To enhance the performance of the fault diagnosis effect, in this paper, taking the recognition rate as the fitness function, the white shark optimizer (WSO) is used to optimize the parameters of HSlopEn and the SVM, and WSO-HSlopEn and WSO-SVM are proposed, respectively, where the WSO is a new meta-heuristic optimization algorithm based on deepsea foraging by great white sharks, proposed in 2022 for solving optimization problems on continuous search spaces. The main process of optimizing the parameters of HSlopEn and the SVM is shown in Figure 5, and the specific process is as follows: optimization problems on continuous search spaces. The main process of optimizing the parameters of HSlopEn and the SVM is shown in Figure 5, and the specific process is as follows: (1) Set the initial parameter ranges of HSlopEn , and the SVM , . (2) Initialize the WSO parameters, such as population size, number of iterations , position, and speed of white sharks. (3) Calculate the fitness function, and update the white sharks' position and speed. (4) Evaluate the fitness function, and update the optimal white shark position (5) Update the position and speed of the white shark. (6) Judge whether the current iteration number reaches the maximum iteration number.
If so, return to update the speed and position of the white shark and repeat the above steps; otherwise, output the best-optimized parameters , and , .

Initialize WSO parameters
Set the range of (γ ,δ) and (C,g) Calculate the fitness function  (1) Set the initial parameter ranges of HSlopEn (γ, δ) and the SVM (C, g).
(2) Initialize the WSO parameters, such as population size, number of iterations I, position, and speed of white sharks. If so, return to update the speed and position of the white shark and repeat the above steps; otherwise, output the best-optimized parameters (γ, δ) and (C, g).

The Proposed Method for Fault Diagnosis of Rolling Bearing
Combining the concept of hierarchical structure, the new complexity feature HSlopEn is proposed, and the parameters of both HSlopEn and the SVM are optimized using the WSO algorithm, and WSO-HSlopEn and WSO-SVM are proposed, respectively. Then, a dual-optimization fault diagnosis method for rolling bearings based on WSO-HSlopEn and WSO-SVM is proposed. Figure 6 presents the flowchart of the proposed fault diagnosis method, and the method mainly includes the following steps: are obtained. In this paper, bearing signals are decomposed into three layers. (4) The nodes of WSO-HSlopEn are calculated, and then single-feature and multi-feature extraction experiments for bearing signals are carried out. Meanwhile, comparisons with some classical entropies, such as HFE, HPE, HSE, and HRDE, are conducted. (5) WSO-SVM is applied to classify bearing signals, and the recognition results are output. In this paper, for each type, select 25 sample signals as training samples and 75 sample signals as test samples.   (1) The different bearing signals are input. In this paper, each type of bearing signal has 100 samples with 1024 data points. In this paper, for each type, select 25 sample signals as training samples and 75 sample signals as test samples.

Experiments and Results
In this chapter, two comparative experiments are implemented to examine the effectiveness of the proposed method in fault diagnosis: (1) In optimizing both HSlopEn and SVM parameters using the WSO, we compare different optimization algorithms, including SSA, MPA, and SO. (2) In extracting the WSO-HSlopEn of nodes, we compare classical hierarchical entropy metrics, including HPE, HSE, HFE, and HRDE.

Fault Diagnosis of Rolling Bearing Signal
The dataset used in this section was derived from the Bearing Data Center of Case Western Reserve University [36], which is an internationally recognized standard dataset for fault diagnosis of rolling bearings. The schematic of the test rig (Cleveland, USA) is shown in Figure 7.

Experiments and Results
In this chapter, two comparative experiments are implemented to examine the effectiveness of the proposed method in fault diagnosis: (1) In optimizing both HSlopEn and SVM parameters using the WSO, we compare different optimization algorithms, including SSA, MPA, and SO. (2) In extracting the WSO-HSlopEn of nodes, we compare classical hierarchical entropy metrics, including HPE, HSE, HFE, and HRDE.

Fault Diagnosis of Rolling Bearing Signal
The dataset used in this section was derived from the Bearing Data Center of Case Western Reserve University [36], which is an internationally recognized standard dataset for fault diagnosis of rolling bearings. The schematic of the test rig (Cleveland, USA)ʺ.is shown in Figure 7. As shown in Figure 7, the test rig consisted of an induction motor, drive-end bearing, self-aligning coupling, and accelerometer dynamometer. An accelerometer was installed on the base of the motor, which was used to detect the vibration acceleration of the faulty bearing at a sampling frequency of 12 kHz. The dataset divided the fault data into four categories: normal data (NOR), ball faults (BFs), outer race faults (ORFs), and inner race faults (IRFs). Among them, BFs, ORFs, and IRFs were simulated faults with single-point damage as an electric spark. The damage diameters were divided into 0.007, 0.014, and 0.021 inches. At the same time, the processed faulty bearing was reloaded into the test motor, and the vibration acceleration signal data were recorded under the load working conditions of 0, 1, 2, and 3 horsepower.
In this section, bearing signals with ten conditions were collected from the drive-end bearings, including rolling bearings in normal condition and those with damage to the inner race, the outer race, and the ball element. Bearings with various damage diameters were considered under a speed of 1730 rpm with a load of 3 horsepower. Table 1 illustrates the fault diagnosis sample collection of bearing signals. Each fault signal was divided into three types according to the fault diameter. We sampled from point 1001, and each condition had 100 samples with 1024 sampling points. Time-domain waveforms for each state bearing signals are displayed in Figure 8.  As shown in Figure 7, the test rig consisted of an induction motor, drive-end bearing, self-aligning coupling, and accelerometer dynamometer. An accelerometer was installed on the base of the motor, which was used to detect the vibration acceleration of the faulty bearing at a sampling frequency of 12 kHz. The dataset divided the fault data into four categories: normal data (NOR), ball faults (BFs), outer race faults (ORFs), and inner race faults (IRFs). Among them, BFs, ORFs, and IRFs were simulated faults with single-point damage as an electric spark. The damage diameters were divided into 0.007, 0.014, and 0.021 inches. At the same time, the processed faulty bearing was reloaded into the test motor, and the vibration acceleration signal data were recorded under the load working conditions of 0, 1, 2, and 3 horsepower.
In this section, bearing signals with ten conditions were collected from the drive-end bearings, including rolling bearings in normal condition and those with damage to the inner race, the outer race, and the ball element. Bearings with various damage diameters were considered under a speed of 1730 rpm with a load of 3 horsepower. Table 1 illustrates the fault diagnosis sample collection of bearing signals. Each fault signal was divided into three types according to the fault diameter. We sampled from point 1001, and each condition had 100 samples with 1024 sampling points. Time-domain waveforms for each state bearing signals are displayed in Figure 8.

Comparison of Different Optimization Algorithms
Designed to verify the performance advantages of the WSO in optimizing HSlopEn and the SVM, this section introduces different optimization algorithms to optimize the parameters of HSlopEn and the SVM, and compares recognition rates of single-feature and multi-feature extractions with those of other optimization algorithms [37][38][39]. In this experiment, 10 different bearing signal conditions were sampled from the 1001 point as the starting point, and 100 samples were selected. Each sample had 1024 data points. First, the parameters of HSlopEn were set as follows: hierarchical layer 3 , embedding dimension 3, and threshold parameters and were adaptively determined using different optimization algorithms. HSlopEn with optimized parameters of bearing signals was extracted. Then, the sample set was divided into the training set and test set, and the select single feature or multi-features were input to optimize the SVM. The penalty factor and kernel function parameters of the SVM were also adaptively determined using the WSO algorithm. Figure 9 presents the fitness iteration curves of different optimization algorithms to optimize HSlopEn and the SVM. These are the fitness iteration curves of different optimization algorithms in the case of extracting five nodes.

Comparison of Different Optimization Algorithms
Designed to verify the performance advantages of the WSO in optimizing HSlopEn and the SVM, this section introduces different optimization algorithms to optimize the parameters of HSlopEn and the SVM, and compares recognition rates of single-feature and multi-feature extractions with those of other optimization algorithms [37][38][39]. In this experiment, 10 different bearing signal conditions were sampled from the 1001 point as the starting point, and 100 samples were selected. Each sample had 1024 data points. First, the parameters of HSlopEn were set as follows: hierarchical layer K = 3, embedding dimension m = 3, and threshold parameters γ and δ were adaptively determined using different optimization algorithms. HSlopEn with optimized parameters of bearing signals was extracted. Then, the sample set was divided into the training set and test set, and the select single feature or multi-features were input to optimize the SVM. The penalty factor and kernel function parameters of the SVM were also adaptively determined using the WSO algorithm. Figure 9 presents the fitness iteration curves of different optimization algorithms to optimize HSlopEn and the SVM. These are the fitness iteration curves of different optimization algorithms in the case of extracting five nodes. It can be found in Figure 9 that, in the condition of extracting five nodes, the highest recognition rate of these ten types of bearing signals reached 100% using the WSO to optimize HSlopEn. At the same time, the convergence speed of the WSO was relatively faster than other optimization algorithms. In addition, the early convergence of the WSO It can be found in Figure 9 that, in the condition of extracting five nodes, the highest recognition rate of these ten types of bearing signals reached 100% using the WSO to optimize HSlopEn. At the same time, the convergence speed of the WSO was relatively faster than other optimization algorithms. In addition, the early convergence of the WSO is quick. Its fitness curve eventually converged to a bigger value. To further demonstrate the significant advantages of WSO, we calculated the recognition rate of bearing signals based on using different optimization algorithms to optimize HSlopEn and the SVM under the situation of extracting the single feature and multi-features. The recognition rates of HSlopEn for all single-feature nodes are shown in Tables 2 and 3, and the highest recognition rates of HSlopEn for multi-features for the four types of optimization algorithms are shown in Table 4. According to the recognition rate of different types of bearing signals, we can find that no matter how many features are extracted, the advantages of the WSO algorithm are obvious. In the case of extracting a single feature, the recognition rate of the WSO reaches 79.33% on node 6, which is much higher than that of other optimization algorithms. Under the circumstances of extracting multi-features, as the number of selected nodes increases, the recognition rate also improves. When we select five features, it realizes the correct identification of all samples. The recognition rate of other optimization algorithms, including SO, MPA, and SSA, is, respectively, 3.8%, 10.53%, and 17.73% lower than that of WSO. Above all, we prove that using the WSO to optimize HSlopEn and the SVM is feasible. Therefore, in this paper, the WSO is used to optimize HSlopEn and SVM parameters.

Comparison of Different Hierarchical Entropies
Aiming to demonstrate the superiority of WSO-HSlopEn in fault diagnosis, we compared it to other classical hierarchical entropies, including HSE, HFE, HPE, and HRDE.
The single-feature approach was first used to extract the fault feature and compare it with HFE, HSE, HPE, and HRDE. The parameters of HSlopEn were as follows: hierarchical layer K = 3, embedding dimension m = 3, time delay d = 1, and threshold parameter γ and δ were adaptively determined using the WSO algorithm. For a fair comparison, the parameter settings of other hierarchical entropies were the same as those in the HSlopEn method. Among them, the similarity tolerances of HSE and HFE were set as r = 0.2, and the category number of HRDE was set as c = 3. The entropy distributions of an optimal node for the single-feature extraction of bearing signals are shown in Figure 10.
of WSO. Above all, we prove that using the WSO to optimize HSlopEn and the SVM is feasible. Therefore, in this paper, the WSO is used to optimize HSlopEn and SVM parameters.

Comparison of Different Hierarchical Entropies
Aiming to demonstrate the superiority of WSO-HSlopEn in fault diagnosis, we compared it to other classical hierarchical entropies, including HSE, HFE, HPE, and HRDE. The single-feature approach was first used to extract the fault feature and compare it with HFE, HSE, HPE, and HRDE. The parameters of HSlopEn were as follows: hierarchical layer 3 , embedding dimension 3 , time delay 1 , and threshold parameter and were adaptively determined using the WSO algorithm. For a fair comparison, the parameter settings of other hierarchical entropies were the same as those in the HSlopEn method. Among them, the similarity tolerances of HSE and HFE were set as 0.2, and the category number of HRDE was set as 3. The entropy distributions of an optimal node for the single-feature extraction of bearing signals are shown in Figure 10.    After using the WSO-HSlopEn as the fault feature of the bearing signal, the bearing fault diagnosis sample set was divided into a training set and a test set, and the training set was input into the WSO-SVM to train the model, and then the test set was input into the model to finish the fault diagnosis of bearings. The Gaussian kernel function was selected as the kernel function of the SVM. The penalty factor and kernel function parameters of the SVM were also adaptively determined by the WSO algorithm. Recognition rates of single features for the five types of hierarchical entropies are displayed in Tables 5 and 6.  Tables 5 and 6 illustrate that, when using WSO-HSlopEn, the recognition rate of node 6 was the highest, which was 79.33%. Compared with other hierarchical entropies, under each node, the recognition rate based on WSO-HSlopEn was always the highest, which shows the effectiveness of WSO-HSlopEn as a fault diagnosis feature of bearing signals.
Through observation, when single-feature extraction is used to extract the fault feature, there is still overlap between the features of different conditions of the bearing signals. Furthermore, the recognition rate of the best node was low, and there were many misclassified samples based on single-feature extraction. Aiming to further improve the recognition rate of different conditions of the bearing signals, double features were used to extract the bearing signals. All parameters used in the experiments were the same as those listed in the single-feature extraction. The entropy distribution on the optimal node for double-feature extractions of bearing signals is shown in Figure 11, where the abscissa and ordinate are the entropy values of the two nodes, respectively. For example, in Figure 11a, the abscissa is the SlopEn of node 1, and the ordinate is the SlopEn of node 5. recognition rate of different conditions of the bearing signals, double features were used to extract the bearing signals. All parameters used in the experiments were the same as those listed in the single-feature extraction. The entropy distribution on the optimal node for double-feature extractions of bearing signals is shown in Figure 11, where the abscissa and ordinate are the entropy values of the two nodes, respectively. For example, in Figure  11a, the abscissa is the SlopEn of node 1, and the ordinate is the SlopEn of node 5. As can be observed from Figure 11, in the case of double-feature extraction, the WSO-HSlopEn distribution of sample signals belonging to the same type is relatively concentrated compared to other hierarchical entropies; for the other four types of hierarchical entropies, the bearing signals between different types are more divergent, and the entropy values of different types of bearing signals are very close.
To further improve the recognition performance, triple features were used to extract bearing fault features on various hierarchical entropies. The parameters for calculating various hierarchical entropies were the same as those of double features. Figure 12 presents the triple-feature distributions of ten types of bearing signals for different hierarchical entropies. As can be observed from Figure 11, in the case of double-feature extraction, the WSO-HSlopEn distribution of sample signals belonging to the same type is relatively concentrated compared to other hierarchical entropies; for the other four types of hierarchical entropies, the bearing signals between different types are more divergent, and the entropy values of different types of bearing signals are very close.
To further improve the recognition performance, triple features were used to extract bearing fault features on various hierarchical entropies. The parameters for calculating various hierarchical entropies were the same as those of double features. Figure 12 presents the triple-feature distributions of ten types of bearing signals for different hierarchical entropies.
the entropy values of different types of bearing signals are very close.
To further improve the recognition performance, triple features were used to extract bearing fault features on various hierarchical entropies. The parameters for calculating various hierarchical entropies were the same as those of double features. Figure 12 presents the triple-feature distributions of ten types of bearing signals for different hierarchical entropies. It can be seen from Figure 12 that there is almost no overlap based on the WSO-HSlopEn, but the feature distributions of the BF2 and IRF2 samples are relatively low in clustering; for the other hierarchical entropies, the clustering of the feature distributions of the samples are very poor because of their approximate entropy distributions. Nevertheless, the entropy distribution of WSO-HSlopEn is more dispersed, and WSO-HSlopEn of different fault types are quite different, which effectively verifies the validity of WSO-HSlopEn as a feature extraction method for ten types of bearing signals.
Next, WSO-SVM is used to construct a fault diagnosis model. The highest recognition rate is calculated for the five types of hierarchical entropies under multi-feature extraction, as shown in Table 7, where (1,5) indicates the combination of nodes with the highest recognition rate for two features are node 1 and node 5, (1,5,6) indicates the combination of nodes with the highest recognition rate for three features are node 1, node 5 and node 6, and so on. Table 7. Highest recognition rate for the five types of hierarchical entropies for multi-features. It can be seen from Figure 12 that there is almost no overlap based on the WSO-HSlopEn, but the feature distributions of the BF2 and IRF2 samples are relatively low in clustering; for the other hierarchical entropies, the clustering of the feature distributions of the samples are very poor because of their approximate entropy distributions. Nevertheless, the entropy distribution of WSO-HSlopEn is more dispersed, and WSO-HSlopEn of different fault types are quite different, which effectively verifies the validity of WSO-HSlopEn as a feature extraction method for ten types of bearing signals.
Next, WSO-SVM is used to construct a fault diagnosis model. The highest recognition rate is calculated for the five types of hierarchical entropies under multi-feature extraction, as shown in Table 7, where (1,5) indicates the combination of nodes with the highest recognition rate for two features are node 1 and node 5, (1,5,6) indicates the combination of nodes with the highest recognition rate for three features are node 1, node 5 and node 6, and so on.  Table 7 shows that no matter how many features are extracted, the recognition rate of these ten types of bearing signals using WSO-HSlopEn is higher than that of other hierarchical entropies; additionally, the more features we select, the better the recognition effect we obtain; in the circumstances of multi-features, the recognition rates of WSO-HSlopEn are all higher than 97.5%, yet the highest recognition rates of other hierarchical entropies are all significantly below 97.5%; for WSO-HSlopEn, when five nodes are selected, that is, choosing nodes (1,5,6,7,11), the highest recognition rate of these ten types of bearing signals reaches 100%; however, the highest recognition rate of other entropies is, respectively, 3.80%, 10.53%, 16.73%, and 4.13% lower than that of WSO-HSlopEn. Through the above comparison, we can clearly find the significant advantages of the proposed method based on WSO-HSlopEn, and the recognition results applied to diagnose faults of rolling bearings are higher than those of classic methods.

Conclusions
This paper puts forward a dual-optimization fault diagnosis method for rolling bearings based on WSO-HSlopEn and WSO-SVM. The effectiveness of the proposed methods is verified by comparing them with the classical methods. The main innovations and conclusions are as follows: (1) On the basis of SlopEn, combined with the idea of hierarchical decomposition, HSlopEn is proposed and introduced into the feature extraction of bearing signals for the first time; at the same time, WSO is used to optimize both HSlopEn and the SVM, and WSO-HSlopEn and WSO-SVM are proposed. (2) In the case of single-feature extraction, the proposed method based on WSO-HSlopEn has the highest recognition rate of 79.33% on node 6, which is, respectively, 19.47%, 26.67%, 31.60%, and 13.73% higher than those of HFE, HPE, HSE, and HRDE. (3) In the case of extracting multi-features, the recognition rates are higher than 97.5%, which is a significant improvement compared with the single-feature extraction method; moreover, with the different number of features, the recognition rate based on WSO-HSlopEn is always high than the other hierarchical entropies. (4) For the proposed dual-optimization fault diagnosis method for rolling bearings, based on WSO-HSlopEn and WSO-SVM, the more features we select, the better the recognition effect we obtain. When five nodes are selected, the highest recognition rate reaches 100%.
The proposed WSO-HSlopEn and WSO-SVM solve the problem of dependent parameter settings for SlopEn and the SVM, respectively, and their superiority has been confirmed in fault diagnosis. Therefore, WSO-HSlopEn and WSO-SVM are expected to be applied to other fields in future work, such as underwater acoustic signal processing and medical signal classification.

Data Availability Statement:
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest:
The authors declare no conflict of interest.