Multi-Domain Entropy-Random Forest Method for the Fusion Diagnosis of Inter-Shaft Bearing Faults with Acoustic Emission Signals

Inter-shaft bearing as a key component of turbomachinery is a major source of catastrophic accidents. Due to the requirement of high sampling frequency and high sensitivity to impact signals, AE (Acoustic Emission) signals are widely applied to monitor and diagnose inter-shaft bearing faults. With respect to the nonstationary and nonlinear of inter-shaft bearing AE signals, this paper presents a novel fault diagnosis method of inter-shaft bearing called the multi-domain entropy-random forest (MDERF) method by fusing multi-domain entropy and random forest. Firstly, the simulation test of inter-shaft bearing faults is conducted to simulate the typical fault modes of inter-shaft bearing and collect the data of AE signals. Secondly, multi-domain entropy is proposed as a feature extraction approach to extract the four entropies of AE signal. Finally, the samples in the built set are divided into two subsets to train and establish the random forest model of bearing fault diagnosis, respectively. The effectiveness and generalization ability of the developed model are verified based on the other experimental data. The proposed fault diagnosis method is validated to hold good generalization ability and high diagnostic accuracy (~0.9375) without over-fitting phenomenon in the fault diagnosis of bearing shaft.


Introduction
Inter-shaft bearing operation between high-and low-pressure rotors is a key component of aeroengines. The failure of inter-shaft bearing can have a catastrophic effect on aeroengines. Identifying and diagnosing inter-shaft faults of aeroengines early and accurately are promising to avoid major accidents, and thus have significant economic benefit and engineering signification [1].
Nowadays, there are many ways to monitor fault signals of rolling bearings, such as noise signals, vibration signals, and AE (Acoustic Emission) signals. Noise signals often contain more environmental noise. It is difficult to identify the fault signals. Hence, it is often used in combination with a vibration signal in practical application. The vibration signal has become the most widely used monitoring signal due to its easy detection and intuitive signal expression. However, the inter-shaft bearing is located in engine rotor, and its vibration signal is easily affected by the connection part and the transmission part, so that the vibration signal is drowned by other noise signals. The AE signal is released outwardly in the form of a instantaneous elastic wave when the energy accumulates to a certain extent due 2 of 15 to the unstable stress distribution inside the object when it is subjected to deformation. Common faults of rolling bearings, such as wear, deformation, and crack, will produce a large number of AE signals. As such, the AE signal is widely used in inter-shaft bearing fault diagnosis, owing to its high frequency and sensitive characteristics to impact signals [2]. Although AE signals can better avoid the interference of noise signals compared with vibration signals, the AE signal of faults is still relatively weak and contains a lot of mechanical noise due to a complicated signal transmission path. Therefore, proposing an effective signal analysis technique, improving the signal-to-noise ratio, and identifying the fault accurately, is always the hot direction of experts and scholars. In recent years, a variety of effective fault diagnosis techniques have been developed based on information fusion theory and pattern recognition methods. Hsieh et al. used the combination of empirical mode decomposition (EMD) and multi-scale information entropy to accurately identify many high-speed rotor faults, such as the imbalance, misalignment, and poor lubrication [3]. For the imbalance, cracks of motor rotors, and single, coupling faults of bearings, Romero, et al. revealed that the fuzzy logic reasoning method could be precise in classifying and identifying the information entropy of different faults, and developed an on-line monitoring system with the above theory [4]. Yu et al. proposed a motor rolling bearing fault diagnosis method based on pattern spectrum entropy and proximal support vector machine (PSVM) [5]. Ai et al. introduced the fusion information entropy distance method for the fault diagnosis of rolling bearing based on wavelet spectral entropy, singular spectral entropy, power spectral entropy, and wavelet spectral entropy of the AE and vibration signals [6]. Based on the combination of singular value decomposition and information entropy, Hernandez et al. extracted the fault features of faulty rotors and bearings to accurately identify faults [7]. Information entropy is a measure of information uncertainty. The larger the value is, the higher the system complexity is. As some faults have similar signal characteristics, information entropy can only measure their complexity and perform preliminary noise reduction and classification but cannot accurately classify. Recently, information entropy has been widely applied as the feature extraction method of fault diagnosis [8][9][10][11][12][13][14]. The purpose of fault diagnosis is to predict the discrete values at different fault states of a diagnostic object. It is a classification task; the core issue is classifier design. Random forest is a combined classification algorithm of a decision-tree based on the stochastic statistical theory and belongs to supervised learning method. This algorithm is a nonlinear modeling tool which has the advantages of fast calculation speed, high classification accuracy, and extensive generalization ability. When learning sample features are obvious, it can obtain better classification accuracy and robustness. It has been applied in many fields, such as finance and biology, and achieved good classification results [15]. Gómez-Peñate et al. presented the design of a H sliding mode and an unknown input observer for Takagi-Sugeno (TS) systems. Contrary to the common approaches of considering exact premise variables, this work deals with the problem of inexact measurements of the premise variables. The method is robust to disturbances, sensor noise, and uncertainty on the premise variables [16]. Kobayashi et al. proposed a new fault auto-detection method by which the signals measured by an accelerometer located at a far point from the diagnosed bearing can be used to detect the bearing faults automatically [17]. Santos-Ruiz et al. described a data-driven system based on PCA (Principal Component Analysis) to detect and quantify fluid leaks in an experimental pipeline and use a dynamic PCA implementation (DPCA) to capture the process dynamics [18].
To make up the traditional information entropy method in extracting strong non-stationary inter-shaft bearing fault signal, this paper establishes a fusion of multiple information entropies with many analysis domain characteristics, namely multi-domain entropy. This method is based on the theory of information entropy fault diagnosis and presents a fault diagnosis method of multi-domain entropy-random forest by integrating the advantages of multi-domain entropy and random forest. The simulation experiment of four typical faults is conducted on inter-shaft bearing fault simulation rig. The multi-domain entropy of fault for the AE signal is extracted to build inter-shaft bearing fault feature vector samples. Random forest is generated by fault sample data, and these data are adopted to test the accuracy and generalization ability of random forest diagnosis and verify the effectiveness of the multi-domain entropy-random forest fault diagnosis method.
The structure of the paper is presented as follows. Four information entropies-singular spectrum entropy (SSE), power spectrum entropy (PSE), wavelet energy spectrum entropy (WESE), and wavelet space feature spectrum entropy (WSFSE), which reflect different domains and multi-domain entropy are introduced in Section 2. The process to build and evaluate RF (Random Forest) are introduced in Section 3. In Section 4, the multi-domain entropy-random forest method is proposed. In Section 5, the rolling bearing faults simulation experiments are carried out to evaluate the present method. Finally, conclusions are given in Section 6.

Information Entropy Theory
Information entropy is a concept used to measure information content in information theory. The more orderly a system is, the lower the information entropy is. Conversely, the more confused it is, the higher the information entropy is. Therefore, information entropy can also be said to be a measure of the systematic ordering degree [8]. The information entropy of the normal bearing is lower than the fault bearing, therefore, we can use it to evaluate the bearing working status.
Assuming the Lebesgue space M with the measure µ (µ(M) = 1), that garners by a measurable set H, which may be denoted as the incompatible set with a limited partitioning where µ(A i ) is the measurement of the ith sample A i , i = 1, 2, . . . , n.
In conclusion, when the rolling bearing state will be evaluated by information entropy, according to the characteristics of AE signal to choose the appropriate classification system and the corresponding measurement indicator.

Time Domain Information Entropy Features
The AE signal of any measuring point is a discrete time series. By the delay embedding technique, an arbitrary AE signal {x i }(i = 1, 2, . . . , n) is mapped to an embedded space. N is the number of samples. When the space length of a modal window is M, and the delay constant is 1. Then the signal {x i } can be divided into N − M segment modal data to obtain a pattern matrix A, i.e., In line with the singular value decomposition (SVD), the singular values σ i }(1 ≤ i ≤ M) of the matrix A are gained. The number of non-zero singular values reflects the number of patterns contained in each column of the matrix. The size of them reflects the proportion of the mode to the total mode. Then, in light of the thought on information entropy, the singular value is a kind of time domain division of AE signal [10]. The singular spectrum entropy (SSE) of AE signal is σ i is the ratio of the ith singular spectrum to the whole spectrum. The maximum singular spectrum entropy is white noise H t,max = log M. According to this feature, the signal may be normalized by white noise. The SSE formula is rewritten as

Frequency Domain Information Entropy Feature
When frequency signal X(ω) is the discrete Fourier transform of an AE time signal {x t }, its power spectrum is S(ω) = 1 2πN X(ω) 2 . The AE signal transformation from time domain to frequency domain obeys the conservation of energy; the relationship is as follows, Therefore, S = {S 1 , S 2 , . . . , S N } may be regarded as the partition of the original signal. Then the power spectrum entropy (PSE) H f of the AE signal is defined by where S i ) is the ratio of the ith power spectrum to the whole spectrum. Similarly, normalized by the white noise signal [19][20][21]. The PSE of the white noise is H f,max = logN. Then the PSE formula is rewritten as

Time-Frequency Domain Information Entropy Features
Wavelet analysis is a time-frequency analysis method developed based on overcoming the Fourier transform shortcomings [22]. The AE signal function is f (t). Its limited energy is conserved for the wavelet transform by where C Ψ is the admissible condition of the wavelet function; E(a) the energy of the function f (t) when the scale of f (t) is a. When E = {E 1 , E 2 , . . . , E n } denotes a wavelet spectrum of signal f (t) on n scales, the E is regarded as the partition of signal energy according to the definition of information entropy. Thus, the time-frequency domain wavelet energy spectrum entropy (WESE) H we [23] is defined by where E i is the ratio of the ith wavelet energy spectrum to the whole spectrum.
The wavelet transform is to isometrically map the one-dimensional signal into two-dimensional space. W = W f (a, b) 2 /C ψ a 2 is the energy distribution matrix of the signal on two-dimensional wavelet space. Through the SVD of the matrix W, similar to SSE, the time-frequency domain wavelet space feature spectrum entropy (WSFSE) [24] H ws is expressed as where σ i is the ratio of the ith feature spectrum to the whole spectrum.
The basis function formed by wavelet is a division of signal energy in scale space, which reflects the energy distribution characteristic of signal in time and frequency domain and measures the information ordering of the rolling bearing AE signal accurately.

Multi-Domain Entropy
From the analysis above, we can see that four information entropies-SSE, PSE, WESE, and WSFSE-could reflect the complexity of AE signal, just reflected by the different domains, in the acceleration or deceleration of bearing. Fusing the four information entropies can comprehensively analyze fault information with the AE signal. This method can improve the utilization of information and diagnose the fault in the early weak signal. In this paper, four information entropies are formed into a four-dimensional space. For the rolling bearing fault, four information entropies can be obtained. Each fault entropy band will change within a small range of values. By obtaining the mean value of each fault entropy band, the information entropy center-information entropy point. Combining four information entropy points (H t , H f , H we , H ws ), one multi-domain entropy point in a four-dimensional space can be determined.

Random Forest Method for Fault Diagnosis
Random forest method is a kind of statistical theory proposed by Breimans, which is combined with the "Bootstrap aggregating" and "random subspace" method. This method is a nonlinear modeling tool and overcomes some shortcomings: Low accuracy of single decision-tree and overfitting. Random forest method is very suitable for solving failure problems such as when priori knowledge is unclear, there is incomplete data, etc.

Random Forest Algorithm Building
Random forest is a classifier consisting of a collection of decision-tree classifiers. The establishment of the algorithm is divided into three steps as follows. (1) T training samples are extracted from the original data set with return by Bootstrap sampling method. The number of samples is the same as the original data set [22].
Assuming that X is a data set containing n samples {x 1 , x 2 , · · · x n }, a sample x i (i = 1, 2, · · · n) is extracted from the original data set X. And a total of n times is taken to combine it into a new set X * . Then, the probability of X * without a sample x j is: When n is large enough, about 36.8% of the samples in the original data set will not be extracted. When this is the case, the decision-tree of the random forest cannot determine a local optimal solution. As such, it can effectively avoid that abnormal data appearing in the sample set, and can get a better classifier. Meanwhile, the undetected Out-Of-Bag (OOB) is used to estimate the generalization error, the correlation coefficient, and the intensity of the decision-tree. Therefore, the algorithm classification accuracy can be quantified. ( The decision-tree model mentioned in this paper is shown in Equation (13).
The segmentation criterion consists of a segmentation variable and segmentation predication and is measured by impurity function. The Gini coefficient [20] reflects the inconformity probability of the category labels, in which the two samples randomly selected in the data set. The Gini coefficient is proportional to the impurity level. The optimal segmentation is to find the largest segmentation of the Gini coefficient, as follows: where p ( j t) is the jth category probability in the node t, namely it is the ratio of jth category to sample label total J.
Before selecting attributes on each non-leaf-node, randomly selecting m attributes from M attributes as the classification attribute set of the current node. This is done according to the empirical formula [25] given by Liaw, usually taken as m = int( where int is the rounding function. The node is spited by the best division mode of m attributes, by which a complete decision-tree is built. The growth of each decision-tree is not pruned, until the leaf-node growing. The random forest, generated by T decision-trees, is used to classify the test sample. Each tree has a voting right to decide the classification result. Summarizing the decision-tree output categories, the most classified categories are the final classification result. The classification decision model H(x) is shown as Equation (16).
where h i (X * , Θ) is the single decision-tree; Y is the output tag variable; I(*) is the indicative function.
The establishment and testing of the random forest is shown in Figure 1.

Random Forest Performance Evaluation
Generalization ability is the ability of the learning model to predict other variables, the learning model is obtained based on the training sample. The generalization error is an indicator of responding to the generalization ability. Its size has a closed relationship with the learning performance of the machine. The smaller the generalization error, the better the machine learning performance. Conversely, the greater the worse.
Random forest (RF) uses the OOB (out-of-bag) mode [26] to estimate the generalization error of the classification algorithm PE*, strength s, and correlation coefficient ρ. The error rate of the decisiontree is counted by OOB data, specifying each decision-tree as a unit. Then, the average of all decisiontree error rates is taken as an estimate of generalization error. Breiman proves through experiment that OOB error is an unbiased estimate. With the increase of the decision-tree classification model, all the sequences

Multi-Domain Entropy-Random Forest Method for Fault Diagnosis
This paper proposes a fault diagnosis method for inter-shaft bearing; that is multi-domain entropy-random forest method, based on the theory of information entropy and random forest. Firstly, we establish the extraction algorithms of SSE, PSE, WESE, and WSFSE, based on the information entropy theory and the non-stationary signal processing method. At the same time, the spatial de-noising method was used to filter and reduce the noise of the collected AE signals. The comparison between an AE signal before and after the preprocessing is shown in Figure 2. Secondly, four information entropies of the fault signals are extracted to fuse, and a fault feature vector set of inner-shaft bearing is formed, after which the training samples and test samples of bearing are established. Then, training samples are used to generate random forest, and selecting the random forest attribute for training. It establishes a fault diagnosis model for random forest. Finally, a test sample is used to verify the trained random forest model. The multi-domain entropy-random forest model, proposed in this paper, is shown in Figure 3.

Random Forest Performance Evaluation
Generalization ability is the ability of the learning model to predict other variables, the learning model is obtained based on the training sample. The generalization error is an indicator of responding to the generalization ability. Its size has a closed relationship with the learning performance of the machine. The smaller the generalization error, the better the machine learning performance. Conversely, the greater the worse.
Random forest (RF) uses the OOB (out-of-bag) mode [26] to estimate the generalization error of the classification algorithm PE*, strength s, and correlation coefficient ρ. The error rate of the decision-tree is counted by OOB data, specifying each decision-tree as a unit. Then, the average of all decision-tree error rates is taken as an estimate of generalization error. Breiman proves through experiment that OOB error is an unbiased estimate. With the increase of the decision-tree classification model, all the sequences Θ 1 , . . . Θ n , PE * converge to P X,Y P Θ [h(X, Θ) = Y] − max j k P Θ [h(X, Θ) = j] < 0 almost everywhere.

Multi-Domain Entropy-Random Forest Method for Fault Diagnosis
This paper proposes a fault diagnosis method for inter-shaft bearing; that is multi-domain entropy-random forest method, based on the theory of information entropy and random forest. Firstly, we establish the extraction algorithms of SSE, PSE, WESE, and WSFSE, based on the information entropy theory and the non-stationary signal processing method. At the same time, the spatial de-noising method was used to filter and reduce the noise of the collected AE signals. The comparison between an AE signal before and after the preprocessing is shown in Figure 2. Secondly, four information entropies of the fault signals are extracted to fuse, and a fault feature vector set of inner-shaft bearing is formed, after which the training samples and test samples of bearing are established. Then, training samples are used to generate random forest, and selecting the random forest attribute for training. It establishes a fault diagnosis model for random forest. Finally, a test sample is used to verify the trained random forest model. The multi-domain entropy-random forest model, proposed in this paper, is shown in Figure 3.

Rolling Bearing Faults Simulation Experiment
The state of the inter-shaft bearing is different at different speeds, and different information is included in the raising and lowering speed. SSE, PSE, WESE, and WSFSE of the inter-shaft bearing AE signal reflect the fault state from the time, frequency, and time-frequency domain. Integrating the above four information entropies can more comprehensively and accurately assess the state of the inter-shaft bearing. To verify the effectiveness and practicability of the multi-domain entropyrandom forest fault diagnosis method, a fault simulation experiment of the cylindrical roller bearing model NU202 is carried out. It simulates four typical faults (ball fault, inner race fault, outer race fault, and normal) under multi-rotation speeds and multi-measuring points and acquires an AE signal.
The test system is shown in Figure 4; the double rotor test stand with inter-shaft bearing can simulate the fulcrum bearing and inter-shaft bearing failure status of the aeroengine. The acoustic emission system of PAC is adopted to collect and analysis AE signal. Four sensors are installed on

Rolling Bearing Faults Simulation Experiment
The state of the inter-shaft bearing is different at different speeds, and different information is included in the raising and lowering speed. SSE, PSE, WESE, and WSFSE of the inter-shaft bearing AE signal reflect the fault state from the time, frequency, and time-frequency domain. Integrating the above four information entropies can more comprehensively and accurately assess the state of the inter-shaft bearing. To verify the effectiveness and practicability of the multi-domain entropyrandom forest fault diagnosis method, a fault simulation experiment of the cylindrical roller bearing model NU202 is carried out. It simulates four typical faults (ball fault, inner race fault, outer race fault, and normal) under multi-rotation speeds and multi-measuring points and acquires an AE signal.
The test system is shown in Figure 4; the double rotor test stand with inter-shaft bearing can simulate the fulcrum bearing and inter-shaft bearing failure status of the aeroengine. The acoustic emission system of PAC is adopted to collect and analysis AE signal. Four sensors are installed on

Rolling Bearing Faults Simulation Experiment
The state of the inter-shaft bearing is different at different speeds, and different information is included in the raising and lowering speed. SSE, PSE, WESE, and WSFSE of the inter-shaft bearing AE signal reflect the fault state from the time, frequency, and time-frequency domain. Integrating the above four information entropies can more comprehensively and accurately assess the state of the inter-shaft bearing. To verify the effectiveness and practicability of the multi-domain entropy-random forest fault diagnosis method, a fault simulation experiment of the cylindrical roller bearing model NU202 is carried out. It simulates four typical faults (ball fault, inner race fault, outer race fault, and normal) under multi-rotation speeds and multi-measuring points and acquires an AE signal.
The test system is shown in Figure 4; the double rotor test stand with inter-shaft bearing can simulate the fulcrum bearing and inter-shaft bearing failure status of the aeroengine. The acoustic emission system of PAC is adopted to collect and analysis AE signal. Four sensors are installed on the casing and bearing, as shown in Figure 5. The speed range of each fault is limited by the interval speed Entropy 2020, 22, 57 9 of 15 of 100 rpm from 800 rpm to 2000 rpm. The sampling frequency is set at 1000 KHz, and is therefore gained by 52 groups of AE signals. SSE, PSE, WESE, and WSFSE of all the rotational speeds and measurement points of AE signals are used to fuse, and the fused information entropy points are calculated according to Equation (10). the casing and bearing, as shown in Figure 5. The speed range of each fault is limited by the interval speed of 100 rpm from 800 rpm to 2000 rpm. The sampling frequency is set at 1000 KHz, and is therefore gained by 52 groups of AE signals. SSE, PSE, WESE, and WSFSE of all the rotational speeds and measurement points of AE signals are used to fuse, and the fused information entropy points are calculated according to Equation (10).

Extraction of Many Information Entropy Features of AE Signals
From the inter-shaft bearing fault simulation experiment, AE signal samples of ball fault, inner race fault, outer race fault, and normal statues are collected. SSE, PSE, WESE, and WSFSE of AE signals of the four faults are structured in terms of Equations (1) and (8). Each of the curves in Figure 6 is the SSE of a fault at multiple rotational speeds. By comparing SSE curves of the four statues, the four curves cross each other seriously, and the fault data are poorly separable. Similarly, from Figures 7-9, PSE, WESE, and WSFSE curves also cross, and are not suitable for fault features alone.  the casing and bearing, as shown in Figure 5. The speed range of each fault is limited by the interval speed of 100 rpm from 800 rpm to 2000 rpm. The sampling frequency is set at 1000 KHz, and is therefore gained by 52 groups of AE signals. SSE, PSE, WESE, and WSFSE of all the rotational speeds and measurement points of AE signals are used to fuse, and the fused information entropy points are calculated according to Equation (10).

Extraction of Many Information Entropy Features of AE Signals
From the inter-shaft bearing fault simulation experiment, AE signal samples of ball fault, inner race fault, outer race fault, and normal statues are collected. SSE, PSE, WESE, and WSFSE of AE signals of the four faults are structured in terms of Equations (1) and (8). Each of the curves in Figure 6 is the SSE of a fault at multiple rotational speeds. By comparing SSE curves of the four statues, the four curves cross each other seriously, and the fault data are poorly separable. Similarly, from Figures 7-9, PSE, WESE, and WSFSE curves also cross, and are not suitable for fault features alone.

Extraction of Many Information Entropy Features of AE Signals
From the inter-shaft bearing fault simulation experiment, AE signal samples of ball fault, inner race fault, outer race fault, and normal statues are collected. SSE, PSE, WESE, and WSFSE of AE signals of the four faults are structured in terms of Equations (1) and (8). Each of the curves in Figure 6 is the SSE of a fault at multiple rotational speeds. By comparing SSE curves of the four statues, the four curves cross each other seriously, and the fault data are poorly separable. Similarly, from Figures 7-9, PSE, WESE, and WSFSE curves also cross, and are not suitable for fault features alone. the casing and bearing, as shown in Figure 5. The speed range of each fault is limited by the interval speed of 100 rpm from 800 rpm to 2000 rpm. The sampling frequency is set at 1000 KHz, and is therefore gained by 52 groups of AE signals. SSE, PSE, WESE, and WSFSE of all the rotational speeds and measurement points of AE signals are used to fuse, and the fused information entropy points are calculated according to Equation (10).

Extraction of Many Information Entropy Features of AE Signals
From the inter-shaft bearing fault simulation experiment, AE signal samples of ball fault, inner race fault, outer race fault, and normal statues are collected. SSE, PSE, WESE, and WSFSE of AE signals of the four faults are structured in terms of Equations (1) and (8). Each of the curves in Figure 6 is the SSE of a fault at multiple rotational speeds. By comparing SSE curves of the four statues, the four curves cross each other seriously, and the fault data are poorly separable. Similarly, from Figures 7-9, PSE, WESE, and WSFSE curves also cross, and are not suitable for fault features alone.

Extraction of Multi-Domain Entropy Features of AE Signals
The multi-domain entropy points (MDEP) of AE signals are structured in terms of Equation (10). Each of the curves in Figure 10 is the MDEP of a fault at multiple rotational speeds. By comparing entropy point curves of the four statues, the four curves cross less, and they are basically separated. The fault data are well separable, and are suitable for fault features.

Extraction of Multi-Domain Entropy Features of AE Signals
The multi-domain entropy points (MDEP) of AE signals are structured in terms of Equation (10). Each of the curves in Figure 10 is the MDEP of a fault at multiple rotational speeds. By comparing entropy point curves of the four statues, the four curves cross less, and they are basically separated. The fault data are well separable, and are suitable for fault features.

Extraction of Multi-Domain Entropy Features of AE Signals
The multi-domain entropy points (MDEP) of AE signals are structured in terms of Equation (10). Each of the curves in Figure 10 is the MDEP of a fault at multiple rotational speeds. By comparing entropy point curves of the four statues, the four curves cross less, and they are basically separated. The fault data are well separable, and are suitable for fault features.

Extraction of Multi-Domain Entropy Features of AE Signals
The multi-domain entropy points (MDEP) of AE signals are structured in terms of Equation (10). Each of the curves in Figure 10 is the MDEP of a fault at multiple rotational speeds. By comparing entropy point curves of the four statues, the four curves cross less, and they are basically separated. The fault data are well separable, and are suitable for fault features.   Figure 11 and Table  2. As illustrated in Figure 11 and Table 2, the MDERF model completely classifies the training samples with 100% accuracy and does not have an over-fit phenomena.  Table 1.  Figure 11 and Table 2. As illustrated in Figure 11 and Table 2

Generalization Ability Verification
To support the generalization ability of the MDERF model, the testing samples of 32 groups were adopted to validate the model by classification. The results are shown in Figure 12 and Table 3.  As revealed in Table 3 and Figure 12, the recognition precision of the built MDERF model is 93.75%. It accurately recognizes the testing samples for inner race fault, outer race fault, and normal statues. However, for eight ball fault samples, two samples were mistakenly recognized as outer race fault samples. The information exergy is the disturbance degree of bearing failure information. From Table 1, the information entropy point vector of the ball fault and outer ring fault is very similar, namely, the disorder degree of the two-failure information is similar. This is the reason for the

Generalization Ability Verification
To support the generalization ability of the MDERF model, the testing samples of 32 groups were adopted to validate the model by classification. The results are shown in Figure 12 and Table 3.

Generalization Ability Verification
To support the generalization ability of the MDERF model, the testing samples of 32 groups were adopted to validate the model by classification. The results are shown in Figure 12 and Table 3.  As revealed in Table 3 and Figure 12, the recognition precision of the built MDERF model is 93.75%. It accurately recognizes the testing samples for inner race fault, outer race fault, and normal statues. However, for eight ball fault samples, two samples were mistakenly recognized as outer race fault samples. The information exergy is the disturbance degree of bearing failure information. From Table 1, the information entropy point vector of the ball fault and outer ring fault is very similar, namely, the disorder degree of the two-failure information is similar. This is the reason for the   As revealed in Table 3 and Figure 12, the recognition precision of the built MDERF model is 93.75%. It accurately recognizes the testing samples for inner race fault, outer race fault, and normal statues. However, for eight ball fault samples, two samples were mistakenly recognized as outer race fault samples. The information exergy is the disturbance degree of bearing failure information. From Table 1, the information entropy point vector of the ball fault and outer ring fault is very similar, namely, the disorder degree of the two-failure information is similar. This is the reason for the miscalculation.
Therefore, the MDERF model is validated to have good generalization ability. The proposed MDERF model provides a new way for inter-shaft bearing fault diagnosis.

Method Validation
To verify the effectiveness of the developed MDERF model in inter-shaft bearing fault diagnosis, five fault diagnosis algorithms, i.e., support vector machine (SVM) [8,27], k-nearest neighbor (KNN) [28], classification and regression tree (CART) [29], and gradient boosting decision-tree (GBDT) [30], MDERF, are validated by learning and testing with the same test samples. The performance comparison results of five fault diagnosis algorithms based on test set data are shown in Table 4. As seen from Table 4, the decision-tree diagnosis algorithms, represented by CART, RF, and GBDT, are significantly better in diagnostic accuracy than the distance discrimination algorithms such as SVM and KNN. What is more, the accuracy of the RF and GBDT algorithms through the integrated training of decision-trees is higher than the single decision-tree represented by CART. It is demonstrated that the developed MDERF method is accurate (93.75%) in inter-shaft bearing fault diagnosis.

Conclusions
The objective of this paper is to propose a novel fault diagnosis method of inter-shaft bearing, i.e., multi-domain entropy-random forest (MDERF) method, by fusing the multi-domain entropy and random forest methods, to improve the precision of fault diagnosis. We discuss the theory and method of MDERF with an emphasis on four information entropies (singular spectrum entropy (SSE), power spectrum entropy (PSE), wavelet energy spectrum entropy (WESE), and wavelet space feature spectrum entropy (WSFSE)) and the random forest method. Then, the developed method is applied to the fault diagnosis of inter-shaft bearing. Through the comparison of methods, the developed MDERF method is validated to be effective and accurate. The results from this study demonstrate that; (1) the fault samples comprising four information entropies have good separability and are suitable for the expression of fault features; (2) the MDERF model is effective to inter-shaft bearing faults diagnosis by adopting the AE signal; (3) the MDERF model is validated to have good learning ability and generalization ability with the diagnostic precision 93.75% and no overfit phenomenon. The efforts of this study provide a new useful insight for inter-shaft bearing fault diagnosis. The proposed method will be extended to multi-faults, and an experimental study on multi-faults of inter-shaft bearings will be carried out to verify the effectiveness of the method.