Next Article in Journal
Comparative Antioxidant and Anti-Inflammatory Activity of Ellagic Acid and Juglans regia L. in Collagenase-Induced Osteoarthritis in Rats
Previous Article in Journal
Performance Assessment of a GNSS Antenna Array with Digital Beamforming Supported by an FPGA Platform
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Methodology for Feature Selection of Time Domain Vibration Signals for Assessing the Failure Severity Levels in Gearboxes

by
Antonio Pérez-Torres
1,2,†,
René-Vinicio Sánchez
2,† and
Susana Barceló-Cerdá
1,*,†
1
Department of Applied Statistics and Operational Research, and Quality, Universitat Politècnica de València, 46022 Valencia, Spain
2
Grupo de Investigación y Desarrollo en Tecnologías Industriales (GIDTEC), Universidad Politécnica Salesiana, Cuenca 010102, Ecuador
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Appl. Sci. 2025, 15(11), 5813; https://doi.org/10.3390/app15115813
Submission received: 23 April 2025 / Revised: 18 May 2025 / Accepted: 19 May 2025 / Published: 22 May 2025

Abstract

:
Early failure detection in gear systems reduces unplanned downtime and associated maintenance costs in rotating machinery. Although numerous indicators can be extracted from vibration signals, selecting the most relevant ones remains challenging. This study proposes a methodology for selecting time-domain features to classify fault severity levels in spur gearboxes. Vibration signals are acquired using six accelerometers and processed to extract 64 statistical condition indicators (CIs). The most informative subset of CIs is identified and selected through a wrapper-based selection approach and artificial intelligence tools. The selected features are then evaluated based on the classification accuracy and the area under the curve (AUC) in receiver operating characteristic (ROC) achieved using Random Forest (RF) and K-nearest neighbours (K-NN) models, with performance exceeding 98%. Additionally, the effect of sensor position and inclination on signal quality and classification performance is analysed using factorial analysis of variance (ANOVA) and multiple comparison tests. The results confirm the robustness of the selected CIs and the minimal influence of sensor placement variability, supporting the practical applicability of the proposed approach in industrial settings. The methodology offers a structured framework for selecting condition indicators in vibration signals, experimentally validated using multiple sensors and fault severity levels, and it is both automated and straightforward to implement.

1. Introduction

Transmission systems, particularly gearboxes, constitute critical subsystems in rotating machinery owing to their capability to deliver high-efficiency power transfer under confined spatial constraints and their ability to sustain elevated mechanical loads. For this reason, condition-based monitoring (CBM) is essential for the early detection of failures, helping to prevent machine operation downtime and unplanned maintenance activities [1,2].
Among the signals used in failure diagnosis, vibration stands out for its ease of acquisition and sensitivity to changes in mechanical conditions. However, its analysis is complex, including stationary, non-stationary, and resonant components [3,4]. This complexity necessitates applying advanced signal processing techniques to extract useful information for system diagnostics.
The monitoring process is typically divided into three stages: data acquisition, feature extraction, and failure identification. Data are captured via sensors placed at various locations on the system, providing accurate and reliable signals [5,6]. Feature extraction is then performed by calculating condition indicators (CIs), which are statistical parameters that reflect the system’s health condition [7,8,9].
CIs can be obtained in different domains. In the time domain, as shown in [10], in the frequency domain by transforming the signal from the time domain using the fast Fourier transform (FFT) [11], or in the time-frequency domain via wavelet transform [4,12]. In the time domain, CIs can be classified as conventional—such as root mean square, mean, variance, kurtosis, and skewness—or non-conventional, including absolute mean value, waveform length, zero crossings, Wilson amplitude, and slope sign changes, among others [13].
Several studies have shown that an appropriate combination of CIs enhances the sensitivity of diagnostic systems. However, a poor combination may introduce redundancy or contradictions that degrade model performance [14]. Consequently, feature selection is a crucial step in optimising classification models.
Feature selection is critical in developing machine learning models, particularly when dealing with high-dimensional data. The wrapper method is a feature selection technique that evaluates different subsets of variables based on the performance of a machine learning model, often resulting in improved predictive accuracy. This contrasts with filter methods, which assess variables independently of the model [15]. A significant advantage of the wrapper approach lies in its ability to identify combinations of features that may be uninformative in isolation but, when combined, enhance the model’s performance [5,16,17]. In this study, the wrapper method uses the Random Forest (RF) algorithm to rank the importance of the condition indicators (CIs).
To validate the proposed methodology, resampling techniques such as bootstrap and repeated hold-out are used to assess the model’s stability and the variability of the results [18,19]. For classification, machine learning models such as RF and K-NN are used, both of which are widely applied in the literature for failure detection in gearboxes [14,20,21,22].
Although there are studies that use CIs and statistical models for failure diagnosis [10,16], as well as investigations into optimal sensor placement [23,24], no methodology has been reported that combines CIs ranking through a wrapper approach with the evaluation of the effect of sensor position and inclination on diagnostic accuracy.
In this context, the objectives of this study are as follows: (a) To propose a feature selection methodology that establishes a CIs ranking based on time-domain vibration signals from a spur gearbox. (b) To evaluate the performance of the selected CIs using Random forest and K-nearest neighbours classification models. (c) To determine whether the sensor’s position and inclination significantly influence feature extraction and the performance of the classification model.
In summary, the main contributions of this work are:
1.
The design of a structured methodology for selecting relevant time-domain condition indicators from vibration signals.
2.
The validation of this selection using two widely adopted classifiers in the literature (Random Forest and K-nearest neighbours)
3.
The experimental evaluation of the effect of sensor position and inclination on fault classification performance in gear systems.
These contributions aim to support condition-based monitoring strategies that are robust, interpretable, and applicable in real industrial environments.

2. Materials and Methods

2.1. Experimental Bench

This work was developed with the vibration signal obtained from the experimental bench represented in Figure 1. It has a 1.5 kW, 1200 rpm, three-phase 220 V motor, and a 1.5 kW frequency inverter. The motor is coupled to a single-stage spur gearbox; the gears have Z1 = 32 and Z2 = 48 teeth. The load is simulated by an 8.83 kW electromagnetic brake on the output shaft. The vibration signal was obtained through six accelerometers (A1–A6) in (m/s2). Accelerometers A1, A4, A5, and A6 were installed in a vertical position defined by the z-axis. A1 and A4 were mounted on the gearbox’s input shaft, whereas A5 and A6 were installed on the output shaft to capture vibration signals associated with both transmission stages. Accelerometer A2 was mounted inclined 45 concerning the x, and z axes, while A3 was mounted inclined 45 concerning the x, y, and z axes. The vibration signal is fed to a computer, which collects the data using LabVIEW 2024 Q1 and Matlab R2024a software.
Four failure types were simulated on the gear Z1, breaking (Figure 2a), cracking (Figure 2b), pitting (Figure 2c), and scuffing (Figure 2d). Each failure mode was evaluated under baseline operating conditions (P1) and across nine progressive severity levels (P2–P10) to characterise the system’s response under varying failure intensities. Motor speed was adjusted to F1 = 8 Hz, F2 = 14 Hz, and F3 = 20 Hz using a frequency inverter, while load conditions were varied to L1 = 0 V, 10 V, and L3 = 20 V via an electromagnetic braking system. Considering ten severity levels, three motor speeds, three load conditions, and ten experimental repetitions, a database comprising 900 observations per accelerometer was generated.
The experimental bench also has an encoder (E1), a laser encoder (LE1), two acoustic emission sensors (EA), and two microphones (M) for the gearbox and the power supply lines to the motor with three voltage meters (V) and three electric current clamp (CC).

2.2. Methodology

The methodology proposed is outlined in Figure 3. It comprises the following stages: acquisition of vibration data, extraction and selection of features, classification of failure severity levels, and statistical evaluation of the effects of sensor positioning and inclination.

2.2.1. Data Acquisition

Using six accelerometers (A1–A6) installed at different positions and inclinations on the gearbox, the vibration signal (Figure 4) was obtained. Each accelerometer sampled data at a frequency of 50 kHz, and a 10 s acquisition period produced 500,000 acceleration measurements per sensor.

2.2.2. Feature Extraction

Feature extraction from the time-domain vibration signals was done using 64 condition indicators (CIs) for each accelerometer and failure type. These indicators effectively capture different aspects of vibration dynamics in mechanical systems [13]. The detailed formulas and descriptions of the studied CIs can be found in Appendix A.

2.2.3. Feature Selection

Feature selection focused on the data acquired by accelerometers A1, A2, and A3, while the remaining accelerometers (A4, A5, and A6) were used to validate the robustness of the selected subset. The selection process is illustrated in Figure 5.
A wrapper-based approach was employed using the random forest (RF) classifier, as this type of method evaluates the performance of feature subsets within the learning model itself, improving selection quality. Additionally, RF allows for estimating each CI relative importance by calculating its mean influence (MI) on the model’s performance.
The selection process consisted of the following phases:
  • Phase 1:
Preparation of the RF classification model (mathematically detailed in Section 2.2.4) was carried out by optimising its hyperparameters, using the 64 computed condition indicators (CIs). These CIs were iteratively introduced into the classification model, with the training dataset (from accelerometers A1–A3) being resampled using the bootstrap method until maximum accuracy was achieved.
The machine learning wrapper method was applied once the RF classifier’s hyperparameters were optimised. This consisted of testing and evaluating different combinations of CIs to determine which yielded the best performance within the RF model, based on the average importance of each CI. The top 10 most relevant CIs were selected for each accelerometer, and a descending weight from 10 to 1 was assigned according to their ranking.
  • Phase 2:
The weights of each CI were summed by failure type across the three accelerometers to determine their relative influence of each kind of failure.
  • Phase 3:
Finally, the weights accumulated for each failure type were aggregated. The CIs with the highest total weights that appeared across all failure types were selected. This resulted in a final subset of the 7 most relevant CIs used as input for classification models. The monitoring method is ready for implementation with the selected condition indicators and the optimised classification model.

2.2.4. Classification Models

In this study, the performance of the selected condition indicators (CIs) was evaluated using two classification models: Random Forest (RF) and K-nearest Neighbours (K-NN), both widely recognised and proven to be effective in fault diagnosis tasks [4,14,20,21,22]. RF was used as the primary model, and K-NN was applied as a comparative technique. These algorithms were selected due to their robustness, interpretability, simplicity, low computational demand, reduced sensitivity to noise, and ease of implementation—qualities that make them particularly suitable for real-time industrial applications [25,26,27]. Their use is therefore justified in scenarios where timely and reliable fault detection is prioritised over theoretical optimality [28,29].
Although more complex models, such as deep neural networks or advanced ensemble architectures, are available, they typically require larger datasets, greater computational capacity, and often lack interpretability. In contrast, RF and K-NN offer optimal performance with minimal parameter tuning and allow better traceability of each CI’s contribution to the classification outcome, which is essential for practical industrial monitoring applications.
Moreover, these classifiers enable a direct validation of the selected subset of CIs, supporting the analysis of their relationship with varying fault severity levels.
  • Random forest (RF): RF is a classification model represented by Equation (1), composed of multiple tree-based classifiers. For each ith tree, an independent random vector ( V i ) is generated. Each tree is trained on a subset of the data and votes for the most popular category in the input vector ( x ) . The classification error, described by Equation (2), depends on the margin ( m g ) , which measures the average number of votes received for the correct class, and on the probability distribution P X , Y in the feature-label space [30].
    R F = h ( x , V i ) i = 1 N , i = 1 , 2 , 3 . . . , N
    E = P X , Y ( m g ( X , Y ) )
  • k-nearest neighbours (K-NN): K-NN is a non-parametric algorithm used for classification tasks in which new instances are categorised based on their proximity to existing samples within the feature space. The method assigns weights according to distance and infers the class of an unknown observation through a majority voting mechanism [31]. K-NN requires only the selection of the parameter k to define the number of neighbours and the appropriate distance metric [32].
    The K-NN classification algorithm works as follows:
    Given a training set D ( x i , y i ) i = 1 N , where x i is a training vector and y i its class label, and a test instance ( x , y ) , the predicted class y is determined using Equation (3):
    y = arg max ψ ( x , y ) D w i δ ( ψ , y i )
    Here, ψ is a candidate class label, y i is the label of the ith nearest neighbour, δ ( · ) is the indicator function returning 1 if the labels match and 0 otherwise, and w i , as defined in Equation (4), represents a weighting coefficient derived from the distance d ( x , x i ) between the query instance and its ith nearest neighbour.
    w i = 1 ( d ( x , x i ) ) 2
    The default distance metric is Euclidean, although alternatives such as Mahalanobis, Manhattan, and Minkowski distances can also be used [33].
A repeated hold-out resampling process was applied to evaluate model performance. During each iteration, the databases were partitioned in 70% for model training and 30% for model testing. The method enables the estimation of model variability across multiple subsets and yields robust performance metrics such as the accuracy rate and the area under the curve (AUC) associated with the receiver operating characteristic (ROC) analysis [19,34]. Two vectors of 1000 observations were obtained for each accelerometer, failure type, and classification model—one for accuracy and one for AUC.

2.2.5. Analysis of the Effect of Sensor Position and Inclination on the Vibration Signal

The accuracy vectors obtained in Section 2.2.4 were analysed to assess the effect of sensor placement on model performance. Factorial analysis of variance (ANOVA) and post-hoc Tukey testing were applied to determine if sensor position and inclination, failure type, and classification model statistically impacted the results.

2.2.6. Computational Tools

All calculations, modelling, and statistical analyses were conducted using the R programming language within the RStudio 2024.04.0 integrated development environment [35]. Dedicated libraries were used for signal processing, statistical modelling, feature selection, and variance analysis.

3. Results and Discussion

The first objective of this work was to rank the CIs of the vibration signal in the time domain using the proposed methodology. In order to fulfil this objective, the process detailed in Section 2.2.3 was carried out.
By developing the procedure detailed in phase 1, the λ -value mtry = 11 was determined for the RF classifier. The top 10 CIs were selected because the variability in classification accuracy reduced for the four types of failures, as detailed in Figure 6. Subsequently, after assigning the weighting, Table 1 was obtained for breaking (B), cracking (C), pitting (P), and scuffing (S) failures.
In order to determine the CIs weighting by failure types, phase 2 was carried out, and the results presented in Table 2 were obtained.
The total weighting and count per failure types in the CIs exposed in phase 3 are detailed in Table 3. The top 7 CIs were selected because they have the highest weighting and are present in all failure types. This procedure reduces the dimensionality of the DBs.
The CIs selected in the ranking were Temporal moments higher order (TMHO), Mean, Skewness, Zero crossing, Slope sign change (SSC), Energy operator and Kurtosis. The calculation equations of the CIs are detailed in Table 4. With the selected CIs, new DBs with 900 observations and these 7 CIs were constructed. These DBs were used to compute the accuracy and AUC metrics for the RF and k-nn algorithms.
A complementary statistical analysis was incorporated based on the Bhattacharyya distance [36] to strengthen the validity of the final set of selected features, quantifying the separation of condition indicators (CIs) across class distributions. This metric was calculated for all 64 CIs, considering comparisons between the baseline class (P1) and the various fault severity levels.
The results showed that the CIs selected through the proposed wrapper-based approach exhibited higher Bhattacharyya distance, confirming their discriminative capability. In contrast, multiple CIs with consistently low distances were identified and not selected during the feature selection phase, reinforcing the consistency between the statistical analysis and the multivariable classification performance. This comparison is detailed in Table 5, Table 6, Table 7 and Table 8 and visually represented in Figure 7.
Additionally, the methodological risk, widely discussed by Rencher [37], that some features with low individual informativeness could still provide value when combined with others, was taken into account. This risk was mitigated by employing a multivariable wrapper approach using Random Forest, which evaluates the performance of entire subsets of features and captures non-linear interactions.
In the work done by Nayana et al. [13], the vibration signal was analysed using 6 conventional and 6 non-conventional CIs to analyse bearing failures. In Sánchez et al. [10], feature extraction was performed starting from 30 CIs, and then a ranking of 10 CIs was performed using different filtering methods for different gear DB. In Patel et al. [38], 15 CIs were used to detect bearing failures. In contrast to the above mentioned works, we propose extracting features from the vibration signal using 64 CIs in the time domain. This leads to expand the options of CIs that can be included in the feature selection process for later use in the classification models.
The second objective presented was to measure the performance of the 7 selected CIs in the RF and K-NN classification models, for which, in the first instance, the λ -values in the classification models were determined and adjusted, being mtry = 3 for RF and k = 5 for K-NN. Subsequently, the accuracy and AUC in classifying the fault severity level were calculated. The results of the accuracy and AUC by accelerometer, failure and classifier are detailed in Table 9 and Table 10 respectively.
In Sánchez et al. [10], a 10 CIs ranking was performed using the filtering method for different gear DB in the time domain. When using the 10 CIs ranking in the RF and K-NN ranking models, the calculated accuracy in some DB was less than 85%. In the work done by Patel et al. [38], 15 CIs were used to detect bearing failures, resulting in values higher than 95% accuracy for the RF classification model. In this research, by using the wrapper method for the ranking of CIs and the RF and K-NN classification models, the classification accuracy and AUC values exceed 98%, as detailed in Table 9 and Table 10, respectively, when using only 7 CIs for all failure types, accelerometers and in the two classification models, thus reducing the dimensionality of the DB and increasing the efficiency in the classification process and reduce the computing time involved in processing the information.
The third objective of this work was to determine whether sensor position and inclination influence the extraction of vibration signal features for a classification model. To meet this objective, factorial ANOVA and Tukey’s post-hoc comparisons were conducted to assess differences in classification accuracy. The factors considered in the analysis were accelerometer position, failure, and the classification model employed.
The accelerometers A1, A4, A5 and A6 results were analysed to determine the influence of the sensor position. The ANOVA results indicated statistically significant differences (p-value < 0.001) in the average classification accuracy across the four accelerometers and for the four types of failure. In contrast, no significant difference was observed between the classification models (p-value = 1), while a significant interaction (p-value < 0.001) was found between the accelerometer:classifier (Figure 8a), failure:classifier (Figure 8b), and accelerometer:failure (Figure 8c) factors. A summary of the ANOVA results is presented in Table 11.
Regarding the influence of sensor inclination, the results obtained for accelerometers A1, A2 and A3 were analysed. The ANOVA test revealed statistically significant differences (p-value < 0.001) in the average classification accuracy across the three accelerometers, the four types of failure, and the two classification models. In addition, significant interaction effects (p-value < 0.001) were observed between the accelerometer:classifier (Figure 9a), failure:classifier (Figure 9b), and accelerometer:failure (Figure 9c) factors. A summary of the ANOVA results is presented in Table 12.
Figure 10 and Figure 11 illustrate the results of Tukey’s post-hoc analyses for sensor position and inclination, respectively, highlighting the significant differences and interaction effects among the evaluated factors.As can be seen in these graphs, the differences between pairs of factor levels are practically all significant except in some cases. However, these differences are of little practical relevance as they are less than 1%, as detailed in Table 9 for the four failure types. This consideration is particularly relevant in real-world applications, where surrounding components and system attachments often limit the physical space available for sensor placement on gearboxes.
In Pichika et al. [23] and Vanrak et al. [24], a sensor’s optimum position in a gearbox when moving it along the x, y and z axes are studied and determined to extract the best information from the vibration signal. This position, although optimal, is often not accessible due to the physical layout or mounting of the gearbox. In this work, it has been determined that there are significant differences in the position and inclination of the sensor. However, the results indicate that they are of no practical importance.

4. Conclusions

In this research, a methodological proposal was made for selecting vibration signal characteristics in the time domain for a spur gearbox based on the wrapper method using RF as a classifier. The study employed a complex and realistic dataset, generated under varying motor speeds and load conditions, to better simulate real-world operating scenarios. With the proposed methodology, 7 CIs (Temporal moments higher order, Mean, Skewness, Zero crossing, Slope sign change, Energy operator and Kurtosis) were obtained, which are predominant in the four failure types: breaking, cracking, pitting and scuffing for the feature extraction process and the performance of the classification models.
By using the 7 CIs obtained in the feature selection stage in the RF and K-NN classification models, the average values of the classification accuracy of the failure severity level as well as the AUC values exceeded 98% across all six accelerometers and four failure types, indicating that the selected CIs are highly effective for analysing vibration signals and accurately determining the severity level of various failure types.
The comparison of classification accuracy values using factorial ANOVA and Tukey’s post-hoc tests revealed statistically significant differences associated with variations in sensor position and inclination. These differences, being less than 1%, are of no practical importance, so when a sensor is installed in a gearbox by moving it either along the drive shaft or the driven shaft and with some degree of inclination, the information obtained from the vibration signal will be minimally affected. The developed methodology is ideal for early detection and assessment of failure severity levels in gearboxes.

5. Future Works

Develop a multivariate statistical process control system for monitoring gearbox condition using vibration signals. Use the proposed methodology to analyse the signal acquired through the gearbox’s acoustic emission, voltage, noise, and electric current sensors.

Author Contributions

Conceptualization, A.P.-T., R.-V.S. and S.B.-C.; methodology, A.P.-T., R.-V.S. and S.B.-C.; software, A.P.-T.; validation, R.-V.S. and S.B.-C.; formal analysis, A.P.-T.; investigation, A.P.-T., R.-V.S. and S.B.-C.; resources, R.-V.S. and S.B.-C.; data curation, A.P.-T.; writing—original draft preparation, A.P.-T.; writing—review and editing, R.-V.S. and S.B.-C.; visualization, A.P.-T.; supervision, S.B.-C.; project administration, R.-V.S.; funding acquisition, R.-V.S. and S.B.-C. All authors have read and agreed to the published version of the manuscript.

Funding

Universidad Politécnica Salesiana and Universitat Politècnica de València.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Grupo de Investigación y Desarrollo en Tecnologías Industriales (GIDTEC), Universidad Politécnica Salesiana, Cuenca, Ecuador; jperezt@ups.edu.ec.

Acknowledgments

Universitat Politècnica de València and Universidad Politécnica Salesiana for funding the research project: “Evaluación de la severidad de fallos en engranajes rectos y helicoidales mediante señales de vibración, corriente y emisión acústica” of the “Grupo de Investigación y Desarrollo en Tecnologías Industriales (GIDTEC)”.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
CIsCondition indicators
RFRandom forest
K-NNK-nearest neighbors
AUCArea under the curve
ROCReceiver operating characteristic
ANOVAAnalysis of variance
CBMCondition-based monitoring
FFTFast Fourier transform
DBDatabase
TMHOTemporal moment higher order
SSCSlope sign change

Appendix A

The formulas for the 64 time domain condition indicators are detailed below.
Table A1. Formulas for condition indicators in the time domain.
Table A1. Formulas for condition indicators in the time domain.
N.Condition IndicatorFormula
1Mean T 1 = 1 N i = 1 N x i
2Variance T 2 = 1 N i = 1 N ( x i T 1 ) 2
3Standar desviation T 3 = 1 N i = 1 N ( x i T 1 ) 2
4Root mean square (RMS) T 4 = 1 N i = 1 N ( x i ) 2
5Max value T 5 = m a x ( x )
6Kurtosis T 6 = N i = 1 N ( x i T 1 ) 4 [ i = 1 N ( x i T 1 ) 2 ] 2
7Skewness T 7 = N i = 1 N ( x i T 1 ) 3 T 3 3
8Energy operator T 8 = N 2 i = 1 N ( Δ y i Δ y ¯ ) 4 [ i = 1 N ( Δ y i Δ y ¯ ) 2 ] 2
9Absolute mean T 9 = 1 N i = 1 N | x i |
10CPT1 T 10 = i = 1 N l o g ( | x i | + 1 ) N l o g ( T 3 + 1 )
11CPT2 T 11 = i = 1 N e x p ( x i ) N e x p ( T 3 )
12CPT3 T 12 = i = 1 N | x i | N T 2
13Fifth statistic moment T 13 = ( x i T 1 ) 5
14Shape factor T 14 = R M S 1 N i = 1 N | x i |
15Impulse factor T 15 = m a x ( x i ) 1 N i = 1 N | x i |
16Clearance factor T 16 = m a x ( | x i | ) 1 N i = 1 N ( x i ) 2
17Delta RMS T 17 = R M S i + 1 R M S i
18Root sum of squares T 18 = l = 1 n | x i | 2
19Energy T 19 = l = 1 n | x i | 2
20Latitude factor T 20 = m a x ( | x i | ) ( 1 N i = 1 N | x i | ) 2
21Weighted SSR absolute T 21 = 1 N ( i = 1 N | x i | ) 2
22Mean square error T 22 = 1 N i = 1 N ( x i T 1 ) 2
23Normalized normal negative likelihoog T 23 = l n T 3 R M S
24Mean deviation T 24 = 1 N i = 1 N x i 1 N i = 1 N ( x i T 1 ) 2
25Standard deviation impulse factor T 25 = s t d ( x ) m e a n ( | x | )
26Log-Log ratio T 26 = 1 l o g ( s t d ( x ) ) i = 1 N l o g ( | x i | + 1 )
27Kth central moment T 27 = E [ ( x E [ x ] ) k ]
Where E(x) is the expected value of x. K is set to 3
28Histogram lower bound T 28 = m i n ( x ) 1 2 m a x ( x ) m i n ( x ) N 1
29Histogram upper bound T 29 = m a x ( x ) + 1 2 m a x ( x ) m i n ( x ) N 1
30Normalized moment T 30 = 1 N i = 1 N ( x i m e a n ( x ) ) 5 ( 1 N i = 1 N ( x 1 m e a n ( x ) ) 2 ) 5
31Shannon entropy T 31 = i = 1 N l o g ( x i 2 )
32Log energy entropy T 32 = i = 1 N l o g ( x i 2 ) where, log(0)=0
33Threshold entropy T 33 = T h r e s h o l d 1 , i f | x i |   >   p , a n d 0 , e l s e w h e r e
p is set to 0.2
34Sure entropy T 34 = n # { i such that
| x i |   p } + i m i n ( x i 2 , p 2 )
p is set to 0.2
35Norm entropy T 35 = i = 1 N | x i | p
p is set to 0.2
36Peak to peak T 36 = M a x M i n
37Minimum value T 37 = m i n = m i n ( x i )
38Peak value T 38 = 1 2 [ M a x ( x i ) M i n ( x i ) ]
396th statistical moment T 39 = ( x i T 1 ) 6
40Crest factor T 40 = m a x R M S
41Integrated signal T 41 = i = 1 N | x i |
42Square root amplitude value T 42 = ( i = 1 N | x i | N ) 2
43Zero crossing T 43 = i = 1 N s t e p [ S i g n ( x i x i + 1 ) ] s t e p = 1 , i f x > 0 0 , i f x = 0 1 , i f x < 0 s i g n = 1 , i f x > 0 1 2 , i f x = 0 0 , i f x < 0
44Wavelength T 44 = i = 1 N | x i + 1 x i |
45Wilson amplitude T 45 = i = 1 N f ( | x i x i + 1 | T )
T = threshold set to 0.2
f = 1 , i f x 0 0 , i f x < 0
46Slope sign change T 46 = i = 2 N f [ ( x i x i 1 ) ( x i x i + 1 ) ] f = 1 , i f x t h r e s h o l d 0 , o t h e r w i s e
47Log detector T 47 = e 1 N i = 1 N l o g | x i |
48Modified mean absolute value 1 T 48 = 1 N i = 1 N W i | x i | W i = 1 ; i f 0.25 N n 0.75 N W i = 0.5 ; o t h e r w i s e
49Modified mean absolute value 2 T 49 = 1 N i = 1 N W i | x i | W i = 1 ; i f 0.25 N n 0.75 N W i = 4 n N ; i f n < 0.25 N W i = 4 ( n N ) N ; i f n > 0.75 N
50Mean absolute value slope T 50 = M A V i + 1 M A V i
51Mean of amplitude T 51 = i = 1 N x i 1 x i )
52Log RMS T 52 = l o g ( X r m s )
53Conduction velocity of signal T 53 = ( 1 N 1 i = 1 N x i 2 )
54Average amplitude change (AAC) T 54 = 1 N i = 1 N 1 x i 2
55V-Order 3 T 55 = 1 N i = 1 N x i 3 3
56Maximum fractal length T 56 = l o g 10 i = 1 N 1 ( x i x i + 1 ) 2
57Difference absolute standard deviation T 57 = 1 N 1 i = 1 N 1 ( x i + 1 x i ) 2
58Myopulse percentage rate T 58 = 1 N i = 1 N [ f ( x i ) ] ; f ( x ) = 1 , i f x t h r e s h o l d 0 , o t h e r w i s e
the threshold is set to 0.2
59Temporal moments higher order T 59 = | 1 N i = 1 N x i m |
Where m is set to 3 as default
60Difference absolute variance value T 60 = 1 N 2 i = 1 N 1 ( x i + 1 x i ) 2
61Margin index T 61 = ( m a x ( x ) ( 1 N ) i = 1 N x 1 ) 2
62Waveform indicators T 62 = V O 2 i = 1 N x i N
63Weibull negative log-likelihood T 63 = i = 1 N l o g [ ( S F η ) s f | x i | s f 1 e x p | x i | η s f ]
Where η is the scale factor and SF the shape factor
64Pulse indicators T 64 = M a x ( x i ) 1 N i = 1 N | x i |

References

  1. Dong, E.; Zhang, E.; Zhan, X.; Cheng, Z. A novel dynamic predictive maintenance framework for gearboxes utilizing 341 nonlinear Wiener process. Meas. Sci. Technol. 2024, 35, 126210. [Google Scholar] [CrossRef]
  2. Goswami, P.; Rai, R.N. A systematic review on failure modes and proposed methodology to artificially seed faults for promoting PHM studies in laboratory environment for an industrial gearbox. Eng. Fail. Anal. 2023, 146, 107076. [Google Scholar] [CrossRef]
  3. Cerrada, M.; Zurita, G.; Cabrera, D.; Sánchez, R.V.; Artés, M.; Li, C. Fault diagnosis in spur gears based on genetic algorithm and random forest. Mech. Syst. Signal Process. 2016, 70, 87–103. [Google Scholar] [CrossRef]
  4. Pérez-Torres, A.; Sánchez, R.V.; Barceló-Cerdá, S. Selection of the level of vibration signal decomposition and mother wavelets to determine the level of failure severity in spur gearboxes. Qual. Reliab. Eng. Int. 2024, 40, 3439–3451. [Google Scholar] [CrossRef]
  5. Sendlbeck, S.; Fimpel, A.; Siewerin, B.; Otto, M.; Stahl, K. Condition monitoring of slow-speed gear wear using a transmission error-based approach with automated feature selection. Int. J. Progn. Health Manag. 2021, 12. [Google Scholar] [CrossRef]
  6. Seo, M.K.; Yun, W.Y. Gearbox Condition Monitoring and Diagnosis of Unlabeled Vibration Signals Using a Supervised Learning Classifier. Machines 2024, 12, 127. [Google Scholar] [CrossRef]
  7. Sharma, V.; Parey, A. A review of gear fault diagnosis using various condition indicators. Procedia Eng. 2016, 144, 253–263. [Google Scholar] [CrossRef]
  8. Hızarcı, B.; Ümütlü, R.C.; Kıral, Z.; Öztürk, H. Fault severity detection of a worm gearbox based on several feature extraction methods through a developed condition monitoring system. SN Appl. Sci. 2021, 3, 129. [Google Scholar] [CrossRef]
  9. Salameh, J.P.; Cauet, S.; Etien, E.; Sakout, A.; Rambault, L. Gearbox condition monitoring in wind turbines: A review. Mech. Syst. Signal Process. 2018, 111, 251–264. [Google Scholar] [CrossRef]
  10. Sanchez, R.V.; Lucero, P.; Vásquez, R.E.; Cerrada, M.; Macancela, J.C.; Cabrera, D. Feature ranking for multi-fault diagnosis of rotating machinery by using random forest and KNN. J. Intell. Fuzzy Syst. 2018, 34, 3463–3473. [Google Scholar] [CrossRef]
  11. Wang, J.; Li, S.; Xin, Y.; An, Z. Gear fault intelligent diagnosis based on frequency-domain feature extraction. J. Vib. Eng. Technol. 2019, 7, 159–166. [Google Scholar] [CrossRef]
  12. Vakharia, V.; Gupta, V.K.; Kankar, P.K. A comparison of feature ranking techniques for fault diagnosis of ball bearing. Soft Comput. 2016, 20, 1601–1619. [Google Scholar] [CrossRef]
  13. Nayana, B.; Geethanjali, P. Analysis of statistical time-domain features effectiveness in identification of bearing faults from vibration signal. IEEE Sens. J. 2017, 17, 5618–5625. [Google Scholar] [CrossRef]
  14. Lei, Y.; Zuo, M.J. Gear crack level identification based on weighted K nearest neighbor classification algorithm. Mech. Syst. Signal Process. 2009, 23, 1535–1547. [Google Scholar] [CrossRef]
  15. Patel, D.; Saxena, A.; Wang, J. A Machine Learning-Based Wrapper Method for Feature Selection. Int. J. Data Warehous. Min. (IJDWM) 2024, 20, 1–33. [Google Scholar] [CrossRef]
  16. Liu, Z.; Zhao, X.; Zuo, M.J.; Xu, H. Feature selection for fault level diagnosis of planetary gearboxes. Adv. Data Anal. Classif. 2014, 8, 377–401. [Google Scholar] [CrossRef]
  17. Maseno, E.M.; Wang, Z. Hybrid wrapper feature selection method based on genetic algorithm and extreme learning machine for intrusion detection. J. Big Data 2024, 11, 24. [Google Scholar] [CrossRef]
  18. Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar] [CrossRef]
  19. Raschka, S. Model evaluation, model selection, and algorithm selection in machine learning. arXiv 2018, arXiv:1811.12808. [Google Scholar] [CrossRef]
  20. Caesarendra, W.; Widodo, A.; Yang, B.S. Combination of probability approach and support vector machine towards machine health prognostics. Probabilistic Eng. Mech. 2011, 26, 165–173. [Google Scholar] [CrossRef]
  21. Shandhoosh, V.; Venkatesh S, N.; Chakrapani, G.; Sugumaran, V.; Ramteke, S.M.; Marian, M. Intelligent fault diagnosis for tribo-mechanical systems by machine learning: Multi-feature extraction and ensemble voting methods. Knowl.-Based Syst. 2024, 305, 112694. [Google Scholar] [CrossRef]
  22. Guo, K.; Wan, X.; Liu, L.; Gao, Z.; Yang, M. Fault diagnosis of intelligent production line based on digital twin and improved random forest. Appl. Sci. 2021, 11, 7733. [Google Scholar] [CrossRef]
  23. Pichika, S.N.; Yadav, R.; Rajasekharan, S.G.; Praveen, H.M.; Inturi, V. Optimal sensor placement for identifying multi-component failures in a wind turbine gearbox using integrated condition monitoring scheme. Appl. Acoust. 2022, 187, 108505. [Google Scholar] [CrossRef]
  24. Vanraj; Dhami, S.; Pabla, B. Optimization of sound sensor placement for condition monitoring of fixed-axis gearbox. Cogent Eng. 2017, 4, 1345673. [Google Scholar] [CrossRef]
  25. Islam, M.S.; Kim, K.; Kim, H.Y. Data-Driven Approach for Fault Diagnosis of Harmonic Drives Using Wireless Acceleration Sensors and Machine Learning. Int. J. Precis. Eng. Manuf.-Green Technol. 2025, 12, 951–968. [Google Scholar] [CrossRef]
  26. Asutkar, S.; Tallur, S. An explainable unsupervised learning framework for scalable machine fault detection in Industry 4.0. Meas. Sci. Technol. 2023, 34, 105123. [Google Scholar] [CrossRef]
  27. Rigas, S.; Papachristou, M.; Sotiropoulos, I.; Alexandridis, G. Explainable Fault Classification and Severity Diagnosis in Rotating Machinery Using Kolmogorov–Arnold Networks. Entropy 2025, 27, 403. [Google Scholar] [CrossRef]
  28. Palaniappan, R. Comparative analysis of support vector machine, random forest and k-nearest neighbor classifiers for predicting remaining usage life of roller bearings. Informatica 2024, 48, 39–52. [Google Scholar] [CrossRef]
  29. Du, P.; Abdel Jabbar, N.M.; Wilhite, B.A.; Kravaris, C. Fault Diagnosis in Chemical Reactors with Data-Driven Methods. Ind. Eng. Chem. Res. 2025, 64, 6060–6076. [Google Scholar] [CrossRef]
  30. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  31. Dudani, S.A. The Distance-Weighted k-Nearest-Neighbor Rule. IEEE Trans. Syst. Man Cybern. 1976, SMC-6, 325–327. [Google Scholar] [CrossRef]
  32. Yu, Z.; Chen, H.; Liu, J.; You, J.; Leung, H.; Han, G. Hybrid k-Nearest Neighbor Classifier. IEEE Trans. Cybern. 2016, 46, 1263–1275. [Google Scholar] [CrossRef] [PubMed]
  33. Prasath, V.; Alfeilat, H.A.A.; Hassanat, A.; Lasassmeh, O.; Tarawneh, A.S.; Alhasanat, M.B.; Salman, H.S.E. Distance and Similarity Measures Effect on the Performance of K-Nearest Neighbor Classifier—A Review. arXiv 2017, arXiv:1708.04321. [Google Scholar] [CrossRef]
  34. Bradley, A.P. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 1997, 30, 1145–1159. [Google Scholar] [CrossRef]
  35. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2024. [Google Scholar]
  36. Choi, E.; Lee, C. Feature extraction based on the Bhattacharyya distance. Pattern Recognit. 2003, 36, 1703–1709. [Google Scholar] [CrossRef]
  37. Rencher, A.C. Multivariate Statistical Inference and Applications; Wiley: New York, NY, USA, 1998; Volume 635. [Google Scholar]
  38. Patel, R.K.; Giri, V. Feature selection and classification of mechanical fault of an induction motor using random forest classifier. Perspect. Sci. 2016, 8, 334–337. [Google Scholar] [CrossRef]
Figure 1. Testbench layout.
Figure 1. Testbench layout.
Applsci 15 05813 g001
Figure 2. Failure types. (a) Breaking. (b) Cracking. (c) Pitting. (d) Scuffing.
Figure 2. Failure types. (a) Breaking. (b) Cracking. (c) Pitting. (d) Scuffing.
Applsci 15 05813 g002
Figure 3. Methodology for feature selection and classification analysis.
Figure 3. Methodology for feature selection and classification analysis.
Applsci 15 05813 g003
Figure 4. Vibration signal obtained from the accelerometers.
Figure 4. Vibration signal obtained from the accelerometers.
Applsci 15 05813 g004
Figure 5. Feature selection process.
Figure 5. Feature selection process.
Applsci 15 05813 g005
Figure 6. Classification accuracy and standard deviation (SD) by number of CIs and failure. (a) Breaking (B). (b) Cracking (C). (c) Pitting (P). (d) Scuffing (S).
Figure 6. Classification accuracy and standard deviation (SD) by number of CIs and failure. (a) Breaking (B). (b) Cracking (C). (c) Pitting (P). (d) Scuffing (S).
Applsci 15 05813 g006
Figure 7. Comparison Bhattacharyya distance by CIs and severity. (a) Breaking. (b) Cracking. (c) Pitting. (d) Scuffing.
Figure 7. Comparison Bhattacharyya distance by CIs and severity. (a) Breaking. (b) Cracking. (c) Pitting. (d) Scuffing.
Applsci 15 05813 g007
Figure 8. Interaction factors for position. (a) Accelerometer:Classifier. (b) Failure:Classifier. (c) Accelerometer:Failure.
Figure 8. Interaction factors for position. (a) Accelerometer:Classifier. (b) Failure:Classifier. (c) Accelerometer:Failure.
Applsci 15 05813 g008
Figure 9. Interaction factors for inclination. (a) Accelerometer:Classifier. (b) Failure:Classifier. (c) Accelerometer:Failure.
Figure 9. Interaction factors for inclination. (a) Accelerometer:Classifier. (b) Failure:Classifier. (c) Accelerometer:Failure.
Applsci 15 05813 g009
Figure 10. Tukey’s Honestly Significant Difference (HSD) post-hoc for position. (a) Accelerometer. (b) Failure. (c) Classifier. (d) Accelerometer:Classifier. (e) Accelerometer:Failure. (f) Failure:Classifier.
Figure 10. Tukey’s Honestly Significant Difference (HSD) post-hoc for position. (a) Accelerometer. (b) Failure. (c) Classifier. (d) Accelerometer:Classifier. (e) Accelerometer:Failure. (f) Failure:Classifier.
Applsci 15 05813 g010
Figure 11. Tukey’s Honestly Significant Difference (HSD) post-hoc for inclination. (a) Accelerometer. (b) Failure. (c) Classifier. (d) Accelerometer:Classifier. (e) Accelerometer:Failure. (f) Failure:Classifier.
Figure 11. Tukey’s Honestly Significant Difference (HSD) post-hoc for inclination. (a) Accelerometer. (b) Failure. (c) Classifier. (d) Accelerometer:Classifier. (e) Accelerometer:Failure. (f) Failure:Classifier.
Applsci 15 05813 g011aApplsci 15 05813 g011b
Table 1. Main CIs by failure and accelerometer.
Table 1. Main CIs by failure and accelerometer.
FailureA1A2A3Weighing
Variable (CI)MIVariable (CI)MIVariable (CI)MIValue
BreakingTMHO11.88Mean10.75Zero crossing11.7410
Mean11.83TMHO10.70Energy operator11.189
Zero crossing11.17Zero crossing10.53Mean11.068
Shape factor9.78Energy operator10.48TMHO11.017
SDIF9.67Kurtosis9.31SSC10.946
SSC8.80Latitud factor9.11Crest factor9.145
Skewness8.73Waveform8.63Impulse factor8.934
Energy operator8.51SSC8.61Latitud factor8.893
Margin index8.43Log-Log ratio8.61Kurtosis8.322
Log-Log ratio8.28Crest factor8.59Skewness8.171
CrackSkewness11.29SSC10.14Skewness12.5510
Mean10.06Clearance factor10.10SSC11.629
TMHO10.05Skewness9.88Zero crossing11.258
Energy operator9.98TMHO8.98Energy operator9.207
SSC9.38Mean8.95FSM9.146
SDIF9.03FSM8.93Mean8.825
Kurtosis8.95Zero crossing8.71TMHO8.814
Shape factor8.93Kurtosis8.32Latitud factor7.763
FSM8.89NNNL8.30Clearance factor7.322
Zero crossing8.55Energy operator7.84Kurtosis7.241
PittingSkewness12.58Energy operator10.06SSC12.4910
TMHO11.37SSC9.99Mean10.739
Mean11.25TMHO9.60TMHO10.708
SDIF10.09Mean9.55Zero crossing10.097
Shape factor10.06Clearance factor9.51Energy operator9.796
FSM9.03Kurtosis9.48Skewness9.395
Kurtosis8.80Waveform8.35Kurtosis8.374
Energy operator8.23Shape factor8.25Latitud factor8.173
Latitud factor8.23Impulse factor8.20Shape factor7.792
Log-Log ratio8.20SDIF8.15SDIF7.761
ScuffingTMHO11.82TMHO11.60TMHO11.6910
Mean11.72Mean11.55Mean11.699
Zero crossing11.30FSM10.46Skewness10.928
Skewness10.47Zero crossing9.95FSM10.677
SDIF10.11Skewness8.69Energy operator8.546
Shape factor9.80Waveform8.65Impulse factor8.185
SSC9.60Clearance factor8.51Zero crossing8.024
Kurtosis9.04Pulse8.42Kurtosis8.013
FSM8.38Kurtosis8.26Clearance factor7.992
Wavelength8.34Impulse factor8.23Crest factor7.941
TMHO = Temporal moments higher order; SDIF = Standard deviation impulse factor; SSC = Slope sign change; FSM = Fifth statistic moment; NNNL = Normalized normal negative likelihoog.
Table 2. Main CIs for failure.
Table 2. Main CIs for failure.
CI BreakingWeighingCI CrackingWeighingCI PittingWeighingCI ScuffingWeighing
Mean27Skewness28TMHO25TMHO30
Zero crossing26SSC25Mean24Mean27
TMHO26Mean20Energy operator19Skewness21
Energy operator19TMHO19SSC19Zero crossing19
SSC14Energy operator15Skewness15FSM17
Kurtosis8FSM13Kurtosis13Kurtosis8
Latitud factor8Zero crossing13Shape factor11Energy operator6
Shape factor7Clearance factor11SDIF9Impulse factor6
SDIF6Kurtosis8Zero crossing7Clearance factor6
Crest factor6SDIF5Clearance factor6SDIF6
Skewness5Shape factor3FSM5Shape factor5
Waveform4Latitud factor3Latitud factor5Waveform5
Impulse factor4NNNL2Waveform4SSC4
Log-Log ratio3 Impulse factor2Pulse index3
Margin index2 Log-Log ratio1Wavelength1
Crest factor1
Table 3. Ranking CI.
Table 3. Ranking CI.
Ranking CI# FailuresWeighingRanking CI# FailuresWeighing
TMHO4100Clearance factor323
Mean498Latitud factor316
Skewness470Waveform313
Zero crossing465Impulse factor312
SSC461Crest factor27
Energy operator459Log-Log ratio25
Kurtosis437Pulse13
FSM335Verosneg12
SDIF427MarginI11
Shape factor425Wavelength11
Table 4. Main CIs formulas.
Table 4. Main CIs formulas.
CIFormula
Temporal moments higher order | 1 N i = 1 N x i m |
m = 3 as default
Mean 1 N i = 1 N x i
Skewness N i = 1 N ( x i T 1 ) 3 T 3 3
Zero crossing i = 1 N s t e p [ s i g n ( x i x i + 1 ) ] s t e p = 1 , i f x > 0 0 , i f x = 0 1 , i f x < 0 s i g n = 1 , i f x > 0 1 2 , i f x = 0 0 , i f x < 0
Slope sign change i = 2 N f [ ( x i x i 1 ) ( x i x i + 1 ) ] f = 1 , i f x t h r e s h o l d 0 , o t h e r w i s e
Energy operator N 2 i = 1 N ( Δ y i Δ y ¯ ) 4 [ i = 1 N ( Δ y i Δ y ¯ ) 2 ] 2
Kurtosis N i = 1 N ( x i T 1 ) 4 [ i = 1 N ( x i T 1 ) 2 ] 2
Table 5. Bhattacharyya distance by CIs in breaking failure.
Table 5. Bhattacharyya distance by CIs in breaking failure.
CIP10_P1P2_P1P3_P1P4_P1P5_P1P6_P1P7_P1P8_P1P9_P1Selected
Mean0.00640.02010.03580.01200.00780.00220.08140.07650.0093
Kurtosis1.12760.08840.77270.36340.76160.35880.31810.58650.0418
Skewness0.58660.06730.59900.35610.67120.34220.14060.43320.4136
Energy operator0.48960.63530.46980.05450.05020.04240.01760.23030.0057
Zero crossing0.01790.12210.21260.46740.28670.06660.00400.23590.1662
Slope sign change0.01940.02610.18910.01480.14900.17380.03220.03990.0064
TMHO0.02300.05330.12550.04100.01620.00740.23990.23590.0340
Log detector0.00060.04040.07240.00210.08400.13660.02130.00870.0112X
Norm entropy0.00070.04530.06610.00120.08230.13660.02810.00860.0149X
Log energy entropy0.00080.04760.05770.00190.07420.14350.03860.00730.0185X
Wilson amplitude0.00910.10340.07970.00330.10430.09730.01460.11690.0027X
Mean square error0.09500.07390.16650.00160.03750.04150.00250.07680.0597X
Table 6. Bhattacharyya distance by CIs in cracking failure.
Table 6. Bhattacharyya distance by CIs in cracking failure.
CIP10_P1P2_P1P3_P1P4_P1P5_P1P6_P1P7_P1P8_P1P9_P1Selected
Mean0.04670.23610.12200.10810.14060.08830.06630.08580.0475
Kurtosis0.61230.42060.60070.49650.12070.11250.19140.10970.3974
Skewness0.18610.17020.05940.20480.39460.00680.25880.09230.0442
Energy operator0.82581.33850.92050.73141.28590.88330.73260.61060.5430
Zero crossing0.02890.16530.02180.16300.13930.13200.07740.46570.0853
Slope sign change0.00160.03010.00160.12940.33040.56920.20890.31960.0809
TMHO0.19890.72530.43400.38860.48750.33560.27270.33280.2164
Log energy entropy0.00690.01210.05230.03670.19460.25380.09170.14940.0118X
Norm entropy0.00740.01370.05770.03700.21740.27100.08370.17460.0093X
Wave form0.03700.01030.14940.01050.24260.20640.03130.18650.0043X
Log detector0.00930.01440.06660.04680.23750.28780.07630.20040.0075X
Wilson amplitude0.00180.00750.05260.03450.21090.32410.09260.21380.0181X
Table 7. Bhattacharyya distance by CIs in pitting failure.
Table 7. Bhattacharyya distance by CIs in pitting failure.
CIP10_P1P2_P1P3_P1P4_P1P5_P1P6_P1P7_P1P8_P1P9_P1Selected
Mean0.03440.00790.05760.07560.01030.05510.01650.09690.0273
Kurtosis1.86640.13780.21530.07630.28120.02290.08220.02390.0544
Skewness0.42650.06500.60160.13700.91590.40840.04730.01610.1699
Energy operator0.51880.84030.02850.00110.01300.05130.53760.25710.2024
Zero crossing0.05450.05040.01940.25740.02430.13620.07410.04940.0109
Slope sign change0.01790.16000.26480.00390.23790.03390.35760.37890.2742
TMHO0.10610.02160.17770.22430.02700.14760.03930.27780.0729
Clearence factor0.37580.00450.04050.01210.10010.03290.29840.18570.1385X
Pulse0.41890.01050.05270.00510.07520.02390.25410.18100.1995X
Wilson amplitude0.02230.09610.08330.00590.15260.00150.30080.30500.2722X
Log detector0.01430.13480.09580.02780.14410.00880.29430.27840.2781X
Norm entropy0.01020.15060.12000.03350.17220.01680.27470.27840.2680X
Table 8. Bhattacharyya distance by CIs in scuffing failure.
Table 8. Bhattacharyya distance by CIs in scuffing failure.
CIP10_P1P2_P1P3_P1P4_P1P5_P1P6_P1P7_P1P8_P1P9_P1Selected
Mean0.09160.08290.06180.05450.04420.04410.03330.09790.0664
Kurtosis0.63950.35330.28710.20390.58740.56940.48440.37280.8484
Skewness0.20820.08500.03040.00700.25010.09660.05390.02020.0445
Energy operator0.17570.79350.01160.04350.22580.03840.21110.04800.0040
Zero crossing0.28090.12660.07710.03560.07720.09580.35440.05520.0606
Slope sign change0.09970.62040.32100.00490.01240.00110.02460.00100.0000
TMHO0.36010.32510.25990.22940.18190.18220.13950.36700.2683
Wave form0.01760.21260.13250.00120.02590.06450.02020.02600.0968X
Log entropy0.03910.25300.17840.00450.00600.06880.01620.02480.0505X
Norm entropy0.03360.26670.18010.00430.00660.06090.01280.02210.0559X
Log Detector0.02450.28470.18920.00420.00580.05640.00750.01930.0595X
Wilson amplitude0.00430.30420.14840.00510.00390.04290.00550.07990.1350X
Table 9. Accuracy by accelerometer, failure and classifier.
Table 9. Accuracy by accelerometer, failure and classifier.
ABreakingCrackingPittingScuffing
RFK-NNRFK-NNRFK-NNRFK-NN
A10.98410.98070.99600.99830.98490.98500.98050.9850
A20.98840.98860.99480.99670.99210.99040.98270.9928
A30.98720.98260.99490.99090.99060.99280.99080.9959
A40.99080.99600.99390.99740.99230.98970.98940.9920
A50.99290.98710.99480.99670.99040.98580.98990.9941
A60.99740.98750.99110.99470.99470.99180.99430.9962
Table 10. AUC multiclass by accelerometer, failure and classifier.
Table 10. AUC multiclass by accelerometer, failure and classifier.
ABreakingCrackingPittingScuffing
RFK-NNRFK-NNRFK-NNRFK-NN
A10.99220.98710.99650.99880.98850.98940.99050.9929
A20.99200.99300.99700.99820.99690.99620.99430.9968
A30.99560.99130.99500.99210.99350.99490.99520.9982
A40.99590.99800.99500.99870.99810.99670.99710.9970
A50.99640.99590.99830.99690.99260.99260.99610.9976
A60.99940.99600.99370.99610.99730.99780.99920.9990
Table 11. Anova factorial test for position.
Table 11. Anova factorial test for position.
FactorDfSum SquareMean SquareF-Valuep-Value
A30.21500.071681295.10<0.001
Failure30.19540.065121176.70<0.001
Classifier10.00000.000000.00<0.001
A:Failure90.20120.02236404.00<0.001
Failure:Classifier30.07620.02539458.70<0.001
A:Classifier30.01840.00614111.00<0.001
Df = Degrees freedom.
Table 12. ANOVA factorial test for inclination.
Table 12. ANOVA factorial test for inclination.
FactorDfSum SquareMean SquareF-Valuep-Value
A20.08290.04145680.78<0.001
Failure30.32110.107041757.89<0.001
Classifier10.00650.00655107.49<0.001
A:Failure60.14270.02378390.56<0.001
Failure:Classifier30.06890.02296377.06<0.001
A:Classifier20.00890.0044673.31<0.001
Df = Degrees freedom.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pérez-Torres, A.; Sánchez, R.-V.; Barceló-Cerdá, S. Methodology for Feature Selection of Time Domain Vibration Signals for Assessing the Failure Severity Levels in Gearboxes. Appl. Sci. 2025, 15, 5813. https://doi.org/10.3390/app15115813

AMA Style

Pérez-Torres A, Sánchez R-V, Barceló-Cerdá S. Methodology for Feature Selection of Time Domain Vibration Signals for Assessing the Failure Severity Levels in Gearboxes. Applied Sciences. 2025; 15(11):5813. https://doi.org/10.3390/app15115813

Chicago/Turabian Style

Pérez-Torres, Antonio, René-Vinicio Sánchez, and Susana Barceló-Cerdá. 2025. "Methodology for Feature Selection of Time Domain Vibration Signals for Assessing the Failure Severity Levels in Gearboxes" Applied Sciences 15, no. 11: 5813. https://doi.org/10.3390/app15115813

APA Style

Pérez-Torres, A., Sánchez, R.-V., & Barceló-Cerdá, S. (2025). Methodology for Feature Selection of Time Domain Vibration Signals for Assessing the Failure Severity Levels in Gearboxes. Applied Sciences, 15(11), 5813. https://doi.org/10.3390/app15115813

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop