# Machine Vibration Monitoring for Diagnostics through Hypothesis Testing

^{*}

## Abstract

**:**

## 1. Introduction

- (a)
- Operational evaluation,
- (b)
- Data acquisition and cleansing,
- (c)
- Signal processing: features selection, extraction, and metrics,
- (d)
- Pattern processing: statistical model development and validation,
- (e)
- Situation assessment,
- (f)
- Decision making.

- Level 1: Detection—indication of the presence of damage, possibly at a given confidence
- Level 2: Localization—knowledge about the damage location
- Level 3: Classification—knowledge about the damage type
- Level 4: Assessment—damage size
- Level 5: Consequence—actual degree of safety and remaining useful life

#### 1.1. Features

- Damage consistency,
- Damage sensitivity and noise-rejection ability,
- Low sensitivity to unmonitored confounding factors.

#### 1.2. Pattern Recognition

#### 1.3. Methodology

#### 1.4. The Experimental Setup and the Dataset

## 2. The Methods

#### 2.1. Statistics and Probability: An Introduction to Hypothesis Testing

The probability is the limiting value of the relative frequency of a given attribute within a considered collective. The probabilities of all the attributes within the collective form its distribution.

#### 2.1.1. Hypothesis Testing of the Difference between Two Population Means

#### 2.1.2. Diagnostics, Hypothesis Testing and Errors

- (a)
- In the training phase, the labelled samples are used to build a classifier, namely a function which divides the feature (variable) space into groups. This separation is then found in terms of distributions. When a single feature is used to investigate the machine, the classifier function corresponds to the selection of a threshold. It is relevant to point out that this feature-space partitioning can also be obtained in an unsupervised way (i.e., without exploiting the labels). This takes the name of clustering.
- (b)
- In a second phase, the new observations are assigned to the corresponding class (i.e., classified) according the classifier function. Each new unlabelled data point is then treated individually.

#### 2.2. Principal Component Analysis (PCA)

#### 2.3. Linear Discriminant Analysis (LDA)

#### 2.4. Mahalanobis Distance Novelty Detection

#### 2.4.1. Hypothesis Testing of Outliers

- Draw a sample of $n$ observations randomly generated from a $d$-dimensional standard normal distribution,
- Compute the deviation of each observation in terms of distance from the centroid i.e., the NI,
- Save the maximum deviation and repeat the draw for $m$ times.

#### 2.4.2. The Curse of Dimensionality

#### 2.4.3. Mahalanobis Distance and Confounding Influences

## 3. The Results

## 4. Discussion

## 5. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Farrar, C.R.; Doebling, S.W. Damage Detection and Evaluation II. In Modal Analysis and Testing; Silva, J.M.M., Maia, N.M.M., Eds.; NATO Science Series (Series E: Applied Sciences); Springer: Dordrecht, The Netherlands, 1999; ISBN 978-0-7923-5894-7. [Google Scholar]
- Rytter, A. Vibration Based Inspection of Civil Engineering Structures. Ph.D. Thesis, University of Aalborg, Aalborg, Denmark, May 1993. [Google Scholar]
- Worden, K.; Dulieu-Barton, J.M. An overview of intelligent fault detection in systems and structures. Struct. Health Monit.
**2004**, 3, 85–98. [Google Scholar] [CrossRef] - Deraemaeker, A.; Worden, K. A comparison of linear approaches to filter out environmental effects in structural health monitoring. Mech. Syst. Signal Process.
**2018**, 105, 1–15. [Google Scholar] [CrossRef] - Jardine, A.K.S.; Lin, D.; Banjevic, D. A review of machinery diagnostics and prognostics implementing condition-based maintenance. Mech. Syst. Signal Process.
**2006**, 20, 1483–1510. [Google Scholar] [CrossRef] - Zhang, W.; Zhou, J. Fault Diagnosis for Rolling Element Bearings Based on Feature Space Reconstruction and Multiscale Permutation Entropy. Entropy
**2019**, 21, 519. [Google Scholar] [CrossRef] - You, L.; Fan, W.; Li, Z.; Liang, Y.; Fang, M.; Wang, J. A Fault Diagnosis Model for Rotating Machinery Using VWC and MSFLA-SVM Based on Vibration Signal Analysis. Shock Vib.
**2019**, 2019, 1908485. [Google Scholar] [CrossRef] - Randall, R.B.; Antoni, J. Rolling Element Bearing Diagnostics—A Tutorial. Mech. Syst. Signal Process.
**2011**, 25, 485–520. [Google Scholar] [CrossRef] - Antoni, J.; Griffaton, J.; Andréc, H.; Avendaño-Valencia, L.D.; Bonnardot, F.; Cardona-Morales, O.; Castellanos-Dominguez, G.; Paolo Daga, A.; Leclère, Q.; Vicuña, C.M.; et al. Feedback on the Surveillance 8 challenge: Vibration-based diagnosis of a Safran aircraft engine. Mech. Syst. Signal. Process.
**2017**, 97, 112–144. [Google Scholar] [CrossRef] - Antoni, J.; Randall, R.B. Unsupervised noise cancellation for vibration signals: Part I and II—Evaluation of adaptive algorithms. Mech. Syst. Signal Process.
**2004**, 18, 89–117. [Google Scholar] [CrossRef] - Caesarendra, W.; Tjahjowidodo, T. A Review of Feature Extraction Methods in Vibration-Based Condition Monitoring and Its Application for Degradation Trend Estimation of Low-Speed Slew Bearing. Machines
**2017**, 5, 21. [Google Scholar] [CrossRef] - Daga, A.P.; Fasana, A.; Marchesiello, S.; Garibaldi, L. The Politecnico di Torino rolling bearing test rig: Description and analysis of open access data. Mech. Syst. Signal Process.
**2019**, 120, 252–273. [Google Scholar] [CrossRef] - Sikora, M.; Szczyrba, K.; Wróbel, Ł.; Michalak, M. Monitoring and maintenance of a gantry based on a wireless system for measurement and analysis of the vibration level. Eksploat. Niezawodn.
**2019**, 21, 341. [Google Scholar] [CrossRef] - Jolliffe, I.T. Principal Component Analysis; Springer: New York, NY, USA, 2002. [Google Scholar]
- Bishop, C. Pattern Recognition and Machine Learning; Springer: New York, NY, USA, 2006; ISBN 978-0-387-31073-2. [Google Scholar]
- Worden, K.; Manson, G.; Fieller, N.R.J. Damage detection using outlier analysis. J. Sound Vib.
**2000**, 229, 647–667. [Google Scholar] [CrossRef] - Von Mises, R. Probability, Statistics, and Truth, 2nd rev. English ed.; Dover Publications: New York, NY, USA, 1981; ISBN 0-486-24214-5. [Google Scholar]
- Holman, J.P.; Gajda, W.J. Experimental Methods for Engineers; McGraw-Hill: New York, NY, USA, 2011; ISBN 10: 0073529303. [Google Scholar]
- Daniel, W.W.; Cross, C.L. Biostatistics: A Foundation for Analysis in the Health Sciences; Wiley: Hoboken, NJ, USA, 2012; ISBN 13: 978-1118302798. [Google Scholar]
- Howell, D.C. Fundamental Statistics for the Behavioral Sciences, 8th ed.; Cengage Learning: Boston, MA, USA, 2013; ISBN 13: 978-1285076911. [Google Scholar]
- Yan, A.M.; Kerschen, G.; De Boe, P.; Golinval, J.C. Structural damage diagnosis under varying environmental conditions—Part I: A linear analysis. Mech. Syst. Signal Process.
**2005**, 19, 847–864. [Google Scholar] [CrossRef] - Penny, K.I. Appropriate critical values when testing for a single multivariate outlier by using the Mahalanobis distance. J. Royal Stat. Soc. Series C (Appl. Stat.)
**1996**, 45, 73–81. [Google Scholar] [CrossRef] - Worden, K.; Allen, D.W.; Sohn, H.; Farrar, C.R. Damage detection in mechanical structures using extreme value statistics. SPIE
**2002**. [Google Scholar] [CrossRef] - Toshkova, D.; Lieven, N.; Morrish, P.; Hutchinson, P. Applying Extreme Value Theory for Alarm and Warning Levels Setting under Variable Operating Conditions. Available online: https://www.ndt.net/events/EWSHM2016/app/content/Paper/293_Filcheva_Rev4.pdf (accessed on 5 June 2019).
- Takahashi, R. Normalizing constants of a distribution which belongs to the domain of attraction of the Gumbel distribution. Stat. Probab. Lett.
**1987**, 5, 197–200. [Google Scholar] [CrossRef] - Gupta, P.L.; Gupta, R.D. Sample size determination in estimating a covariance matrix. Comput. Stat. Data Anal.
**1987**, 5, 185–192. [Google Scholar] [CrossRef] - Yan, A.M.; Kerschen, G.; De Boe, P.; Golinval, J.C. Structural damage diagnosis under varying environmental conditions—Part II: Local PCA for non-linear cases. Mech. Syst. Signal Process.
**2005**, 19, 865–880. [Google Scholar] [CrossRef] - Deraemaeker, A.; Worden, K. New Trends in Vibration Based Structural Health Monitoring; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012; ISBN 978-3-7091-0399-9. [Google Scholar]
- Arnaiz-González, Á.; Fernández-Valdivielso, A.; Bustillo, A.; López de Lacalle, L.N. Using artificial neural networks for the prediction of dimensional error on inclined surfaces manufactured by ball-end milling. Int. J. Adv. Manufact. Technol.
**2016**, 83, 847–859. [Google Scholar] [CrossRef]

**Figure 1.**The experimental setup, the triaxial accelerometers location (A1 and A2) and orientation (

**a**) Constructive parts; (

**b**) Bearings and accelerometers location.

**Figure 2.**The considered dataset after features extraction. The black dotted lines divide the different damage conditions (0A to 6A). For each, the 100 observations for the 17 speed and loads conditions are plotted sequentially.

**Figure 3.**The damage (a conical indentation) on the rolling element as obtained through a Rockwell tool (

**a**) and its evolution after 19 h at various load and speed conditions (

**b**). Dimensions in mm.

**Figure 4.**One or two tail hypothesis testing principle: significance $\alpha $ (limit for the p-value) and critical value(s) of the confidence interval highlighted (

**a**) single sided; (

**b**) double sided.

**Figure 5.**Visual representation of the hypothesis test of the difference between two population means—the critical values are highlighted together with the corresponding significance in terms of tail areas (in cyan).

**Figure 6.**The power of a two population means test—(

**a**) a visualization of the significance $\alpha $ and of the power $1-\beta $ for a particular case—(

**b**) the power (under assumption of normality) as a function of ${t}^{*}$ which depends on the effect size $\frac{{\mu}_{1}-{\mu}_{2}}{\sigma}=d$ and the sample size $n$.

**Figure 7.**Receiver Operating Characteristic (ROC) as a function of the threshold (Gaussian distributions). (

**a**) graphical summary of the table of type I and type II errors in yellow (Table 9). (

**b**) ROC for binary classification with different effect sizes ${d}^{\text{}*}$ and the position of the 95% critical value (black dotted). For $d{\text{}}^{*}\text{}$ = 0.2 the performance is very poor as the ROC is near to the 1st–3rd quadrant bisector (random classifier).

**Figure 8.**Visualization of the principal component analysis (PCA) principle for a 2D case—geometric interpretation.

**Figure 10.**Mahalanobis equivalent procedure on a 2D simplified plane, for 2 simulated normal classes (blue: reference, red: novel). Notice that the Centring (

**a**) and Standardization (

**c**) of the space is unique and based on the reference set alone. All the novel acquisitions are later mapped to the same space. (

**a**) Data centred on reference condition; (

**b**) Rotated according to PCs; (

**c**) Standardized;(

**d**) Squared components: non-linear space transform; (

**e**) Distance from centre (origin).

**Figure 11.**(

**a**) Several different Induced Gumbel distributions for the maxima arising from a bivariate standard normal, as a function of n, for m = 100 Monte Carlo Repetitions. (

**b**) Monte Carlo sampling from a 5-dimensional multivariate normal with random mean and covariance matrix. The ${\widehat{\mu}}_{g}^{2}$ values obtained from fitting a Gumbel (green dots) are compared to the theoretical ${\mu}_{g}^{2}\left(n\right)$ critical values form ${\chi}^{2}$ and Wilk’s criteria.

**Figure 12.**Three ${\mathsf{\chi}}^{2}$ distributions for an increasing number of dofs. The ${\mathsf{\chi}}_{100}^{2}$ is compared to the asymptotical tendency distribution ${N}_{\left(100,\text{}\sqrt{200}\right)}$. The asymptotical means are highlighted as black dotted lines. Notice that increasing the dofs, the distribution concentrates around the asymptotical mean.

**Figure 13.**Averages of the estimated Mahalanobis distances (red) and true Eulerian distances (blue) for the Maxima of a $30$-dimensional multivariate Gaussian, as a function of the sample numerousness $n$, considering 1000 Monte Carlo repetitions. ±σ confidence intervals of the estimates are also given in cyan (Eulerian) and red (Mahalanobis).

**Figure 14.**Results for for condition 12 ($280\text{}\mathrm{Hz},\text{}1800\text{}\mathrm{N}$) alone: (

**a**,

**c**,

**e**) the 3 different dimensionality reductions; (

**b**,

**d**,

**f**) the corresponding empirical pdf for the healthy condition (blue) and for the 6 damage conditions (red). The x axis represents the 100 observations for each of the different damage conditions (ordered from 0A to 6A) which are separated by the black dotted lines. (

**g**) ROC curves for the three 1-D reductions. The 5% significance threshold values are highlighted with the black circles; the black x denotes the position of the novelty index (NI) threshold computed according to the MC repetition introduced in Section 2.4.1—in magenta in (

**h**) The 2-D PCA visualization of the dataset (condition 12) and the LDA direction $w$.

**Figure 15.**Results for condition 3 ($90\text{}\mathrm{Hz},\text{}1400\text{}\mathrm{N}$) alone (

**a**,

**c**,

**e**) the 3 different dimensionality reductions; (

**b**,

**d**,

**f**) the corresponding empirical pdf for the healthy condition (blue) and for the 6 damage conditions (red). The x axis represents the 100 observations for each of the different damage conditions (ordered from 0A to 6A) which are separated by the black dotted lines. (

**g**) ROC curves for the three 1-D reductions. The 5% significance threshold values are highlighted with the black circles; the black x denotes the position of the NI threshold computed according to the MC repetition introduced in Section 2.4.1—in magenta in (

**h**) The 2-D PCA visualization of the dataset (condition 12) and the LDA direction $w$.

**Figure 16.**(

**a**) ROC curves for the three classifications considering constant speed ($180\text{}\mathrm{Hz}$) but variable load ($0,\text{}1000,\text{}1400,\text{}1800\text{}\mathrm{N}$)—conditions 5 to 8. (

**b**) ROC curves for the three classifications considering constant load ($1400\text{}\mathrm{N}$) but variable speed ($90,\text{}180,\text{}280,\text{}370,\text{}470\text{}\mathrm{Hz}$)—conditions 2, 6, 10, 14, 17.The 5% significance threshold values are highlighted with the black circles; the black x denotes the position of the NI threshold computed according to the MC repetition introduced in Section 2.4.1.

**Figure 17.**ROC curves for the three classifications considering the whole dataset involving both speed and load variability (all the 17 conditions in Table 3). The 5% significance threshold values are highlighted with the black circles; the black x denotes the position of the NI threshold computed according to the MC repetition introduced in Section 2.4.1.

Moments | Name | Formulation |
---|---|---|

Order 1—raw moment: Location | Mean Value | ${\mu}_{1}={\mu}_{y}=E\left[y\left(k\right)\right]={{\displaystyle \int}}_{-\infty}^{+\infty}y\text{}p\left(y\right)dy$ |

Order 2—central moment: Dispersion | Variance | $\begin{array}{c}{\mu}_{2}={\sigma}_{y}^{2}=E\left[{\left(y\left(k\right)-{\mu}_{y}\right)}^{2}\right]\\ ={{\displaystyle \int}}_{-\infty}^{+\infty}{\left(y-{\mu}_{y}\right)}^{2}\text{}p\left(y\right)dy\end{array}$ |

Order 3—standardized moment: Symmetry | Skewness | $\frac{{\mu}_{3}}{{\sigma}_{y}^{3}}=E\left[{\left(\frac{y\left(k\right)-{\mu}_{y}}{{\sigma}_{y}}\right)}^{3}\right]$ |

Order 4—standardized moment: “Tailedness” | Kurtosis | $\frac{{\mu}_{4}}{{\sigma}_{y}^{4}}=E\left[{\left(\frac{y\left(k\right)-{\mu}_{y}}{{\sigma}_{y}}\right)}^{4}\right]$ |

Level Indicators | Name | Formulation |
---|---|---|

Root Mean Square | RMS | $RMS=\sqrt{E\left[{\left(y\left(k\right)\right)}^{2}\right]}$ |

Peak value | Peak | $peak=\frac{\mathrm{max}\left(y\left(k\right)\right)-\mathrm{min}\left(y\left(k\right)\right)}{2}\text{}$ |

Crest factor | Crest | $crest=\frac{peak}{RMS}$ |

**Table 3.**Bearing B1 codification according to damage type (inner ring or rolling element) and size. The damage is obtained through a Rockwell tool producing a conical indentation of maximum diameter reported as characteristic size.

Code | 0A | 1A | 2A | 3A | 4A | 5A | 6A |
---|---|---|---|---|---|---|---|

Damage type | none | Inner Ring | Inner Ring | Inner Ring | Rolling Element | Rolling Element | Rolling Element |

Damage size [µm] | - | 450 | 250 | 150 | 450 | 250 | 150 |

Label | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

$f$ [dHz] | 9 | 9 | 9 | 9 | 18 | 18 | 18 | 18 | 28 | 28 | 28 | 28 | 37 | 37 | 37 | 47 | 47 |

$F$ [kN] | 0 | 1 | 1.4 | 1.8 | 0 | 1 | 1.4 | 1.8 | 0 | 1 | 1.4 | 1.8 | 0 | 1 | 1.4 | 0 | 1 |

**Table 5.**Common Confidence Intervals for a Gaussian variable [18].

Standard Interval | Inside to Outside Ratio | $\mathbf{Confidence}\text{}1-\mathit{\alpha}$ |
---|---|---|

$\pm 0.6745$ | $1\text{}\mathrm{to}\text{}1$ | $50\%$ |

$\pm 1$ | $2.15\text{}\mathrm{to}\text{}1$ | $68.3\%$ |

$\pm 2$ | $21\text{}\mathrm{to}\text{}1$ | $95.5\%$ |

$\pm 3$ | $369\text{}\mathrm{to}\text{}1$ | $99.7\%$ |

Tails | Confidence Interval |
---|---|

For a right tail event, it can be stated as | $\mathrm{Pr}(K\ge k|{H}_{0})$ |

For a left tail event, it is | $\mathrm{Pr}(K\le k|{H}_{0})$ |

For a double tail event (on a symmetric distribution), it becomes | $2\mathrm{min}\left(\mathrm{Pr}\left(K\ge k|{H}_{0}\right),\mathrm{Pr}(K\le k|{H}_{0})\right)$ |

**Table 7.**Statistical summary of the sample as a function of the population distribution and numerousness $n$ of the sample.

Distribution of the Population: | Statistical Summary of the Sample: |
---|---|

Normal distributions with given variance or Generic distributions (also non-normal) assuming $n>30$, thanks to CLT | $z=\frac{E\left[{x}_{1}\right]-E\left[{x}_{2}\right]}{\sqrt{{\sigma}_{1}^{2}/{n}_{1}+{\sigma}_{2}^{2}/{n}_{2}}}~{N}_{\left(0,1\right)}$ |

Normal distributions with unknown variance | $t=\frac{E\left[{x}_{1}\left]-E\right[{x}_{2}\right]}{\sqrt{{s}_{p}^{2}/{n}_{1}+{s}_{p}^{2}/{n}_{2}}}~{t}_{\left({n}_{1}+{n}_{2}-2\right)}$ |

**Table 8.**General rule for a rough quantification of the effect size [20].

Effect Size | ${\mathit{d}}^{\text{}*}$ |
---|---|

Small | $0.2$ |

Medium | $0.5$ |

Large | $0.8$ |

True Health Condition: | |||
---|---|---|---|

Healthy (H_{0}) | Damaged | ||

CBM Actions | accept ${\mathrm{H}}_{0}$: Healthy | No Alarm— true healthy | Missed Alarm— type II error |

reject ${\mathrm{H}}_{0}$: Damaged | False Alarm— type I error | Alarm— true damaged |

Scatter Matrices | Optimization of the Separation Index $\mathit{J}\left(\mathit{w}\right)$ |
---|---|

Between class scatter matrix: ${S}_{b}={\left({\mu}_{2}-{\mu}_{1}\right)}^{\prime}\left({\mu}_{2}-{\mu}_{1}\right)$ | $J\left(w\right)=\frac{{w}^{\prime}{S}_{b}w}{{w}^{\prime}{S}_{w}w}$ |

Within class scatter matrix: ${S}_{w}={\displaystyle {\displaystyle \sum}_{h\in {C}_{1}}{\left({x}^{h}-{\mu}_{1}\right)}^{\prime}\left({x}^{h}-{\mu}_{1}\right)+{\displaystyle \sum}_{k\in {C}_{1}}}{\left({x}^{k}-{\mu}_{2}\right)}^{\prime}\left({x}^{k}-{\mu}_{2}\right)$ | $\mathrm{arg}\underset{w}{\mathrm{max}}\text{}J\left(w\right):$ $w\propto {S}_{w}^{-1}{\left({\mu}_{2}-{\mu}_{1}\right)}^{\prime}$ |

**Table 11.**MD confusion matrix for the whole dataset involving both speed and load variability. The 5% significance threshold highlighted in Figure 17 is used, as highlighted by the 0A false alarms rate of $5\%$.

True Class | ||||||||
---|---|---|---|---|---|---|---|---|

0A | 1A | 2A | 3A | 4A | 5A | 6A | ||

Classified | Healthy | 95 | 9 | 52 | 64 | 0 | 32 | 26 |

Damaged | 5 | 91 | 48 | 36 | 100 | 68 | 74 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Daga, A.P.; Garibaldi, L.
Machine Vibration Monitoring for Diagnostics through Hypothesis Testing. *Information* **2019**, *10*, 204.
https://doi.org/10.3390/info10060204

**AMA Style**

Daga AP, Garibaldi L.
Machine Vibration Monitoring for Diagnostics through Hypothesis Testing. *Information*. 2019; 10(6):204.
https://doi.org/10.3390/info10060204

**Chicago/Turabian Style**

Daga, Alessandro Paolo, and Luigi Garibaldi.
2019. "Machine Vibration Monitoring for Diagnostics through Hypothesis Testing" *Information* 10, no. 6: 204.
https://doi.org/10.3390/info10060204