# Incipient Fault Diagnosis of Rolling Bearings Based on Impulse-Step Impact Dictionary and Re-Weighted Minimizing Nonconvex Penalty Lq Regular Technique

^{1}

^{2}

^{*}

Next Article in Journal

Correction published on 23 April 2020, see
*Entropy* **2020**, *22*(4), 483.

College of Mechanical Engineering, Donghua University, Shanghai 201620, China

George W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0405, USA

Author to whom correspondence should be addressed.

Received: 31 July 2017
/
Revised: 15 August 2017
/
Accepted: 16 August 2017
/
Published: 18 August 2017

The periodical transient impulses caused by localized faults are sensitive and important characteristic information for rotating machinery fault diagnosis. However, it is very difficult to accurately extract transient impulses at the incipient fault stage because the fault impulse features are rather weak and always corrupted by heavy background noise. In this paper, a new transient impulse extraction methodology is proposed based on impulse-step dictionary and re-weighted minimizing nonconvex penalty Lq regular (R-WMNPLq, q = 0.5) for the incipient fault diagnosis of rolling bearings. Prior to the sparse representation, the original vibration signal is preprocessed by the variational mode decomposition (VMD) technique. Due to the physical mechanism of periodic double impacts, including step-like and impulse-like impacts, an impulse-step impact dictionary atom could be designed to match the natural waveform structure of vibration signals. On the other hand, the traditional sparse reconstruction approaches such as orthogonal matching pursuit (OMP), L1-norm regularization treat all vibration signal values equally and thus ignore the fact that the vibration peak value may have more useful information about periodical transient impulses and should be preserved at a larger weight value. Therefore, penalty and smoothing parameters are introduced on the reconstructed model to guarantee the reasonable distribution consistence of peak vibration values. Lastly, the proposed technique is applied to accelerated lifetime testing of rolling bearings, where it achieves a more noticeable and higher diagnostic accuracy compared with OMP, L1-norm regularization and traditional spectral Kurtogram (SK) method.

Rolling bearings are extensively used as critical elements in the transmission systems of rotating machinery, and unexpected faults may cause severe mechanical failures and great economic losses or even personal casualties. Therefore, the incipient fault diagnosis (IFD) and health management of rolling bearings are becoming more and more crucial in engineering applications [1,2,3].

Over the past decades, feature extraction techniques were applied to extract the transient characteristics from the original vibration signal, which can indicate the fault location and fault type of rolling bearings. However, fault feature extraction usually suffers from two challenges when the defects occur at the early stage: (1) at the early stage of fault degradation, fault-related components perform incompletely and incipient faults are quite different from obvious failure states, and in practical applications, it seems the normal operating stage is generally considered; (2) the useful weak fault impulse transient characteristics are often submerged in heavy-level background noise and other irrelevant components, which makes the fault diagnosis more difficult. Therefore, the major concern in bearing fault feature extraction is to determine which signal processing tools and algorithms to use to distinguish and diagnose early stage fault characteristics. Up to now, various fault diagnosis techniques have been proposed attempting to address the above challenges, such as wavelet/wavelet-packet transform [4], local mean decomposition (LMD) and its extension [5], minimum entropy deconvolution (MED) and its extension [6,7] and artificial intelligence (AI) algorithms such as artificial neural network (ANN) and fuzzy algorithm [8,9,10], Hilbert envelope spectrum [11], energy and entropy methods [12,13,14], higher order statistical techniques [15,16,17,18], to mention just a few. Unfortunately, some potential drawbacks and severe shortcomings related to the common techniques still remained unresolved. For example, an adaptive decomposition problem exists in the wavelet/wavelet-packet and LMD methods, sample training and fault severity quantitative analysis issues exist in the ANN and fuzzy algorithm, energy and entropy methods are complex and time-consuming, etc. As a consequence, if a weak fault exists, then the acquired vibration signals are rather complex, and these shortcomings may hinder the effectiveness of traditional methods.

Compared to traditional fault feature extraction approaches, sparse representation (SR) as a new signal processing method whereby a given vibration signal can be sparsely represented based on a linear combination of sparse basis or dictionary atoms. The SR has been successfully introduced on fault detection of rotating machinery. For example, Zhu and Fan [19,20] developed an optimal Laplace wavelet, tunable Q-factor wavelet transform, single-side Morlet wavelet basis combined with split variable augmented Lagrangian shrinkage algorithm (SALSA) to extract impulse components and transient features. Du [21] proposed a nuclear norm minimization that uses a weighted low-rank sparse model for bearing fault detection. Cui [22,23] introduced composite dictionary multi-atom matching and a matching pursuit algorithm based on an adaptive impulse dictionary for gear-box and bearing fault diagnosis. The defect-induced impulses redundant dictionary and matching pursuit (MP) approach was proposed by He in [24]. Zhang [25] proposed a novel method called kurtosis-based weighted sparse model based on a convex optimization technique; this technique formulated the prior information into a sparse regularization problem and achieved good effect in bearing fault diagnosis. He [26] employed a local time-frequency (TF) domain sparse representation to reconstruct the native pulse waveform structure of fault transients, and proved that the proposed method was superior to traditional the MP and K-singular value decomposition (K-SVD) methods.

Although the sparse representation has achieved successful applications in fault diagnosis of rotating machinery, however, the following two situations still need to be further researched:

- (1)
- When a spalling defect or pitting corrosion is induced, a series of successive impulses will be generated during subsequent operation. However, most dictionaries and optimal wavelet-basis constructed in the previous method only use single pulse or single impact frequencies, e.g., the optimal Laplace wavelet, single-side Morlet wavelet basis, transient impulse atoms, etc. Therefore there is no guarantee that the sparse-basis construction can match the natural waveform structure of the vibration signal well.
- (2)
- In practice, due to the fluctuation of the load and speed, and the interference of the harsh working environment, some random variations will be generated between an impulse and its neighboring impulses. The traditional sparse reconstruction methods such as greedy pursuit, orthogonal matching pursuit (OMP), L1-norm regularization and iterative shrinkage algorithm ignore those time-varying physical characteristics, which leads to a lower success rate of the transient impulse reconstruction. On the other hand, the traditional sparse reconstruction approaches also treat all vibration signal values equally and thus ignore the fact that the vibration peak value may have more useful information about periodical transient impulses and should be preserved at a larger weight value.

In an attempt to overcome the above shortcomings and extend the universality of SR in fault diagnosis, this paper proposes a new sparse representation algorithm called re-weighted minimizing nonconvex penalty Lq regular (R-WMNPLq) combined with an impulse-step impact dictionary for incipient bearing fault diagnosis, using a bearing accelerated life test diagnosis as a case study. Prior to the sparse representation, the original vibration signal is preprocessed by the variational mode decomposition (VMD) technique for incipient fault signal filtering. Furthermore, the impulse-step impact dictionary atoms are constructed to match the natural waveform structure of the vibration signals. Considering the time-varying physical characteristics of transient impulses and keeping a reasonable distribution consistence of peak vibration values, the R-WMNPLq (q = 0.5) technique is employed and the fault frequencies and failure location can be diagnosed accordingly. Incipient transient feature extraction results indicate that the impulse time and the period of transients can be detected more accurately and effectively in cases where previous approaches failed, which can significantly improve the performance of sparse reconstruction for extracting transient impulses from heavy noisy vibration signal.

The remainder of this paper is organized as follows: Section 2 describes the theory of the impulse-step impact dictionary. Section 3 introduces re-weighted minimizing nonconvex penalty Lq regular (q = 0.5) algorithm and the implementation details of this method. In Section 4, the diagnosis results and discussion of the proposed algorithm with other previous approaches are presented. Conclusions are made in Section 5.

The periodic transient impulses of rolling bearings are mainly generated by the impact between the bearing elements, inner race and outer race. For example, when there is a pitting failure with a certain size in the outer race, the bearing element initially contacts the anterior fault edge and then exits from the lagging fault edge, as shown in Figure 1a. Thus two impacts are generated due to entry into and exit from the fault region. The first impact could be treated as step-like response (i.e., with low frequency components) and the second impact could be treated as impulse-like response (i.e., with high frequency components) [27]. This phenomenon has been proven by the practical fault data from the bearing data center of Case Western Reserve University [28], as shown in Figure 1b. Based on the above analysis, in this paper, a double transient impulse dictionary atom that includes step-like and impulse-like impacts is proposed.

Firstly, the period time of the bearing elements entering and then exiting from the fault region can be calculated by:
where the l_{o} is defect size on the outer race, d is the rolling element diameter, D_{0} is the diameter of outer race, i.e., D_{0} = D_{p} + d, D_{p} is the pitch diameter and f_{r} is the shaft rotation frequency, f_{c} is the bearing cage frequency, i.e., ${f}_{c}=\frac{{f}_{r}}{2}(1-\frac{d}{{D}_{p}}\mathrm{cos}\alpha )$, α is the contact angle. As a matter of fact, when the defect size l_{o} (mm) is smaller than the rolling element diameter d, so the rolling element cannot come into contact with the bottom of the pitting failure, the distance of the rolling element entering and then exiting from the fault region is half of the defect size l_{o} (mm). Thus the period time $\Delta t$ becomes:

$$\Delta {t}_{o}=\frac{{l}_{o}}{\pi {D}_{0}{f}_{c}}=\frac{{l}_{o}}{\pi ({D}_{p}+d)}\times \frac{1}{\frac{1}{2}{f}_{r}(1-\frac{d}{{D}_{p}})}=\frac{2{l}_{o}{D}_{p}}{\pi {f}_{r}({D}_{p}^{2}-{d}^{2})}$$

$$\Delta t=\frac{1}{2}\times \Delta {t}_{o}=\frac{{l}_{o}{D}_{p}}{\pi {f}_{r}({D}_{p}^{2}-{d}^{2})}$$

Similarly, when there is a pitting failure with a certain size l_{i} (mm) on the inner race, the corresponding period time $\Delta {t}_{i}$ can be expressed as follows:
which is the same as the period time that was derived for the same defect size on the outer race, as shown in Equation (1). We suppose the moment when the impulse-like impact response occurs is u, thus the step-like impact response occurs is $u-\Delta t$, consequently, the single degree of freedom impulse-like impact (in the form of a decaying sinewave) can be defined as:

$$\Delta {t}_{i}=\frac{{l}_{i}}{\pi {D}_{i}({f}_{r}-{f}_{c})}=\frac{{l}_{i}}{\pi ({D}_{p}-d)}\times \frac{1}{{f}_{r}-\frac{1}{2}{f}_{r}(1-\frac{d}{{D}_{p}})}=\frac{2{l}_{i}{D}_{p}}{\pi {f}_{r}({D}_{p}^{2}-{d}^{2})}$$

$${x}_{imp}=\mathrm{exp}(\frac{-(t-u)}{\tau})\mathrm{sin}(2\pi {f}_{n}t)$$

The single step-like impact can be defined as:

$${x}_{\mathrm{step}}=\mathrm{exp}(\frac{-(t-u-\Delta t)}{3\tau})\times (-\mathrm{cos}(2\pi \frac{{f}_{n}}{6}t))+\mathrm{exp}(\frac{-(t-u)}{5\tau})$$

Therefore, the impulse-step impact impulse dictionary atom can be defined as:
where f_{n} is the system natural frequency, u the time when the impulse-like impact occurs, τ is system damping and a is the peak value ratio of the impulse-like response to the step-like response [27].

$$x=a\cdot {x}_{imp}+{x}_{step}$$

In order to generate an impulse-step impact signal representative of that obtained from the double impact of the rolling element with the anterior and lagging fault edge, the time-domain waveforms of impulse-step impact atom, step-impulse impact atom, impulse-step impact atom, and the impulse-step impact signal without/with noise generated using Equation (6) are shown in Figure 2, where the simulated bearing type was NACHI 2206GK whose detailed parameters are listed in Table 1. The parameters of the impulse-step impact equation were set as follows: the system damping constant τ is 0.001, peak value ratio a is 0.3, the system natural frequency f_{n} = 10,000 Hz, the impulse-like response happened u is 0.005, the rotor speed rotation frequency f_{r} is 800 rpm. The time-domain waveform of the impulse-like signal with noise is shown in Figure 2e. The signal-noise ratio (SNR) is 20 dB. It can be seen that the similarities between the measured signal and the simulated signal with noise presented in Figure 1b is quite apparent.

The basic idea of sparse representation is that a vibration signal can be represented as a linear superposition of a few sparse atoms with residual component [29,30,31,32,33]. Denoting a vibration signal $y\in {R}^{p}$, the approximating process can be represented as:
where x is the approximating signal, n represents residual component, $D\in {R}^{p\times n},p<<n$ called redundant dictionary, which consists of n sparse atoms ${d}_{i}\in {R}^{p}(i=1,2,\cdot \cdot \cdot ,n)$, and ${\alpha}_{k}$ an sparse coefficients of the vibration signal y. As demonstrated in Equation (7), there are two issues to be solved:

$$y=x+n=D\alpha +n={\displaystyle \sum _{k\in {\mathrm{\Omega}}_{m}}{\alpha}_{k}{d}_{k}}+n$$

- (1)
- Designing a redundant dictionary D. The first important issue is how to construct a redundancy dictionary D that suitable for the transient behavior of fault impulse components.
- (2)
- Recovering sparse coefficients $\alpha $. Another important issue is how to design an optimization algorithm to calculate the sparse coefficients of vibration fault signal.

It should be noted that $y=D\alpha +\mathrm{n}$ is a highly underdetermined equation [34], for which there is an infinite set of solutions. In the present method, using an optimization approach, the problem of signal reconstruction by sparse representation under residual error constraints can be calculated by:
where c is a threshold of the residual error. Moreover, the prior knowledge of the original signal is usually utilized to regularize the solution under residual error constraint is expressed as
where $\lambda $ is regularization weight and $\zeta (x)$ regularization term. From the perspective of Bayesian estimation, the ${\Vert D\alpha -y\Vert}_{2}^{2}$ and $\zeta (x)$ can be viewed as the likelihood part and prior knowledge part, respectively. Therefore, the $\zeta (x)$ prior knowledge part plays a significant role in signal reconstruction based on sparse representation.

$$\stackrel{~}{\alpha}=\mathrm{arg}\mathrm{min}{\Vert \alpha \Vert}_{0},\mathrm{subject}\mathrm{to},{\Vert D\alpha -y\Vert}_{2}^{2}\le c$$

$$\stackrel{~}{\alpha}=\mathrm{arg}\mathrm{min}{\Vert D\alpha -y\Vert}_{2}^{2}+\lambda \cdot \zeta (x).$$

Usually, a fault vibration signal can also be divided into two types: the first is the periodic transient impulse containing step-like impact and impulse-like impact, and the second is the smoothing regions between the impulse and its neighbor impulse, as shown in Figure 2e. For the first one, the physical structure and fault type determine the similarity between two impulses, and the influence of external noise is relatively weak. However, in smoothing regions, due to the fluctuation of the varying load and speed, and the interference of the harsh working environment, the influence of external noise will play a critical role in signal reconstruction. If the noise level is strong, the information of noise in smoothing regions is regarded as structural information in sparse coefficients. Meanwhile, the classical optimization and regularization approaches also treat all vibration signal values equally and thus ignore a fact that the vibration peak value may have more useful information of periodical transient impulses, and cannot remove the false structural information contained in the sparse coefficients, and the traditional methods may cause instability and obvious artifacts in the reconstructed signal.

To overcome the above issue, inspired by the ideas of the unconstrained low-rank matrix recovery in Refs. [35,36,37] that many successful applications have implemented in the compressed sensing field [29,30,31,32,33], a new re-weighted minimizing nonconvex penalty Lq (0 < q ≤ 1) regular (R-WMNPLq) method is introduced, which is different from the ones studied in [35,36,37] where uniform random matrix (URM, i.e., the entries of matrix are random variables with uniform distribution) was used. In this work, the impulse-step impact dictionary is utilized for extracting the fault information from its observation or noisy data. The objective function is as follows:
where q is regular operator, $\epsilon (\epsilon >0)$ is smoothing parameter, $\lambda (\lambda >0)$ is penalty parameter and b is measurement vector. It should be mentioned that the smoothing parameter plays a critical role in signal reconstruction in terms of smoothing regions. The detailed update procedure is shown in Algorithm 1.

$${L}_{q}(\alpha ,\epsilon ,\lambda )={\displaystyle \sum _{j=1}^{N}{({\alpha}_{j}^{2}+{\epsilon}^{2})}^{q/2}}+\frac{1}{2\lambda}{\Vert D\alpha -b\Vert}_{2}^{2}$$

Algorithm 1 Re-weighted minimizing nonconvex penalty Lq regular (R-NSMLq) | |

1: | Input: Matrix D, measurement vector b and estimated sparsity level s; |

2: | Choose appropriate parameters $\lambda (\lambda >0)$, q (0 < q ≤ 1); |

3: | Initialize ${\alpha}^{(0)}$ such that $D{\alpha}^{(0)}=b$, and ${\epsilon}_{0}=1$; |

4: | For k = 0; |

5: | Solve the following linear system for ${\alpha}^{(k)}$, |

6: | $(\frac{q{\alpha}^{(k+1)}[i]}{{({\epsilon}_{k}^{2}+{\Vert {\alpha}^{(k)}[i]\Vert}_{2}^{2})}^{1-q/2}}{)}_{1\le i\le M}+\frac{1}{\lambda}{D}^{T}(D{\alpha}^{(k+1)}-b)=0$ (11) |

7: | Or |

8: | $({D}^{T}D+diag{(\frac{q\lambda}{{({\epsilon}_{k}^{2}+{\Vert {\alpha}^{(k)}[i]\Vert}_{2}^{2})}^{1-q/2}})}_{1\le i\le M}){\alpha}^{(k+1)}={D}^{T}b$ (12) |

9: | When the required reconstruction precision is obtained, the coefficients ${\alpha}^{(k)}$ will be considered as the output value assigned to $\alpha $, meanwhile end to this algorithm, otherwise execute next steps. |

10: | Let β be a constant, where 0 < β < 1. Update $\alpha $ by formula ${\epsilon}_{k+1}=\mathrm{min}\{{\epsilon}_{k},\beta \cdot r{({\alpha}^{(k+1)})}_{s+1}\}$, where $r(\alpha )$ represents the rearrangement of absolute values of $r({\alpha}^{(k+1)})$ in the decreasing order, and $r{(\alpha )}_{s+1}$ is the (s + 1) th component value of $r(\alpha )$. Note that, if ${\epsilon}_{k+1}=0$, choose ${\alpha}^{(k+1)}$ to be an approximation of sparse solution and stop this iteration. |

11: | Let k = k + 1, and return to step 4 to continue. |

12: | Output: Sparse coefficients α; |

13: | End |

For the R-WMNPLq method, the following theorem summarizes the results for 0 < q ≤ 1, thus we have the following theorem which can prove the above proposed algorithm:

Error estimation theorem [35,36]. Suppose that x^{o} is a sparse signal with sparsity level s which satisfies Dx^{o} = b. Without loss of generality, here the sparse coefficient $\alpha $ is substituted by vector x. The smooth parameter ${\epsilon}_{k}\to {\epsilon}_{*}$ with $k\to \infty $. Matrix D satisfies the restricted isometry property (RIP) [30,31,33] of order 2 s with ${\delta}_{2s}<1$, when ${\epsilon}_{*}>0$, the sequence {x^{(k)}} has at least one convergent subsequence. Suppose that the limit ${\epsilon}_{k}={\epsilon}_{*}$ is a local optimal solution for Equation (10), we have:
where ${\delta}_{s}{({x}^{{\epsilon}_{*}})}_{2}$ is the approximate error of ${x}^{{\epsilon}_{*}}$, which satisfies ${\delta}_{s}{({x}^{{\epsilon}_{*}})}_{2}=\underset{{\Vert y\Vert}_{2,0}\le s}{\mathrm{inf}}{\Vert {x}^{{\epsilon}_{*}}-y\Vert}_{2}$. For the special case, when ${\epsilon}_{*}=0$, there must exist a convergent subsequence converging to point x^{o}, it satisfies,
where C_{1}, C_{2} and C_{3} are independent positive constants. To prove the Theorem 1, the following two lemmas (i.e., Lemmas 1 and 2) [35,36] are required.

$${\Vert {x}^{{\epsilon}_{*}}-{x}^{0}\Vert}_{2}\le {C}_{1}\sqrt{\lambda}+{C}_{2}{\delta}_{s}{({x}^{{\epsilon}_{*}})}_{2}$$

$${\Vert {x}^{0}-{x}^{*}\Vert}_{2}\le {C}_{3}\sqrt{\lambda}$$

[35,36]**.** For all $x,y\in {R}^{N}$ and 0 < q ≤ 1, if ${\epsilon}_{k}\ge {\epsilon}_{k+1}\ge 0$, it satisfies:

$${({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})}^{\frac{q}{2}}-{({\epsilon}_{k+1}^{2}+{\Vert y\Vert}_{2}^{2})}^{\frac{q}{2}}-\frac{q{y}^{T}(x-y)}{{({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})}^{1-\frac{q}{2}}}\ge \frac{q{\Vert x-y\Vert}_{2}^{2}}{2{({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})}^{1-\frac{q}{2}}}$$

According to arithmetic-geometric mean inequality [38], i.e.:

$${({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})}^{1-\frac{q}{2}}{({\epsilon}_{k+1}^{2}+{\Vert y\Vert}_{2}^{2})}^{\frac{q}{2}}\le (1-\frac{q}{2})({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})+\frac{q}{2}({\epsilon}_{k+1}^{2}+{\Vert y\Vert}_{2}^{2})$$

Then we compute:

$$\begin{array}{l}{({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})}^{\frac{q}{2}}-{({\epsilon}_{k+1}^{2}+{\Vert y\Vert}_{2}^{2})}^{\frac{q}{2}}-\frac{q{y}^{T}(x-y)}{{({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})}^{1-\frac{q}{2}}}\\ =\frac{({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})-{({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})}^{1-\frac{q}{2}}{({\epsilon}_{k+1}^{2}+{\Vert y\Vert}_{2}^{2})}^{\frac{q}{2}}-q{y}^{T}(x-y)}{{({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})}^{1-\frac{q}{2}}}\\ \ge \frac{({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})-(1-\frac{q}{2})({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})-\frac{q}{2}({\epsilon}_{k+1}^{2}+{\Vert y\Vert}_{2}^{2})-q{y}^{T}(x-y)}{{({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})}^{1-\frac{q}{2}}}\\ =\frac{q}{2}\frac{{\epsilon}_{k}^{2}-{\epsilon}_{k+1}^{2}+{(x-y)}^{2}}{{({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})}^{1-\frac{q}{2}}}\ge \frac{q}{2}\frac{{(x-y)}^{2}}{{({\epsilon}_{k}^{2}+{\Vert x\Vert}_{2}^{2})}^{1-\frac{q}{2}}}\end{array}$$

This completes the proof of Lemma 1. ☐

[35,36]**.** Let ${L}_{q}(x,\epsilon ,\lambda )={\displaystyle \sum _{j=1}^{N}{[{\alpha}_{j}^{2}+{\epsilon}^{2}]}^{q/2}}+\frac{1}{2\lambda}{\Vert D\alpha -b\Vert}_{2}^{2}$, if be the solution of ${L}_{q}(x,\epsilon ,\lambda )$ for k = 0 ,1, 2,…N, then:
Furthermore,
where C_{4} is an independent positive constant.

$${\Vert D{x}^{(k)}-D{x}^{(k+1)}\Vert}_{2}^{2}\le 2\lambda ({L}_{q}({x}^{(k)},{\epsilon}_{k},\lambda )-{L}_{q}({x}^{(k+1)},{\epsilon}_{k+1},\lambda ))$$

$${\Vert {x}^{(k+1)}-{x}^{(k)}\Vert}_{2}^{2}\le {C}_{4}({L}_{q}({x}^{(k)},{\epsilon}_{k},\lambda )-{L}_{q}({x}^{(k+1)},{\epsilon}_{k+1},\lambda ))$$

We first compute the following formula:

$$\begin{array}{l}{L}_{q}({x}^{(k)},{\epsilon}_{k},\lambda )-{L}_{q}({x}^{(k+1)},{\epsilon}_{k+1},\lambda )\\ ={\displaystyle \sum _{j=1}^{N}{({\epsilon}_{k}^{2}+{\left|{x}_{j}^{(k)}\right|}^{2})}^{\frac{q}{2}}}-{\displaystyle \sum _{j=1}^{N}{({\epsilon}_{k+1}^{2}+{\left|{x}_{j}^{(k+1)}\right|}^{2})}^{\frac{q}{2}}}+\frac{1}{2\lambda}({\Vert D{x}^{(k)}-b\Vert}_{2}^{2}-{\Vert D{x}^{(k+1)}-b\Vert}_{2}^{2})\\ ={\displaystyle \sum _{j=1}^{N}{({\epsilon}_{k}^{2}+{\left|{x}_{j}^{(k)}\right|}^{2})}^{\frac{q}{2}}-{({\epsilon}_{k+1}^{2}+{\left|{x}_{j}^{(k+1)}\right|}^{2})}^{\frac{q}{2}}}+\frac{1}{2\lambda}{\Vert D{x}^{(k)}-D{x}^{(k+1)}\Vert}_{2}^{2}\\ +\frac{1}{\lambda}{(D{x}^{(k+1)}-b)}^{T}(D{x}^{(k)}-D{x}^{(k+1)})\end{array}$$

According to Equation (11), we have:

$$\frac{1}{\lambda}{(D{x}^{(k+1)}-b)}^{T}(D{x}^{(k)}-D{x}^{(k+1)})=-{\displaystyle \sum _{j=1}^{N}\frac{q{x}_{j}^{(k+1)}({x}_{j}^{(k)}-{x}_{j}^{(k+!)})}{{({\epsilon}_{k}^{2}+{\left|{x}_{j}^{(k)}\right|}^{2})}^{1-\frac{q}{2}}}}$$

Using Lemma 1, i.e., Equation (15), and substituting Equation (20) to Equation (19), we have:

$$\begin{array}{l}{L}_{q}({x}^{(k)},{\epsilon}_{k},\lambda )-{L}_{q}({x}^{(k+1)},{\epsilon}_{k+1},\lambda )\\ ={\displaystyle \sum _{j=1}^{N}\{{({\epsilon}_{k}^{2}+{\left|{x}_{j}^{(k)}\right|}^{2})}^{\frac{q}{2}}-{({\epsilon}_{k+1}^{2}+{\left|{x}_{j}^{(k+1)}\right|}^{2})}^{\frac{q}{2}}-}\frac{q{x}_{j}^{(k+1)}({x}_{j}^{(k)}-{x}_{j}^{(k+!)})}{{({\epsilon}_{k}^{2}+{\left|{x}_{j}^{(k)}\right|}^{2})}^{1-\frac{q}{2}}}\}+\frac{1}{2\lambda}{\Vert D{x}^{(k)}-D{x}^{(k+1)}\Vert}_{2}^{2}\\ \ge \frac{1}{2\lambda}{\Vert D{x}^{(k)}-D{x}^{(k+1)}\Vert}_{2}^{2}+{\displaystyle \sum _{j=1}^{N}{({x}_{j}^{(k)}-{x}_{j}^{(k+1)})}^{2}\frac{q}{2{({\epsilon}_{k}^{2}+{\left|{x}_{j}^{(k)}\right|}^{2})}^{1-\frac{q}{2}}}}\end{array}$$

From the result of Equation (21), Equation (17) can be calculated immediately. It should be noted from Equation (17) that the ${L}_{q}({x}^{(k)},{\epsilon}_{k},\lambda )$ is monotonically decreasing sequence, hence:
for all k ≥ 1 and 1 ≤ i ≤ n, there exists a positive number $\beta $ which satisfies ${\Vert {x}^{(k)}\Vert}_{\infty}\le \beta $, hence:

$${\Vert {x}^{(k)}\Vert}_{q}^{q}\le {\Vert {x}^{(k)}\Vert}_{q,{\epsilon}_{k}}^{q}\le {L}_{q}({x}^{(k)},{\epsilon}_{k},\lambda )\le {L}_{q}({x}^{(0)},{\epsilon}_{0},\lambda )={\Vert {x}^{(0)}\Vert}_{q,{\epsilon}_{0}}^{q}$$

$$\frac{q}{2{({\epsilon}_{k}^{2}+{\left|{x}_{j}^{(k)}\right|}^{2})}^{1-\frac{q}{2}}}\ge \frac{q}{2{({\epsilon}_{0}^{2}+{\beta}^{2})}^{1-\frac{q}{2}}}$$

Let $\frac{1}{{C}_{4}}=\frac{q}{2{({\epsilon}_{0}^{2}+{\beta}^{2})}^{1-\frac{q}{2}}}$, and thus Lemma 2 is proved conclusively. ☐

Herein, combining the above inequalities in Lemmas 1 and 2, Theorem 1 can be proved ultimately. For simplicity, the detailed proof process was derived and presented in the Appendix section. In the next section, the choice of regular operator q will be discussed in detail via a simulation experiment.

For the choice of q (0 < q ≤ 1), we assume q varying among a data scope {0.1, 0.5, 0.7, 1}. Firstly, the dictionary D was generated by a rand-function rand (64, 256), and the test signal has t non-zeros narrow-pulse that subject to the standard Gaussian distribution (SGD), the locations of non-zeros were generated randomly, and the number t varying among {8, 10, 12, …, 32}. The penalty parameter $\lambda ={10}^{-6}$ is small enough which satisfies $Dx=D{x}^{0}$. Taking the R-WMNPLq algorithm iterative 1000 times, if the recovery error (RE) satisfy $RE={\Vert {x}^{r}-{x}^{0}\Vert}_{2}/{\Vert {x}^{0}\Vert}_{2}\le {10}^{-3}$, the algorithm iteration is stopped, where ${x}^{r}$ stands for non-zeros narrow-pulse. Figure 3a shows the random signal with 32 non-zero pulses and Figure 3b shows the recovery success rate (RSR) curves with different regular operator q. From Figure 3b, it can be seen that q = 0.1, q = 0.5 performed better than q = 0.7 and much better than q = 1. Moreover, the RSR curve with q = 0.5 is slightly higher than q = 0.1. Therefore, in this paper, regular operator q = 0.5 was chosen as the optimal operator.

The experimental setup of our roller bearing accelerated life test is shown in Figure 4. Bearing accelerated vibration signals were generated by an Intelligent Maintenance System (IMS) [39,40]. The sampling rate is 20 kHz. The authors analyzed the vibration acceleration data from bearing 1 that the accelerated life test was carried out successively for 8 days (from 12 February 2004 10:32:39 to 19 February 2004 06:22:39, from normal to severe fault, i.e., 9840 min). Meanwhile, four Rexnord ZA-2115 bearings (pitch diameter is 71.501 mm, roller diameter 8.4074 mm, roller number 16 and contact angle 15.17 deg) were detected using acceleration sensors and thermocouples. Therefore, by calculating, the fault characteristic frequency of bearing 1# outer race is 236.4 Hz.

In order to avoid propagation of damages to the whole experimental platform and for security reasons, the accelerated life test was stopped when the vibration amplitude of the vibration raw signal surpassed 5 m/s^{−2}. Figure 5a shows the vibration raw signal of the whole life-cycle of bearing 1. Figure 5b shows the Kurtosis curve over the whole life-cycle of bearing 1 and indicates that there is a long time with normal operation in whole life-cycle, but the period of fault from incipient stage to severe stage is relatively short. As shown in Figure 5, there is an obvious transient feature at point 647 in the incipient fault. However, due to the interference of harsh working environment and background noises, the engineer cannot sure whether the fault is happened before point 647 or not. Hence, to verify the effectiveness of the proposed method for bearing incipient fault diagnosis, the experimental data at point 535 was chosen which has no obviously wave phenomenon in whole life-cycle.

Figure 6a–d show the original signal at point 535 (1024 sampling points were chosen, i.e., about 0.05 s), the time-frequency distribution a with short-time Fourier transform (STFT), the amplitude spectrum and Hilbert envelope spectrum of original vibration signal, respectively. From Figure 6a, the periodical impulses are submerged in heavy noise and fault type cannot be determined yet. From Figure 6d, although the spectrum peak at 240 Hz and its harmonic frequencies which consists with the outer-race fault frequency can be detected without de-nosing, however, the spectrum peak masked by heavy background noise and features are not be evident enough to detect fault.

Considering the complexity of bearing vibration signals with different frequencies and the repetition behavior of fault patterns, the variational mode decomposition (VMD) [41,42,43] method was used to preprocess the original signal.

Figure 7a,b show the amplitude spectrum of each IMF mode with K = 20 and K = 21, respectively. We can obviously observe that the amplitude spectrum structures were displayed more clearly when the modal number is 20, however, the modal duplication phenomenon starts to appear when the modal number reaches 21, which demonstrates that the model number K = 20 is better than K = 21. Figure 8 shows 20 intrinsic mode functions (IMF) models of original vibration signal decomposed by VMD method, which can be help to distinguish the periodic impulse from the mixed noisy signal. Here, the kurtosis of the 19th IMF is the maximum, which means the transient impulses feature may be contained in 19th IMF model based on criterion of maximum kurtosis [43].

Furthermore, the proposed R-WMNPLq (q = 0.5) algorithm was employed on the 19th IMF model and its related parameters are illustrated in Table 2. The reconstructed periodic impulses signal, the time-frequency distribution with short-time Fourier transform (STFT) and the Hilbert envelope spectrum of the reconstructed periodic impulses signal are depicted in Figure 9a–c respectively. It can be observed that the proposed R-WMNPLq (q = 0.5) algorithm combined with the impulse-step impact dictionary not only extracts the transient impulse components clearly but also the noise components in reconstructed vibration signal have been removed evidently, and the signal-to-noise ratio (SNR) in Figure 9a is −27.7460 db. Compared with the original vibration signal as shown in Figure 6, the time-frequency distribution combines the information in time and frequency domains, it can be easily seen from Figure 9a,b that there are no interference noises among the extracted transient impulses. The transient impulses time-frequency distribution is more clearly, which effectively reveal the fault feature from the incipient vibration signal. The Hilbert envelop spectrum is shown in Figure 9c. As can be seen, the characteristic frequency 240 Hz (close to the theoretical fault frequency of outer race 236.4 Hz) and its harmonic frequencies (3f_{o}, 4f_{o} and 5f_{o}) are clearly detected, therefore, the proposed method is exactly suitable for incipient fault bearing signal.

A considerable amount of literature has been published on the application of orthogonal matching pursuit (OMP) and L1-norm regularization algorithms in mechanical fault diagnosis [44,45,46]. To further validate the superiority of the proposed method, the OMP and L1-norm regularization techniques were sequentially used on the 19th IMF model vibration signal. The running iteration time is set as 50. Figure 10a,c,e and Figure 10b,d,f are the results of the OMP and L1-norm regularization methods, respectively. The signal-to-noise ratio (SNR) in Figure 10a,b is −29.1315 db and −35.4638 db, respectively. As shown in Figure 10a, strong noises still remained in the reconstructed impulsive signal. Besides, by comparing Figure 6a with Figure 10a, it should be noted that the conventional L1-norm technique removes too much energy of the original vibration signal to effectively reduce the noises but also shrinks the fault feature frequency. Compared with the results of Hilbert envelope spectrum shown in Figure 10e,f, it can be seen that the OMP and L1-norm regularization technique do not have a satisfactory performance in incipient fault diagnosis of rolling bearings.

In addition to the sparse representation method methods, the superiority and effectiveness requires further validation with a traditional approach. Therefore, the same signal, namely the 19th IMF model, was also processed by the spectral Kurtogram (SK) method [47,48]. The Kurtogram of the 19th IMF model is displayed in Figure 11a, from which the optimal demodulation frequency band, namely 5333–10,000 Hz, can be detected. Thus a band-pass filter was designed to extract the potential features from the 19th IMF model, then the envelope spectrum was applied to the filtered signal, the corresponding envelope spectrum is shown in Figure 11b. As can be seen, no explicit fault characteristic frequencies are observed and it is also hard to distinguish the fault location from the incipient fault signal.

This work originated from a study on the sparse representation approach and incipient fault diagnosis of rolling bearings. Although a lot of works have achieved successful application in fault diagnosis of rotating machinery based upon sparse representation methods such as greedy pursuit, orthogonal matching pursuit (OMP), L1-norm regularization, the developed approaches are not satisfactory for reconstructing periodic transient impulses and identifying the physical structure information of periodic impulses, especially when the fault is in an incipient stage.

This paper proposes a novel feature extraction method for incipient bearing fault diagnosis combining re-weighted minimizing nonconvex penalty Lq (R-WMNPLq, q = 0.5) regular and impulse-step impact dictionary. The proposed method provides a new point of view for periodic instantaneous impulse reconstruction by introducing a penalty parameter, smoothing parameter and regular operator on a sparse representation model to guarantee the reasonable distribution consistence of peak vibration values. On the other hand, the original physical structure information was formed by impulse-step impact dictionary atoms. Effectiveness in the extraction of transient impulse is verified by an accelerated life test. The experimental analysis shows that the proposed method can achieve good performance in reducing noises and extracting fault characteristic from raw vibration signals in comparison with the matching pursuit method (OMP), L1-norm regularization and spectral Kurtogram (SK) method, especially for vibration signals with heavy background noises, and it is well suited for on-line practical applications.

However, the proposed methodology is only applicable for accelerated life test of rolling bearings under constant operating conditions, and variable conditions such as variable speed, torque and variable harsh working environments should be considered in the future which may help generalizing the proposed method. Moreover, the proposed sparse representation method can be also improved for the detection of multiple faults concurrence in our future work.

This research was funded by the Fundamental Research Funds for the Central Universities (Grant No. CUSF-DH-D-2017059) and the National Natural Science Foundation of China (Grant No. 51675096). The authors wish to express their sincere gratitude for this support.

Conceptualization, formal analysis, investigation and writing the original draft is done by Qing Li. Validation, review and editing is done by Steven Y. Liang. All authors have read and approved the final manuscript.

The authors declare no conflict of interest.

[35,36]**.** It should be noted from Equation (18) that the $\left\{{x}^{(k)}\right\}$ is monotonically decreasing sequence because ${\Vert {x}^{(k+1)}-{x}^{(k)}\Vert}_{2}^{2}\to 0$. Thus we might as well set $\left\{{x}^{(k)}\right\}$ converges to ${x}^{{\epsilon}_{*},\lambda}$. Letting $i\to \infty $, the Equation (11) can be rewritten as:

$$(\frac{q{x}^{{\epsilon}_{*},\lambda}[i]}{{({\epsilon}_{*}^{2}+{\Vert {x}^{{\epsilon}_{*},\lambda}[i]\Vert}_{2}^{2})}^{1-q/2}}{)}_{1\le i\le M}+\frac{1}{\lambda}{D}^{T}(D{x}^{{\epsilon}_{*},\lambda}-b)=0$$

Namely, ${x}^{{\epsilon}_{*},\lambda}$ is the critical point of ${L}_{q}(x,\epsilon ,\lambda )$ with $\epsilon ={\epsilon}_{*}>0$. According to Lemma 2, we have the following:
which means:

$$\begin{array}{l}{L}_{q}({x}^{{\epsilon}_{*},\lambda},{\epsilon}_{*},\lambda )\le {L}_{q}({x}^{(kj)},{\epsilon}_{kj},\lambda )\le {L}_{q}({x}^{(0)},{\epsilon}_{0},\lambda )\\ \le {\displaystyle \sum _{i=1}^{m}{\Vert {x}^{(0)}[i]\Vert}_{2}^{2}}+m{\epsilon}_{0}\le {\Vert {x}^{(0)}\Vert}_{2}^{2}+m\end{array}$$

$${\Vert D{x}^{{\epsilon}_{*},\lambda}-b\Vert}_{2}^{2}\le \sqrt{2\lambda {L}_{q}({x}^{{\epsilon}_{*},\lambda},{\epsilon}_{*},\lambda )}\le \sqrt{2\lambda ({\Vert {x}^{(0)}\Vert}_{2}^{2}+m)}$$

Let T be an index dataset of nonzero entries of x^{o} and T* be an index dataset of s largest entries in L2-norm of $\left\{{x}^{{\epsilon}_{*},\lambda}\right\}$, since ${\Vert {x}^{0}\Vert}_{0}\le s$, we get:

$$\begin{array}{l}{\Vert {x}^{{\epsilon}_{*},\lambda}-{x}^{0}\Vert}_{2}^{2}\le {m}^{\frac{1}{2}}{\Vert {x}^{{\epsilon}_{*},\lambda}-{x}^{0}\Vert}_{2}^{2}\\ \le {m}^{\frac{1}{2}}{\Vert {({x}^{{\epsilon}_{*},\lambda}-{x}^{0})}_{T\cup {T}^{*}}\Vert}_{2}^{2}+{m}^{\frac{1}{2}}{\Vert {({x}^{{\epsilon}_{*},\lambda}-{x}^{0})}_{{(T\cup {T}^{*})}^{c}}\Vert}_{2}^{2}\\ \le {m}^{\frac{1}{2}}\frac{1}{\sqrt{1-{\delta}_{2,s}}}{\Vert {(D{x}^{{\epsilon}_{*},\lambda}-{x}^{(0)})}_{T\cup {T}^{*}}\Vert}_{2}^{2}+{m}^{\frac{1}{2}}{\Vert {({x}^{{\epsilon}_{*},\lambda}-{x}^{0})}_{{(T\cup {T}^{*})}^{c}}\Vert}_{2}^{2}\\ \le {m}^{\frac{1}{2}}\frac{1}{\sqrt{1-{\delta}_{2,s}}}{\Vert (D{x}^{{\epsilon}_{*},\lambda}-D{x}^{(0)})\Vert}_{2}^{2}+{m}^{\frac{1}{2}}(\frac{1}{\sqrt{1-{\delta}_{2,s}}}{\Vert D\Vert}_{2}+1){\Vert {({x}^{{\epsilon}_{*},\lambda})}_{{(T\cup {T}^{*})}^{c}}\Vert}_{2}^{2}\\ \le {m}^{\frac{1}{2}}\frac{1}{\sqrt{1-{\delta}_{2,s}}}\sqrt{2\lambda ({\Vert {x}^{(0)}\Vert}_{2}^{2}+m)}+{m}^{\frac{1}{2}}(\frac{1}{\sqrt{1-{\delta}_{2,s}}}{\Vert D\Vert}_{2}+1){\Vert {({x}^{{\epsilon}_{*},\lambda})}_{{({T}^{*})}^{c}}\Vert}_{2}^{2}\\ \le {m}^{\frac{1}{2}}\frac{1}{\sqrt{1-{\delta}_{2,s}}}\sqrt{2\lambda ({\Vert {x}^{(0)}\Vert}_{2}^{2}+m)}+{m}^{\frac{1}{2}}(\frac{1}{\sqrt{1-{\delta}_{2,s}}}{\Vert D\Vert}_{2}+1){\delta}_{s}{({x}^{{\epsilon}_{*},\lambda})}_{2}\end{array}$$

If C_{1} and C_{2} are as follows:

$$\begin{array}{l}{C}_{1}={m}^{\frac{1}{2}}\frac{1}{\sqrt{1-{\delta}_{2,s}}}\sqrt{2\lambda ({\Vert {x}^{(0)}\Vert}_{2}^{2}+m)}\\ {C}_{2}={m}^{\frac{1}{2}}(\frac{1}{\sqrt{1-{\delta}_{2,s}}}{\Vert D\Vert}_{2}+1)\end{array}$$

This is the proof of Equation (13) in Theorem 1.

Further, if ${\epsilon}_{*}=0$, then ${\epsilon}_{{k}_{0}}=0$ for point k_{0}, and ${x}^{({k}_{0})}$ is a s-sparse signal. Otherwise, there is a sequence $\left\{{x}^{({n}_{k})}\right\}$ satisfies ${\epsilon}_{{n}_{k}}=\alpha \cdot r{\left({x}^{({n}_{k})}\right)}_{s+1}>0$. In the former case, ${x}^{({k}_{0})}$ is a s-sparse signal, and we get ${x}^{0,\lambda}={x}^{(k)}$. In the latter case, due to $\left\{{x}^{({n}_{k})}\right\}$ is bounded with limit point ${x}^{0,\lambda}$, and without loss of generality, we assume the convergent sub-sequence of $\left\{{x}^{({n}_{k})}\right\}$ is also ${x}^{({n}_{k})}$, ${x}^{0,\lambda}=\underset{k\to \infty}{\mathrm{lim}}{x}^{({n}_{k})}$, and then $\underset{k\to \infty}{\mathrm{lim}}r{({x}^{({n}_{k})})}_{s+1}=\underset{k\to \infty}{\mathrm{lim}}\frac{{\epsilon}_{{n}_{k}}}{\alpha}=0$, that is, the sub-sequence of ${x}^{({n}_{k})}$ is a s-sparse signal. Therefore, based on the above two cases, and without loss of generality, we assume the s-sparse signal is ${x}^{(k)}$, we have $\underset{k\to \infty}{\mathrm{lim}}{x}^{(k)}={x}^{*}$, using the RIP of D, we have,

$$\begin{array}{l}{\Vert {x}^{*}-{x}^{0}\Vert}_{2}^{2}\le {m}^{\frac{1}{2}}{\Vert {x}^{*}-{x}^{0}\Vert}_{2}^{2}\\ \le {m}^{\frac{1}{2}}\frac{1}{\sqrt{1-{\delta}_{2s}}}{\Vert D{x}^{*}-D{x}^{0}\Vert}_{2}^{2}\\ ={m}^{\frac{1}{2}}\frac{1}{\sqrt{1-{\delta}_{2s}}}{\Vert D{x}^{*}-b\Vert}_{2}^{2}\\ \le {m}^{\frac{1}{2}}\frac{1}{\sqrt{1-{\delta}_{2s}}}\underset{k\to \infty}{\mathrm{lim}}{(2\lambda {L}_{q}({x}^{(k)},{\epsilon}_{k},\lambda ))}^{\frac{1}{2}}\\ \le {m}^{\frac{1}{2}}\frac{1}{\sqrt{1-{\delta}_{2s}}}{(2\lambda {L}_{q}({x}^{(k)},{\epsilon}_{k},\lambda ))}^{\frac{1}{2}}\\ \le {m}^{\frac{1}{2}}\frac{1}{\sqrt{1-{\delta}_{2s}}}\sqrt{2\lambda {\Vert {x}^{(0)}\Vert}_{2}^{2}}\end{array}$$

Letting C_{3} is represented as follows:

$${C}_{3}={m}^{\frac{1}{2}}\frac{1}{\sqrt{1-{\delta}_{2s}}}\sqrt{2{\Vert {x}^{(0)}\Vert}_{2}^{2}}$$

This is the proof of Equation (14) in Theorem 1. We are thus ready to prove Theorem 1. ☐

- Chen, J.L.; Li, Z.P.; Pan, J.; Chen, G.; Zi, Y.Y.; Yuan, J.; Chen, B.Q.; He, Z.J. Wavelet transform based on inner product in fault diagnosis of rotating machinery: A review. Mech. Syst. Signal. Process.
**2016**, 70–71, 1–35. [Google Scholar] [CrossRef] - Lee, J.; Wu, F.J.; Zhao, W.Y.; Ghaffari, M.; Liao, L.X.; Siegel, D. Prognostics and health management design for rotary machinery systems-reviews, methodology and applications. Mech. Syst. Signal. Process.
**2014**, 42, 314–334. [Google Scholar] [CrossRef] - Li, Q.; Liang, S.Y.; Yang, J.G.; Li, B.Z. Long range dependence prognostics for bearing vibration intensity chaotic time series. Entropy
**2016**, 18, 23. [Google Scholar] [CrossRef] - Mishra, C.; Samantaray, A.K.; Chakraborty, G. Rolling element bearing fault diagnosis under slow speed operation using wavelet de-noising. Measurement
**2017**, 103, 77–86. [Google Scholar] [CrossRef] - Li, Q.; Liang, S.Y.; Song, W.Q. Revision of bearing fault characteristic spectrum using LMD and interpolation correction algorithm. Procedia CIRP
**2016**, 56, 182–187. [Google Scholar] [CrossRef] - McDonald, G.L.; Zhao, Q. Multipoint optimal minimum entropy deconvolution and convolution fix: Application to vibration fault detection. Mech. Syst. Signal. Process.
**2017**, 82, 461–477. [Google Scholar] [CrossRef] - Sawalhi, N.; Randall, R.B.; Endo, H. The enhancement of fault detection and diagnosis in rolling element bearings using minimum entropy deconvolution combined with spectral kurtosis. Mech. Syst. Signal. Process.
**2007**, 21, 2616–2633. [Google Scholar] [CrossRef] - Jaouher, B.A.; Nader, F.; Lotfi, S.; Brigitte, C.M.; Farhat, F. Application of empirical mode decomposition and artificial neural network for automatic bearing fault diagnosis based on vibration signals. Appl. Acoust.
**2015**, 89, 16–27. [Google Scholar] - Mendonça, L.F.; Sousa, J.M.C.; Sá da Costa, J.M.G. An architecture for fault detection and isolation based on fuzzy methods. Expert Syst. Appl.
**2009**, 36, 1092–1104. [Google Scholar] [CrossRef] - Sena, P.; Attianese, P.; Pappalardo, M.; Villecco, F. FIDELITY: Fuzzy inferential diagnostic engine for on-line support to physicians. In 4th International Conference on Biomedical Engineering in Vietnam; Springer: Berlin/Heidelberg, Germany, 2013; Volume 49, pp. 396–400. [Google Scholar]
- Sun, W.; Yang, G.A.; Chen, Q.; Palazoglu, A.; Feng, K. Fault diagnosis of rolling bearing based on wavelet transform and envelope spectrum correlation. J. Vib. Control.
**2012**, 19, 924–941. [Google Scholar] [CrossRef] - Han, M.H.; Pan, J.L. A fault diagnosis method combined with LMD, sample entropy and energy ratio for roller bearings. Measurement
**2015**, 76, 7–19. [Google Scholar] [CrossRef] - Gao, Y.D.; Villecco, F.; Li, M.; Song, W.Q. Multi-scale permutation entropy based on improved LMD and HMM for rolling bearing diagnosis. Entropy
**2017**, 19, 176. [Google Scholar] [CrossRef] - Yi, C.; Lv, Y.; Ge, M.; Xiao, H.; Yu, X. Tensor singular spectrum decomposition algorithm based on permutation entropy for rolling bearing fault diagnosis. Entropy
**2017**, 19, 139. [Google Scholar] [CrossRef] - Yang, D.M. The envelop-based bispectral analysis for motor bearing defect detection. In Proceedings of the IEEE International Conference on Applied System Innovation (ICASI), Sapporo, Japan, 13–17 May 2017; pp. 1646–1649. [Google Scholar]
- Yunusa-Kaltungo, A.; Sinha, J.K.; Elbhbah, K. An improved data fusion technique for faults diagnosis in rotating machines. Measurement
**2014**, 58, 27–32. [Google Scholar] - Tian, X.; Gu, J.X.; Rehab, I.; Abdalla, G.M.; Gu, F.; Ball, A.D. A robust detector for rolling element bearing condition monitoring based on the modulation signal bispectrum and its performance evaluation against the Kurtogram. Mech. Syst. Signal. Process.
**2018**, 100, 167–187. [Google Scholar] [CrossRef] - Li, Y.; Liang, X.; Zuo, M.J. Diagonal slice spectrum assisted optimal scale morphological filter for rolling element bearing fault diagnosis. Mech. Syst. Signal. Process.
**2017**, 85, 146–161. [Google Scholar] [CrossRef] - Wang, S.; Huang, W.; Zhu, Z.K. Transient modeling and parameter identification based on wavelet and correlation filtering for rotating machine fault diagnosis. Mech. Syst. Signal. Process.
**2011**, 25, 1299–1320. [Google Scholar] [CrossRef] - Fan, W.; Cai, G.G.; Zhu, Z.K.; Shen, C.Q.; Huang, W.G.; Shang, L. Sparse representation of transients in wavelet basis and its application in gearbox fault feature extraction. Mech. Syst. Signal. Process.
**2015**, 56–57, 230–245. [Google Scholar] [CrossRef] - Du, Z.H.; Chen, X.F.; Zhang, H.; Yang, B.Y.; Zhai, Z.; Yan, R.Q. Weighted low-rank sparse model via nuclear norm minimization for bearing fault detection. J. Sound Vib.
**2017**, 400, 270–287. [Google Scholar] [CrossRef] - Cui, L.L.; Kang, C.H.; Wang, H.Q.; Chen, P. Application of composite dictionary multi-atom matching in gear fault diagnosis. Sensors
**2011**, 11, 5981–6002. [Google Scholar] [CrossRef] [PubMed] - Cui, L.L.; Wang, J.; Lee, S. Matching pursuit of an adaptive impulse dictionary for bearing fault diagnosis. J. Sound Vib.
**2014**, 333, 2840–2862. [Google Scholar] [CrossRef] - He, G.L.; Ding, K.; Lin, H.B. Fault feature extraction of rolling element bearings using sparse representation. J. Sound Vib.
**2016**, 366, 514–527. [Google Scholar] [CrossRef] - Zhang, H.; Chen, X.F.; Du, Z.H.; Yan, R.Q. Kurtosis based weighted sparse model with convex optimization technique for bearing fault diagnosis. Mech. Syst. Signal. Process.
**2016**, 80, 349–376. [Google Scholar] [CrossRef] - He, Q.B.; Ding, X.X. Sparse representation based on local time–frequency template matching for bearing transient fault feature extraction. J. Sound Vib.
**2016**, 370, 424–443. [Google Scholar] [CrossRef] - Sawalhi, N.; Randall, R.B. Vibration response of spalled rolling element bearing: Observations, simulations and signal processing techniques to track the spall size. Mech. Syst. Signal Process.
**2011**, 25, 846–870. [Google Scholar] [CrossRef] - Bearing Data Center. Available online: http://csegroups.case.edu/bearingdatacenter/pages/welcome-case-western-reserve-university-bearing-data-center-website (accessed on 20 July 2017).
- Donoho, D.L. Compressed Sensing. IEEE Trans. Inf. Theory
**2006**, 52, 1289–1306. [Google Scholar] [CrossRef] - Candes, E.J.; Romberg, J.; Tao, T. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory
**2006**, 52, 489–509. [Google Scholar] [CrossRef] - Candes, E.J.; Tao, T. Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. Inf. Theory
**2006**, 52, 5406–5425. [Google Scholar] [CrossRef] - Candes, E.J.; Romberg, J. Sparsity and incoherence in compressive sampling. Inverse Probl.
**2007**, 23, 969–985. [Google Scholar] [CrossRef] - Candes, E.J. The restricted isometry property and its implications for compressed sensing. C. R. Math.
**2008**, 346, 589–592. [Google Scholar] [CrossRef] - Natarajan, B.K. Sparse approximate solutions to linear systems. SIAM J. Comput.
**1995**, 24, 227–234. [Google Scholar] [CrossRef] - Lai, M.J.; Wang, J.Y. An Unconstrained l
_{q}minimization with 0 < q ≤ 1 for sparse solution of underdetermined linear systems. SIAM J. Optim.**2011**, 21, 82–101. [Google Scholar] - Lai, M.J.; Xu, Y.Y.; Yin, W.T. Improved iteratively reweighted least squares for unconstrained smoothed l
_{q}minimization. SIAM J. Numer. Anal.**2013**, 51, 927–957. [Google Scholar] - Wang, Y.; Wang, J.; Xu, Z. On recovery of block-sparse signals via mixed l
_{2}/l_{q}(0 < q ≤ 1) norm minimization. EURASIP J. Adv. Signal Process.**2013**, 2013, 76. [Google Scholar] - Hardy, G.; Littlewood, J.; Polya, G. Inequalities; Cambridge University Press: Cambridge, UK, 1952. [Google Scholar]
- Center for Intelligent Maintenance Systems. Available online: http://www.imscenter.net/ (accessed on 3 April 2017).
- Qiu, H.; Lee, J.; Lin, J.; Yu, G. Wavelet filter-based weak signature detection method and its applicationon on roller bearing prognostics. J. Sound Vib.
**2006**, 289, 1066–1090. [Google Scholar] [CrossRef] - Konstantin, D.; Zosso, D. Variational mode decomposition. IEEE Trans. Signal. Process.
**2014**, 62, 531–544. [Google Scholar] - Zhang, M.; Jiang, Z.N.; Feng, K. Research on variational mode decomposition in rolling bearings fault diagnosis of the multistage centrifugal pump. Mech. Syst. Signal. Process.
**2017**, 93, 460–493. [Google Scholar] [CrossRef] - Li, Q.; Ji, X.; Liang, S.Y. Incipient fault feature extraction for rotating machinery based on improved AR-minimum entropy deconvolution combined with variational mode decomposition approach. Entropy
**2017**, 19, 317. [Google Scholar] [CrossRef] - Singh, V.K.; Rai, A.K.; Kumar, M. Sparse data recovery using optimized orthogonal matching pursuit for WSNs. Procedia Comput. Sci.
**2017**, 109, 210–216. [Google Scholar] [CrossRef] - Zhao, J.X.; Song, R.F.; Zhao, J.; Zhu, W.P. New conditions for uniformly recovering sparse signals via orthogonal matching pursuit. Signal Process.
**2015**, 106, 106–113. [Google Scholar] [CrossRef] - Lin, J.H.; Li, S. Nonuniform support recovery from noisy random measurements by orthogonal matching pursuit. J. Approx. Theory
**2013**, 165, 20–40. [Google Scholar] [CrossRef] - Antoni, J. The spectral kurtosis: A useful tool for characterising non-stationary signals. Mech. Syst. Signal. Process.
**2006**, 20, 282–307. [Google Scholar] [CrossRef] - Antoni, J. Fast computation of the kurtogram for the detection of transient faults. Mech. Syst. Signal. Process.
**2007**, 21, 108–124. [Google Scholar] [CrossRef]

Parameter | Outside Diameter (mm) | Pitch Diameter (mm) | Contact Angle | Element Number | Pitting Defect Size |
---|---|---|---|---|---|

Value | d_{0} = 7.95 | D_{p} = 45.14 | α = 0° | 14 | l_{0} = 1.28 |

Regular Operator-q | Smoothing Parameter | Penalty Parameter | Maximum Iterations Number | Stopping Threshold |
---|---|---|---|---|

0.5 |
ε_{0} = 0 | $\lambda ={10}^{-6}$ | 1000 | ${\Vert {x}^{k}-{x}^{0}\Vert}_{2}/{\Vert {x}^{0}\Vert}_{2}\le {10}^{-3}$ |

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).