Weak Fault Detection for Gearboxes Using Majorization – Minimization and Asymmetric Convex Penalty Regularization

It is a primary challenge in the fault diagnosis community of the gearbox to extract the weak fault features under heavy background noise and nonstationary conditions. For this purpose, a novel weak fault detection approach based on majorization–minimization (MM) and asymmetric convex penalty regularization (ACPR) is proposed in this paper. The proposed objective cost function (OCF) consisting of a signal-fidelity term, and two parameterized penalty terms (i.e., one is an asymmetric nonconvex penalty regularization term, and another is a symmetric nonconvex penalty regularization term).To begin with, the asymmetric and symmetric penalty functions are established on the basis of an L1-norm model, then, according to the splitting idea, the majorizer of the symmetric function and the majorizer of the asymmetric function are respectively calculated via the MM algorithm. Finally, the MM is re-introduced to solve the proposed OCF. As examples, the effectiveness and reliability of the proposed method is verified through simulated data and gearbox experimental real data. Meanwhile, a comparison with the state of-the-art methods is illustrated, including nonconvex penalty regularization (NCPR) and L1-norm fused lasso optimization (LFLO) techniques, the results indicate that the gear chipping characteristic frequency 13.22 Hz and its harmonic (2f, 3f, 4f and 5f) can be identified clearly, which highlights the superiority of the proposed approach.


Introduction
Gearboxes are widely used in transmission systems with rotating machinery, due to its advantages, such as a high transmission ratio and large load capacity, which plays an important role in modern industrial applications.Unexpected bearing failure caused by a harsh working environment and intricate running conditions may result in catastrophic incidents, significant economic losses, and even human casualties.Thus, timely and precise condition-based monitoring (CBM) and health assessments of rotating machines are of great significance in practice, performed online or offline, to avoid a mechanical breakdown, especially in the early stages of failure [1][2][3].
The vibration signal generated by localized faults is a typical non-linear and non-stationary signal; thus, fault feature extraction based on vibration signal has become the mainstream technique in the fault diagnosis field.However, the observed vibration signals are usually corrupted by severe background noise and interference components.For the purpose of detecting fault from rotating machines, over the past decades, numerous diagnostic techniques have been developed.For example, some methods Symmetry 2018, 10, 243 2 of 23 are used via signal transforms, such as wavelet/wavelet packet transform [4][5][6], short-time Fourier transform [7,8], or tunable Q-factor wavelet transform (TQWT) [9,10].Some methods are used via signal adaptive decomposition, such as empirical mode decomposition (EMD) [11,12], local mean decomposition (LMD) [13,14], and variational mode decomposition (VMD) [15,16]; in addition, some methods are used by intelligent supervised learning techniques, such as artificial neural network (ANN) [17,18], deep learning (DL) [19,20], just to mention a few.These methods have achieved successful applications in fault diagnosis of rotating machinery to some extent.
Compared to the above conventional fault feature extraction approaches, most recently, sparse coding techniques (SCT) or sparse representation (SR) as a new fault detection method was introduced for fault feature extraction and condition monitoring [21,22].By applying the SR approach, the observed vibration signals can be represented as a linear combination of a few sparse bases (e.g., wavelet base) or sparse atoms generated by an over-complete dictionary (OCD), so the instantaneous fault impulses can be substituted by a few sparse coefficients.Zhou et al. [23] proposed shift-invariant dictionary learning (SIDL) and a hidden Markov model (HMM) to address bearing fault diagnosis and identify fault types.Further, Yang et al. [24] and Feng et al. [25] proposed a shift-invariant K-singular value decomposition (SI-K-SVD) method combined with a sparse representation algorithm to extract the fault impulse of wind turbine generator bearing and planetary gearbox fault, respectively.Qin [26] developed the model-based impulsive wavelet basis and Fourier basis for the weak repetitive transient feature extraction of a rolling bearing.The adaptive impulse dictionary, double-dictionary matching pursuit (DDMP), and step-impulse dictionary were introduced for detecting the physical defect size of the inner and outer race of a rolling bearing in [27][28][29].Ding et al. [30] employed the time-frequency (TF) dictionary and orthogonal matching pursuit (OMP) to extract the fault feature information of single row cylindrical roller bearings based on acoustic signals.Li et al. [31] introduced a novel dictionary learning method called the impulse-step impact dictionary, based on a nonconvex optimization approach; the results demonstrate the method's superiority in weak fault feature extraction via accelerated lifetime testing experiment, compared with OMP, L1-norm convex penalty regularization, and the spectral Kurtogram (SK) method.
Although the above SR and its extension, combined with the redundant dictionary, have achieved satisfactory applications in fault diagnosis, those algorithms have some common drawbacks: (1) Unique dictionaries' atoms and optimal wavelet basis cannot simultaneously match the natural structure of every real vibration signal well; (2) A large number of observed signals should be collected to form a training dictionary before diagnosis, which is always infeasible in practical applications; (3) Computational complexity and time-consuming problems occur simultaneously in dictionary training, such as with the K-SVD training and SI-K-SVD dictionaries training [31].
In an attempt to overcome the above shortcomings of an over-complete dictionary (OCD) and estimate the fault characteristics more accurately, the family of low-rank matrix approximation (LRMA) methods were introduced in recent years, Thus the reconstruction problem of dictionary atoms is translated into an optimization or inverse regularization problem.In [32], Ding et al. propose an algorithm that combines alternating the direction method of multipliers (ADMM) and the majorization-minimization (MM) method to compute a low-rank matrix optimization model.In [33] and [34], He et al. develop nonconvex penalty regularization and a sparsity-based group-sparse signal denoising (OGS) approach to extract the single and compound features of outer and inner race defects.In [35], Zhang et al. proposed a novel convex penalty regularization method called Kurtosis-based weighted sparse decomposition (Kur-WSD) on the basis of the L1-norm method to detect the inner fault of an alternating current motor.In [36] Unfortunately, the above low-rank matrix approximation approach still suffers from the following challenges, which have restricted its application development: (1) Generally, the penalty functions that are established in a low-rank matrix approximation (LRMA) model are symmetric functions, e.g., the absolute value function (AVF); the common drawback is that this penalty function is non-differentiable at the zero point, which can lead to some numerical issues, such as a local optimum and early termination of algorithm.(2) In a conventional low-rank matrix approximation (LRMA) method, the convex regularizer, such as the L1-norm, usually underestimates the sparse signal when the absolute value function (AVF) is used as a sparsity regularizer; the nonconvex regularizer suffers from several issues, such as a strict convexity problem of objective cost function (OCF), a non-convergence problem, etc.Additionally, both the convex and nonconvex regularizers shrink all the coefficients equally and remove too much energy from the useful signal, resulting in the estimation of the fault signal becoming more challenging.(3) When the useful fault characteristics signals are very weak but additive noise extremely strong, the conventional LRMA method cannot estimate low-dimensional feature distribution accurately.
Aiming at the issue of estimating the weak fault signal from its noisy observations, and enlightened by the LRMA method, a novel detection approach based on asymmetric convex penalty regularization (ACPR) and majorization-minimization (MM) is proposed in this paper, using the weak chipping fault of a gearbox as a research object.The proposed ACPR method neither relies on the reconstruction of dictionary atoms, nor requires a large number of sample data to form a certain training dictionary; meanwhile, the numerical issues, such as being non-differentiable at zero point, are addressed by introducing the asymmetric convex penalty function.In the modeling solution process, MM is implemented to solve a relatively complicated sparsity-based optimization problem.Finally, the proposed method is validated via the weak fault feature of the simulation and the gearbox dataset collected in the lab experiment; the results of fault frequency and its harmonics indicate that the ACPR method is superior to the other LRMA methods, such as nonconvex penalty regularization (NCPR) and L1-norm fused lasso optimization (LFLO) techniques.
The layout of this work is organized as follows.Section 2 introduces the MM algorithm.Section 3 introduces the algorithm and theoretical derivation of asymmetric convex penalty regularization (ACPR).The simulation evaluation of the proposed method is presented in Section 4. In Section 5, the practical diagnosis results and the discussion of the gearbox using the proposed approach with other LRMA methods are presented.Conclusions are drawn in Section 6.

Majorization-Minimization Algorithm
Before the derivation of the proposed algorithm, a review to the majorization-minimization (MM) algorithm is given.Usually, the MM algorithm is used to reduce the computational complexity-in other words, for a minimization problem that is too complex to solve directly, the MM method can simplify it into a series of simpler problems by constructing the majorization functions [37][38][39].
Suppose F(u) is the minimization problem, which is too complex to solve directly.The MM algorithm constructs a simple function G(u, u k ), G(u, u k ), which is called the majorization function of F(u) at the k-th iteration point u k .More specifically, consider the minimization problem By using the MM method, the above problem can be described by the following iteration: where k is the iteration number and G(u, u k ) is a continuously differentiable function, satisfying Note that the majorizer G(u, v) reaches F(u) for u = v.According to Equations ( 2) and (3), we have This iteration briefly describes the MM approach for minimizing a minimization problem function.The MM algorithm consists of two processes: the construction of the majorization function and the optimization of the majorization function.The main steps of the MM algorithm can be summarized as follows: (1) Initialize u 0 and k = 0; (2) Construct a majorization function G(u, u k ); (3) Operate the iteration u (k+1) = argmin u G(u, u (k) ); (4) If the stopping criterion is satisfied, then output u opt ; otherwise, k = k + 1, and go to step (2); (5) Output u opt .

Sparse Representation and Filter Banks
Generally, the noisy noisy vibration signal y of rotating machinery, such as a rolling bearing, contains three parts: the fault transient impulses x, the systematic natural vibration signal f, and the additive noise w, i.e., y = x + f + w The core work of the fault diagnosis is to extract the fault transient impulses x from the noisy observation y.Assuming that fault transient component impulses and the systematic natural vibration signal are estimated, we have Assuming that the estimation of fault transient impulses x is obtained as ∧ x, we can estimate signal f as follows: where LPF is a specified low-pass filter.Substituting Equation (7) into Symmetry 2018, 10, 243 5 of 23 Defining HPF = I − LPF = H, thus, we get On the other hand, it should be noted that Equation (5) belongs to a highly underdetermined equation, i.e., an ill-posed or N-P hard problem [40][41][42], for which there are infinite sets of solutions because the number of unknowns is greater than the number of equations.Usually, convex optimization approaches are commonly used to estimate a transient component from the noisy signal; based on the aforementioned work, the estimation of x can be formulated as the constrained optimization problem, i.e., where H is a specified high-pass filter (i.e., the HPF in Equation ( 11)), λ is a regularization parameter, and D is a matrix defined as , which controls the sparsity of the approximating value of x.If x is a sparse signal, i.e., most of the amplitude values in x tend to zero, then the problem in Equation ( 12) can also be solved by the L1-norm fused lasso optimization (LFLO) algorithm, i.e., where λ 0 and λ 1 are regularization parameters.The solution of the LFLO algorithm can be given by a soft-threshold function [43].In that case, we have where function tvd(•, •) is the total variation de-noising (TVD) algorithm [44][45][46], and the soft-threshold function is defined as follows In addition, the high-pass filter H described above could be formulated as follows [45][46][47]: where A and B are Toeplitz matrices.

Asymmetric Convex Penalty Regularization Model
Compared with optimization algorithms Equations ( 12) and (13), in order to estimate the fault transient impulses x precisely, this work introduces a novel penalty regularization method, i.e., an asymmetric and symmetric nonconvex penalty regularization model: where , and the matrix D 2 is defined as . The innovations of the novel compound regularizer model are as follows: (I) The M-term compound regularizers estimate the fault transient impulses; (II) The compound regularizers model consists of symmetric and asymmetric penalty functions, wherein the symmetric penalty function is a differentiable function, compared with the nondifferentiable function x i at i = 0. (III) The MM algorithm is introduced for the solution of the proposed compound regularization method.
Based on this, therefore, the core tasks of the proposed algorithm are (1) how to construct a symmetric and differentiable penalty function; (2) how to construct an asymmetric and differentiable penalty function; and (3) how to solve the proposed method based on the MM algorithm and the estimation x, and make diagnosis results more accurately than the traditional LFLO and nonconvex penalty regularization approaches.
For the first task, traditional LFLO regularization approach uses the absolute value function (AVF) φ A (x) = x as the penalty function; however, the common drawback of φ A (x) = x is that this function is non-differentiable at zero point, which can cause some numerical problems.To address this issue, some non-linear approximation functions of φ A (x) = x are proposed, i.e., Or Note that when ε = 0, then the φ B (x) and φ c (x) degrade into the AVF φ A (x), and while the ε > 0, the φ B (x) and φ c (x) are differentiable at the zero point.The functions φ A (x), φ B (x), φ c (x), and their first-order derivatives are listed in Table 1.
Table 1.Symmetric penalty functions and their derivatives.

Functions φ(x) φ (x)
In order for the non-linear approximation functions to maintain the reliable sparsity-inducting behavior of the original LFLO algorithm, the parameter ε should be set to an adequately small positive value.For example, the parameters ε = 10 −5 and ε = 10 −6 are small enough that numerical issues can be avoided, and their impact on the sparsity promoting can also be ignored.
For the second task, inspired by the AVF φ A (x) = x , and in contrast to the symmetric and differentiable penalty function φ B (x) and φ c (x), here a segmented function is proposed as follows: where r > 0 is a positive constant.Therefore, the main problem of Equation ( 20) will transforms into a task about how to construct the intermediate function f (x), −ε ≤ x ≤ ε.To address this issue, we seek a majorizer as the approximation function of f (x), −ε ≤ x ≤ ε; here, in order to eliminate the issue that the penalty function is non-differentiable at zero point, a simple quadratic equation (QE) is introduced accordingly, According to the theory of the majorization-minimization (MM), we have, The parameters a, b, c and s are all function of v, and solving for them gives Similarly, the numerical issue of Equation ( 24) will appear if the parameter v approaches zero.To address this issue, the sufficiently small positive value ε is used instead of |v|; thus, the segmented function Equation ( 20) can be rewritten as, Hence, the new function θ ε (x) is a continuously differentiable function.
The third task will be solved and derived in Section 3.3 using the MM algorithm.

The Solution of the Proposed Model Based on the Majorization-Minimization Algorithm
In this paper, the majorization-minimization (MM) algorithm is implemented to derive an iterative solution procedure for the proposed approach [47].The function G(x, v) is chosen as the majorizer of F(x).Specifically, the iterative solution procedure can be could be divided into three phases: For problem (a), we first seek the majorizer g(x, v) for φ(x), i.e., Since φ(x) is symmetric function, we set g(x, v) to be an even second-order polynomial, i.e., Thus, according to Equation ( 26), as well as g(v, v) = φ(v) and g (v, v) = φ (v), we have Symmetry 2018, 10, 243 The parameters m and b can be computed as Substituting Equation ( 29) into Equation ( 27), we have In summation, we obtain Therefore, based on Equation (31), we obtain where For problem (b), we assume that g 0 (x, v) is the majorizer of the asymmetric and differentiable function Therefore, the majorizer of the asymmetric and differentiable function θ ε (x n ; r) is obtained: Symmetry 2018, 10, 243 In summation, we obtain where For problem (c), based on Equations ( 32) and ( 37), the majorizer of F(x) based on the MM algorithm is given by Minimizing G(x, v) with respect to x yields Here, substituting H = BA −1 into Equation ( 39), we have where matrix ]D i and matrix Q = B T B + A T MA.Finally, by using the above formulas, the fault transient impulses x can be obtained by the following iterations: In conclusion, the complete steps of the proposed algorithm are summarized as follows: (1) (4) Repeat the following iterations: Symmetry 2018, 10, 243 10 of 23 −1

Parameter Selection
In this section, when M = 2, the regularization parameters λ i are set as: where σ is the standard deviation (SD) of the additive noise, β 0 and β 1 are the constants, so as to maximize the signal-to-noise (SNR); here, parameters β 0 and γ are typically set up to be constant value, i.e., β 0 = [0.5, 1], γ = [7.5, 8].In practice, the SD of the background noise in Equation ( 44) can be computed using both the fault signal and healthy data under same operating environment.Moreover, when the healthy data is not available or is unknown, the standard deviation of the background noise can still be estimated by the following formula: which is a traditional estimator of the noise level that used for wavelet denoising [48], where MAD(y) represents the median absolute deviation (MAD) of signal y, i.e.,

Numerical Simulation
A simulation experiment was utilized to investigate the effectiveness of the proposed approach for extracting fault characteristic frequencies.Generally, the localized gear fault signal consists of three typical parts: periodic impulses caused by the localized fault, natural modulated signals due to systematic components, and additive background noise.Take the gearbox structure as an under-damped second-order nonlinear mass-spring-damper (UD-SO-NMSD) system, and its synthetic response function can be described by following formula: where A 0 = 1 m/s 2 is the intensity of the fault impulse impact, A 1 = 0.1 m/s 2 is the intensity of the systematic vibration signal, and the damping ratio is a = 0.1; in addition, f n = 2000 Hz represents the natural frequency of excited structure, the length of vibration signal is N = 8192, the rotating frequencies are f 1 = 280 Hz and f 2 = 400 Hz, and the sampling frequency f s = 20 KHz.Additionally, in this simulated case, the additive white noise x 3 (t) with SDs sigma = 0.5, sigma = 0.7, and sigma = 0.9 are respectively added to the simulated signal, in order to test the noise tolerance of the proposed ACPR method.Therefore, it can be calculated that the gear fault frequency is 100 Hz (because the repetition interval T = 0.01, the sampling point of a single period is NT = round(fs*T), and the sampling time series is t 0 = (0:NT-1)/fs; therefore, the resonant frequency is 100 Hz). Figure 1a,b depict the obtained periodical impulse of the gear fault and natural modulated signal, respectively.
this simulated case, the additive white noise x3(t) with SDs sigma = 0.5, sigma = 0.7, and sigma = 0.9 are respectively added to the simulated signal, in order to test the noise tolerance of the proposed ACPR method.Therefore, it can be calculated that the gear fault frequency is 100 Hz (because the repetition interval T = 0.01, the sampling point of a single period is NT = round(fs*T), and the sampling time series is t0 = (0:NT-1)/fs; therefore, the resonant frequency is 100 Hz). Figure 1a,b depict the obtained periodical impulse of the gear fault and natural modulated signal, respectively.As a benchmark approach for gearbox fault detection, the Hilbert envelope spectrum has been effectively applied to fault diagnosis of rotating machinery, such as gears and rolling bearings.In this section, comparisons are made between the raw envelope spectrum method and envelope spectrums based on the proposed ACPR, on nonconvex penalty regularization (NCPR), and on the LFLO method, by analyzing simulated signals.In order to compare the diagnosis effect of the ACPR, NCPR, and LFLO methods, the evaluating parameters below are considered in Table 2. Here, according to Equation (44), we will take the constant parameters to be β0 = 0.7 and γ = 7.5.The parameters of the proposed ACPR can be obtained as follows: the standard deviation σ = 0.5, regularization term parameters λ0 = 0.7 × 0.5 = 0.35, λ1 = 7.5 × (1 − 0.7) × 0.5 = 1.125, and λ2 = 0.7 × 0.5 = 0.35.As a benchmark approach for gearbox fault detection, the Hilbert envelope spectrum has been effectively applied to fault diagnosis of rotating machinery, such as gears and rolling bearings.In this section, comparisons are made between the raw envelope spectrum method and envelope spectrums based on the proposed ACPR, on nonconvex penalty regularization (NCPR), and on the LFLO method, by analyzing simulated signals.In order to compare the diagnosis effect of the ACPR, NCPR, and LFLO methods, the evaluating parameters below are considered in Table 2. Here, according to Equation (44), we will take the constant parameters to be β 0 = 0.7 and γ = 7.5.The parameters of the proposed ACPR can be obtained as follows: the standard deviation σ = 0.5, regularization term parameters λ 0 = 0.7 × 0.5 = 0.35, λ 1 = 7.5 × (1 − 0.7) × 0.5 = 1.125, and λ 2 = 0.7 × 0.5 = 0.35.The simulated synthetic signal under noise a standard deviation sigma of 0.5, obtained from Equation (47), is shown in Figure 2a.From Figure 2a, it can be seen that the periodic impulses are completely buried in background noise.The Hilbert envelope spectrum of the raw synthetic signal is illustrated in Figure 2b, in which the fault frequencies 100 Hz and 200 Hz can be recognized from the envelope spectrum; however, those frequencies are clouded by the interference components.The resulting fault signal extracted through ACPR and its envelope spectrums are depicted in Figure 2c,d, revealing frequency information related to faults.Meanwhile, the resulting impulses components extracted through NCPR and LFLO and their envelope spectrums are exhibited in Figure 2e-h, respectively.As observed, in the case of SD = 0.5, all these results proved that the above three approaches can provide excellent performance in extracting the fault information, in contrast to the results from direct envelope spectrum analysis.
resulting fault signal extracted through ACPR and its envelope spectrums are depicted in Figure 2c,d, revealing frequency information related to faults.Meanwhile, the resulting impulses components extracted through NCPR and LFLO and their envelope spectrums are exhibited in Figure 2e-h, respectively.As observed, in the case of SD = 0.5, all these results proved that the above three approaches can provide excellent performance in extracting the fault information, in contrast to the results from direct envelope spectrum analysis.
Figure 2. A simulated synthetic signal, its detected impulses, and its envelope spectrum, using the proposed non-convex penalty regularization and L1-norm fused lasso optimization (LFLO) method, when noise standard deviation is 0.5.(a) Simulated synthetic signal; (b) envelope spectrum of the simulated synthetic signal; (c) detected impulses using the proposed ACPR method; (d) envelope spectrum of detected impulses using proposed ACPR method; (e) detected impulses using the nonconvex penalty regularization (NCPR) method; (f) envelope spectrum of detected impulses using the NCPR method; (g) detected impulses using the LFLO method; (h) envelope spectrum of detected impulses using the LFLO method.
Next, with the purpose of testifying to the noise tolerance performance of the proposed ACPR, the vibration signals with different standard deviations (i.e., 0.7 and 0.9) are simulated and compared.The NCPR and LFLO methods are also used to process the simulated heavy noisy vibration signals, for comparison.The extraction results with different methods are illustrated in Figures 3 and 4, respectively.As shown in Figures 3e,g and Figures 4e,g, most of noise harmonic interferences are removed; meanwhile, the energy of the useful fault periodic impulses is also reduced dramatically.It can be seen that, with the standard deviations (SD) of sigma increasing, the characteristic frequencies and their harmonic components cannot be accurately and clearly extracted by the NCPR and LFLO methods.Hence, the NCPR and LFLO methods are no longer effective for such a signal.Thankfully, the proposed ACPR is an appealing solution for the problem.It can be concluded that for the extraction results shown in Figures 3b-d and 4b-d, the ACPR method is superior to the NCPR and LFLO methods in terms of increasing noise.Table 3 shows the running time of the proposed ACPR algorithm and other benchmarking methods.In this work, the simulation and experimental data were carried out by MATLAB, on a computer with Windows 10, quad-core processors at 2.9 GHz CPU, and 16 GB RAM.
In conclusion, when applied to the noisy signals, the proposed ACPR is more stable with regards to noise perturbation than the other two methods, at least when the noise standard deviation equals 0.7 and 0.9.Frequency/Hz Amplitude/m•s -2 Figure 2. A simulated synthetic signal, its detected impulses, and its envelope spectrum, using the proposed non-convex penalty regularization and L1-norm fused lasso optimization (LFLO) method, when noise standard deviation is 0.5.(a) Simulated synthetic signal; (b) envelope spectrum of the simulated synthetic signal; (c) detected impulses using the proposed ACPR method; (d) envelope spectrum of detected impulses using proposed ACPR method; (e) detected impulses using the nonconvex penalty regularization (NCPR) method; (f) envelope spectrum of detected impulses using the NCPR method; (g) detected impulses using the LFLO method; (h) envelope spectrum of detected impulses using the LFLO method.
Next, with the purpose of testifying to the noise tolerance performance of the proposed ACPR, the vibration signals with different standard deviations (i.e., 0.7 and 0.9) are simulated and compared.The NCPR and LFLO methods are also used to process the simulated heavy noisy vibration signals, for comparison.The extraction results with different methods are illustrated in Figures 3 and 4, respectively.As shown in Figure 3e,g and Figure 4e,g, most of noise harmonic interferences are removed; meanwhile, the energy of the useful fault periodic impulses is also reduced dramatically.It can be seen that, with the standard deviations (SD) of sigma increasing, the characteristic frequencies and their harmonic components cannot be accurately and clearly extracted by the NCPR and LFLO methods.Hence, the NCPR and LFLO methods are no longer effective for such a signal.Thankfully, the proposed ACPR is an appealing solution for the problem.It can be concluded that for the extraction results shown in Figures 3b-d and 4b-d, the ACPR method is superior to the NCPR and LFLO methods in terms of increasing noise.Table 3 shows the running time of the proposed ACPR algorithm and other benchmarking methods.In this work, the simulation and experimental data were carried out by MATLAB, on a computer with Windows 10, quad-core processors at 2.9 GHz CPU, and 16 GB RAM.
In conclusion, when applied to the noisy signals, the proposed ACPR is more stable with regards to noise perturbation than the other two methods, at least when the noise standard deviation equals 0.7 and 0.9.

(g) (h)
Figure 3.A simulated synthetic signal, its detected impulses, and its envelope spectrum, using the proposed non-convex penalty regularization and LFLO methods, when noise standard deviation sigma is 0.7.(a) Simulated synthetic signal; (b) envelope spectrum of simulated synthetic signal; (c) detected impulses using the proposed ACPR method; (d) envelope spectrum of detected impulses using the proposed ACPR method; (e) detected impulses using the NCPR method; (f) envelope spectrum of detected impulses using the NCPR method; (g) detected impulses using the LFLO method; and (h) envelope spectrum of detected impulses using the LFLO method.(c) detected impulses using the proposed ACPR method; (d) envelope spectrum of detected impulses using the proposed ACPR method; (e) detected impulses using the NCPR method; (f) envelope spectrum of detected impulses using the NCPR method; (g) detected impulses using the LFLO method; and (h) envelope spectrum of detected impulses using the LFLO method.
Symmetry 2018, 10, x FOR PEER REVIEW 15 of 23 (g) (h) Figure 3.A simulated synthetic signal, its detected impulses, and its envelope spectrum, using the proposed non-convex penalty regularization and LFLO methods, when noise standard deviation sigma is 0.7.(a) Simulated synthetic signal; (b) envelope spectrum of simulated synthetic signal; (c) detected impulses using the proposed ACPR method; (d) envelope spectrum of detected impulses using the proposed ACPR method; (e) detected impulses using the NCPR method; (f) envelope spectrum of detected impulses using the NCPR method; (g) detected impulses using the LFLO method; and (h) envelope spectrum of detected impulses using the LFLO method.(c) detected impulses using the proposed ACPR method; (d) envelope spectrum of detected impulses using the proposed ACPR method; (e) detected impulses using the NCPR method; (f) envelope spectrum of detected impulses using the NCPR method; (g) detected impulses using the LFLO method; and (h) envelope spectrum of detected impulses using the LFLO method.(c) detected impulses using the proposed ACPR method; (d) envelope spectrum of detected impulses using the proposed ACPR method; (e) detected impulses using the NCPR method; (f) envelope spectrum of detected impulses using the NCPR method; (g) detected impulses using the LFLO method; and (h) envelope spectrum of detected impulses using the LFLO method.

Experimental Validation
The experimental vibration data were collected via a two-class standard cylinder spur gear reducer [49,50].The gearbox consists of an input shaft, an idler shaft, and an output shaft.The physical experimental rig and schematic diagram of the gearbox are shown in Figures 5 and 6, respectively.There are 16 and 40 teeth of helical gear in the input and output shafts, respectively, and the two gears on the idler shaft have 48 teeth and 24 teeth.The first-and second-stage reduction gear ratios are 1.5 and 1.667, respectively.In this experiment, two triaxial accelerometers with three channels were mounted to the housing with magnetic bases, near the idler shaft (see Figure 5d,e), and the acquisition system was Endevco with 10 mv/g.An artificial fault (chipping failure with a small area) was produced on the tooth of a helical gear (i.e., 24 teeth in the idler shaft), and the vibration signal was picked up from the bearing pedestal with a sample frequency of 66.667 kHz.The shaft rotational speed was 2400 rpm (i.e., 40 Hz) with a low load, the sampling time was set to 4 seconds, and the characteristic defect frequency was 13.33 Hz (40 Hz × 16/48 = 13.33 Hz).It should be pointed out that the collected signals were not pre-filtered, and hence contains many interference factors and heavy background noise, which render more challenges for the implementation of weak fault detection.
More details about the experimental description and experiment parameters can be found at the website [49].Noise Standard Deviation ACPR Algorithm NCPR Algorithm LFLO Algorithm sigma = 0.5 0.288113 s 0.05391 s 0.000486 s sigma = 0.7 0.283184 s 0.073986 0.000743 s sigma = 0.9 0.308288 s 0.052860 s 0.000524 s

Experimental Validation
The experimental vibration data were collected via a two-class standard cylinder spur gear reducer [49,50].The gearbox consists of an input shaft, an idler shaft, and an output shaft.The physical experimental rig and schematic diagram of the gearbox are shown in Figures 5 and 6, respectively.There are 16 and 40 teeth of helical gear in the input and output shafts, respectively, and the two gears on the idler shaft have 48 teeth and 24 teeth.The first-and second-stage reduction gear ratios are 1.5 and 1.667, respectively.In this experiment, two triaxial accelerometers with three channels were mounted to the housing with magnetic bases, near the idler shaft (see Figure 5d,e), and the acquisition system was Endevco with 10 mv/g.An artificial fault (chipping failure with a small area) was produced on the tooth of a helical gear (i.e., 24 teeth in the idler shaft), and the vibration signal was picked up from the bearing pedestal with a sample frequency of 66.667 kHz.The shaft rotational speed was 2400 rpm (i.e., 40 Hz) with a low load, the sampling time was set to 4 seconds, and the characteristic defect frequency was 13.33 Hz (40 Hz × 16/48 = 13.33 Hz).It should be pointed out that the collected signals were not pre-filtered, and hence contains many interference factors and heavy background noise, which render more challenges for the implementation of weak fault detection.
More details about the experimental description and experiment parameters can be found at the website [49].The raw vibration signal, (here in order to get a more complex vibration signal, a Gaussian white noise with amplitude sigma = 0.04 was added to the original signals) with a length of 4 s, is shown in Figure 7a, in which the periodic impulsive symptoms cannot be unraveled due to high level of noise and harmonic interference.Furthermore, no sign of the gear fault is revealed by its Hilbert envelope demodulation shown in Figure 7b-that is, the weak gear chipping characteristic frequency is completely submerged in the strong noise interference.Thus, an advanced signal processing method for weak fault detection should be developed.The proposed ACPR method was adopted to process the raw vibration signal for the detected feature information.Since the standard deviation(SD)of the background noise was unknown, it could be calculated by the equation . Therefore, the related parameter specification for the ACPR approach is summarized in Table 4. Figure 8 depicts the extracted fault signal and its envelope spectrum.Compared with the original envelope spectrum shown in Figure 7b, it can be clearly seen that the gear chipping characteristic frequency was 13.22 Hz (close to the theoretical fault frequency of gear chipping, which was 13.33 Hz), and its higher orders (2f, 3f, 4f, and 5f) were obvious and had a satisfying distribution pattern.Therefore, it can be concluded that a localized fault exists on the helical gear with 24 teeth in the idler shaft.The raw vibration signal, (here in order to get a more complex vibration signal, a Gaussian white noise with amplitude sigma = 0.04 was added to the original signals) with a length of 4 s, is shown in Figure 7a, in which the periodic impulsive symptoms cannot be unraveled due to high level of noise and harmonic interference.Furthermore, no sign of the gear fault is revealed by its Hilbert envelope demodulation shown in Figure 7b-that is, the weak gear chipping characteristic frequency is completely submerged in the strong noise interference.Thus, an advanced signal processing method for weak fault detection should be developed.The raw vibration signal, (here in order to get a more complex vibration signal, a Gaussian white noise with amplitude sigma = 0.04 was added to the original signals) with a length of 4 s, is shown in Figure 7a, in which the periodic impulsive symptoms cannot be unraveled due to high level of noise and harmonic interference.Furthermore, no sign of the gear fault is revealed by its Hilbert envelope demodulation shown in Figure 7b-that is, the weak gear chipping characteristic frequency is completely submerged in the strong noise interference.Thus, an advanced signal processing method for weak fault detection should be developed.The proposed ACPR method was adopted to process the raw vibration signal for the detected feature information.Since the standard deviation(SD)of the background noise was unknown, it could be calculated by the equation . Therefore, the related parameter specification for the ACPR approach is summarized in Table 4. Figure 8 depicts the extracted fault signal and its envelope spectrum.Compared with the original envelope spectrum shown in Figure 7b, it can be clearly seen that the gear chipping characteristic frequency was 13.22 Hz (close to the theoretical fault frequency of gear chipping, which was 13.33 Hz), and its higher orders (2f, 3f, 4f, and 5f) were obvious and had a satisfying distribution pattern.Therefore, it can be concluded that a localized fault exists on the helical gear with 24 teeth in the idler shaft.The proposed ACPR method was adopted to process the raw vibration signal for the detected feature information.Since the standard deviation(SD)of the background noise was unknown, it could be calculated by the equation ∧ σ = MAD(y)/0.6745= 0.0409.Therefore, the related parameter specification for the ACPR approach is summarized in Table 4. Figure 8 depicts the extracted fault signal and its envelope spectrum.Compared with the original envelope spectrum shown in Figure 7b, it can be clearly seen that the gear chipping characteristic frequency was 13.22 Hz (close to the theoretical fault frequency of gear chipping, which was 13.33 Hz), and its higher orders (2f, 3f, 4f, and 5f) were obvious and had a satisfying distribution pattern.Therefore, it can be concluded that a localized fault exists on the helical gear with 24 teeth in the idler shaft.For the sake of comparing the diagnosis results obtained by the NCPR and LFLO methods, the parameters like regularization coefficients and penalty parameters were also calculated.The parameters of the algorithm are also given in Table 4.The detected fault components and related spectrum by the NCPR and LFLO methods are demonstrated in Figures 9 and 10.From Figures 9 and 10, however, only one higher order line (39.42Hz) associated with the gear chipping fault can be observed, so the fundamental frequency of failure of 13.33 Hz cannot be reliably identified.This may because of the fact that the gear chipping fault, in this case, was quite weak and was submerged by strong noise.Therefore, it can be verified that the proposed ACPR method has an excellent performance with regards to recognizing the fault information of weak gearbox failure, in comparison with the NCPR and LFLO methods.For the sake of comparing the diagnosis results obtained by the NCPR and LFLO methods, the parameters like regularization coefficients and penalty parameters were also calculated.The parameters of the algorithm are also given in Table 4.The detected fault components and related spectrum by the NCPR and LFLO methods are demonstrated in Figures 9 and 10.From Figures 9 and 10, however, only one higher order line (39.42Hz) associated with the gear chipping fault can be observed, so the fundamental frequency of failure of 13.33 Hz cannot be reliably identified.This may because of the fact that the gear chipping fault, in this case, was quite weak and was submerged by strong noise.Therefore, it can be verified that the proposed ACPR method has an excellent performance with regards to recognizing the fault information of weak gearbox failure, in comparison with the NCPR and LFLO methods.For the sake of comparing the diagnosis results obtained by the NCPR and LFLO methods, the parameters like regularization coefficients and penalty parameters were also calculated.The parameters of the algorithm are also given in Table 4.The detected fault components and related spectrum by the NCPR and LFLO methods are demonstrated in Figures 9 and 10.From Figures 9 and 10, however, only one higher order line (39.42Hz) associated with the gear chipping fault can be observed, so the fundamental frequency of failure of 13.33 Hz cannot be reliably identified.This may because of the fact that the gear chipping fault, in this case, was quite weak and was submerged by strong noise.Therefore, it can be verified that the proposed ACPR method has an excellent performance with regards to recognizing the fault information of weak gearbox failure, in comparison with the NCPR and LFLO methods.

Conclusions
Aiming at the issue of estimating the sparse low-rank matrix (i.e., the weak fault signal) from its noise observation, this paper proposes a novel weak fault feature extraction approach for a gearbox, combining majorization-minimization (MM) and asymmetric convex penalty regularization (ACPR) algorithms.In this work, the observed noisy signal was modeled as the sum of the fault's transient impulses, the systematic natural vibration signal, and the additive noise; thus, the ill-posed problem of sparse representation was translated into a regularization inverse problem.The proposed objective function better estimates sparse low-rank matrices (SLRM) than the nonconvex and convex methods, which utilize the sum of the signal-fidelity term and the two parameterized penalty terms.
In terms of an algorithm solution, the asymmetric penalty function was established on the basis of the L1-norm model and simple quadratic equation forms, and a symmetric penalty function was established on the basis of the L1-norm model, respectively.Then, the majorizers of the symmetric and asymmetric functions were calculated via the majorization-minimization (MM) algorithm.Afterward, the efficient iterative algorithm was derived by the MM algorithm.Meanwhile, we show how to set the model parameters of the proposed objective cost function (OCF), in order to ensure that the objective function is strictly convex.Finally, the effectiveness of the proposed ACPR method is demonstrated by the simulation signal and practical gearbox fault data, compared with two stateof-art methods, especially for the measured signal with heavy background noise.specifically, the gear chipping characteristic frequency 13.22 Hz and its harmonic (2f, 3f, 4f and 5f) can be detected obviously,The diagnosis results show that the fault frequency and its harmonics could be detected clearly by the proposed method.

Conclusions
Aiming at the issue of estimating the sparse low-rank matrix (i.e., the weak fault signal) from its noise observation, this paper proposes a novel weak fault feature extraction approach for a gearbox, combining majorization-minimization (MM) and asymmetric convex penalty regularization (ACPR) algorithms.In this work, the observed noisy signal was modeled as the sum of the fault's transient impulses, the systematic natural vibration signal, and the additive noise; thus, the ill-posed problem of sparse representation was translated into a regularization inverse problem.The proposed objective function better estimates sparse low-rank matrices (SLRM) than the nonconvex and convex methods, which utilize the sum of the signal-fidelity term and the two parameterized penalty terms.
In terms of an algorithm solution, the asymmetric penalty function was established on the basis of the L1-norm model and simple quadratic equation forms, and a symmetric penalty function was established on the basis of the L1-norm model, respectively.Then, the majorizers of the symmetric and asymmetric functions were calculated via the majorization-minimization (MM) algorithm.Afterward, the efficient iterative algorithm was derived by the MM algorithm.Meanwhile, we show how to set the model parameters of the proposed objective cost function (OCF), in order to ensure that the objective function is strictly convex.Finally, the effectiveness of the proposed ACPR method is demonstrated by the simulation signal and practical gearbox fault data, compared with two stateof-art methods, especially for the measured signal with heavy background noise.specifically, the gear chipping characteristic frequency 13.22 Hz and its harmonic (2f, 3f, 4f and 5f) can be detected obviously,The diagnosis results show that the fault frequency and its harmonics could be detected clearly by the proposed method.

Conclusions
Aiming at the issue of estimating the sparse low-rank matrix (i.e., the weak fault signal) from its noise observation, this paper proposes a novel weak fault feature extraction approach for a gearbox, combining majorization-minimization (MM) and asymmetric convex penalty regularization (ACPR) algorithms.In this work, the observed noisy signal was modeled as the sum of the fault's transient impulses, the systematic natural vibration signal, and the additive noise; thus, the ill-posed problem of sparse representation was translated into a regularization inverse problem.The proposed objective function better estimates sparse low-rank matrices (SLRM) than the nonconvex and convex methods, which utilize the sum of the signal-fidelity term and the two parameterized penalty terms.
In terms of an algorithm solution, the asymmetric penalty function was established on the basis of the L1-norm model and simple quadratic equation forms, and a symmetric penalty function was established on the basis of the L1-norm model, respectively.Then, the majorizers of the symmetric and asymmetric functions were calculated via the majorization-minimization (MM) algorithm.Afterward, the efficient iterative algorithm was derived by the MM algorithm.Meanwhile, we show how to set the model parameters of the proposed objective cost function (OCF), in order to ensure that the objective function is strictly convex.Finally, the effectiveness of the proposed ACPR method is demonstrated by the simulation signal and practical gearbox fault data, compared with two state-of-art methods, especially for the measured signal with heavy background noise.specifically, the gear , Du et al. propose a rigorously weighted low-rank sparse detection framework to explore the physical/internal mechanisms of bearing faults and implement fault diagnosis of a wind turbine.Symmetry 2018, 10, 243 3 of 23 (a) The majorizer of the symmetric and differentiable function φ([D i x] n ) based on the MM algorithm.(b) The majorizer of the asymmetric and differentiable function θ ε (x n ; r) based on the MM algorithm.(c) The majorizer of the objective cost function F(x) based on the MM algorithm.

Figure 1 .
Figure 1.Simulated synthetic signal.(a) Faulty periodical transient impulses of a gearbox; and (b) the systematic natural modulated signal.

Figure 1 .
Figure 1.Simulated synthetic signal.(a) Faulty periodical transient impulses of a gearbox; and (b) the systematic natural modulated signal.

2 Figure 3 .
Figure3.A simulated synthetic signal, its detected impulses, and its envelope spectrum, using the proposed non-convex penalty regularization and LFLO methods, when noise standard deviation sigma is 0.7.(a) Simulated synthetic signal; (b) envelope spectrum of simulated synthetic signal; (c) detected impulses using the proposed ACPR method; (d) envelope spectrum of detected impulses using the proposed ACPR method; (e) detected impulses using the NCPR method; (f) envelope spectrum of detected impulses using the NCPR method; (g) detected impulses using the LFLO method; and (h) envelope spectrum of detected impulses using the LFLO method.

Figure 4 .
Figure 4.The simulated synthetic signal, its detected impulses, and its envelope spectrum, using the proposed non-convex penalty regularization and LFLO methods, when the noise standard deviation sigma is 0.9.(a) Simulated synthetic signal; (b) envelope spectrum of the simulated synthetic signal; (c) detected impulses using the proposed ACPR method; (d) envelope spectrum of detected impulses using the proposed ACPR method; (e) detected impulses using the NCPR method; (f) envelope spectrum of detected impulses using the NCPR method; (g) detected impulses using the LFLO method; and (h) envelope spectrum of detected impulses using the LFLO method.

2 Figure 4 .
Figure 4.The simulated synthetic signal, its detected impulses, and its envelope spectrum, using the proposed non-convex penalty regularization and LFLO methods, when the noise standard deviation sigma is 0.9.(a) Simulated synthetic signal; (b) envelope spectrum of the simulated synthetic signal; (c) detected impulses using the proposed ACPR method; (d) envelope spectrum of detected impulses using the proposed ACPR method; (e) detected impulses using the NCPR method; (f) envelope spectrum of detected impulses using the NCPR method; (g) detected impulses using the LFLO method; and (h) envelope spectrum of detected impulses using the LFLO method.

Figure 5 .
Figure 5. Experimental setup for the gearbox multiple faults.(a) Overview of the experimental apparatus; (b) the internal structure the gearbox; (c) the internal structure of the gear meshing; (d) the installation location of input shaft accelerometer; (e) the installation location of the output shaft accelerometer; and (f) the gears with failure [49,50].

Figure 5 .
Figure 5. Experimental setup for the gearbox multiple faults.(a) Overview of the experimental apparatus; (b) the internal structure the gearbox; (c) the internal structure of the gear meshing; (d) the installation location of input shaft accelerometer; (e) the installation location of the output shaft accelerometer; and (f) the gears with failure [49,50].

Figure 7 .
Figure 7.The raw vibration signal and its Hilbert envelope spectrum.(a) The raw vibration signal and (b) the Hilbert envelope spectrum of that raw vibration signal.

Figure 7 .
Figure 7.The raw vibration signal and its Hilbert envelope spectrum.(a) The raw vibration signal and (b) the Hilbert envelope spectrum of that raw vibration signal.

2 Figure 7 .
Figure 7.The raw vibration signal and its Hilbert envelope spectrum.(a) The raw vibration signal and (b) the Hilbert envelope spectrum of that raw vibration signal.

Figure 8 .
Figure 8.The fault information extracted through the proposed approach.(a) The time-domain waveform of the extracted fault signal and (b) the envelope spectrum of the extracted fault signal.

Figure 8 .
Figure 8.The fault information extracted through the proposed approach.(a) The time-domain waveform of the extracted fault signal and (b) the envelope spectrum of the extracted fault signal.

Figure 8 .
Figure 8.The fault information extracted through the proposed approach.(a) The time-domain waveform of the extracted fault signal and (b) the envelope spectrum of the extracted fault signal.

2 Figure 9 .Figure 9 .Figure 10 .
Figure 9.The fault information extracted through the NCPR approach.(a) The time-domain waveform of extracted fault signal and (b) the envelope spectrum of the extracted fault signal.

2 Figure 10 .
Figure 10.The fault information extracted through the LFLO approach.(a) The time-domain waveform of the extracted fault signal and (b) the envelope spectrum of the extracted fault signal.
F(x) is the proposed objective cost function (OCF), the penalty function θ ε (x n ; r) is a asymmetric and differentiable function, and φ([D i x] n ) is a symmetric and differentiable function, if the term

Table 2 .
Parameter settings of the proposed asymmetric convex penalty regularization (ACPR) method for gear fault detection.

Table 2 .
Parameter settings of the proposed asymmetric convex penalty regularization (ACPR) method for gear fault detection.

Table 3 .
The running time of the proposed algorithm and benchmarking methods.

Table 3 .
The running time of the proposed algorithm and benchmarking methods.

Table 4 .
Parameter settings of the proposed ACPR method for gear fault detection.

Table 4 .
Parameter settings of the proposed ACPR method for gear fault detection.

Table 4 .
Parameter settings of the proposed ACPR method for gear fault detection.