RETRACTED: Laplacian Eigenmaps Feature Conversion and Particle Swarm Optimization-Based Deep Neural Network for Machine Condition Monitoring

Yuan, Nanqi; Yang, Wenli; Kang, Byeong; Xu, Shuxiang; Wang, Xiaolin

doi:10.3390/app8122611

Open AccessArticle

RETRACTED: Laplacian Eigenmaps Feature Conversion and Particle Swarm Optimization-Based Deep Neural Network for Machine Condition Monitoring

by

Nanqi Yuan

^1,2,*

,

Wenli Yang

¹,

Byeong Kang

¹,

Shuxiang Xu

¹ and

Xiaolin Wang

²

¹

Discipline of ICT, School of Technology, Environments and Design, University of Tasmania, Hobart TAS7005, Australia

²

School of Engineering, Australian Maritime College, University of Tasmania, Hobart TAS7005, Australia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2018, 8(12), 2611; https://doi.org/10.3390/app8122611

Submission received: 11 November 2018 / Revised: 6 December 2018 / Accepted: 11 December 2018 / Published: 13 December 2018

(This article belongs to the Special Issue Fault Detection and Diagnosis in Mechatronics Systems)

Download

Browse Figures

Versions Notes

Abstract

:

This work reports a novel method by fusing Laplacian Eigenmaps feature conversion and deep neural network (DNN) for machine condition assessment. Laplacian Eigenmaps is adopted to transform data features from original high dimension space to projected lower dimensional space, the DNN is optimized by the particle swarm optimization algorithm, and the machine run-to-failure experiment were investigated for validation studies. Through a series of comparative experiments with the original features, two other effective space transformation techniques, Principal Component Analysis (PCA) and Isometric map (Isomap), and two other artificial intelligence methods, hidden Markov model (HMM) as well as back-propagation neural network (BPNN), the present method in this paper proved to be more effective for machine operation condition assessment.

Keywords:

Laplacian Eigenmaps; feature conversion; deep neural network; particle swarm optimization; condition assessment

1. Introduction

With advanced manufacturing industry (AMI) being attached an increasing importance by most countries in today’s world, effective machine health assessment theory is undergoing an unprecedented revolution. Evaluating and monitoring the performances of some pivotal components, such as gears or bearings, can detect the degradation or faults and correct them before machine breakdown occurs [1]. According to References [2,3,4,5], signal processing methods involving time–frequency entropy, wavelet transform, etc., are most popular among the existing works for health assessment.

As modern mechanical structure becomes increasingly complex, the vibration signal that characterizes the running state of machine needs to be analyzed more accurately. However, the incipient fault features of target machines is usually so weak that it is always submerged in the strong noise environment and difficult to extract. Therefore, more effective methods that can extract representative features from volatile machine running conditions and provide precise evaluation results are urgently needed. A number of artificial intelligence techniques have already been applied for distinguishing machinery health conditions, for example, Cui et al. [6] proposed a novel approach of analog circuit fault diagnosis by using a support vector machine (SVM) classifier. A kind of hidden Markov model (HMM)-driven robust probabilistic principal component analyzer was created by J. Zhu et al. [7] for dynamic process fault classification. In addition, Yan and Guo [8] adopted back-propagation neural network (BPNN) to assess on-line bearing performance degradation. More recently, diverse novel machine learning algorithms such as deep neural networks (DNNs) [9], deep belief networks (DBNs) [10], and convolutional neural networks (CNNs) [11] have been effectively applied in the related field. Additionally, a new hierarchical network that stacks DBNs layer by layer was proposed by Meng et al. [12] to assess mechanical systems. However, there is currently no proven mature method can be effectively employed in the establishment of deep neural network models. In practice, researchers have to continually change parameters and experiment multiple times to build a model structure, of which the instability and uncertainty have made the development and application of machine learning models limited, and even questioned. This paper proposes to adopt the particle swarm optimization (PSO) algorithm [13] to optimize the model parameters of DBN for machine condition assessment, which can reduce the complexity of DNN modeling.

Nowadays, in addition to time and frequency analysis and other traditional methods, some novel analysis techniques such as short time Fourier transform (STFT) [14], Wigner Ville Distribution (WVD) [15], and wavelet packet transform (WPT) [16] have also been effectively applied to extract vibration signal features. Feature fusion reduction is the process to produce new and more sensitive features through a series of transformations or combinations of the original input feature sets. Compared with feature selection, physical meaning of the new features that acquired after fusion reduction is obviously different from that of original ones and is difficult for understanding. However, since all of the features are involved in the transformation, feature fusion reduction is capable of retaining most of the useful signal classification information by projecting the feature set from high dimensional feature space to low dimensional feature space directly. Since traditional linear data space conversion methods including principal component analysis (PCA) [17], fisher discriminant analysis (FDA) [18], and other algorithms may ignore the convex and concave characteristics of nonlinear feature data under some conditions, the nonlinear feature space conversion technologies based on manifold learning have become a hotspot of current research, which mainly include the Isometric map algorithm (Isomap) [19], Laplacian Eigenmaps algorithm (LE) [20], local linear embedding algorithm (LLE) [21], and so on. Laplacian Eigenmaps constructs the relationship between the data from the local point of view. If two data instances i and j are similar to each other, then I and j would be as close as possible in the subspace transformed by LE, it is therefore, LE is competent to reflect the intrinsic manifold structure of data.

After comparing the existing shortcomings of the proposed techniques, this paper reports a novel method that integrates LE into DNN to assess machine running conditions, which mainly includes four steps: (1) Original features ex-traction; (2) LE feature space conversion; (3) normal DNN optimization and training; and (4) obtaining the evaluation results with the well trained DNN. The comprehensive contrast experiments with two other important feature space conversion algorithms PCA and Isomap, and two other intelligent evaluation models HMM and BPNN highlighted the advantages of the method proposed in this paper.

Other sections of the article are arranged as follows: Section 2 introduced executive steps of the proposed method in detail, in which all the related algorithms and theories are illustrated. In Section 3, the experiments of extracting original features of the vibration signal, feature space transformation and the comparison experiments with some other popular feature space transformation methods and intelligent assessment algorithms were carried out, and the assessment results are also introduced in this section. Then, in Section 4, the conclusions were drawn.

2. The Proposed Method

2.1. Original Features Extraction

Data preprocessing, such as moving out the abnormal data form the original signal dataset, is an essential procedure before feature extraction, and more precise outcomes require more meticulous preprocessing. Since the original signal doped with a lot of messy interference information is unsuitable for analyzing directly, it is a good choice to extract the features of the signal data for further analysis. To retain the original useful information as much as possible, both classical and contemporary methods for feature extraction were employed in this paper.

2.1.1. Time and Frequency Analysis

Time and frequency domain feature analysis is one of the dominant ways for state evaluation and fault diagnosis of mechanical equipment. Among which, time domain signal possessing the characteristics of containing large information, intuitive and easy to understand is the original basis for health evaluation and diagnosis of machines. The frequency domain characteristic parameters describe signal through the change of frequency band in signal spectrum and the dispersion of spectrum energy. Accompanied by the occurrences and developments of rotating machinery’s faults, the frequency components of vibration signal would change as well, hence the running status of the equipment can be evaluated according to the composition and size of these frequency components.

In this paper, 11 characteristic parameters in time domain and 13 characteristic parameters in time domain are adopted, which are displayed in Table 1 and Table 2, respectively. In our research, these features are the most effective and the most widely used signal features.

2.1.2 WPT

When wavelet packet transform (WPT) is decomposing the low frequency part of the signal, it can also decompose the high frequency part more meticulously at the same time, and there is neither redundancy nor omission in the decomposition. Therefore, WPT can provide better time-frequency analysis than wavelet transform for the mechanical vibration signal containing both medium and high frequency information. The steps to extract wavelet packet energy features mainly include [22]:

(1) Extract signals in each sub-band

Recorded the wavelet function

W (k)

and scaling function

φ (k)

as

μ_{1} = W (k)

and

μ_{0} = φ (k)

, respectively, then

{\begin{cases} μ_{2 n} (t) = \sqrt{2} \sum_{k \in Z} h (t) μ_{n} (2 t - k) \\ μ_{2 n + 1} (t) = \sqrt{2} \sum_{k \in Z} g (t) μ_{n} (2 t - k) \end{cases}

(1)

where

g_{k} = {(- 1)}^{k} h_{1 - k}

is a biorthogonal filter,

n = 2 l

or

n = 2 l + 1

,

l = 0, 1, 2, \dots

The recursively defined function

μ_{n}

is called the wavelet packet determined by orthonormal scaling function

μ_{0} = φ (k)

.

(2) Calculate the energy of each sub-band

Set the signal energy corresponding to the reconstructed signal

c_{jk}

of the

jth

frequency band of the

kth

layer after the wavelet packet decomposition as

E_{jk}

, then

E_{j k} = {\int | c_{j k} (t) |}^{2} d t = \sum_{m = 1}^{N} x_{j m}

(2)

In which

m

is the discrete point of the reconstructed signal

c_{jk}

of the

jth

frequency band of the

kth

layer, while

x_{jm}

stands for the amplitude of the discrete points of the reconstructed signal

c_{jk}

.

(3) Constructing wavelet packet feature vector

The feature vector of the wavelet packet can be obtained through normalizing the characteristic parameters calculated by the following formula:

e = {E_{j 0}, E_{j 1}, \dots, E_{j l}} / E, l = 2^{j} - 1

(3)

where

E = \sum_{k = 1}^{l} E_{jk}

is the total energy of the signal that equals to the sum of the energy of each sub-band.

After selection, this paper extracted 14 WPT original features for further research.

2.2. LE Feature Space Conversion

The sample data of high dimensional spaces is actually in a low dimensional manifold, of which the structure contains the geometric characteristics and the intrinsic dimensionality information of the original data [23]. The sample data in high dimensional space (D dimension) can actually be projected into a low dimensional manifold (L dimension, L ≤ D), which can accurately reflect the geometric characteristics of the original data. As a nonlinear space dimensionality transformation technique, LE builds a graph from neighborhood information of the data set, and each data point serves as a node on the graph and connectivity between nodes is governed by the proximity of neighboring points, which can be generally represented as:

M^{D} \overset{LE}{\Rightarrow} M^{L}, (L \leq D)

where

M^{D}

and

M^{L}

stand for the original features in D-dimensional space and projected features in L-dimensional space, respectively. The steps can be summarized as follows [24]:

(A) Constructing the Graphs

Given

k

points

x_{1}

,…,

x_{k}

in

M^{D}

, construct a weighted graph with

k

nodes, one for each point and a set of edges connecting neighboring points to each other. For this purpose, put an edge between nodes i and j that are close. In this work, the n-nearest neighbors algorithm is adopted to find the nodes that are close to each other. In this method, nodes of i and j are connected by an edge if i is among n-nearest neighbors of node j.

(B) Choosing weights

The heat kernel algorithm described in previous section was introduced to calculate the weights of the edges in the constructed graph. If nodes i and j are connected, put

W_{i, j} = e^{- \frac{‖ x_{i} - x_{j} ‖^{2}}{4 t}}

(4)

(C) Eigenmaps

As for a constructed graph G, to obtain the connected components, we should compute the eigen-values and eigen-vectors for the generalized eigen-vector problem as:

Ay = δ By

(5)

where B is the diagonal weight matrix, of which the entries are columns sums of W,

B_{ii} = \sum_{j} W_{ji}

, and

A = B - W

is the Laplacian matrix.

The main processes can be presented as Figure 1.

As introduced in Section 2.1, an

n \times 38

feature array composed of 38 original features extracted from the vibration signal is acquired in high dimension feature space. Additionally, before the high dimension feature array is projected to lower dimensional space by LE, maximum likelihood estimation (MLE) was adopted to calculate the intrinsic dimension of the array, then an

n \times m

(m < 38) lower dimensional feature array was obtained.

2.3. DNN Training and Optimization

2.3.1. Construction of Deep Neural Network

Hinton et al. proposed a feasible scheme to construct deep structure neural network. The key points of this method is to use some Restricted Boltzmann machines (RBM) to execute the pre-training without supervision, and tack up these RBMs layer by layer to construct a DBN.

RBM is a probabilistic model that can be represented by a kind of undirected graph models. The undirected graph model has two layers, of which one is a visible layer used to describe the characteristics of the input data, while another is a hidden layer, and each layer is composed of a plurality of probability units. All the visible layer elements are connected with the random binary hidden layer elements by undirected weights, however, there is no connection between the elements in the same visible or hidden layer.

DBN is built through stacking a number of RBMs from bottom to top layer by layer, of which the rules are available in the literature of Reference [25]. Since the input features of this paper are continuous variables, the first two layers are built as Gaussian-Bernoulli RBM models, while other hidden layers are built as Bernoulli-Bernoulli RBM models. The output values of the lower layer are used as inputs of the higher one between two binary RBM layers, through repeating which the network structure with desired hidden layer number can be obtained at last.

In this paper, a linear output layer is added at the top of the DBN to form DNN that is used to study the mapping relationship between the vibration signal features and the equipment state information, the architecture of DNN is shown in Figure 2.

2.3.2. DNN Optimization Based on PSO

As for the DNN models, the quantities of hidden nodes and hidden layers are the most significant parameters, which decide the ability of DNNs to capture useful information from massive input data. The architectures of a DNN model can be defined as follows:

DNN [param 1; param 2_{1}, \dots param 2_{j}; param 3]

(6)

where

param 1

represents the number of input nodes,

param 2_{i}

denotes the number of hidden nodes of

ith

hidden layer, while

param 3

stands for the number of output nodes.

A lot of research reveals that too few hidden nodes usually make the network not competent enough for modelling the data, while too many hidden nodes may trigger some problems such as over-fitting and even lead to unreliable results at last [12]. However, until now, there is no mature theory that has been reported for computing the exact quantity of hidden nodes or layers, which remains the construction of DNN that is still an intractable task.

In this paper, we propose to use the particle swarm optimization (PSO) algorithm to optimize the model parameters of DNNs. The particle swarm optimization algorithm can be regarded as a process for the global optimization of the population, which can be effectively applied to many optimization problems. The PSO algorithm adopted in this study can realize the establishment of the optimal DNN model through iterating the model parameters.

The target parameters of the optimization include the number of nodes in the hidden layer, the order of training in the DNN model, and the number of trainings per level. Usually, the number of nodes in the input layer is less than half of the number of hidden layers, and the number of nodes in the current hidden layer is generally not less than 2 times that of the next layer. For example, if the DNN model has a hidden layer of 2, with L input nodes and 1 output node. L denotes the feature dimension after transformation, and is generally not greater than 6, then the values of the number of nodes in the two hidden layers can be set as 12 to17 and 6 to 8, respectively. The verification shows that when the order of the model training exceeds 8, the error generated by the model increases exponentially, so the order of the model training is set as 1 to 7. Additionally, for the number of trainings of each order, it is required to be divisible by the number of elements included in the input node, taking into account the performance of the computer and the time PSO algorithm need, the times of training are tentatively set as {500, 1000, 1250, 2000, 2500, 4000, 5000, 6250, 10,000, and 12,500}.

After optimization, the parameters param2₁, param2_j in Equation (6) will be determined, and thus the optimal DNN model can be obtained.

2.4. Condition Assessment

The DNN model optimized by PSO can express numerous function sets in a more compact and concise way, which make it very suitable for DNN to obtain the essential characteristics of massive data. To analyze the whole life running condition of the machine, the entire dataset

M^{L}

that composed of feature data of the vibration signals collected under normal condition as well as abnormal condition is used as the testing data and input into DNN model, that is

Testing data : M^{L} = [\begin{matrix} p_{1} \\ p_{2} \\ ⋮ \\ p_{M} \end{matrix}] = [\begin{matrix} p_{1}^{1} & \dots & p_{1}^{L} \\ ⋮ & ⋱ & ⋮ \\ p_{M}^{1} & \dots & p_{M}^{L} \end{matrix}]

where

N^{L} \in M^{L}

. Additionally, the assessment is then the result of when the machine was healthy, when the incipient slight faults occurred, and when the serious faults occurred would be accurately detected.

We consider this task as a regression task. For a regression task, each training instance may have a value, such as 0.8,0.9,1.0,1.1,1.2 and so on. In mechanical part health monitoring, we could set “1.0” for healthy training dataset. When the validation dataset output “1.0” or values close to “1.0”, we could consider this signal to be healthy. When the validation dataset output “0.5” or “1.5”, we may consider this signal to be unhealthy. The output will fluctuate when faults occurred. It’s easier to detect and monitor the mechanical part health using this method.

Figure 3 exhibits the main procedures of the proposed method of integrating LE into deep neural network for evaluating machine health state in general.

3. Experiments and Analysis

3.1. Test Rig and Data

Bearings, the most important components of mechanical transmission system, are also the most vulnerable parts due to the complicated internal constitution, and most machine failures are caused by the damage of the critical equipment such as bearings. The bearing run-to-failure experiment was implemented on the test devices shown in Figure 4, in which there are four bearings in the transmission system, the rotating speed of the main shaft with constant load was kept invariable by the given alternating current motor. The parameters of the bearings and operation conditions are listed in Table 3. To obtain the accurate vibration data, the bearing housing was fixed with two High Sensitivity Quartz ICP accelerometers (PCB 353B33), of which one is fixed in the horizontal direction and the other is fixed in the vertical direction, and the NI DAQ Card 6062E was also applied in the data acquisition system. When the experiment was done, 984 individual ASCII format data files were got, of which each file consists of 20,480 data points with the recording interval of 10 min. The outer ring of selected bearing was found to be faulty after the test.

3.2. Feature Space Conversion

The test data collected from the above experiment were adopted for further analysis. After data pre-processing, time and frequency domain analysis and WPT were utilized to extract the thirty-eight features in original feature space, here a 9810 × 38 array composed of original feature was obtained. The eight representative original features that have been widely used for further research are displayed in Figure 5, in which all the waveforms of the eight features have an obvious mutation at time point 7000, however, only four of them (Mean frequency (MF), WPT1, WPT5, and WPT6) show slight abnormality at time point 5300, while the other four features (Skewness, Kurtosis, Crest, and Standard deviation frequency (SDF)) seem unable to detect the early slight abnormality.

Next, the instinct dimension quantity of the thirty-eight original features was computed with the help of the MLE algorithm, the answer was got as six. Then the local non-linear space conversion technique LE was applied to project the features from high dimension space to lower dimensional space according to the steps in Section 2.2, hence, the 9810 × 38 original feature dataset was transformed into a 9810 × 6 one that composed of mapping features. From Figure 6, it can be discovered that four of the projected features (features 1, 2, 3, and 5) started becoming abnormal around time point 5300, while there occurred an obvious mutation around time point 7000 for all these six features.

It can also be easily discovered that the mapping features in projected feature space performed much better than the features in original feature space for abnormality detection. The abnormal phenomena expressed by the features described above may indicate that the bearing applied in the test might have slight faults around time point 5300, while serious degradation might occur around 7000. Additionally, since all the curves of both original and projected features before time point 5300 performed smoothly and stably, it can be inferred that the machine was running under normal condition during the period before this time point.

3.3. DNN Condition Assessment

3.3.1. DNN Construction and Training

In this study, to correspond with the dimension of the input projected feature array, the number of input nodes was also set as six. While in view of ultimate purpose is the result of the equipment condition evaluation, a single output node is preferred. The quantities of hidden nodes and layers of the DNN can be optimized with PSO algorithm proposed in Section 2.3. The dataset applied for training and fine-tuning DNN were segmented into two parts: The first 80% were used for training and fine-tuning, while the rest for validation. The algorithm parameters of DNN model, such as numepochs, batchsize, momentum, and so on, were adjusted instantly and repeatedly to achieve better results in the experiments. According to Equation (6), after a series of comparative experiments the model possessing smooth, clear, and reasonably trended curve is constructed as

{DNN}_{1} [6; 100, 50, 20, 10; 1]

and the critical DNN parameters numepochs, batchsize, and momentum were set to 3, 50, and 0, respectively. According to the analysis in Section 3.2, the feature data before 5300th min are all collected in normal condition. Therefore, as described in Section 2.4, the former 2500 × 6 subpart of the 9810 × 6 mapping features array obtained by LE will be used as training data to train DNN, and the weights and biases are fine-tuned through the CD and BP algorithms.

3.3.2. Assessment and Results

After getting the optimized and well-trained DNN assessment model, the entire feature dataset composed of feature data of the vibration signals collected under normal condition as well as abnormal condition are used to evaluate the lifelong running condition of the machine.

Firstly, the 9810 × 6 array of six mapping features in the projected space is input into

{DNN}_{1}

to conduct the assessment experiment, of which the result is shown Figure 7. Then, in order to make a comparison and to demonstrate the advantages of the proposed method, the 9810 × 38 array of thirty-eight original features without feature space conversion were applied to conduct the same experiment, and Figure 8 plots the result, of which the DNN model is constructed as

{DNN}_{2} [38; 100, 50, 20, 10; 1]

By analyzing and comparing the results of these two kinds of features, the following phenomena can be discovered: (1) In a long beginning period during which the bearing runs normally, both the curves show a basically linear trend, but the curve of the mapping features is more stable, while that of the original features fluctuates obviously, which indicates that the former is much more insensitive to noise than the latter. (2) Both of these two kinds of features can detect the serious degradation such as crackle, fatigue spalling, etc., occurred at 7000th min, however, the original features could not detect the early slight faults of wear, pitting, or overheat began in the vicinity of 5300th min, while the mapping features performed well on this issue. (3) Additionally, at the end, the second curve changes the direction of the trend and performs very disorderedly, while the first curve shows a good unilateral trend and rises monotonically and sharply after the 9400th min, which indicates that the bearing got started to deteriorate so violently that it could no longer work.

Contrast with the actual experimental situation, it can be discovered that the assessment result of the proposed method that transforms the features in original higher dimensional space to projected lower dimensional space by LE in this study was consistent with the real operation status of the bearings, while the assessment results of the original features without feature space conversion are in great difference with the actual situation.

3.4. Comparison Experiments and Analysis

3.4.1. Comparisons of Space Conversion Methods

LE adopted in this study is a nonlinear local feature space conversion method, in order to make comparative analysis and highlight its effectiveness for the work, several contrast experiments with linear space conversion method PCA and global nonlinear space conversion method Isomap that are proverbially applied for feature transformation were carried out. Considering fairness and rationality, the most suitable DNN structures for PCA and Isomap are, respectively, constructed as follows

{DNN}_{3} [6; 100, 50, 25, 5; 1]

and

{DNN}_{4} [6; 100, 50, 25, 10; 1]

The evaluation results of DNNs with PCA and Isomap-based space conversion techniques applying the same procedures in Section 3.3 are shown in Figure 9, it can be discovered that the waveforms of PCA-based and LE-based results have the same abnormal performance: Both of them began to appear abnormal at 5300th min and mutated around 7000th min, but the former has greater volatility before the start of the anomaly and the end is chaotic. While the result of Isomap-based technique performs worse in the beginning normal period, and its mutation at 7000th min is not so obvious, but of which, the unidirectional drastic descent (at about 9400th min) demonstrated the validity of this method in the detection of serious failure of bearings. It is easy to find that LE performs best in general in the comparisons.

3.4.2. Comparisons of Assessment Models

In the following study, two other artificial intelligence models BPNN and HMM that have excellent performance in pattern recognition [26], data processing and other fields were also applied to carry out the similar comparative experiments. By the way, the specific algorithmic theories of HMM and BPNN can be studied in literatures [7] and [8], respectively.

In the comparison experiments, the feature space conversion method remained unchanged as LE, but the evaluation models were changed to BPNN and HMM, respectively. The evaluation results are shown in Figure 10, from which it can be discovered that BPNN can accurately detect the anomaly at 7000th min, but it is not sensitive to the early slight fault at about 5300th min and performs intricately at the end where the waveform went to the contrary direction. While the waveform of HMM-based method is pretty smooth in the early stage, and it also indicates that HMM can identify the early deterioration of bearings around 5300th min more obviously than BPNN. However, the inefficiency of HMM in detection of the mutation at about 7000th min suggests that this approach is not competent enough for the assessment task either. Hence, we can say that DNN outperforms the assessment models.

4. Conclusions

In view of the complexity of modern mechanical systems as well as the harsh and unstable running condition, effective methods for evaluating and monitoring the running conditions machines are urgently needed. This work reports a novel effective method combining Laplacian Eigenmaps feature conversion and particle swarm optimization-based deep neural network for evaluating the health state of the target machine (rolling-element bearings). Firstly, three popular approaches including time and frequency domain analysis as well as WPT were applied to extract thirty-eight features of the vibration signals collected from machines in the original high dimension space. Then, the nonlinear local algorithm LE was introduced to transform the original features to the projected lower dimensional space and obtain the six more typical parameters. Next, the transformed six-dimensional feature dataset was entered into the PSO algorithm optimized DNN network to assess the whole life running conditions of the target bearing in the test. Finally, a series of comprehensive and persuasive comparison experiments proved that the proposed method of integrating LE into DNN is more effective for machine running state assessment. In the future work, the proposed method in this paper may also be applied in prognosis, classification, and some other fields.

Author Contributions

B.K, S.X and X.W conceived and designed the experiments; N.Y performed the experiments; N.Y and W.Y analyzed the data; N.Y contributed analysis tools; All the authors wrote the paper and revised the paper.

Funding

This research received no external funding.

Acknowledgments

Thanks Juncheng Lu for data collection and thanks Chengjiang Li and Bin Su for giving suggestions to improve this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yuan, N.; Yang, W.; Kang, B.; Xu, S.; Li, C. Signal fusion-based deep fast random forest method for machine health assessment. J. Manuf. Syst. 2018, 48, 1–8. [Google Scholar] [CrossRef]
Loutridis, S. Instantaneous energy density as a feature for gear fault detection. Mech. Syst. Signal Process. 2006, 20, 1239–1253. [Google Scholar] [CrossRef]
Öztürk, H.; Sabuncu, M.; Yesilyurt, I. Early detection of pitting damage in gears using mean frequency of scalogram. J. Vib. Control 2008, 14, 469–484. [Google Scholar] [CrossRef]
Loutridis, S. Self-similarity in vibration time series: Application to gear fault diagnostics. J. Vib. Acoust. 2008, 130, 031004. [Google Scholar] [CrossRef]
Yu, D.; Yang, Y.; Cheng, J. Application of time–frequency entropy method based on Hilbert–Huang transform to gear fault diagnosis. Measurement 2007, 40, 823–830. [Google Scholar] [CrossRef]
Cui, J.; Wang, Y. A novel approach of analog circuit fault diagnosis using support vector machines classifier. Measurement 2011, 44, 281–289. [Google Scholar] [CrossRef]
Zhu, J.; Ge, Z.; Song, Z. HMM-driven robust probabilistic principal component analyzer for dynamic process fault classification. IEEE Trans. Ind. Electron. 2015, 62, 3814–3821. [Google Scholar] [CrossRef]
Yan, J.; Guo, C.; Wang, X. A dynamic multi-scale Markov model based methodology for remaining life prediction. Mech. Syst. Signal Process. 2011, 25, 1364–1376. [Google Scholar] [CrossRef]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Yang, B.-S.; Oh, M.-S.; Tan, A.C.C. Fault diagnosis of induction motor based on decision trees and adaptive neuro-fuzzy inference. Expert Syst. Appl. 2009, 36, 1840–1849. [Google Scholar]
Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed]
Gan, M.; Wang, C. Construction of hierarchical diagnosis network based on deep learning and its application in the fault pattern recognition of rolling element bearings. Mech. Syst. Signal Process. 2016, 72, 92–104. [Google Scholar] [CrossRef]
Lin, Q.; Liu, S.; Zhu, Q.; Tang, C.; Song, R.; Chen, J.; Coello, C.A.C.; Wong, K.-C.; Zhang, J. Particle swarm optimization with a balanceable fitness estimation for many-objective optimization problems. IEEE Trans. Evol. Comput. 2018, 22, 32–46. [Google Scholar] [CrossRef]
Klein, R.; Ingman, D.; Braun, S. Non-stationary signals: Phase-energy approach—Theory and simulations. Mech. Syst. Signal Process. 2001, 15, 1061–1089. [Google Scholar] [CrossRef]
Baydar, N.; Ball, A. A comparative study of acoustic and vibration signals in detection of gear failures using Wigner–Ville distribution. Mech. Syst. Signal Process. 2001, 15, 1091–1107. [Google Scholar] [CrossRef]
He, Q. Vibration signal classification by wavelet packet energy flow manifold learning. J. Sound Vib. 2013, 332, 1881–1894. [Google Scholar] [CrossRef]
Gharavian, M.; Ganj, F.A.; Ohadi, A.; Bafroui, H.H. Comparison of FDA-based and PCA-based features in fault diagnosis of automobile gearboxes. Neurocomputing 2013, 121, 150–159. [Google Scholar] [CrossRef]
Zhu, Z.-B.; Song, Z.-H. A novel fault diagnosis system using pattern classification on kernel FDA subspace. Expert Syst. Appl. 2011, 38, 6895–6905. [Google Scholar] [CrossRef]
Tenenbaum, J.B.; De Silva, V.; Langford, J.C. A global geometric framework for nonlinear dimensionality reduction. Science 2000, 290, 2319–2323. [Google Scholar] [CrossRef]
Belkin, M.; Niyogi, P. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 2003, 15, 1373–1396. [Google Scholar] [CrossRef]
Roweis, S.T.; Saul, L.K. Nonlinear dimensionality reduction by locally linear embedding. Science 2000, 290, 2323–2326. [Google Scholar] [CrossRef] [PubMed]
Hemmati, F.; Orfali, W.; Gadala, M.S. Roller bearing acoustic signature extraction by wavelet packet transform, applications in fault detection and size estimation. Appl. Acoust. 2016, 104, 101–118. [Google Scholar] [CrossRef]
Hauberg, S. Principal curves on Riemannian manifolds. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 1915–1921. [Google Scholar] [CrossRef] [PubMed]
Jafari, A.; Almasganj, F. Using Laplacian eigenmaps latent variable model and manifold learning to improve speech recognition accuracy. Speech Commun. 2010, 52, 725–735. [Google Scholar] [CrossRef]
Hinton, G.E.; Osindero, S.; Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef] [PubMed]
Yuan, N.; Yang, W.; Kang, B.; Xu, S.; Wang, X.; Li, C. Manifold learning-based fuzzy k-principal curve similarity evaluation for wind turbine condition monitoring. In Energy Science & Engineering; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2018. [Google Scholar]

Figure 1. Main processes of LE space conversion.

Figure 2. Structure of DNN.

Figure 3. Flow chart of the proposed method.

Figure 4. Test rig.

Figure 5. Eight of the features in original feature space.

Figure 6. The six features in projected feature space.

Figure 7. Assessment result of the 6 mapping features.

Figure 8. Assessment result of the 38 original features.

Figure 9. Assessment results of the compared space conversion methods. (a)Assessment result of PCA conversion; (b)Assessment result of Isomap conversion

Figure 10. Assessment results of the compared assessment models. (a) Assessment results of BPNN model; (b) Assessment results of HMM model.

Table 1. Features in Time-domain.

Features	Names	Equations	Features	Names	Equations
$p_{1}$	Mean	$\frac{\sum_{n = 1}^{N} x (n)}{N}$	$p_{7}$	Kurtosis index	$\frac{\frac{1}{N} \sum_{n = 1}^{N} (x (n) - p_{1})}{{p_{2}}^{3}}$
$p_{2}$	variance	$\sqrt{\frac{\sum_{n = 1}^{N} {(x (n) - p_{1})}^{2}}{N - 1}}$	$p_{8}$	Peak factor	$\frac{\frac{1}{N} \sum_{n = 1}^{N} (x (n) - p_{1})}{{p_{2}}^{4}}$
$p_{3}$	Square root amplitude	${(\frac{\sum_{n = 1}^{N} \sqrt{\| x (n) \|}}{N})}^{2}$	$p_{9}$	Margin indicator	$\frac{p_{5}}{p_{4}}$
$p_{4}$	Valid value	$\sqrt{\frac{\sum_{n = 1}^{N} {(x (n))}^{2}}{N}}$	$p_{10}$	Waveform indicator	$\frac{p_{5}}{p_{3}}$
$p_{5}$	Peak	$\max \| x (n) \|$	$p_{11}$	Pulse indicator	$\frac{p_{5}}{\frac{1}{N} \sum_{n = 1}^{N} \| x (n) \|}$
$p_{6}$	Skewness index	$\frac{p_{4}}{\frac{1}{N} \sum_{n = 1}^{N} \| x (n) \|}$

Note:

x (n)

is the sequence of time domain signal,

n = 1, 2, \dots, N

,

N

is the total number of samples.

Table 2. Features in Frequency-domain.

Features	Names	Equations	Features	Names	Equations
$p_{12}$	Mean frequency	$\frac{\sum_{a = 1}^{A} s (a)}{A}$	$p_{19}$	None	$\sqrt{\frac{\sum_{a = 1}^{A} {f_{a}}^{4} s (a)}{\sum_{a = 1}^{A} {f_{a}}^{2} s (a)}}$
$p_{13}$	Standard deviation frequency	$\sqrt{\frac{\sum_{a = 1}^{A} (s (a) - p_{12})}{A - 1}}$	$p_{20}$	None	$\frac{\sum_{a = 1}^{A} {f_{a}}^{2} s (a)}{\sqrt{\sum_{a = 1}^{A} s (a) \sum_{a = 1}^{A} {f_{a}}^{4} s (a)}}$
$p_{14}$	Spectral skewness	$\frac{\sum_{a = 1}^{A} (s (a) - p_{12})}{A {(\sqrt{p_{13}})}^{3}}$	$p_{21}$	None	$\frac{p_{17}}{p_{16}}$
$p_{15}$	Spectral kurtosis	$\frac{\sum_{a = 1}^{A} {(s (a) - p_{12})}^{4}}{A {p_{13}}^{2}}$	$p_{22}$	None	$\frac{\sum_{a = 1}^{A} {(f_{a} - p_{16})}^{3} s (a)}{A {p_{17}}^{3}}$
$p_{16}$	First-order center of gravity	$\frac{\sum_{a = 1}^{A} f_{a} s (a)}{\sum_{a = 1}^{A} s (a)}$	$p_{23}$	None	$\frac{\sum_{a = 1}^{A} {(f_{a} - p_{16})}^{4} s (a)}{A {p_{17}}^{4}}$
$p_{17}$	Second-order center of gravity	$\sqrt{\frac{\sum_{a = 1}^{A} {(f_{a} - p_{16})}^{2} s (a)}{A}}$	$p_{24}$	None	$\frac{\sum_{a = 1}^{A} {(f_{a} - p_{16})}^{1 / 2} s (a)}{A {p_{17}}^{1 / 2}}$
$p_{18}$	Second order moment of spectrum	$\sqrt{\frac{\sum_{a = 1}^{A} {f_{a}}^{2} s (a)}{\sum_{a = 1}^{A} s (a)}}$

Note:

s (a)

is signal spectrum,

a

is spectral line.

f_{a}

is the frequency of a-th spectral line, A is the total number of spectral lines.

a = 1, 2, \dots, A

.

Table 3. Bearing parameters and experimental conditions.

Type	Number	Ball Diameter (mm)	Contact Angle (deg)	Rotation Speed (RPM)	Load (kN·m)	Sampling Rate (kHz)
ZA-2115	4	10	0	1500	26.50	20

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yuan, N.; Yang, W.; Kang, B.; Xu, S.; Wang, X. RETRACTED: Laplacian Eigenmaps Feature Conversion and Particle Swarm Optimization-Based Deep Neural Network for Machine Condition Monitoring. Appl. Sci. 2018, 8, 2611. https://doi.org/10.3390/app8122611

AMA Style

Yuan N, Yang W, Kang B, Xu S, Wang X. RETRACTED: Laplacian Eigenmaps Feature Conversion and Particle Swarm Optimization-Based Deep Neural Network for Machine Condition Monitoring. Applied Sciences. 2018; 8(12):2611. https://doi.org/10.3390/app8122611

Chicago/Turabian Style

Yuan, Nanqi, Wenli Yang, Byeong Kang, Shuxiang Xu, and Xiaolin Wang. 2018. "RETRACTED: Laplacian Eigenmaps Feature Conversion and Particle Swarm Optimization-Based Deep Neural Network for Machine Condition Monitoring" Applied Sciences 8, no. 12: 2611. https://doi.org/10.3390/app8122611

APA Style

Yuan, N., Yang, W., Kang, B., Xu, S., & Wang, X. (2018). RETRACTED: Laplacian Eigenmaps Feature Conversion and Particle Swarm Optimization-Based Deep Neural Network for Machine Condition Monitoring. Applied Sciences, 8(12), 2611. https://doi.org/10.3390/app8122611

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

RETRACTED: Laplacian Eigenmaps Feature Conversion and Particle Swarm Optimization-Based Deep Neural Network for Machine Condition Monitoring

Abstract

1. Introduction

2. The Proposed Method

2.1. Original Features Extraction

2.1.1. Time and Frequency Analysis

2.1.2 WPT

2.2. LE Feature Space Conversion

2.3. DNN Training and Optimization

2.3.1. Construction of Deep Neural Network

2.3.2. DNN Optimization Based on PSO

2.4. Condition Assessment

3. Experiments and Analysis

3.1. Test Rig and Data

3.2. Feature Space Conversion

3.3. DNN Condition Assessment

3.3.1. DNN Construction and Training

3.3.2. Assessment and Results

3.4. Comparison Experiments and Analysis

3.4.1. Comparisons of Space Conversion Methods

3.4.2. Comparisons of Assessment Models

4. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI