Data Augmentation Using Generative Adversarial Network for Automatic Machine Fault Detection Based on Vibration Signals

: In the last decade, predictive maintenance has attracted a lot of attention in industrial factories because of its wide use of the Internet of Things and artiﬁcial intelligence algorithms for data management. However, in the early phases where the abnormal and faulty machines rarely appeared in factories, there were limited sets of machine fault samples. With limited fault samples, it is difﬁcult to perform a training process for fault classiﬁcation due to the imbalance of input data. Therefore, data augmentation was required to increase the accuracy of the learning model. However, there were limited methods to generate and evaluate the data applied for data analysis. In this paper, we introduce a method of using the generative adversarial network as the fault signal augmentation method to enrich the dataset. The enhanced data set could increase the accuracy of the machine fault detection model in the training process. We also performed fault detection using a variety of preprocessing approaches and classiﬁed the models to evaluate the similarities between the generated data and authentic data. The generated fault data has high similarity with the original data and it signiﬁcantly improves the accuracy of the model. The accuracy of fault machine detection reaches 99.41% with 20% original fault machine data set and 93.1% with 0% original fault machine data set (only use generate data only). Based on this, we concluded that the generated data could be used to mix with original data and improve the model performance.


Introduction
Machine maintenance is one of the most important fields in the industrial environment. In the normal working condition, the maintenance costs range from 15% to 40% of the total production cost [1]. Most manufacturing industries today use preventive maintenance, which replaces the machine parts based on a fixed schedule, to guarantee high maintenance safety. However, preventive maintenance cannot be considered as an effective maintenance method because the fixed schedule could not make full use of resources. This method can save costs ranging from 8% to 12% of the traditional maintenance cost and 40% of the reactive maintenance cost [2]. In the last decade, predictive maintenance has a more and more important role in manufacturing because of the improvement of the Internet of Things (IoT) and real-time data management. In the modern industrial environment, predictive maintenance focuses on the IoT and artificial intelligence (AI) platforms. In Figure 1, the maintenance includes data collection and signal processing to conducts early fault detection and diagnosis. Based on these platforms, the system can perform data collection and signal processing to conducts early fault detection and diagnosis by applying the concepts of data collection and management [3]. Some signal are popular in machine fault detection, such as vibration and acoustic signal. However, the acoustic signal is harder to collected, and more sensitive to noise, compared to vibration signal [4]. Various predictive maintenance schemes and AI models, which mostly use supervised learning, have been proposed lately [5,6]. However, in the early phases when abnormal operations and fault machines rarely appeared in real-time scenarios, the initial fault vibration samples set is restricted. In the case of absent or limited fault samples, the training process for fault classification applications is difficult to conduct because of the imbalance of the input data. Therefore, data augmentation is necessary to increase the performance of the model training process when dealing with small fault datasets. An approach for the limited training data is transfer learning, where the target signal is created based on the source signal, which has the same distribution [7,8]. However, the source signal using for transfer learning also requires the balance between the normal and fault data set. Therefore, data augmentation is also necessary for the transfer learning techniques when dealing with small fault datasets at the initial phase. In practice, the works of data augmentation in the time-series region are very limited and mostly focus on the traditional data transformation methods. The example of these methods is jittering [9,10], scaling [9,10], window slicing [10,11], and flipping [10,12,13]. In these studies, the traditional data transformation methods do not significantly improve the accuracy of the model [9][10][11]14]. Data augmentation, therefore, is not fully evaluated in time-series data analysis and fault detection applications. With the increase in the demand for fault detection applications in smart factories, the requirement for effective methods for data augmentation has been increasing [9,15].
The general generative models produce outputs similar to the samples in the training dataset [16] by mimic the probability distribution function of the original data. The most popular generative method in data augmentation is the generative adversarial network (GAN) [17,18]. The GAN algorithm is mostly applied in image processing and image generation. The main drawback of GAN is its instability during the training process, where the discriminator and generator try to fool each other [19]. Several studies have been conducted to improve GAN stability in the training process [20]. However, the limitation in GAN evaluation required human inspection, especially for picture generation and computer vision.
In this paper, vibration data from Spectra Quest's Gearbox Prognostics Simulator (GPS) is tested using various fault detection approaches for both limited and unlimited input data. Another data source that can be considered is the real-scenario data, such as Reference [21]. However, the experiments are not compatible with the GPS dataset, and the measurement is not conducted thoroughly. Therefore, we only consider using the GPS dataset for the data augmentation in this study. Using GAN, we generate the broken signal to improve the training performance of the model. Using different approaches in both the experiment and test, we evaluated the generated data comprehensively and avoided misjudging during the data generation process for obtaining the final results. Our main contributions in this paper are as follows:

•
We briefly review the characteristic of the vibration signal data with different approaches in fault detection applications. These approaches are mainly used to verify that the generated data are high quality and suitable for fault detection applications.

•
We introduce GAN algorithms to generate a broken machine signal, which has high quality and is similar to the original signal.

•
The main contribution of this paper is the different approaches used to evaluate the generation data and to guarantee similarity with the original data. These approaches include different preprocessing processes and a variety of machine learning techniques in pattern recognition.
The remainder of this paper is organized as follows. Section 2 describes the fault diagnosis input data and the Fast Fourier Transform (FFT), which provides the refined signal for further data processing and classification. Section 3 provides the working scheme with real data, which includes the full signal, principal component analysis (PCA) transform, and statistical analysis (SA). We also introduce different machine learning techniques for each approach and perform a comprehensive analysis. We discuss data generation using GAN and compare it with the original data in Section 4. With various preprocessing approaches and AI models, the generated data is evaluated carefully with high similarity with the real data. Conclusions for the data augmentation using GAN are presented in Section 5.

Gearbox Data
The vibration signal used in this study is collected from the GPS and then uploaded to OpenEI data storage [22]. The GPS simulates a real gearbox device and has configurations with different options and working behaviors. Based on these configurations, the GPS can simulate gearbox working behavior, condition monitoring, and vibration data for further study.
The basic GPS includes replaceable parts that are combined for gearbox operation simulation as follows: • one shaft test gearbox with two parallel stages; • different torsion and radial loadings; • replaceable gears with large spaces for additional devices; • parallel gearbox that can support rolling element bearings or sleeve bearings; • option for installing intentional error gearing to study the changes in the vibration signature; • modular design that keeps the simulation easy to understand and doable; • different mounting locations; and • gearbox fault simulation and diagnosis methods.
Based on these characteristics, the GPS can be customized to handle and examine heavy loads. GPS is also designed with a large reserve space so that the users can place, set up, and install new monitoring devices. In this paper, we collected data in the four directions: g_x, g_y, g_z, and g_t. In Figure 2, the GPS is set at 50% of the load condition, and we record the vibration signal under normal conditions and the broken tooth condition. The GPS data includes 450 s of normal machine vibration signal and 400 s of broken machine vibration signal.

Fast Fourier Transform
In the first phase, we extract significant characteristics of the input signal by performing feature extraction. These characteristics vary from signal to signal and are statistical, domain-specific features, or both. In Figure 2, the vibration data are collected and stored as the time series, and we transform it into the frequency domain using Fast Fourier Transform (FFT) (Figure 3). The main purpose of this study is to generate and evaluate broken data that is limited in the experiments. Therefore, we need to analyze both the original and generated data with different approaches and AI models that can affect final results. Let x 0 , . . . , x N−1 be complex numbers. We can calculate the Discrete Fourier Transform (DFT) in the time domain by using the following formula: where x 0 , . . . , x N−1 are complex numbers, and e i2π/N is a primitive N-th root of 1 [23]. For FFT, the formula changes into the following expression: By transforming Formula (1) into Formula (2) and (3), the Fourier Transform is split into two smaller transforms with odd-numbered values and even-numbered values. At this point, we did not decrease the computational complexity, which consists of With the symmetrical characteristics of Formulas (2) and (3), we can reduce the number of computations. The value of k is defined as 0 ≤ k < N, whereas the value of n is 0 ≤ n < M ≡ N/2; each sub-problem required half the computation of the original one. The total computation is reduced from The FFT requires lower computation as compared with the original FFT, which is suitable for real-time applications. In this study, we focus on the commercial application in the industrial environment, which requires both high accuracy and real scenario response. Due to these reasons, FFT would be a better choice compared to the DFT. The advantage of FFT is that we can process more significant features in the frequency domain classification between the normal and broken machine signals. Another advantage of the FFT transform is that the generated signal is evaluated indirectly, which leads to better performance analysis. We can consider FFT as one of the most effective methods used to extract the vibration input pattern [5]. Therefore, we use FFT as the basic processing method for further research. Based on the FFT input signal, we propose three approaches: full analysis, PCA transform, and statistical analysis. In the next section, we will discuss different approaches and AI models that can be applied to the fault diagnosis results.

Fault Diagnosis with Original Data
In general, data augmentation is mostly used in image processing [15] because it is easy to evaluate whether the generated data is similar to the original data based on the judgement of human. In contrast, it is difficult to evaluate the data augmentation in data analysis because it depends on a different characteristic of the data. Therefore, to evaluate the generation data, we require a comprehensive test with various conditions. This section introduces three fault detection approaches for vibration signals, which can be applied to the generated signal. The overall fault diagnosis diagram with different data processing and multiple AI models is shown in Figure 4. The vibration data in each direction is windowed and transformed into the frequency domain for further process. Then, we provide 3 different methods for feature extractions, includes full analysis, PCA transform, and statistical analysis. After that, all four signal is combined and feed to different AI algorithms, includes artificial neural network (ANN), support vector machine (SVM), and K-mean clustering. All the approaches introduced in this section will be the basis for further analysis and evaluations.  Figure 5 introduces the training and testing process using both generated data by GAN and real data. The input data is divided into 3 groups: training data includes 70% of total data, testing data includes 30% of total data, and GAN data includes 11.75% data of fault machine of the training data (8.2% of total fault machine data).

Full Analysis and PCA Transform
In this approach, the FFT of the signal is fed directly into the AI model to determine whether or not the machine has errors. The AI models analyzed in this study are artificial neural network (ANN) [25], K-mean clustering [26], and support vector machines (SVM) [27]. These models have proved to be robust for classification applications. Moreover, ANN, K-mean clustering, and SVM are very flexible when dual with different data types and structures. However, K-mean clustering and SVM-based models are not effective when applied with high-dimension input data [28]. Therefore, we also consider the PCA to reduce the dimensions of the input data. Figure 6 shows normal and broken sample data of the signal after the PCA process, which transforms high-dimension input data into low-dimension input data for K-mean clustering and SVM algorithms. The vibration signals, which include full FFT and PCA transform, are fed into the ANN networks. The inputs are different so we consider two ANN structures for the machine fault detection application. The first ANN model for the full FFT of the signal has a large structure because the input contains all four signals in the g_x, g_y, g_z, and g_t directions. This ANN model has an input shape of 200 × 4, 200 input neural, 100 hidden units, and 2 output neural with the Softmax activation function for classification. The Softmax function is defined as follows: where i = 1, . . . ,K and z = (z 1 , . . . ,z K ). The predicted probability of an output of the neural network classified as normal (or broken machine) is: where x is the input vector; w is the weighting vector of the output neural network; and x T w is the inner product of x and w. In this formula, we have i = 1, . . . , K and z = (z 1 , . . . ,z K ). The PCA transform reduces the FFT of the signal into a 20 × 1 vector, which is much smaller than the original signal. Therefore, the ANN has a small structure with 64 input neural, 32 hidden units, and 2 output neural with the Softmax activation function for classification. For comparison, we apply the same optimization for both the ANN structures. Xavier initialization [29] is applied to set up the weights of all ANN cells, and the Dropout technique [30] with 0.7 keeping probability is also used to improve the ANN performance. We use Leaky Rectified Linear Unit (Leaky ReLU) [31] as the activation function for both the input and hidden layers, which is defined as follows: Sparse categorical cross-entropy is used as the loss function for our ANN models, and Adaptive Moment Estimation (ADAM) is selected as the optimization algorithm with 1000 epochs. Table 1 shows the accuracy of ANN, K-mean clustering, and SVM based on two input data (full FFT of the signal and PCA transform signal). The accuracy of ANN, K-mean clustering, and SVM reaches 100% in both cases. Compared with the acoustic signal in Reference [4], we achieve higher accuracy with a simpler data collection method. However, with high training and testing speed, the K-mean clustering and SVM are more suitable in the real-time scenario. Table 2 shows the test results with a small broken machine signal training set (40 s of broken signal, only 10% of the original amount). When there is a lack of small broken machine signal data, the accuracy drops significantly as compared with the original condition. Table 3 shows the result with traditional data augmentation, includes jittering, scaling, and slicing. The accuracy improves slightly as compared with Table 2, which shows that this method cannot significantly improve the performance of the model.

Statistical Analysis
The third approach is that of statistical analysis, which uses the extracted features from the FFT of the signal. The FFT of the signal is analyzed based on the statistical features that have eight parameters: mean, median, min, max, kurtosis, skewness, standard deviation, and range. These features are obtained along all four axes: g_x, g_y, g_z, and g_t. Each data sample contains 32 elements. Due to low input data dimensions, these approaches are suitable with ANN, K-mean clustering, and SVM. The statistical approach not only provides another efficient method for fault classification but also plays an important role in evaluating the generated data. Using statistical analysis, we extract the overall characteristics of the vibration signal. Based on these characteristics, we can compare the generated data with the original data. If the generated data has high accuracy in the statistical analysis, we can conclude that it has a high similarity level with the original data and can be used for the training process. Before feeding data into the AI models, we should understand the variations in the statistical characteristics of the FFT of the signal.
The statistical relationship between the normal and broken machine signals can be considered a reliable classification method, as shown in Figure 7. The statistical elements of the normal and broken signal are similar in the g_y, g_z, and g_t directions, but they are different in the g_x direction. The most difference is seen in the maximum value (4) and kurtosis value (5) of the g_x signal. In Figure 8, if we break down the relationship between them, we can easily classify the normal and broken machine signals.  The performance of ANN, K-mean clustering, and SVM based on statistical analysis are shown in Table 4. The ANN algorithm reaches over 95% accuracy in three scenarios, which proves that robust classification features were adopted. In contrast, K-mean clustering and SVM have the worst performance as compared with full FFT analysis. In the classification using real data, K-mean clustering and SVM showed poor performance as compared with a full analysis approach. However, this characteristic can be used to analyze the performance of the generated data for an unstable classification method. This approach needs to be considered when the generated data is used.

Data Generation in Machine Fault Detection
In the previous section, we concluded that the lack of vibration signals in the training process will decrease the accuracy of the predictive models, regardless of the kind of data process. To improve the accuracy of the predictive model, we introduce data augmentation using GAN to generate broken data similar to the original data. Data generation is based on the original data of the vibration signal of the broken machine, which includes 47 sof vibration and uses 11.75% of the normal signal. The generated signal is analyzed based on the approaches described in Section 3. With different approaches and AI models, we can guarantee the evaluation process with high accuracy and similarity with the original data, which can help to improve the predictive model.

GAN
GAN is a technique designed by Ian Goodfellow to generate new data from a fixed training data set. In this technique, the discriminative and the generative neural networks compete in a zero-sum game to improve themselves. Using a limited training set, the GAN techniques learn by themselves to generate data using the specific structure. The most well-known GAN applications are those in computer vision, in which a photograph set is trained to generate new output with realistic characteristics for human observers. Although previous studies were mostly focused on unsupervised learning, GANs were then used in semi-supervised learning and reinforcement learning. The core idea is to "indirectly" train a generator using the discriminator. Based on this idea, the generator is trained to fool the discriminator but not to minimize the total loss of the function, which leads to the ability to generate new data in a different manner.
GAN was first proposed in Reference [33] to generate images similar to the original image. In a practical situation, GAN can be considered as the competition between a discriminative network D and a generative network G. With normal contribution or random noise, which ranges from 0 to 1 as the input, the generative network G generates the "fake" data G(z), in which distribution p g is close to that of the data distribution p data . The role of the discriminative network D is to distinguish the true data sample x ∼ p data (x) and the generated sample G(z) ∼ p g (G(z)). In the original GAN, this adversarial training process was formulated as follows: The adversarial procedure is illustrated in Figure 9. Most existing GANs perform a similar adversarial procedure in different adversarial objective functions. In this paper, the GAN algorithm is used to generate the broken machine data signal; therefore, only broken training data is fed into the generator. The generator generates the broken data using random noise, which ranges from 0 to 1 with normal distribution to guarantee the difference in the output data.
The generator G and discriminator D have the ANN structure and are implemented as shown in Tables 5 and 6, respectively. The generator G has a complex structure to generate a broken high-quality signal, whereas the discriminator D has a simpler structure for classification. Note that the output of the generator G has the same shape as the input, whereas the output of the discriminator D is a single value between 0 and 1 because the sigmoid activation function for classification is applied.   Tanh   Table 6. ANN structure of the discriminator D.

Data Generating
The data generated using GAN includes 500 s of vibration signals, which includes the four signals g_x, g_y, g_z, and g_t. The generated broken signal is shown in Figure 10, which contains the individual g_x vibration signal. The preprocessing procedure for the generated signal is the same as that for the original signal, and, based on that, we can evaluate its quality using previous fault detection methods. Note that we generate the signal only for the broken machine because the broken signal is assumed to be less than the signal obtained for the original data. The FFT of the g_x signal is calculated and shown in Figure 11.

Fault Diagnosis Using Generated Signal
Based on both the original and generated data, we evaluated the final results using the three approaches described in Section 3. Keeping the same testing data, we simulated the actual situation of the AI model in real life. In contrast, the training data is a mix of the real data and the generated data. The ratio of the real data over generated data is 100%, 80%, 60%, 40%, 20%, and 0%. With this testing condition, different signal processing approaches, and AI model, the generated data will be evaluated comprehensively if it can satisfy the data augmentation requirement in a fault machine detection application.
To verify the similarity between original data and generated data, we conducted the Kolmogorov-Smirnov test using Python version 3.7.4 [34]. The statistic value is 0.059, while the p-value is 0.336, so we can accept the hypothesis that signals are drawn from the same distribution. Figure 12 illustrates the relationship among all statistical elements between an original broken machine signal and a generated signal. The generated signal has a larger data variation distribution with taller boxed, which shows the variations during the generation phase. In Figure 13, by continuing to break down the relationship between the maximum and kurtosis values of the original and generated broken FFT of the signals, we can see high variations in the generated values. The linear regression test [35] is conducted to verify the correlation between the maximum and kurtosis values. The coefficient of determination is calculated and shown in Table 7. With the high value of Coefficient of Determination in all dimension g_x, g_y, g_z, and g_t, we can conclude that the maximum and kurtosis has a high correlation in fault signal. Compared with the original signal, the generated signal has a lower Coefficient of Determination, indicating that it has higher variation and less dependent on each other. This variation is good for the training phase because it can improve the machine learning models with different input patterns.    Table 8 shows the accuracies of ANN, K-mean clustering, and SVM with different approaches based on the FFT of the vibration signal. With the full FFT of the signal approach, the data augmentation with the original data larger than 40% worked well under the testing condition with 100% accuracy (ANN). This result is better than the 0% original data set, which has the accuracy of 89.4% (ANN), 60.2% (K-mean clustering), and 93.1% (SVM). The classification algorithms worked much better in the case where 20% original data were used, which achieve the accuracy of 95.1% (ANN), 61.3% (K-mean clustering), and 95.3% (SVM). In the case of using PCA transform for the FFT signal, the data augmentation with the original data larger than 40% worked well under the testing condition with 98.1% accuracy (ANN). This result is better than the 0% original data set, which has the accuracy of 87.2% (ANN), 55.3% (K-mean clustering), and 91.0% (SVM). The classification algorithms worked much better in the case where 20% original data were used, which achieve the accuracy of 93.1% (ANN), 65.2% (K-mean clustering), and 94.7% (SVM). In the last approach, the statistical analysis of the FFT of the signal, the ANN provides almost 100% accuracy with more than 20% original data. The accuracy of the ANN algorithm decreases to 79.41% when there is 0% original data; this is much better than the traditional augmentation method in Section 3. In this approach, the K-mean clustering and SVM have poor performance as compared with ANN. The SVM has high accuracy and stable with different test sets, so it is strong enough to be considered in real-life scenarios. In contrast, the K-mean clustering is not stable and requires more accuracy to be considered as a reliable classification approach. The result shown in Table 8 indicates that the general data have high similarity as compared with the original data in the FFT of the signal and can be replaced in the training process. Data augmentation is useful in the training process when the number of fault samples is so small that the model cannot be trained effectively. This characteristic is very suitable in machine fault detection because of the lack of fault machines at the start of the implementation phase. With the improvement of GAN, we can generate fault data for applying the machine fault diagnosis with high similarity to the original data. Using various experiments and evaluations, we can conclude that the generated data has a high similarity with the original data in both the time domain and frequency domain. The generation data significantly improved the application of training performance with a large machine fault sample. Although we could generate high-quality input data, the original fault data are also necessary for testing and partial training.

Conclusions
This study proposed a novel method to generate the fault machine vibration signal data, thus enhancing the model performance in the case of a limited fault dataset for training. In this study, our generated fault data improve the accuracy of the model to 93.1% with 0% original fault machine data set (Full signal analysis) and 99.41% with 20% original fault machine data set (statistical analysis). After testing, we conclude that the generated data has high similarity to the original data and significantly improves the accuracy of the model with limited real fault machine data in the training dataset.
However, the data augmentation method using GAN still has a limitation, since the high variety can reduce the output signal. Another drawback of GAN is unstable during the training process, in which the balance between the discriminator and the generator needs to be maintained. Therefore, the architectures of both the generator and discriminator should be considered carefully, and the output of GAN has to be carefully evaluated.
With the disadvantages of GAN, we consider to provide other generative AI models for the data augmentation and compare with the current scheme. Another option is to provide a model to generate the fault data for different machines, only using the fault data of one machine and normal data of other machines.