Predictive Maintenance of Machinery with Rotating Parts Using Convolutional Neural Networks

Apeiranthitis, Stamatis; Zacharia, Paraskevi; Chatzopoulos, Avraam; Papoutsidakis, Michail

doi:10.3390/electronics13020460

Open AccessArticle

Predictive Maintenance of Machinery with Rotating Parts Using Convolutional Neural Networks

Department of Industrial Design and Production Engineering, University of West Attica, 12241 Egaleo, Greece

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(2), 460; https://doi.org/10.3390/electronics13020460

Submission received: 3 October 2023 / Revised: 19 December 2023 / Accepted: 17 January 2024 / Published: 22 January 2024

(This article belongs to the Special Issue Intelligent Manufacturing Systems and Applications in Industry 4.0)

Download

Browse Figures

Versions Notes

Abstract

:

All kinds of vessels consist of dozens of complex machineries with rotating parts and electric motors that operate continuously in harsh environments with excess temperature, humidity, vibration, fatigue, and load. A breakdown or malfunction in one of these machineries can significantly impact a vessel’s operation and safety and, consequently, the safety of the crew and the environment. To maintain operational efficiency and seaworthiness, the shipping industry invests substantial resources in preventive maintenance and repairs. This study presents the economic and technical benefits of predictive maintenance over traditional preventive maintenance and repair by replacement approaches in the maritime domain. By leveraging modern technology and artificial intelligence, we can analyze the operating conditions of machinery by obtaining measurements either from sensors permanently installed on the machinery or by utilizing portable measuring instruments. This facilitates the early identification of potential damage, thereby enabling efficient strategizing for future maintenance and repair endeavors. In this paper, we propose and develop a convolutional neural network that is fed with raw vibration measurements acquired in a laboratory environment from the ball bearings of a motor. Then, we investigate whether the proposed network can accurately detect the functional state of ball bearings and categorize any possible failures present, contributing to improved maintenance practices in the shipping industry.

Keywords:

predictive maintenance; convolutional neural network; deep learning; vibration

1. Introduction

According to the World Economic Forum, 90% of global trade relies on merchant vessels for transportation via the sea [1]. In 2022, Statista Research Department’s previous research revealed that the world merchant fleet consists of 58,590 vessels of all types and sizes, like bulkers, tankers, container ships, refer, RoRo, chemicals, LPG, and LNG [2]. The ultimate goal of shipping companies is to operate the ships under their management with the highest possible freight rates and the lowest operating and maintenance costs. Considering that these vessels boast an array of advanced machinery, a malfunction in any of these systems not only has the potential to impede the vessel’s expected arrival time but could also pose a significant threat to the safety of both the vessel and its crew, with possible adverse effects on the environment. To keep the vessels fully functional and seaworthy, the International Maritime Organization (IMO) has established and enforced the International Safety Management Code (ISM). Mandated by this code, the shipping industry has implemented corrective and proactive maintenance of all major machinery onboard shipping vessels.

In proactive maintenance, each piece of machinery is immobilized and inspected in a predetermined calendar or operating period, while major parts are proactively replaced regardless of their actual condition. This entails the risk that proactively replaced parts may still be functional, meaning that their replacement has resulted in unnecessary waste of resources and money. In addition, it is still possible for a machine to break down before its scheduled maintenance, and therefore, unscheduled repair works must be executed as soon as possible. It is believed that for proactive maintenance to be properly applied, vessels must always procure and carry spare parts for all major machinery. This leads to increased overall maintenance costs and the commitment of significant financial resources to procuring and storing the spare parts.

With the development of technology over time, new maintenance methods based on the real condition of machinery have appeared and are gradually, though still slowly, being applied in the marine industry. The overarching objective is to draw useful conclusions about the machinery’s operational status and, finally, to implement predictive maintenance (PdM). The ultimate goal of PdM is to optimize performance and efficiency, minimizing both the downtime and maintenance costs of machinery.

To address how predictive maintenance (PdM) can contribute to a vessel’s rotating machinery operating optimization, in this paper, we analyze how computer and data technology, and more specifically convolutional neural networks (CNNs), which are part of deep learning (DL) and artificial intelligence (AI), can detect faults in machines early by analyzing their fundamental operational features. Since machines with rotating parts vary, each having different operational characteristics, the focus of this work is on identifying and classifying issues related to ball bearings and their internal parts, like balls and rings. This involves the analysis of captured vibration patterns associated with these components. Bearings and ball joints, being essential elements of machines with rotating parts, are particularly prone to experiencing the most prevalent faults.

To this end, we propose a simple, 1D convolutional neural network that is trained with raw vibration data from the Case Western Reserve University (CWRU) Bearing Data Center. Then, the proposed network is fed with new, unknown raw vibration signals to check whether it succeeds in detecting and categorizing the status of the ball bearings. The CWRU motor vibration dataset has been used by many researchers and is provided by the Case Western Reserve University Bearing Data Center for research purposes [3].

This paper is organized as follows: Section 2 presents some of the most interesting studies based on a review of the recent literature; Section 3 provides a concise overview of the two primary machinery maintenance methods commonly employed in the shipping industry, preventive maintenance and predictive maintenance, as well as the principles of CNNs; Section 4 analyzes the proposed CNN and presents the training, evaluation, and, subsequently, the testing dataset and the results of the network’s response. Finally, conclusions and directions for further research are drawn in Section 5.

2. Literature Review

Nowadays, there is an increasing research interest in using artificial intelligence techniques for fault diagnosis. Some representative papers are presents below.

Han et al. [4] proposed a combination of genetic algorithms (GAs) and neural networks (NNs) for diagnosing faults in induction motors based on feature recognition using motor vibration. A GA was employed as a feature selection and classifier optimization tool. The proposed system’s performance was evaluated in a real-life test. It was proven that the proposed system has high effectiveness, a compact structure, and is promising for real applications.

Chen et al. [5] used a CNN for the detection and categorization of faults in gearboxes. The network was trained and tested using measurements taken from a gearbox at five different operating frequencies, each subjected to four different load profiles. The proposed model demonstrated a classification accuracy ranging from 89.46% to 98.35%.

Janssens et al. [6] used feature learning in the form of a CNN model applied to raw amplitudes of the frequency spectrum of vibration data. The network learned transformations on the data that resulted in a better representation of the data for an eventual classification task in the output layer. The results showed that employing the proposed CNN model led to improved outcomes in detecting various faults, including outer-raceway faults and diverse levels of lubricant degradation. In contrast to a traditional manual feature extraction approach, the CNN-based method demonstrated an overall enhancement in classification accuracy, without necessitating an extensive reliance on domain knowledge for fault detection.

Guo et al. [7] proposed a new CNN trained for the detection of faults in ball bearings using an enhanced adaptive learning rate hierarchical algorithm (adaptive deep CNN). The primary focus was on employing this network for the diagnosis of bearing faults and the quantification of their severity. To assess the efficacy of the proposed approach, an experimental study was conducted, utilizing bearing fault data samples acquired from a dedicated test rig. The method demonstrated satisfactory performance in terms of both recognizing fault patterns and evaluating fault sizes.

Zhang et al. [8] presented a two-dimensional convolutional model for the detection of faults in ball bearings. For their study, they used data from the Case Western Reserve University Bearing Data Center (CWRU). Initially, they transformed a one-dimensional time series of raw vibration measurements, each consisting of 2400 data points, into two-dimensional images of size 60 × 40. The model was trained and tested with two datasets, A and B. Their research results indicated that the proposed model successfully classified 99.7% of the cases in dataset A, while this percentage decreased by approximately 1.8% when dataset B was used. According to the authors, this suggests the need for the network to be trained on large datasets. The model was proven able to achieve fairly high accuracy in a noisy environment. It also achieved high accuracy when the working load is changed.

Guo et al. [9] introduced a novel diagnostic approach where a CNN was employed to directly categorize a continuous wavelet transform scalogram (CWTS), a time–frequency domain transformation of the primary signal that encapsulates the majority of information within vibration signals. The CNN was trained for fault diagnosis utilizing the CWTS as the input. Several experiments were executed on the rotor experimental platform and the outcomes affirmed the capability of the proposed method to conduct accurate fault diagnosis. To assess the method’s universality, the trained CNN was additionally applied to conduct fault diagnosis in alternative rotor equipment, yielding favorable results.

In the work of Wu et al. [10], a fault diagnosis model for gearboxes utilizing a one-dimensional CNN was introduced. This model was designed to autonomously acquire feature classification diagnoses directly from the original vibration signals. In contrast to three conventional signal decomposition methods, the outcomes demonstrated that the one-dimensional convolutional neural network (1-DCNN) model exhibited superior accuracy in diagnosing faults in fixed-shaft gearboxes and planetary gearboxes compared to traditional diagnostic approaches.

Abdeljaber et al. [11] presented a one-dimensional convolutional network for the detection, classification, and assessment of the severity of ball bearing faults. The proposed network was trained and then tested on scenarios involving multiple ball bearing faults simultaneously. In this way, the authors demonstrated the accuracy and validity that simple one-dimensional convolutional networks can achieve without requiring significant computational power and extensive datasets for training. The experimental results demonstrated that the proposed approach could achieve a high level of accuracy for damage detection, localization, and quantification.

Ma et al. [12] investigated fault prediction using a one-dimensional CNN comprising only two layers. They processed the input signal by applying a wavelet packet transform to feed the network with frequency domain data. To validate the effectiveness of the proposed network, the authors used data from Case Western Reserve University (CWRU), introducing white Gaussian noise ranging from 10% to 100%. The results showed that the proposed network exhibited a significantly better noise response (5–58%) compared to existing classification models, improved learning (up to 44%), and significantly lower computational power requirements, reaching 88.5%.

Zhao et al. [13] performed a comprehensive evaluation of four deep learning models, including multi-layer perception (MLP), auto-encoder (AE), convolutional neural network (CNN), and recurrent neural network (RNN). They utilized nine publicly available datasets containing vibration measurements from ball bearings and evaluated their ability to draw meaningful conclusions based on the included data. Their work focused on evaluating DL-based intelligent diagnosis algorithms from different perspectives and providing benchmark accuracy (a lower bound) to avoid useless improvement.

Souza et al. [14] proposed the application of convolutional neural networks for developing a prognostic maintenance model that detects and classifies faults in machines with rotating parts and provides recommendations on when maintenance actions should be taken. The proposed model demonstrated high accuracy in the classification of failures using the MaFaulDa database. The proposed method showed great potential for application in the diagnosis and classification of failures of rotating machines in industrial environments with different severity levels, even with the usage of only one vibration sensor and features.

A study by Kiranyaz et al. [15] presented a comprehensive review of the general architecture and principles of 1D CNNs along with their major engineering and biomedical applications, focusing especially on recent progress in this field. It has already become apparent that AI will further assist or perhaps even replace humans in those complex tasks that require a high level of expertise and training, such as medical operations, health monitoring and diagnosis, taxonomy, and even higher education.

Tama et al. [16] presented the advantages and disadvantages of applying deep learning in the predictive maintenance of rotating machinery. Based on a thorough review, they conducted an in-depth analysis of 59 studies on fault diagnosis methods based on vibration signals. In their work, they investigated a series of deep learning algorithms such as convolutional neural networks (CNNs), deep belief networks (DBNs), recurrent neural networks (RNNs), generative neural networks, and graph neural networks (GNNs), providing insights into the latest techniques for fault diagnosis.

As previously mentioned, in recent years, researchers have increasingly applied artificial intelligence and deep learning for fault detection. In the preceding paragraphs, a summary of research works published on the subject was presented, highlighting the advantages and challenges faced. Within the current work, the authors aimed to present their initial exploration of predictive maintenance by developing a 1D convolutional neural network for the detection and categorization of ball bearing faults. The ultimate goal of this study is to demonstrate that a simple 1D CNN can accurately identify a defected ball bearing and categorize its fault.

3. Fundamental Concepts

3.1. Maintenance Strategies

In today’s modern era, society has embraced an increasingly consumerist lifestyle, leading to an unprecedented growing demand for production worldwide. However, the challenges of meeting this demand, such as machinery breakdowns, have become apparent; thus, unexpected production line disruptions contribute to increased downtime and hinder the productivity and profitability of companies. As a result, the concept of machinery maintenance has gained prominence in sectors such as construction, production, and transportation. Regardless of its specific type, the ultimate objective of effective maintenance is to enhance working conditions and optimize machinery performance.

Machinery breakdowns and interruptions in production have a significant and increasingly swift economic impact. The process of mechanization followed by automation in production is a progression with no turning back. It was only in 1962 when the term reliability-centered maintenance (RCM) was introduced, and in 1970, Japan emphasized the concept of total productive maintenance (TPM) [17]. These developments highlight the growing recognition of the importance of maintenance strategies in optimizing reliability and productivity in industrial settings. According to Coanda et al. [18], the lifetime of a product includes five stages: conception of the idea, defining its objectives, manufacturing, use, and disposal/recycling. It is evident that the middle phase, involving the use of the product, should last longer; therefore, the product itself must be maintained in a manner that ensures optimal performance throughout its entire lifespan.

In contemporary maintenance practices, the focus on minimizing environmental impact has given rise to two key approaches: “Sustainable Maintenance” and “Energy-Based Maintenance” [19,20,21,22,23]. These methodologies aim to enhance resource efficiency and overall sustainability. Sustainable maintenance, acknowledging the significance of environmentally responsible asset upkeep, expands beyond the traditional emphasis on sustainability in new constructions. It incorporates eco-friendly practices, emphasizing waste reduction, cost efficiency, and minimizing social impact. The strategy involves evaluating environmental impacts throughout the maintenance cycle, addressing energy consumption, pollutants, and waste hazardousness to extend asset lifecycles while achieving operational efficiency and reducing environmental, social, and economic footprints.

Energy-based maintenance, as a subset of condition-based maintenance, monitors equipment performance by measuring energy consumption during regular operation. It compares measured energy with predetermined standards, identifying deviations as triggers for timely maintenance actions. Unlike preventive maintenance, which may lead to resource wastage, energy-based maintenance allows for prolonged operation while continuously monitoring energy consumption, preventing unnecessary labor and spare part expenditures. This approach contributes to efficient resource use, locates potential breakdown conditions, and enhances productivity, albeit with considerations for instrument costs and potential training needs of maintenance personnel.

Although the ultimate purpose of maintenance is the same, i.e., keeping a machinery or production line in its optimal operation condition, its philosophy depends on the production area it is applied to, but also on the mentality of the stakeholders. As determined by the EN 13306 standard [24], maintenance is divided into two major categories, reactive and proactive; each one is divided into more subcategories [25].

Reactive maintenance is applied after the consequences of machinery breakdown or production line immobilization have occurred. It can be categorized into two types: corrective and emergency maintenance. Corrective maintenance is the first maintenance strategy ever applied and is the most straightforward. It follows the run to failure approach, where no interventions are made until the machinery fails. Once the machinery breaks down, the faulty parts are repaired or replaced and the machinery is brought back into operation. This method is applicable, without unpleasant implications, when the defective machine does not affect production, or when demand far outweighs production, or the profit margin is so large that the cost of repair is negligible.

On the other hand, proactive maintenance refers to a preventive approach that focuses on identifying and addressing potential issues before they result in machinery breakdowns or production line disruptions. It aims to minimize unexpected downtime and optimize the overall performance and reliability of the equipment. Proactive maintenance encompasses two various strategies: preventive and predictive maintenance.

Preventive maintenance involves regularly scheduled inspections, repairs, and servicing of equipment or facilities to identify and address potential issues before they develop into major problems [26]. The EN 13360 standard defines preventive maintenance as performed at predetermined time intervals or other measurement units of use without prior knowledge of the state of the machine, with the sole purpose of reducing failure or degradation of the functional state of a system, component, or machine [24]. One of the main objectives of preventive maintenance is to reduce the rate, but also the frequency, of failures [27]. One key benefit of preventive maintenance is that, with only a few exceptions, it minimizes operational issues by proactively conducting maintenance, replacing worn parts, and assessing the machine’s condition before problems arise. In contrast, relying solely on time-based maintenance can lead to tasks being performed either too early or too late, resulting in premature part replacements when they still have a significant remaining lifespan.

Predictive maintenance is defined as “Condition-based maintenance carried out following a forecast derived from repeated analysis or known characteristics and evaluation of the significant parameters of the degradation of the item” [24]. This approach aims to optimize maintenance schedules, reduce unplanned downtime, and minimize unnecessary maintenance tasks. By addressing potential issues before they lead to failures, predictive maintenance helps to improve operational efficiency, extend equipment lifespan, and reduce maintenance costs. The authors of [28] state that 99% of the time, specific signs, conditions, or indicators manifest themselves before the occurrence of any equipment failure.

Another advantage of predictive maintenance is that it can be applied while the machine is in operation and, relying on data analysis, it is possible to analyze the data even in a state of rest, when the machine is not functioning. Predictive maintenance can be divided into two basic categories: detection, where data are analyzed to identify the existence of a fault or the potential future occurrence of a breakdown, and prediction, where data are analyzed to forecast/estimate when a fault may occur and, therefore, calculate the remaining useful life (RLU) of the machine.

3.2. Convolutional Neural Networks—An Overview

Convolutional neural networks are a type of deep learning architecture specifically designed for processing grid-like data, such as images or sequential data. They are considered a special multi-layer perceptron (MLP) topology network, consisting of multiple filter stages and one classification stage. CNNs are inspired by the organization and functioning of the visual cortex in the human brain, which allows them to effectively extract and learn hierarchical representations of complex patterns in data. They have revolutionized the field of computer vision and have been successfully applied in various tasks, including image classification, object detection, and image generation.

An artificial neural network architecture that would allow computers to recognize images was first proposed in the late 1980s. The first convolutional neural network, named LeNet, was developed in 1994, and it was capable of recognizing handwritten graphic characters and applying back-propagation training. In [29], a new convolutional model, LeNet5, was trained to recognize handwritten characters using the NIST database and utilizing the gradient descent method. In the same year, more than 10% of bank checks in the USA were processed using this technology [30].

CNNs have been at the forefront of deep learning. Researchers are constantly coming up with new ideas with improved network characteristics and performance. Some of the most well-known CNNs are Google Net (2013), VGG (2015), ResNeXT (2017), and EfficientNet 1303 (2020). Figure 1 illustrates an example of a common convolutional neural network architecture [31].

3.2.1. CNN Architecture

CNNs mainly consist of three types of layers: convolutional layers, pooling layers, and fully connected layers.

Convolutional layer: A convolutional layer is the fundamental component of CNN architecture and is responsible for feature extraction. Convolution is a specialized type of linear operation used for feature extraction that applies a set of learnable arrays of numbers (also known as kernels) to the input data, which are also an array of numbers (called tensor). The kernel scans the entire input data by performing convolution operations, which involve element-wise multiplication and summation, where the product is a value in the corresponding position of the output tensor, called a feature map. This procedure is repeated by applying multiple kernels to form an arbitrary number of feature maps that represent different characteristics of the input tensors. Each filter captures specific local patterns or features in the input data, such as edges or textures. Convolution is described by Equation (1) below, while Figure 2 illustrates the mathematical operation of convolution [32].

O_{j}^{l} = f (\sum_{i \in M_{j}^{}} {(σ_{i j} O}_{i}^{l - 1} ∎ W_{i j}^{l}) + b_{j}^{l})

(1)

where:

O_{i}^{l - 1}

is the output of convolution layer l – 1 and the input of layer l through feature map i;

O_{j}^{l}

is the jth feature map of layer l – 1;

f

is the transfer function (RELU, sigmoid, etc.);

M_{j}

is the input tensors;

W_{i j}^{l}

is the weight that connects the i_th feature of layer l − 1 with the j_th feature of layer l;

b_{j}^{l}

is the layer l − 1 bias.

Finally, the symbol

∎

represents the convolution operation.

Pooling layer: The pooling layer always follows a convolutional layer. It is mainly used to provide a down-sampling operation that reduces the in-plane spatial dimensions of feature maps. Similar to the convolutional layer, a kernel is applied in local regions of layer input, replacing them with a summarized representation. The most commonly used pooling operation is max pooling, which extracts the maximum value within a defined window. Retaining the maximum value, max pooling keeps the most prominent features present in the region while discarding less relevant information. Another type of pooling operation, less frequently used, is average pooling. Average pooling computes the average value within a predefined region. In cases where the average intensity or value across a region is higher than the max value, average pooling can be more effective.

The pooling layer’s operation helps to preserve the most salient information while reducing the computational complexity of the overall network. It also contributes to translation invariance, enabling the network to recognize patterns regardless of their precise spatial location in the input data.

Fully connected layer: Fully connected layers, also known as dense layers, are the last layers of the network, where all the neurons of one layer are interconnected with the neurons of the next. Fully connected layers are responsible for making final predictions and classifications based on the extracted features of the convolutional layers. The number of fully connected layers depends on the depth of classification. In most cases where a multi-class classification is required, fully connected layers utilize a hot encoder and softmax function that normalizes the real output values of the previous layers to target class probabilities between 0 and 1, while the sum of all probabilities is 1.

3.2.2. CNN Training

The ultimate goal of CNNs is to effectively categorize an image or a data time series into a specific type, acting on detection, regression, or classification. To achieve that, CNNs have parameters that must be calculated during the training process. CNNs utilize supervised learning to optimize their internal parameters, weights, and biases by minimizing a loss function that measures the discrepancy between predicted and true labels. This process, known as backpropagation, involves optimization algorithms such as gradient descent, which updates the network’s parameters and then propagates the error back through the network layers. This process allows the network to learn and recognize relevant patterns or features at different spatial locations in the input data. The training process iterates until a stopping criterion is met. The most commonly used criteria are a maximum number of epochs or observing diminishing returns in validation set performance.

One of the most challenging goals that network training must achieve is to balance the results between over- and underfitting. Overfitting is when a network performs well on the training data but poorly on new, unknown data, while underfitting is when it fails to capture the underlying patterns in input data. To mitigate over- and underfitting, regularization techniques such as the dropout function can be applied to the kernels of the convolutional and pooling layers. The dropout function randomly deactivates (sets to zero) a specific percentage of neurons during each training iteration; thus, it must be used sparingly.

4. Experimental Results of the Proposed CNN

As mentioned in the previous sections, CNNs are designed to exclusively operate on 2D data such as images and videos. To feed a CNN network with 1D data, like voice and data series, weather forecast data, vibration measurements, traffic flow, and electrocardiogram signals [33], different techniques, like reshaping, have been utilized to transform the 1D signal into a 2D representation [11]. However, 2D CNN networks, especially ones developed with deep architecture that have more than 1 M (usually above 10 M) parameters, exhibit high computational complexity [15]. To settle this drawback, researchers have developed and introduced 1D convolutional neural networks, which, by leveraging 1-D kernels, can operate directly on 1D data, like time series. They have been proven to be more effective when extracting features from a fixed-length segment in the whole dataset, where the position of the feature does not matter [33]. Moreover, they have displayed superior, faster, and more accurate behavior in real-time data monitoring applications. Due to their “shallow” architecture, they can be trained and operated utilizing low computational requirements.

In this section, we present a 1D convolutional neural network architecture and assess its adherence to the previously outlined requirements. Our initial step involves training the proposed model and evaluating its performance using data from the Case Western Reserve University (CWRU) Bearing Data Center. Subsequently, the model is fed with new, previously unseen vibration data to determine whether the network is able to successfully identify the condition of the ball bearings and classify any potential faults.

4.1. Experimental Setup

The basic structure of the laboratory model used by CWRU for capturing vibration measurements is shown in Figure 3.

It consists of a 2Hp electric motor, a torque transducer for measuring motor torque, and a dynamometer for load simulation. The ball bearings measured and used in the tests support the motor shaft from both the load side (DE) and the motor fan side (NDE). Damage ranging from 0.18 to 0.71 mm (7–40 mils) was induced in the balls as well as in the inner and outer raceways of the ball bearings using electromagnetic induction. The ball bearings used were 6205-2RS JEM deep groove for the DE and 6203-2RS JEM deep groove for the NDE, both from SKF, Göteborg Switzerland.

Vibration measurements were carried out using accelerometers placed vertically (in the radial plane) on both the DE and NDE of the motor. The measurements were sampled at 12 and 48 kHz. A total of 161 Matlab data files were generated and grouped into four categories: 48 kHz no fault, 48 kHz DE faults, 12 kHz DE faults, and 12 kHz NDE faults. It is worth noting that for this particular dataset, the concept of “load” has no significant meaning in vibration measurements since it originates from an electromagnetic mechanism and not from actual load, such as a gearbox or a fan, which would convert the motor torque into radial load on the ball bearings. Here, the imposed load mainly affects motor speed, which decreases by almost 4% at maximum load [34]. Additionally, for damage to the outer raceway, there are three measurements, labeled @3, @6, and @12. This indicates the relative position of the measurement with respect to the position of the ball bearing. Measurements of the outer raceway were conducted in three directions as the potential point of damage is fixed (it does not rotate with the ball bearing) and therefore has a direct impact on the vibration response of the motor–ball bearing system.

4.2. CWRU Dataset

As described above, the CWRU dataset consists of 161 files, each one including 65 to 450 thousand data rows. For our model, we used the 48 KHz sampling data, i.e., DE measuring data for 0 and 1 Hp motor loads and all the available ball bearing conditions (no fault, inner race, out race@3H, out race@6H, out race@9H, Base Support), with induced fault sizes of 7, 14, and 21 mils. Finally, the utilized dataset consisted of Matlab 27 files which were transformed and combined in a .csv file.

The created dataset was then split in half. Fifty percent of the data, approximately 2.2 million data, was used for the training and evaluation of the model, while the other fifty percent was fed to the network as “new”, “unseen” measurements. This research study aimed to assess whether the proposed network can successfully identify and classify the condition of ball bearings by analyzing raw DE vibration signals. Figure 4 illustrates a graphical representation of raw vibration data for a ball bearing operation without damage and load, as well as the footprint of a ball bearing with 14 mils damage in the inner raceway for 0 Hp load. The figure was generated in Python (version 3.12.1), printing the specific part of the dataset.

Likewise, Figure 5 illustrates a scatter plot of data types corresponding to vibration amplitudes for the previously described dataset generated using Python. It can be easily determined that measurements resulting from the operation of undamaged ball bearings (N) have the smallest amplitude, while those describing damage to the outer raceway (OR) have the largest measurement amplitude.

4.3. Architecture Design

To achieve the optimum network architecture, we performed multiple tests with different module combinations. Our final architecture is illustrated in Figure 6. The proposed network is quite simple and consists of a data input layer, two convolutional layers using the RELU activation function, and one max pooling layer. In the output, we can find one flatten layer and two dense layers. The first dense layer consists of 200 neurons, while the second and last one involves 27 neurons, each one representing one of the ball bearing condition classes.

Table 1 illustrates the shape and the training parameters for each layer. The total number of training parameters of the network is 1,554,619.

4.4. Evaluation Results

In the previous subsection, we discussed the network training process, which used a portion (50%) of the CWRU dataset. To ensure accurate evaluation, we further divided the dataset into a 70/30 ratio for training and evaluation purposes. Through rigorous testing, we determined that optimal results were achieved with a batch size of 300 and a total of 20 epochs. The entire training process took less than 14 min. During training, the model exhibited a remarkable accuracy of 99.62%. However, during the evaluation phase, accuracy slightly decreased to 93.52%. The progress of each epoch is visually depicted in Figure 7, while Figure 8 provides insights into the training and evaluation process in terms of determining the number of epochs.

Considering the success of the network in determining the condition of and categorizing specific ball bearing faults, it is worth mentioning that the proposed CNN model was developed, trained, and tested on a conventional laptop equipped with an 11th generation i7 CPU, without the assistance of GPT technology or any other enhanced process and memory capabilities.

In Figure 8, Figure 9 and Figure 10 scatter plots of the data per fault category are depicted. These plots facilitate our comprehension of the evolution of distinctiveness after the convolutional layers and the extraction of feature maps. At the input of the flatten layer (Figure 8), a relative grouping of features is observed, which becomes more distinct at the input of dense layer 18 (Figure 9), and finally, the categorization of different fault categories becomes distinct at the input of the last dense layer 19 (Figure 10). For all scatter plots, T-distributed stochastic neighbor embedding (TSNE) from sklearn library is utilized. The TSNE parameters are the same for all three scatterplots: n_components: 2, perplexity: 40, learning_rate: auto, n_iter: 30, verbose: 1.

Further, the remaining 50% of the CWRU dataset was utilized to test the network. As previously mentioned, our objective was to evaluate the feasibility and accuracy of the proposed network in categorizing new, unseen data. By conducting this testing phase, the aim was to validate the model’s performance on previously unencountered data samples. These measurements were segmented using the same window size and stride. As observed in Table 2, different amounts of data and measurements were utilized in each test, all of which were significantly smaller than the dataset used to train the network. The data ranged from a minimum of 2000 to a maximum of 15,000 measurements, whereas the network was trained on 4,544,767 measurements. To evaluate the network’s performance, we employed the inverse encoder process, comparing the network’s predictions with the actual state of the spherical bearing. The results are presented in Table 3.

From the two tables above, several observations can be made:

The network demonstrates an overall response ranging from 78.8% to 100%, with the exception of detecting signals 14_0_BN and 21_0_IR;
For signals 14_0_BN and 21_0_IR, the network’s response is 25% and 20%, respectively. It is worth noting that incorrect predictions are not related to the type of damage but rather to load. The network misinterprets a 0 Hp sample as 1 Hp. Nonetheless, this is not considered an error since the network successfully recognizes the type of fault;
Remarkably, the network achieves a 100% accuracy in distinguishing between damage and non-damage. This is particularly surprising considering the simplicity of the network and the circumstances under which it was tested and trained within the context of our work;
As previously mentioned, the data used for testing the model consist of measurements ranging from 2000 to 15,000, corresponding to signals of 44 to 330 milliseconds at a 45 kHz sampling rate. It is important to note that these measurements are relatively small compared to real-world scenarios where measurements of several seconds are typically used. This further highlights the network’s success in its response.

5. Conclusions

This study focused on predictive maintenance of rotating machinery using a convolutional neural network. Throughout the study, the main focus was on the importance of proper machine operation for safe navigation and ship management, as well as the implications that a malfunction can have on performance, ultimately impacting the profits of the ship-owning company, the safety of seafarers, and environmental protection. The selection of maintenance approaches (corrective, predictive, prognostic, etc.) depends on various factors, but the primary objective remains reducing machinery breakdowns, minimizing downtime, optimizing machine performance, and ultimately lowering maintenance costs. Moreover, when combined with artificial intelligence and deep learning, it enables the detection of faults at an early stage, before they cause damage to the monitored machine.

In this paper, a model based on a convolutional neural network was introduced. This model was trained using the CWRU vibration dataset gathered from accelerometers positioned on the DE and NDE of a 2 Hp engine under laboratory conditions. The network proposed in this study demonstrated favorable performance in classifying specific faults, with accuracy ranging from 20% to 100%. Analyzing further the network’s results in cases where its response was low, such as in data for 14_0_BN and 21_0_IR, we observed that the network succeeded in classifying the type of fault but could not distinguish between different loads (0 and 1Hp) or different defect sizes (7, 14, and 21 mils). Therefore, when faults are grouped only based on their type, i.e., BN, IR, OR, and N, without considering the size or load applied to the motor shaft, the network’s response is consistently above 90%.

As the training and testing of the network were conducted using data collected in a laboratory environment (CWRU dataset), the authors’ intention is to continue the research and, at a later stage, test the proposed network on real vibration data collected from machinery controlling and reliquefying vapors in vessels carrying liquified natural gas (LNG carriers). This endeavor poses a challenge due to the implementation of preventive maintenance measures, which aim to assure the optimal performance of these machines and minimize the occurrence of substantial damage. Therefore, there are ample data available from machines operating correctly, but data from machines prior to the occurrence of a fault are very limited or nearly nonexistent.

Since few studies focus on developing machine learning or deep learning models for calculating the remaining useful life (RUL) of machinery, the authors’ purpose and intention is to expand the proposed network beyond fault detection and prediction, aiming to also estimate the remaining useful life of specific machines. In the future, our intention is to apply the same model to real-time measurements taken by sensors permanently installed on rotating machinery.

Author Contributions

Conceptualization, S.A.; methodology, S.A.; software, S.A.; validation, P.Z., A.C. and M.P.; formal analysis, S.A. and P.Z.; investigation, S.A. and P.Z.; resources, S.A.; data curation, S.A.; writing—original draft preparation, S.A. and P.Z.; writing—review and editing, S.A., P.Z. and A.C.; visualization, A.C. and M.P.; supervision, A.C. and M.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Our Economy Relies on Shipping Containers. This Is What Happens When They’re ’Stuck in the Mud’. Available online: https://www.weforum.org/agenda/2021/10/global-shortagof-shipping-containers/ (accessed on 9 September 2023).
Number of Ships in the World Merchant Fleet as of January 1, 2022, by Type. Available online: https://www.statista.com/statistics/264024/number-of-merchant-ships-worldwide-by-type/ (accessed on 9 September 2023).
Welcome to the Case Western Reserve University Bearing Data Center Website. Available online: https://engineering.case.edu/bearingdatacenter/welcome (accessed on 9 September 2023).
Han, T.; Yang, B.-S.; Yin, Z.-J. Feature-based fault diagnosis system of induction motors using vibration signal. J. Qual. Maint. Eng. 2007, 13, 163–175. [Google Scholar] [CrossRef]
Chen, Ζ.; Li, C.; Sanchez, R.-V. Gearbox Fault Identification and Classification with Convolutional Neural Networks. Shock Vib. 2015, 2015, 390134. [Google Scholar] [CrossRef]
Janssens, O.; Slavkovikj, V.; Vervisch, B.; Stockman, K.; Loccufier, M.; Verstockt, S.; Van de Walle, R.; Van Hoecke., S. Convolutional Neural Network Based Fault Detection for Rotating Machinery. J. Sound. Vib. 2016, 377, 331–345. [Google Scholar] [CrossRef]
Guo, X.; Chen, L.; Shen, C. Hierarchical adaptive deep convolution neural network and its application to bearing fault diagnosis. Measurement 2016, 93, 490–502. [Google Scholar] [CrossRef]
Zhang, W.; Li, C.; Peng, G.; Chen, Y.; Zhang, Z. A deep convolutional neural network with new training methods for bearing fault diagnosis under noisy environment and different working load. Mech. Syst. Signal Pr. 2017, 100, 439–453. [Google Scholar] [CrossRef]
Guo, S.; Yang, T.; Gao, W.; Zhang, C. A Novel Fault Diagnosis Method for Rotating Machinery Based on a Convolutional Neural Network. Sensors 2018, 18, 1429. [Google Scholar] [CrossRef]
Wu, C.; Jiang, P.; Ding, C.; Feng, F.; Chen, T. Intelligent fault diagnosis of rotating machinery based on one-dimensional convolutional neural network. Comput. Ind. 2019, 108, 53–61. [Google Scholar] [CrossRef]
Abdeljaber, O.; Sassi, S.; Avci, O.; Kiranyaz, S.; Aly Ibrahim, A.; Gabbouj, M. Fault Detection and Severity Identification of Ball Bearings by Online Condition Monitoring. IEEE Trans. Ind. Electron. 2019, 66, 8136–8147. [Google Scholar] [CrossRef]
Ma, S.; Cai, W.; Liu, W.; Shang, Z.; Liu, G. A Lighted Deep Convolutional Neural Network Based Fault Diagnosis of Rotating Machinery. Sensors 2019, 19, 2381. [Google Scholar] [CrossRef]
Zhao, Z.; Li, F.; Wu, J.; Sun, C.; Wang, S.; Yan, R.; Chen, X. Deep learning algorithms for rotating machinery intelligent diagnosis: An open source benchmark study. ISA Trans. 2020, 107, 224–255. [Google Scholar] [CrossRef]
Souza, R.M.; Nascimento, E.G.S.; Miranda, U.A.; Silva, W.J.D.; Lepikson, H.A. Deep learning for diagnosis and classification of faults in industrial rotating machinery. Comput. Ind. Eng. 2021, 153, 107060. [Google Scholar] [CrossRef]
Kiranyaz, S.; Avci, O.; Abdeljaber, O.; Ince, T.; Gabbouj, M.; Inman, D.J. 1D convolutional neural networks and applications: A survey. Mech. Syst. Signal Process. 2021, 151, 107398. [Google Scholar] [CrossRef]
Tama, B.A.; Vania, M.; ·Lee, S.; Lim, S. Recent advances in the application of deep learning for fault diagnosis of rotating machinery using vibration signals. Artif. Intell. Rev. 2023, 56, 4667–4709. [Google Scholar] [CrossRef]
Mushiri, T.; Mbohwa, C. Machinery Maintenance Yesterday, Today and Tomorrow in the Manufacturing Sector. In Proceedings of the World Congress on Engineering Vol II, WCE 2015, London, UK, 1–3 July 2015. [Google Scholar]
Coanda, P.; Avram, M.; Constantin, V. A state of the art of predictive maintenance techniques. In Proceedings of the OP Conference Series: Materials Science and Engineering 997, Iași, Romania, 4–5 June 2020. [Google Scholar]
Jasiulewicz-Kaczmarek, M.; Gola, A. Maintenance 4.0 Technologies for Sustainable Manufacturing—An Overview. IFAC-PapersOnLine 2019, 52, 91–96. [Google Scholar] [CrossRef]
Ibrahim, Y.M.; Hami, N.; Othman, S.N. Integrating Sustainable Maintenance into Sustainable Manufacturing Practices and its Relationship with Sustainability Performance: A Conceptual Framework. Int. J. Energy Econ. Policy 2019, 9, 30–39. [Google Scholar] [CrossRef]
Bányai, A. Energy Consumption-Based Maintenance Policy Optimization. Energies 2021, 14, 5674. [Google Scholar] [CrossRef]
Orošnjak, M.; Jocanović, M.; Čavić, M.; Karanović, V.; Penčić, M. Industrial maintenance 4(.0) Horizon Europe: Consequences of the Iron Curtain and Energy-Based Maintenance. J. Clean. Prod. 2021, 314, 128034. [Google Scholar] [CrossRef]
Orošnjak, M.; Brkljač, N.; Šević, D.; Čavić, M.; Oros, D.; Penčić, M. From predictive to energy-based maintenance paradigm: Achieving cleaner production through functional-productiveness. J. Clean. Prod. 2023, 408, 137177. [Google Scholar] [CrossRef]
EN 13306:2010; Maintenance Terminology. CEN (European Committee for Standardization): Brussels, Belgium, 2010.
Konrad, E.; Schnürmacher, C.; Adolphy, S.; Stark, R. Proactive maintenance as success factor for use-oriented Product-Service Systems. Procedia CIRP 2017, 64, 330–335. [Google Scholar]
Poór, P.; Ženíšek, D.; Basl, J. Historical Overview of Maintenance Management Strategies: Development from Breakdown Maintenance to Predictive Maintenance in Accordance with Four Industrial Revolutions. In Proceedings of the International Conference on Industrial Engineering and Operations Management, Pilsen, Czech Republic, 23–26 July 2019. [Google Scholar]
Ahmad, R.; Kamaruddin, S. An overview of time-based and condition-based maintenance in industrial application. Comput. Ind. Eng. 2012, 63, 135–149. [Google Scholar] [CrossRef]
Bloch, H.P.; Geitner, F.K. Machinery Failure Analysis and Troubleshooting; Gulf Publishing Company: Houston, TX, USA, 1983. [Google Scholar]
Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient Based Learning Applied to Document Recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Yann LeCun: An Early AI Prophet. Available online: https://www.historyofdatascience.com/yann-lecun/ (accessed on 9 September 2023).
Kolar, D.; Lisjak, D.; Payak, M.; Pavkovic, D. Fault Diagnosis of Rotary Machines Using Deep Convolutional Neural Network withWide Three Axis Vibration Signal Input. Sensors 2020, 20, 4017. [Google Scholar] [CrossRef] [PubMed]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; The MIT Press: Cambridge, MA, USA, 2016; ISBN 9780262035613. [Google Scholar]
Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 6999–7019. [Google Scholar] [CrossRef] [PubMed]
Smith, W.A.; Randal, R.B. Rolling element bearing diagnostics using the Case Western Reserve University data: A benchmark study. Mech. Syst. Signal Process. 2015, 64–65, 100–131. [Google Scholar] [CrossRef]

Figure 1. A 2D convolutional neural network model for image processing with an input layer of 28 × 28 pixels.

Figure 2. Overview of a 2D convolutional neural network.

Figure 3. Test rig at CWRU Bearing Data Center [34].

Figure 4. Ball bearing vibration measurement footprint 0N and 14-0-IR.

Figure 5. Scatter plot of data amplitudes and fault data.

Figure 6. The designed network architecture.

Figure 7. Training/evaluation process.

Figure 8. Scatter plot input at the first flatten layer.

Figure 9. Scatter plot input at the first dense layer.

Figure 10. Scatter plot input at the last dense layer.

Table 1. The proposed model summary.

Model: “Sequential_9”
Layer (Type)	Output Shape	No of Param
Conv1d_18 (Conv1D)	(None, 401, 128)	12,928
Conv1d_19 (Conv1D)	(None, 352, 64)	409,664
Max_pooling1d_9 (MaxPooling1D)	(None, 88, 64)	0
flatten_9 (Flatten)	(None, 5632)	0
dense_18 (Dense)	(None, 200)	1,126,600
dense_19 (Dense)	(None, 27)	5427
Total params: 1,554,619 Trainable params: 1,554,619 Non-trainable params: 0

Table 2. Results of network training process.

Epoch Νο.	Execution Time (1 s/Step)	Loss	Accuracy	Validation Loss	Validation Accuracy
1	32 s	2.7129	0.1744	2.2570	0.3157
2	36 s	1.8781	0.4033	1.5318	0.5392
3	38 s	1.1082	0.6664	0.8049	0.7263
4	38 s	0.5795	0.8183	0.6281	0.8043
5	47 s	0.4406	0.8501	0.4276	0.8555
6	42 s	0.2788	0.8995	0.3097	0.8869
7	42 s	0.1934	0.9360	0.2891	0.8988
8	42 s	0.2184	0.9296	0.3952	0.8602
9	44 s	0.2039	0.9290	0.2341	0.9144
10	45 s	0.1419	0.9538	0.3140	0.8955
11	42 s	0.1154	0.9615	0.2062	0.9250
12	42 s	0.0846	0.9722	0.2252	0.9253
13	43 s	0.0616	0.9810	0.2025	0.9289
14	43 s	0.0694	0.9779	0.2445	0.9170
15	41 s	0.0875	0.9694	0.2773	0.9187
16	40 s	0.0840	0.9719	0.2587	0.9160
17	44 s	0.0586	0.9795	0.2461	0.9283
18	41 s	0.0467	0.9851	0.1934	0.9369
19	46 s	0.0236	0.9943	0.2365	0.9263
20	45 s	0.0195	0.9962	0.2167	0.9352

Table 3. Experimental results for new unknown data.

Νο.	Dataset	Data Size	Data	Classification
1	7_0_OR1	2999	6	83.3%
2	7_0_IR	5001	11	100.0%
3	7_0_OR2	15,000	21	100.0%
4	14_0_IR	15,000	33	84.8%
5	14_0_BN	1999	4	25.0%
6	21_0_IR	2500	5	20.0%
7	21_0_IR	15,000	33	78.8%
8	0N	2000	4	100.0%
9	1N	9999	22	100.0%
10	21_0_OR3	10,001	22	95.4%
11	21_0_OR2	4999	10	90.0%
12	14_0_OR1	5003	11	90.9%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Apeiranthitis, S.; Zacharia, P.; Chatzopoulos, A.; Papoutsidakis, M. Predictive Maintenance of Machinery with Rotating Parts Using Convolutional Neural Networks. Electronics 2024, 13, 460. https://doi.org/10.3390/electronics13020460

AMA Style

Apeiranthitis S, Zacharia P, Chatzopoulos A, Papoutsidakis M. Predictive Maintenance of Machinery with Rotating Parts Using Convolutional Neural Networks. Electronics. 2024; 13(2):460. https://doi.org/10.3390/electronics13020460

Chicago/Turabian Style

Apeiranthitis, Stamatis, Paraskevi Zacharia, Avraam Chatzopoulos, and Michail Papoutsidakis. 2024. "Predictive Maintenance of Machinery with Rotating Parts Using Convolutional Neural Networks" Electronics 13, no. 2: 460. https://doi.org/10.3390/electronics13020460

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predictive Maintenance of Machinery with Rotating Parts Using Convolutional Neural Networks

Abstract

1. Introduction

2. Literature Review

3. Fundamental Concepts

3.1. Maintenance Strategies

3.2. Convolutional Neural Networks—An Overview

3.2.1. CNN Architecture

3.2.2. CNN Training

4. Experimental Results of the Proposed CNN

4.1. Experimental Setup

4.2. CWRU Dataset

4.3. Architecture Design

4.4. Evaluation Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI