Next Article in Journal
A Case Study on Assessing the Capability and Applicability of an Articulated Arm Coordinate Measuring Machine and a Touch-Trigger Probe for On-Machine Measurement
Next Article in Special Issue
Predicting Remaining Useful Life of Induction Motor Bearings from Motor Current Signatures Using Machine Learning
Previous Article in Journal
Overview of IoT Security Challenges and Sensors Specifications in PMSM for Elevator Applications
Previous Article in Special Issue
Current Status of Research on Hybrid Ceramic Ball Bearings
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Use of Image Recognition and Machine Learning for the Automatic and Objective Evaluation of Standstill Marks on Rolling Bearings

1
Competence Center for Tribology, Mannheim University of Applied Sciences (UAS), 68163 Mannheim, Germany
2
Faculty for Mechanical Engineering, Mannheim University of Applied Sciences (UAS), 68163 Mannheim, Germany
3
Faculty for Mechanical and Energy Engineering, Technische Hochschule Mittelhessen (UAS), 35390 Gießen, Germany
*
Author to whom correspondence should be addressed.
Former student at the Mannheim University of Applied Sciences (UAS).
Machines 2024, 12(12), 840; https://doi.org/10.3390/machines12120840
Submission received: 7 October 2024 / Revised: 12 November 2024 / Accepted: 21 November 2024 / Published: 23 November 2024
(This article belongs to the Special Issue Remaining Useful Life Prediction for Rolling Element Bearings)

Abstract

:
One main research area of the Competence Centre for Tribology is so-called standstill marks (SSMs) at roller bearings that occur if the bearing is exposed to vibrations or performs just micromovements. SSMs obtained from experiments are usually photographed, evaluated and manually categorized into six classes. An internal project has now investigated the extent to which this evaluation can be automated and objectified. Images of standstill marks were classified using convolutional neural networks that were implemented with the deep learning library Pytorch. With basic convolutional neural networks, an accuracy of 70.19% for the classification of all six classes and 83.65% for the classification of pairwise classes was achieved. Classification accuracies were improved by image augmentation and transfer learning with pre-trained convolutional neural networks. Overall, an accuracy of 83.65% for the classification of all six standstill mark classes and 91.35% for the classification of pairwise classes was achieved. Since 16 individual marks are generated per test run in a typical quasi standstill test (QSST) of the CCT and the deviation in the prediction of the classification is a maximum of one school grade, the accuracy achieved is already sufficient to carry out a reliable and objective evaluation of the markings.

1. Introduction

Worldwide, approximately 0.5% of all roller bearings, which, in absolute numbers, is 50 million, fail during operation every year. As roller bearings are key components in rotating machinery, high follow-up costs due to production downtime and repairs are the consequence [1]. Therefore, it is clearly necessary to expand the knowledge regarding failure causes and their prevention. Standstill marks are among these failure causes. The Competence Centre for Tribology (CCT) at the Mannheim University of Applied Sciences researches the effect of lubricants on the development and severity of standstill marks (SSMs). In their research, SSMs are obtained from experiments, which are described in the section ‘Basics’ and, then, photographed. Afterwards, the resulting images are evaluated and categorized into six classes. This visual evaluation method requires experts that are scarce and even specialists are prone to human errors, e.g., due to lack of concentration. Furthermore, ambiguous images defy clear classification. Considering that there are several thousand unclassified SSM images, it is clear that their manual classification would take a long time. This would not be economical and would also result in many misclassified images. For these reasons, a more robust and faster evaluation method would benefit SSM classification and, overall, benefit the research on SSM-severity-reducing lubricants.
In consideration of the current developments in image classification with machine learning (ML), especially with convolutional neural networks (CNNs), it makes sense to investigate the use of these methods in SSM evaluation. Recent improvements in image classification have been made since the breakthrough with AlexNet, which used CNNs in the 2012 ImageNet Large Scale Visual Recognition Challenge [2,3]. Because they automatically extract effective features from input data through convolutional and pooling operations, CNNs have shown to be more efficient than earlier ML techniques, which rely on manually engineered features. Following AlexNet, the development of CNNs has advanced rapidly, so much that modern versions even surpass human-level performance in the classification of generic images [4]. They are now widespread in object detection and classification tasks, for example, medical image retrieval, traffic sign detection and facial recognition. All this shows their enormous potential, even for niche applications such as the classification of SSMs.
The overall aim of this work is to perform image classification of SSMs images using CNNs. First, it will be determined whether this is possible at all. Then, it will be found out what classification accuracy can be achieved and how well the six classes can be distinguished from each other. In order to achieve this, various CNNs are examined in this work. Furthermore, methods that can improve these CNNs are investigated.

2. Basics

2.1. Standstill Marks at Roller Bearings

A standstill mark (SSM), of which an example is shown in Figure 1, is a type of damage of tribological cause in the raceways of roller bearings. SSMs affect roller bearings in machines that are not operated under normal conditions (rotation or large-angle oscillation), namely, those that are exposed to vibrations and micro movements at machine standstill. It is important to distinguish standstill marks from classic false brinelling marks, as the processes taking place in the tribologically stressed contact zone differ significantly and, therefore, require different solution approaches [5]. In the extensive work by Presilla et al., various international research groups agreed on a clear terminology and described it in detail [6]. The distinction is made on the basis of the so-called x/2b ratio, which relates the movement of the rolling element (x) to the width of the Hertzian contact (2b). In rolling systems with a ratio smaller than 1, the contact area is never completely opened. In the middle of the contact zone, there are areas that adhere, while micro-sliding movements already occur in the outer areas. If the x/2b ratio is greater than 1, the contact opens cyclically and “normal” rolling movements occur.
Destructive conditions occur, for example, during the transport of cars or other machinery by ship or rail due to dynamic driving vibrations, especially in regions with poor infrastructure [5,7]. In addition, SSMs occur frequently to roller bearings of hydraulic components, generators, automotive components, construction machinery, machine tools, special machinery that requires small swivel angles and the pitch bearings of wind turbines [5]. After roller bearings have been affected by SSMs, which can happen after a short period of time under the previously mentioned conditions, their operational lifetime is reduced significantly. Further operation of roller bearings affected by SSMs leads to uneven running and premature failure [5].
In addition to other factors such as temperature, ball material, raceway roughness, coatings and kinematics, the severity of a SSM is also influenced by the lubricant of its roller bearing. The CCT researches the effect of different lubricants on SSMs, with the intention to find a lubricant that reduces their severity. Therefore, a special false brinelling tester is used. It is a modified SNR-FEB2 tester which induces tangential forces through the oscillations of small angles and a constant axial force on an axial ball bearing in order to emulate the conditions under which SSMs form in the contact zone between the ball and raceway. For all experiment runs, axial-deep groove ball bearings of type 51,206 are used [5].
A problem arises in the evaluation of the SSMs obtained from the CCT’s experiments. In the past, there have been evaluation attempts that included the measurement of the mass difference of roller bearings, acoustical measurements during testing and measurements of the SSM’s area, maximum depth and wear volume. But all these measurements have proven to be flawed because they do not include the evaluation of SSMs images. These images include important information about the type of damage, wear mechanisms, topography and color [5].
That is why the CCT’s current evaluation method is based on the visual comparison of close-up images of cleaned SSMs with reference images. After evaluation, the SSMs are sorted into six classes based on the German grading system, with class 1 being the best and class 6 the worst. Actually, there is a seventh class, class x. Images of experiments in which something went wrong are assigned to this class. In this thesis, class x is omitted for simplification reasons. Exemplary representations of the six classes can be seen in Figure 2. In this thesis, only SSMs of axial ball bearings are covered.

2.2. Machine Learning

Machine learning is a subdomain of artificial intelligence in which data-driven algorithms are used. Data-driven algorithms refer to algorithms that adjust their parameters in response to a dataset. Artificial neural networks, which, henceforth, will be referred to as NNs, are a popular type of this kind of algorithm. In principle, there are numerous different areas of application for machine learning in tribology. The largest and economically most important area at the moment is probably the condition monitoring of machines (predictive maintenance). There are numerous publications on the use of various ML approaches and, here, on CNNs explicitly, for the evaluation of vibration signals [8,9,10]. Another major area of application for NNs, especially CNNs, is image classification tasks [11]. Here, too, there are numerous examples of the successful use of these methods. However, if you focus specifically on issues in tribology, you will find very few publications. Herwig et al. applied Layer-wise Relevance Propagation (LRP) to an established CNN for tribological image analysis [12]. Subsequently, an automated global meta-analysis of the LRP results on the relevant clusters of oil gearbox images was used to make a statement about the way the network is classified and compared with the approach of experts. Wang et al. proposed an online image-detection method based on improved Faster R-CNN model for wear location and wear mechanism identification [13]. For this purpose, the wear surface of a ball-on-plate test was continuously observed and evaluated for signs of adhesion, abrasion or tribooxidation. Staar et al. utilized CNNs to automate a common ball-on-disk test whereby an endoscopic camera was used to measure the contact area between a rubber sample and a spherical counterpart [14].
To the best of our knowledge, the use of this method for evaluating standstill marks or similar rolling bearing damage is new.
A static program for image classification would calculate a function f, which has, in the case of our SSMs images, 3,686,400 parameters (1280 by 960 pixels and three color channels). Because of this enormous parameter space and the ambiguous features that define SSM classes, it would be impossible to code such a program. Instead, NNs are used, which do not calculate this function directly, but rather approximate it iteratively with another function F. In this iterative process, the function learns from labeled image data and updates its parameters accordingly to perform better classification in every training epoch. Starting with the function F0, it is expected that the difference between f and every epoch function Fi decreases in expectation that the result has the following characteristic [11]:
lim n F n = f

3. Computational Environment, Dataset and Methods

This chapter describes the computing environment, used methods, and dataset, together with its processing and augmentation attempts.

3.1. Computational Environment

All computations of this project are conducted on Google Colaboratory, a free-to-use cloud service which provides access to computing hardware. Jupyter notebooks are hosted on Google Colaboratory in a virtual machine running Ubuntu 22.04.3 LTS. The experiments are performed on a single-core dual-thread 2.30 GHz Intel(R) Xeon(R) CPU and on a NVIDIA T4 GPU.
The experiments’ code is implemented with python 3.10.12. For ML, the libraries used include torch 2.1.0+cu121, pytorch-lightning 2.1.4, torchmetrics 1.3.0.post0 and torchvision 0.16.0+cu121. Image augmentations are realized with albumentations 1.3.1 and torchvision 0.16.0+cu121. The latter library also provides model architectures and their pretrained weights for transfer learning (TL).

3.2. Methods

The key problem with ML is the development of a model that achieves a high classification accuracy on completely new data. Developing a reliable model is an optimization problem with a variety of architectures, techniques and hyperparameters to choose from. Hyperparameters include, for example, learning rate, batch size and NN layers. Hyperparameters are not learned by the NN, they have to be set manually; therefore, hyperparameter optimization has to be performed to find the most suitable ones [12,13].
A simple hyperparameter optimization technique is grid search. When grid search is performed, values for certain hyperparameters are defined first, then, for all combinations of these hyperparameter values, the CNN is trained and cross-validated. The hyperparameter combination with which the best validation accuracy is obtained is further utilized and refined. As grid search is computationally expensive, large hyperparameter increments are used [13].
Cross-validation is a method applied to check the results of ML training using a validation and a test data set. In this work, the hold-out cross-validation is used, in which the dataset is split into a training subset, a validation subset and a test subset. In general, dataset splits are performed because learning the best parameters of a NN from a set of images and then predicting the NN’s performance on the same set of images would be a methodological mistake. The NN’s performance needs to be evaluated on unknown data to check whether it only remembers images it has processed during training or if it actually has learned to distinguish classes by their specific features [12,14].
The training set is used for training in which the NN optimizes its weights and biases. The validation set is used for an unbiased evaluation of the NN’s performance on unknown data during training. It is also used to investigate the influence of hyperparameters on the classification performance and to check for overfitting [15]. Model development decisions are made with the help of the validation set. The test set is used for an unbiased evaluation of the final model performance on new input data. No decision throughout the model development should be made on the basis of the test set. It is used to determine whether the developed model, which is highly adapted to the validation set, will also perform in its designated real-world task by exposure to completely new data [12,16].
For every CNN network architecture, we have optimized the categorical cross-entropy-loss with the SGD optimizer provided by PyTorch. Feature clustering methods have not been applied in our work. We have amended our paper accordingly.

3.3. Image Dataset

The image dataset used in this thesis consist of 1032 SSM images. All images are classified into the six categories mentioned in Section 2.1 by an expert from the CCT. The images are saved as .jpg files and have a width of 1280 pixels, a height of 960 pixels and three color channels: red, green and blue (RGB).
The image dataset is split with a ratio of 70:20:10 into following three sets: training set, validation set and test set. The split is performed through a Stratified Shuffle Split. This means that images are shuffled randomly, first, and then, split, preserving the whole dataset’s class distribution in all three sets [17]. The exact distribution of images is shown in Table 1 for the original dataset and in Table 2 for the re-sorted dataset. In the latter data set, images that showed large deviations between the real value and the predicted value in a first run were checked again and reclassified if necessary (for details, see the Results section).
It is extremely important to keep the three data sets strictly separate. Thus, the largest part of the total data set (70%) is used for training. Twenty percent is used as a validation dataset to optimize the hyperparameters. Ten percent of the data is then used as a test data set for final testing of the algorithms when using “fresh” data. In the following Tables 3–6, the accuracies for the prediction quality in the training data set (ACCtrain) and in the validation data set (ACCval) are given separately. Table 7 then compares the accuracies of the validation data set with those of the test data set (ACCtest). The accuracy values in the Conclusion, then, always refer to the test data set.

3.3.1. Data Preprocessing

Before SSM images can be used in training and evaluation, certain preparation procedures need to be performed on them. These procedures include data renaming, resizing and normalization. In some experiment runs, image augmentations are applied.

3.3.2. Data Augmentation

Not every augmentation type is suitable for every classification problem [18,19,20,21,22,23,24]. Therefore, only augmentation types that reproduce visual features that are already present in the dataset and that do not change the class affiliation of an image are utilized and examined. As stated before, augmentations are implemented by the library albumentations 1.3.1. In the following, A. stands for albumentations. The investigated types can be categorized into photometric augmentations and geometric augmentations.
Photometric augmentations preserve the geometric properties of an image while they change its visual features [22]. The photometric augmentation types tested in this thesis are visualized in Figure 3, and they include the following types:
  • Spot Reflections: Some images show randomly located spots, which are bright because they reflect light. Spot Reflections are implemented with A.CoarseDropout(), which places a white dot with a width and height of 1 pixel in a random location of an image.
  • Contrast: Images in the dataset vary in contrast. Contrast is implemented with A.RandomContrast(), which randomly changes an image’s contrast.
  • Brightness: Images in the dataset vary in brightness. Brightness is implemented with A.RandomBrightness(), which randomly modifies the brightness of an image.
  • Color: Images in the dataset differ in color. There are yellowish, reddish or greyish tinted images. Color is implemented with A.HueSaturationValue(), which randomly changes the hue, saturation and lightness of an image.
  • Overexposure: Some images in the dataset are overexposed. Overexposure is implemented with A.Sharpen(), which overlays the original image with a sharpened one. The sharpened image has a random lightness.
Geometric augmentations change the geometric features of an image, for example, its orientation [22]. All geometric augmentations can be justified by the fact that neither the position, the angle nor the orientation of a SSM within an image changes its class affiliation. In addition, the position, angle and orientation may differ when the image is taken. The geometric augmentation types tested in this thesis are visualized in Figure 4, they include:
  • Flips: implemented with A.HorizontalFlip() and A.VerticalFlip(), which mirror the image horizontally and vertically, respectively. Only one operation is applied at a time.
  • Rotation by 180°: implemented with both A.HorizontalFlip() and A.VerticalFlip(). Applying both flips in succession to an image rotates it by 180°.
  • Rotation by Small Angles: implemented with A.Rotate(), which rotates an image by a small angle set to +7° or −7°.
  • Striding Crops: implemented with A.Crop() and A.CenterCrop(). The image is not only cropped in the center, but also in its four corners. The probability that one particular location will be cropped is 20%.
  • Transposing: implemented with A.Transpose(), which transposes the image tensor. The image is resized after the transposing operation and, thus, regains its original aspect ratio.

4. Results

In the first part of the thesis, various methodological approaches were evaluated based on their confusion matrix and accuracy. The confusion matrix shows how prediction and true values differ. In the simplest case of two classes, a distinction is made here between True Positive (TP) and True Negative for correct predictions and False Positive (FP) and False Negative (FN) for incorrect predictions. Large deviations in the confusion matrix are particularly critical if, for example, a standstill mark rated as 2 is rated as 5 by the AI. Deviations of one class are still relatively uncritical for the current question and can also be based on the subjective assessment by the expert.
All images with deviations greater than one class were output after the first training and checked again by the human experts. In the process, images were found that had obviously been inadvertently misclassified. These comprised approx. 3% of all the images (30 of 1032). Four of these 30 images were clearly misclassified by the expert. The classification was not clear for 22 images. These images were shifted by one class. The expert classification was confirmed for four images during the follow-up inspection. No changes were made to the classification here. This new data set was then labelled as ‘re-sorted’ and used for all further investigations. In order to provide the CNN with better decision criteria, an investigation is currently being carried out using LIME (Local Interpretable Model Agnostic Explanations [19,20]).
BasicNet was the first CNN to be analyzed. Figure 5 shows the architecture of this simple CNN as a flowchart. In a first approach, the number of classes was varied. In the first step, only clearly different classes were analyzed. In the second step, classes were analyzed in pairs, and in the last step, all six classes were analyzed. Table 3 shows the results.
Figure 5. BasicNet with its convolutional part on the left side and its fully connected part on the right side. The depth value in the convolutional and max pooling layers denotes the number of channels. In each layer, the input and output sizes are listed. The input to the overall CNN is a single image with three channels, a height of 45 pixels and a width of 61 pixels. The output is a tensor with three values as this is a BasicNet example for the classification of three classes.
Figure 5. BasicNet with its convolutional part on the left side and its fully connected part on the right side. The depth value in the convolutional and max pooling layers denotes the number of channels. In each layer, the input and output sizes are listed. The input to the overall CNN is a single image with three channels, a height of 45 pixels and a width of 61 pixels. The output is a tensor with three values as this is a BasicNet example for the classification of three classes.
Machines 12 00840 g005
Table 3. Validation and training accuracies for the classification of different class configurations with BasicNet. The used hyperparameters are LR = 1 × 10−4, BS = 8, H = 90 and W = 122.
Table 3. Validation and training accuracies for the classification of different class configurations with BasicNet. The used hyperparameters are LR = 1 × 10−4, BS = 8, H = 90 and W = 122.
Dataset Number of ClassesClasses ACCtrain
[%]
ACCval
[%]
Original 21, 5 99.5296.67
Re-Sorted 21, 5 99.5298.33
Original 31, 3, 5 92.3188.43
Re-Sorted 31, 3, 5 91.0491.80
Original 32, 4, 6 91.5586.05
Re-Sorted 32, 4, 6 95.2787.21
Original 3(1, 2), (3, 4), (5, 6) 83.1984.06
Re-Sorted 3(1, 2), (3, 4), (5, 6) 90.8381.64
Original 61, 2, 3, 4, 5, 6 85.5663.94
Re-Sorted 61, 2, 3, 4, 5, 6 86.6769.08
To find suitable hyperparameters, a grid search was conducted with BasicNet in which all six classes were classified. The grid spans the dimensions learning rate, which is the most important hyperparameter to tune [16] and batch size. For the batch size, the values 8 and 16 were chosen, as smaller batch sizes can improve the classification performance [21]. For the learning rate, which usually is small and lies between 0 and 1, the values 0.01, 0.005, 0.001, 0.0005 and 0.0001 were chosen. Generally, it is recommended to set the learning rate values in a logarithmic scale when a grid search is performed [16]. The learning rate values in the middle of the respective intervals were selected to refine the grid. For every grid combination, BasicNet was trained for 50 epochs. The best validation accuracy, 69.57%, was achieved with a learning rate of 0.0005 and a batch size of 16.
Because training with images in their original size led to the GPU memory overflowing and took quite a long time, various smaller image sizes were examined to see how they affect the classification accuracy [25]. The maximum number of pixels that was evaluated was 43,920 px and the lowest was 2745 px. The best validation accuracy, 91.80%, was achieved with a width of 122 pixels and a height of 90 pixels (Table 4).
Table 4. Validation and training accuracies for the classification of classes 1, 3 and 5 with BasicNet for different photometric augmentation types. The used hyperparameters are LR = 1 × 10−4, BS = 8, H = 45 and W = 61. The best accuracy is highlighted in bold.
Table 4. Validation and training accuracies for the classification of classes 1, 3 and 5 with BasicNet for different photometric augmentation types. The used hyperparameters are LR = 1 × 10−4, BS = 8, H = 45 and W = 61. The best accuracy is highlighted in bold.
H [px]W [px]Number of PixelsACCtrain
[%]
ACCval
[%]
180 24443,920 98.80 91.74
90 122 10,980 91.04 91.80
60 81 4860 91.59 89.26
60 60 3600 91.35 90.91
45 61 2745 85.10 85.95
In a next step, different types of augmentation were tested; first, separately and later, in combination (Table 5).
Table 5. Validation and training accuracies for the classification of classes 1, 3 and 5 with BasicNet for different photometric augmentation types. The used hyperparameters are LR = 1 × 10−4, BS = 8, H = 45 and W = 61. The best accuracy is highlighted in bold.
Table 5. Validation and training accuracies for the classification of classes 1, 3 and 5 with BasicNet for different photometric augmentation types. The used hyperparameters are LR = 1 × 10−4, BS = 8, H = 45 and W = 61. The best accuracy is highlighted in bold.
Augmentation Type ACCtrain
[%]
ACCval
[%]
None 85.10 85.95
Color 84.86 89.26
Brightness 79.81 84.30
Contrast 85.10 89.26
Overexposure 84.1381.82
Spot Reflections 88.46 86.78
Contrast and Color 87.02 87.60
Color and Spot Reflection 86.78 87.60
Contrast and Spot Reflection 86.54 90.08
Contrast and Spot Reflection & Color86.06 88.43
Compared to the basic run, three photometric augmentation types have improved the validation accuracy, i.e., Color, Contrast and Spot Reflections. Two augmentation types, Overexposure and Brightness, have worsened the validation accuracy. In the second run of photometric augmentation testing, the augmentation types with which the validation accuracy could be improved in the first run were combined. The augmentation combinations all increased the validation and training accuracy compared to the basic run.
The second CNN analyzed in detail was ZhouNet by Zhou et al. [26]. ZhouNet consists of three two dimensional convolutional layers and two fully connected layers. The first convolutional layer has a kernel size of 5 × 5, a stride of 1 and 20 channels. The subsequent two-dimensional max pooling layer has a kernel size of 2 × 2, a stride of 2 and 20 channels. The second convolutional layer also has a kernel size of 5 × 5 and a stride of 1. It has 50 channels. Likewise, its following max pooling layer, which has a kernel size of 2 × 2 and a stride of 2, has 50 channels. The third convolutional layer has a kernel size of 4 × 4, a stride of 1 and 400 channels. The third and last convolutional layer is connected to a fully connected layer, which has 6400 neurons. ZhouNet’s output layer is adapted from its original eight neurons to six neurons. A depiction of the original ZhouNet can be found in [26].
Three experiment runs with all six SSM classes have been conducted for different combinations of image width, height and number of color channels. The achieved training and validation accuracies are shown in Table 6. The hyperparameters learning rate with a value of 1 × 10−3 and the momentum with a value 0.9 were taken from the publication. Only the batch size was lowered to eight since Zhou et al. state that a smaller batch size may increase accuracy [26]. ZhouNet was tested for the classification of SSMs because it achieved an accuracy of 99.4% in the classification of eight different surface defect classes that could be similar to SSM.
Table 6. Validation and training accuracies for the classification of all six SSM classes with ZhouNet on the re-sorted dataset.
Table 6. Validation and training accuracies for the classification of all six SSM classes with ZhouNet on the re-sorted dataset.
Color Channels H [px] W [px] ACCtrain
[%]
ACCval
[%]
3 40 40 99.72 64.25
1 40 40 98.89 62.32
1 80 80 99.5865.70
In the experiments with ZhouNet, validation accuracies between 62.32% and 65.70% were achieved. The training accuracies are much higher; all are above 98.89%.
Finally, transfer learning (TL) was used to test further CNNs. Four different pre-trained CNNs were chosen for TL based on their testing accuracy on the ImageNet-1K dataset. The CNNs are ConvNeXt_Base, ResNeXt101_64X4D, RegNet_Y_16GF and Efficient-Net_V2_S. All four are used in training as fixed feature extractors (FFEs) and for fine-tuning (FT). In both training modes, a grid search is conducted with all six SSM classes for each of these CNNs. The fine-tuned CNNs achieve higher classification accuracies. With validation accuracies of at least 78.26%, all four CNNs achieve better results than BasicNet. EfficientNet_V2_S achieves the best validation accuracy, which is 82.13%. For RegNet_Y_16GF and EfficientNet_V2_S, there seems to be no correlation between learning rate and validation accuracy. RegNet_Y_16GF and EfficientNet_V2_S achieve higher validation accuracies for small batch sizes at large learning rates and for large batch sizes at small learning rates. For ResNeXt101_64X4D, the validation accuracies are higher at larger learning rates. For ConvNeXt_Base, the validation accuracies are higher at smaller learning rates and the validation accuracy does not seem to be strongly dependent on the batch size.
Because the validation accuracies for a fine-tuned ConvNeXt_Base and a ConvNeXt_Base which is used as an FFE are similar, it was investigated if a partially frozen ConvNeXt_Base achieves better classification results. The idea behind partial freezing is that layers close to the input layer are more likely to extract unspecific features and layers close to the output layer are more likely to extract dataset specific features. Therefore, if the front layers are frozen, the back layers can be fine-tuned to the new dataset.
The classification results can be seen in Figure 6. It is striking that validation accuracies of only less than 50% were achieved in five attempts, four times with a learning rate of 0.01 and once with a learning rate of 0.005, i.e., overall, at high learning rates. In general, a trend can be observed that with many frozen layers, a high learning rate leads to a better validation accuracy and with few frozen layers, a low learning rate leads to a better validation accuracy.
The final classification within this work was performed on the re-sorted dataset for pairwise classes and for all six classes. Augmentations that have proven to increase the classification accuracy were applied in groups. These groups are no augmentations, geometric augmentations, which include Transposing, Striding Crops, Rotation by 180°, Striding Crops and Transposing, Striding Crops and Rotation by 180°, Transposing and Rotation by 180° and Striding Crops and Transposing and Rotation by 180°, photometric augmentations, which include Color, Contrast, Spot Reflections, Color and Spot Reflection, Contrast and Spot Reflection and Contrast and Spot Reflection and Color. Both augmentation groups are also applied at the same time.
The final classification was performed with EfficientNet_V2_S with learning rate of 0.005 and a batch size of 16, ConvNeXt_Base with a learning rate of 0.0005, a batch size of 16 and 2 frozen layers and BasicNet with a learning rate of 0.0005. These hyperparameters were selected because they produced the best results in the previous grid searches. The test accuracy of the runs was calculated in each case using the CNN weights of the epoch with the best validation accuracy. The achieved final accuracies for the classification of six classes can be seen in Table 7 (subdivided according to accuracy in the validation data set and in the test data set).
Table 7. Validation and test accuracies for the final classification of all six SSM classes depending on CNN and augmentation type. The highest achieved test accuracy is highlighted in bold.
Table 7. Validation and test accuracies for the final classification of all six SSM classes depending on CNN and augmentation type. The highest achieved test accuracy is highlighted in bold.
Model ACCNone
[%]
Geom.
[%]
Photom.
[%]
Both
[%]
EfficientNet_V2_SACCval 82.13 80.6875.3682.61
ACCtest 73.08 82.6976.9275.96
ConvNeXt_BaseACCval 81.16 79.7178.2679.23
ACCtest 79.81 80.7781.7383.65
BasicNetACCval 69.5770.5367.1573.43
ACCtest 67.3170.1968.2770.19
The best test accuracy for the classification of six classes, 83.65%, was achieved with ConvNeXt_Base and both augmentation types. A mean AUC of 0.927 shows that all classes are well distinguishable. Class 4 is the least and class 1 the most distinguishable. In the confusion matrix in Figure 7, it can be seen that of the misclassified images, all except one were only placed one class away.
The best final test accuracy for pairwise classification was achieved with ConvNeXt_Base and photometric augmentations; it is 91.35%. The mean AUC is 0.919, which indicates a good differentiability between the classes. Class (5,6) is the least and class (1,2) is the most distinguishable. All misclassified images were only placed one class away.
The test accuracies depending on the model and augmentation group are shown in Figure 8. It appears that the six-class classification benefits from augmentations, as the score with no augmentations is the lowest for each model. In pairwise classification, no trend regarding augmentation group is recognizable.

5. Discussion

The classification success of two SSM classes using BasicNet, where a validation accuracy of 98.33% was achieved, has shown that the classification of SSMs is principally possible with CNNs. However, the drop in accuracy when classifying with six classes to 69.08% showed that BasicNet is not suitable for distinguishing between all classes. Since the training accuracy also dropped, research was carried out to determine whether better results can be achieved with other hyperparameters. It was also investigated whether data augmentation can improve the classification results. In addition, it was investigated whether more complex CNNs and TL can improve the classification performance.
In order to find out whether better classification results can be achieved with different hyperparameters, a grid search was carried out with two different batch sizes and five different learning rates. Indeed, with grid search, a hyperparameter combination was found with which the validation accuracy could be improved to 69.57%. As expected, the accuracies for larger learning rates are worse than the accuracies for lower learning rates. With a too high learning rate, the SGD optimizer may update the weights and biases too much, which can lead to the optimal values being skipped over [27]. It was also found that the optimal learning rate must lie between 0.001 and 0.0001, as the achieved accuracy values are similar in this hyperparameter range. A further grid search with more grid points in this range might have been able to find the optimal hyperparameter combination. However, the grid search method is very computationally expensive. For further work, other methods should be used that are less computationally expensive, such as random search.
The image size, which can also be regarded as a hyperparameter, was examined as well, mainly because the GPU memory was overloaded at full image size [28].
For the largest examined image size, the difference between the validation accuracy and the training accuracy was found to be high, which indicates overfitting (Table 4). BasicNet may be overfitting here because larger images are more detailed and, therefore, contain more specific features. BasicNet could, thus, memorize the training images based on their details and, therefore, distinguish less accurately between the classes with larger image sizes. The accuracies for the middle three examined image sizes are very close to each other. In addition, the difference between the respective validation and training accuracy is small, which indicates that BasicNet does not overfit to the training data. In other words, it can generalize well, which means that it distinguishes classes based on their characteristic features. However, the training accuracy is only about 91% in each case. Therefore, it is very unlikely that better validation accuracy can be achieved with these image sizes. The optimal image size probably lies between the largest and the second largest tested image size, because at these two image sizes, the best validation accuracies were achieved. No further studies were conducted in this regard. A sharp drop in accuracy can be seen at the smallest selected image size. Looking at Figure 9, it is noticeable that details are omitted at this image size. In addition, artifacts appear in the image. They are recognizable by the new pixelated pattern that replaces the previously existing grooves.
Image augmentation was identified as a further possibility to increase the classification accuracy. Five different geometric and photometric augmentation types were investigated. Of the geometric augmentation types, Transposing, Striding Crops and Rotation by 180° improved the validation accuracy. That is why combinations of them were examined in a second experiment run, in which the validation accuracy could be increased in comparison to the non-augmented classification, but not in comparison to the classification with the non-combined augmentation types.
When Transposing and Rotation by 180° are applied, the training accuracy is lower than the validation accuracy, which is usually not the case. This may indicate that the validation dataset is less complex than the training dataset and, therefore, easier to classify. The lower complexity may also be a consequence of the fact that the images of the validation dataset are not transformed by augmentations. Accordingly, all SSMs of the validation dataset have the same orientation. The lower validation accuracy can also be an indication that the validation set is too small and, therefore, too homogeneous. It may be that the validation dataset, in contrast to the training dataset, does not sufficiently represent the diversity of all SSMs. Another indication that the validation data set is too small is the fact that three validation accuracies have exactly the same value. The training accuracy increased slightly and the validation accuracy remained the same, in comparison to the non-augmented classification, when Rotation by Small Angles was applied. This may be related to the shift invariance, which is ensured by the pooling layer. If patterns from one area of the image differ only slightly from each other, which is the case for images transformed with the Rotation by Small Angles augmentation type, the pooling layer produces the same output for them. Therefore, this augmentation type has only minor effects. In the Flips augmentation experiment, the training accuracy improved slightly, but the validation accuracy declined to 50.41%. This is due to the fact that the SSM images are flipped with a probability of 50% in the training run and, therefore, half of the images have a different orientation compared to the validation set images. BasicNet may not be able to find the flipped patterns in the validation set images.
Of the photometric augmentation types, Color, Contrast and Spot Reflections improved the validation accuracy compared to the non-augmented classification. With three photometric types of augmentation, it should, again, be noted that the validation accuracy is higher than the training accuracy. This may, again, be due to the fact that the validation set is too homogeneous compared to the augmented training set. The application of the Overexposure augmentation decreased the training and validation accuracy in comparison to the non-augmented run. This could be due to the fact that with this type of augmentation, an image is overlaid with a brightened image. This overlay might introduce image artifacts. In addition, a change in brightness has also led to a reduction in validation and training accuracy, when the Brightness augmentation type was applied. It is surprising that a simple change in image brightness does not improve the validation accuracy, unlike other simple modifications such as contrast or color changes. In a second run, combinations of classification-improving augmentations were tested again. With all photometric augmentation types, the validation accuracy was improved in comparison to the non-augmented classification. Compared to the runs with non-combined augmentation types, the validation accuracy appears to deteriorate for the combination Contrast and Color. The combination of Contrast and Spot Reflection has clearly improved the validation accuracy. For the remaining two combinations, the validation accuracy lies between the validation accuracies of the respective basic runs. For photometric augmentation combinations, the validation accuracies are, again, higher than the training accuracies, which again indicates that the validation set is too small or homogeneous.
In the final classification experiments, image augmentations were tested in a geometric group, a photometric group and a group with both augmentation types. It has been found that augmentations definitely improve the accuracy for the classification of six classes. In case of pairwise classification, it has been found that in some cases, non-augmented classification achieves higher results than augmented classification.
Further work could investigate whether there are more augmentation types that improve the SSM classification. In addition, a grid search or other hyperparameter tuning methods could be used to search for the optimal augmentation parameters, such as the probability with which they are applied.
As BasicNet did not perform well in classifying all six classes, a CNN was tested that has already proven itself in classifying surface defect classes. This CNN was developed by Zhou et al. [26] and is called ZhouNet in this work. Compared to the achieved accuracy of 99.4% by Zhou et al. in the classification of eight classes, this work only achieved an accuracy of 65.70% in the classification of six classes.
Three different combinations of image width, height and number of color channels were examined with ZhouNet. In its original architecture, in which monochromatic images with a height and width of 40 pixels each are classified, ZhouNet achieved a validation accuracy of only 62.32%. Because colored and larger images contain more details, ZhouNet was adapted to process them by adapting the fully connected layer. Better validation accuracies were achieved by adapting both the number of color channels and the image size. The training accuracy is above 98% for all runs and, therefore, far above the validation accuracy. It should be noted that ZhouNet is heavily overfitting and cannot generalize. This is probably due to the fact that the SSM images are heavily compressed by resizing them to their small size. As a result, they probably lose details that would be important for distinguishing the classes. Zhou et al. resize images of 200 by 200 pixels to a size of 40 by 40 pixels. The SSM images with an original size of 960 by 1280 pixels are, therefore, much more compressed when they are resized. The validation accuracy in the classification of SSMs is far below the test accuracy mentioned in the paper. This may be due to the fact that Zhou et al. classify images of various defect classes, which seem easier to distinguish than classes, which contain images of only a single surface defect with varying severity.
Overall, the maximum validation accuracy of 65.70% that was achieved with ZhouNet is lower than BasicNet’s maximum validation accuracy of 69.57% for the classification of all six SSM classes. ZhouNet was not examined further because the GPU memory was almost overloaded when the images with a width and height of 80 by 80 pixels were pro- cessed.
As with BasicNet, a grid search was carried out with pre-trained CNNs, which were expected to produce better classification results. These CNNs, which are ConvNeXt_Base, ResNeXt101_64X4D, RegNet_Y_16GF and EfficientNet_V2_S, were used for finetuning and as an FFE, respectively.
It has been found that all examined pre-trained CNNs work better when they are fine-tuned. Only ConvNeXt_Base achieves similar performance, both as an FFE and when finetuned. Overall, only ConvNeXt_Base as an FFE exceeds the validation accuracy of BasicNet. The other three CNNs perform worse than BasicNet, which is related to the fact that all layers except the last classification layer are frozen during their use as FFEs and, therefore, their weights and biases cannot be trained on the new image data.
As expected, the CNNs were able to achieve higher validation accuracies after being fine-tuned. All of them at least achieve a validation accuracy of 78.26% and, thus, perform better than BasicNet. With 82.13%, the fine-tuned EfficientNet_V2_S achieves the highest validation accuracy.
Because it achieved similar validation accuracies when fine-tuned and as an FFE, Con-vNeXt_Base was further investigated by examining its classification accuracy for different numbers of frozen layers. With two frozen layers, the classification accuracy could indeed be increased to 81.2%.
Final classifications were performed with EfficientNet_V2_S and with ConvNeXt_Base, because these CNNs were the only two that achieved classification accuracies of over 80%. BasicNet was added for comparison. All three CNNs were trained with hyperparameters that were found to be beneficial for them in the process of this work.
In the end, for the classification of all six classes, an accuracy of 83.65%, and for the classification of pairwise classes, an accuracy of 91.35%, was achieved. Both highest accuracies were achieved with a partially frozen ConvNeXt_Base and image augmentations. BasicNet as a non-pretrained CNN only reached an accuracy of 70.19% for the classification of all six classes and an accuracy of 83.65% for the classification of pairwise classes. Mean AUC values of 0.927 for the classification of six classes and 0.919 for the classification of pairwise classes indicate that the six SSM classes can be distinguished reasonably well from each other.

6. Conclusions and Outlook

In this work, it is shown that the SSMs of ball bearings can be classified using CNNs. It is also shown that the classification accuracy can be improved by image augmentation and transfer learning with pre-trained CNNs. Overall, an accuracy of 83.65% for the classification of all six SSM classes and 91.35% for the classification of pairwise classes was achieved. Due to its achieved accuracy, the pairwise classification could be used as a preliminary decision maker in the evaluation of SSMs and, thereby, reduce the human error rate.
Standstill marks mainly occur in ball bearings, as the width of the Hertzian contact area in roller bearings (radial and tapered roller bearings) is relatively narrow and the contact zone is exposed even with small pivoting movements. However, we see similar damage in highly loaded tapered roller bearings [5], so that the ML-based classification approach presented here could also be applied to these bearing types.
In follow-up projects, the evaluation of SSMs could be investigated, using ML regression. This could help to classify SSMs whose class affiliation is ambiguous.
In addition, it could be investigated whether better classification accuracy can be achieved with Vision Transformers (ViTs). ViTs sometimes perform better than CNNs in image-related deep learning tasks.
Furthermore, an attempt could be made to achieve a higher classification accuracy by using a model ensemble of several CNNs with different strengths.
In future work, more image data should also be utilized, as the size of the validation set could have caused problems in some cases. Alternatively, the entire data set could be split up differently.

Author Contributions

Conceptualization, M.G., A.B. and D.M.; methodology, D.M.; software, D.M.; validation, D.M.; formal analysis, M.G. and A.B.; investigation, D.M.; resources, M.G.; data curation, M.G.; writing—original draft preparation, D.M.; writing—review and editing, M.G. and A.B.; visualization, D.M.; supervision, M.G. and A.B.; project administration, M.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The image data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. SKF. Bearing Damage and Failure Analysis. 2017. Available online: https://cdn.skfmediahub.skf.com/api/public/0901d1968064c148/pdf_preview_medium/0901d1968064c148_pdf_preview_medium.pdf (accessed on 2 August 2024).
  2. Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
  3. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 84–90. [Google Scholar] [CrossRef]
  4. He, K.; Zhang, X.; Ren, S.; Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar] [CrossRef]
  5. Grebe, M. Wälzlager im Betrieb bei Kleinen Schwenkwinkeln oder Unter Vibrationsbelastung; Narr Francke Attempto Verlag: Tübingen, Germany, 2017. [Google Scholar]
  6. de la Presilla, R.; Wandel, S.; Stammler, M.; Grebe, M.; Poll, G.; Glavatskih, S. Oscillating Rolling Element Bearings: A Review of Tribotesting and Analysis Approaches. Tribol. Int. 2023, 188, 108805. [Google Scholar] [CrossRef]
  7. Grebe, M.; Molter, J.; Schwack, F.; Poll, G. Damage mechanisms in pivoting rolling bearings and their differentiation and simulation. Bear. J. 2018, 3, 71–86. [Google Scholar]
  8. Zhang, K.; Wang, J.; Shi, H.; Zhang, X.; Tang, Y. A fault diagnosis method based on improved convolutional neural network for bearings under variable working conditions. Measurement 2021, 182, 109749. [Google Scholar] [CrossRef]
  9. Ye, M.; Yan, X.; Jiang, D.; Xiang, L.; Chen, N. MIFDELN: A multi-sensor information fusion deep ensemble learning network for diagnosing bearing faults in noisy scenarios. Knowl. Based Syst. 2024, 284, 111294. [Google Scholar] [CrossRef]
  10. Yan, X.; Jiang, D.; Xiang, L.; Xu, Y.; Wang, Y. CDTFAFN: A novel coarse-to-fine dual-scale time-frequency attention fusion network for machinery vibro-acoustic fault diagnosis. Inf. Fusion 2024, 112, 102554. [Google Scholar] [CrossRef]
  11. Guilhoto, L.F. An Overview Of Artificial Neural Networks for Mathematicians. 2018. Available online: https://api.semanticscholar.org/CorpusID:85504929 (accessed on 6 March 2024).
  12. Herwig, N.; Peng, Z.; Borghesani, P. Bridging the trust gap: Evaluating feature relevance in neural network-based gear wear mechanism analysis with explainable AI. Tribol. Int. 2023, 187, 108670. [Google Scholar] [CrossRef]
  13. Wang, M.; Yang, L.; Zhao, Z.; Guo, Y. Intelligent prediction of wear location and mechanism using image identification based on improved Faster R-CNN model. Tribol. Int. 2022, 169, 107466. [Google Scholar] [CrossRef]
  14. Staar, B.; Bayrak, S.; Paulkowski, D.; Freitag, M. A U-Net Based Approach for Automating Tribological Experiments. Sensors 2020, 20, 6703. [Google Scholar] [CrossRef] [PubMed]
  15. Lemley, J.; Bazrafkan, S.; Corcoran, P. Smart Augmenta- tion Learning an Optimal Data Augmentation Strategy. IEEE Access 2017, 5, 5858–5869. [Google Scholar] [CrossRef]
  16. Stanford University. Image Classification. Available online: https://cs231n.github.io/classification/#validation-sets-for-hyperparameter-tuning%7D (accessed on 29 February 2024).
  17. Li, Q.; Luo, Z.; Chen, H.; Li, C. An Overview of Deeply Optimized Convolutional Neural Networks and Research in Sur- face Defect Classification of Workpieces. IEEE Access 2022, 10, 26443–26462. [Google Scholar] [CrossRef]
  18. Yadav, S.; Shukla, S. Analysis of k-Fold Cross-Validation over Hold-Out Validation on Colossal Datasets for Quality Classification. In Proceedings of the 2016 IEEE 6th International Conference on Advanced Computing (IACC), Bhimavaram, India, 27–28 February 2016; pp. 78–83. [Google Scholar] [CrossRef]
  19. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; Available online: http://www.deeplearningbook.org (accessed on 6 March 2024).
  20. Scikit-Learn. Stratified Shuffle Split. Available online: https://scikit-learn.org/stable/modules/cross_validation.html#stratified-shuffle-split%7D (accessed on 15 January 2024).
  21. Steiner, A.; Kolesnikov, A.; Zhai, X.; Wightman, R.; Uszkoreit, J.; Beyer, L. How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers. arXiv 2022, arXiv:2106.10270. [Google Scholar]
  22. Molnar, C. Interpretable Machine Learning. 2019. Available online: https://christophm.github.io/interpretable-ml-book/ (accessed on 11 April 2024).
  23. Zafar, M.R.; Khan, N. Deterministic Local Interpretable Model-Agnostic Explanations for Stable Explainability. Mach. Learn. Knowl. Extr. 2021, 3, 525–541. [Google Scholar] [CrossRef]
  24. Buslaev, A.; Iglovikov, V.I.; Khvedchenya, E.; Parinov, A.; Druzhinin, M.; Kalinin, A.A. Albumentations: Fast and Flexible Image Augmentations. Information 2020, 11, 125. [Google Scholar] [CrossRef]
  25. Sebastien, C.; Wong, S.C.; Gatt, A.; Stamatescu, V.; McDonnell, M.D. Understanding data augmentation for classification: When to warp? arXiv 2016, arXiv:1609.08764. [Google Scholar]
  26. Zhou, S.; Chen, Y.; Zhang, D.; Xie, J.; Zhou, Y. Classification of surface defects on steel sheet using convolutional neural networks. Mater. Teh. 2017, 51, 123–131. [Google Scholar] [CrossRef]
  27. PyTorch. Illustration of Transforms. Available online: https://pytorch.org/vision/main/auto_examples/transforms/plot_transforms_illustrations.html (accessed on 3 March 2024).
  28. Saponara, S.; Elhanashi, A.; Saponara, A. Impact of Image Resizing on Deep Learning Detectors for Training Time and Model Performance. In Applications in Electronics Pervading Industry, Environment and Society; Springer International Publishing: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
Figure 1. A typical standstill mark on the raceway of an axial ball bearing (type 51206). The undamaged inner zone and the elliptical shape of the damage with areas of severe tribooxidation are characteristic features.
Figure 1. A typical standstill mark on the raceway of an axial ball bearing (type 51206). The undamaged inner zone and the elliptical shape of the damage with areas of severe tribooxidation are characteristic features.
Machines 12 00840 g001
Figure 2. Evaluation of standstill marks with school marks (upper left: grade 1; lower right: grade 6).
Figure 2. Evaluation of standstill marks with school marks (upper left: grade 1; lower right: grade 6).
Machines 12 00840 g002
Figure 3. Top row: original image, overexposure and spot reflections. Bottom row: color, brightness and contrast.
Figure 3. Top row: original image, overexposure and spot reflections. Bottom row: color, brightness and contrast.
Machines 12 00840 g003
Figure 4. Top row: original image, horizontal flip and vertical flip. Bottom row: transposing, rotation by 180° and rotation by small angles. Note: The small black border of the images rotated by small angles vanishes after the image is cropped before it is fed into the CNN.
Figure 4. Top row: original image, horizontal flip and vertical flip. Bottom row: transposing, rotation by 180° and rotation by small angles. Note: The small black border of the images rotated by small angles vanishes after the image is cropped before it is fed into the CNN.
Machines 12 00840 g004
Figure 6. The validation accuracies for the classification of all six SSM classes on the re-sorted dataset with a pre-trained and partially frozen ConvNeXt_Base depending on learning rate and number of frozen layers.
Figure 6. The validation accuracies for the classification of all six SSM classes on the re-sorted dataset with a pre-trained and partially frozen ConvNeXt_Base depending on learning rate and number of frozen layers.
Machines 12 00840 g006
Figure 7. The confusion matrices of the best final classification runs, presented on the left for six classes and on the right for pairwise classes.
Figure 7. The confusion matrices of the best final classification runs, presented on the left for six classes and on the right for pairwise classes.
Machines 12 00840 g007
Figure 8. The test accuracies of the best final classification runs, depending on CNN and augmentation group, presented on the left for six classes and on the right for pairwise classes.
Figure 8. The test accuracies of the best final classification runs, depending on CNN and augmentation group, presented on the left for six classes and on the right for pairwise classes.
Machines 12 00840 g008
Figure 9. The left image has the original resolution of 960 by 1280 pixels. The middle image has a resolution of 90 by 122 pixels. Many details are still visible. The right image has a resolution of 45 by 61 pixels. Here, details are completely missing, such as the one marked by the red circle, or are deformed, such as the one marked by the yellow circle.
Figure 9. The left image has the original resolution of 960 by 1280 pixels. The middle image has a resolution of 90 by 122 pixels. Many details are still visible. The right image has a resolution of 45 by 61 pixels. Here, details are completely missing, such as the one marked by the red circle, or are deformed, such as the one marked by the yellow circle.
Machines 12 00840 g009
Table 1. Image distribution of the original dataset.
Table 1. Image distribution of the original dataset.
SetTotalClass 1Class 2Class 3Class 4Class 5Class 6
Dataset103212911730524916666
Training720908221317411546
Validation208262461503413
Test10413113125177
Table 2. Image distribution of the re-sorted dataset.
Table 2. Image distribution of the re-sorted dataset.
SetTotalClass 1Class 2Class 3Class 4Class 5Class 6
Dataset103212811830624517065
Training721908221417111846
Validation207252461493513
Test10413123125176
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Grebe, M.; Baral, A.; Martin, D. Use of Image Recognition and Machine Learning for the Automatic and Objective Evaluation of Standstill Marks on Rolling Bearings. Machines 2024, 12, 840. https://doi.org/10.3390/machines12120840

AMA Style

Grebe M, Baral A, Martin D. Use of Image Recognition and Machine Learning for the Automatic and Objective Evaluation of Standstill Marks on Rolling Bearings. Machines. 2024; 12(12):840. https://doi.org/10.3390/machines12120840

Chicago/Turabian Style

Grebe, Markus, Alexander Baral, and Dominik Martin. 2024. "Use of Image Recognition and Machine Learning for the Automatic and Objective Evaluation of Standstill Marks on Rolling Bearings" Machines 12, no. 12: 840. https://doi.org/10.3390/machines12120840

APA Style

Grebe, M., Baral, A., & Martin, D. (2024). Use of Image Recognition and Machine Learning for the Automatic and Objective Evaluation of Standstill Marks on Rolling Bearings. Machines, 12(12), 840. https://doi.org/10.3390/machines12120840

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop