Segmented Embedded Rapid Defect Detection Method for Bearing Surface Defects

: The rapid development of machine vision has prompted the continuous emergence of new detection systems and algorithms in surface defect detection. However, most of the existing methods establish their systems with few comparisons and veriﬁcations, and the methods described still have various problems. Thus, an original defect detection method: Segmented Embedded Rapid Defect Detection Method for Surface Defects (SERDD) is proposed in this paper. This method realizes the two-way fusion of image processing and defect detection, which can efﬁciently and accurately detect surface defects such as depression, scratches, notches, oil, shallow characters, abnormal dimensions, etc. Besides, the character recognition method based on Spatial Pyramid Character Proportion Matching (SPCPM) is used to identify the engraved characters on the bearing dust cover. Moreover, the problem of characters being cut in coordinate transformation is solved through Image Self-Stitching-and-Cropping (ISSC). This paper adopts adequate real image data to verify and compare the methods and proves the effectiveness and advancement through detection accuracy, missing alarm rate, and false alarm rate. This method can provide machine vision technical support for bearing surface defect detection in its real sense.


Introduction
Bearings are important mechanical parts produced in large quantities by assembly lines. In the process of bearing production and assembly, due to the impact of some factors such as equipment wear, collision, and extrusion, oil pollution, bearings inevitably show some abnormal shapes, colors, or dimensions, which are so-called bearing defects. The bearing defects can cause different degrees of quality problems in the machine, leading to failures or serious damage to the machine. Therefore, bearing surface defect detection [1] and bearing fault diagnosis [2][3][4] are of great significance.
Bearing fault diagnosis generally analyzes the signals in bearings to identify the damage or fault in the bearings. Peng [5] uses an improved Hilbert-Huang transform (HHT), which is a time-frequency method to detect the bearing faults. Randall [6] analyzes the impulsive signals from bearing faults by spectral correlation and envelope analysis. Lou [7] designs a scheme for bearing fault diagnosis based on processing the vibration signals by the wavelet transform and neuro-fuzzy classification. Moreover, with the development of artificial intelligent, some deep learning based methods [8][9][10] analyze time/frequencydomain vibration signals, extract features from them, and the neural networks can be

Methods
In the existing surface defect detection methods, defect detection is an important module after image processing [24][25][26]. This paper changes this mode and adopts an innovative two-way fusion method of image processing and defect detection. Defect detection is no longer an independent step in the whole system, but some integrated and synchronized parts in image processing. Some defect detection processes are even the image processing itself. We call this defect detection method as Segmented Embedded Rapid Defect Detection Method for Surface Defects (SERDD). Because it is integrated and synchronized with image processing, SERDD can streamline the system's execution steps and reduce algorithm complexity and time overhead.
The core idea of SERDD is to embed defect detection into image processing through particular characteristics of image processing, to form a segmented detection mode. Notch, abnormal dimension, non-character region defects, and character defects are detected in four image processing stages of bearing positioning, bearing segmentation, character extraction, and character recognition, respectively, as shown in Figure 1. The principle and implementation processes are described in the following section.

Algorithm Flow
The algorithm of SERDD is combined with two parts: image processing and defect detection. The specific algorithm flow is shown in Figure 2. It can be seen from Figure 2 that image processing proceeds in sequence from image reading to character recognition. SERDD embeds the defect detection into image processing modules in the yellow boxes. Therefore, in addition to completing image processing tasks, these modules also have the ability of simultaneous defect detection. These will be described in detail in the following sections.

Image Preprocessing
The bearing surface image obtained by the image acquisition system is shown in Figure 3.
Bilateral filtering and median filtering are successively used on the image, eliminating most of the noise on the bearing surface and background.

Bearing Positioning
Since the big difference between bearing and background, we can easily find the bearing's outer contour. According to the contour, the center of the bearing and radius can be calculated by the least square method. The outer circle can be deduced reversely from the center of the bearing and radius.

Notch Detection of Bearing Outer Ring
By calculating the Euclidean distance between the contour of the outer ring and the circle derived by the least square method, it can be determined whether there is a notch on the bearing's outer circle. Namely, if the Euclidean distance is greater than the reference threshold, some notches are on the outer contour.
Assuming that the point set on the contour of the outer circle is Con = {x 1 , x 2 , . . . , x n }, and x i is a point somewhere on the contour, and assuming that the point set on the circle derived from the center of the circle and radius is Cir = {y 1 , y 2 , . . . , y n }, and y i is a point on the circle corresponding to x i , the distance between the two is: The basis for judging whether there is a notch on the outer ring is: d = 1 means there is a defect; otherwise, there is no defect. T means the distance threshold for judging whether there is a notch. Figure 4 is an instance of the notch defect.

Bearing Segmentation
Since the radius ratio of the outer ring, inner ring, and dust cover of the standard bearing of the same model is fixed, we can divide the outer ring, inner ring, and dust cover of the bearing accordingly, as shown in Figure 5.

Abnormal Dimension Detection
The reason for adopting segmentation based on a fixed ratio is described below. Once the outer ring, inner ring, or dust cover divided by the standard bearing ratio deviates, there must be a dimension defect. SERDD performs abnormal dimension detection here. Since the inside and outside edges of the three parts are in low gray value, we only need to count the number of low-gray pixels in the edge regions to determine whether there is a dimension defect, as shown in Figure 6.

Polar to Cartesian (P2C) Coordinate Transformation
The three regions are all in the shape of a ring. In order to facilitate subsequent character recognition and defect detection, we introduced Polar to Cartesian (P2C) coordinate transformation [27] to transform the ring into a more tractable rectangular band.
As shown in Figure 7, R is the outer radius of the ring, r is the inner radius of the ring, H is the width of the rectangular band, L is the length of the rectangular band, and f (x, y) is the coordinate point somewhere on the ring, F(m, n) is the point of the rectangular band corresponding to f (x, y), m and n are the numbers of rows and columns of the band (the upper left corner of the rectangular band is (0,0)), and h is the height of F(m, n) in the band. The corresponding relationship is as follows: After P2C, we can convert the three-part image in Figure 5 into the rectangular band image in Figure 8.  3.6. Image Self-Stitching-And-Cropping (ISSC) Unlike the outer and inner rings, the dust cover has engraved characters whose grayscale performance is nearly the same as defects. Thus, they need to be removed through character recognition. P2C cut a ring at the polar axis direction and convert it into a rectangular band. Once a character crosses the polar axis, as shown in the red line of Figure 9a, it will be cut during P2C. As shown in Figure 9b, "C" is cut so that the subsequent character recognition will fail to recognize "C" normally.
In the actual industrial site, the bearings' placement is random, so it is impossible to avoid the problem of characters being cut.
Liu [20] proposes a band expansion method based on character width to solve the problem of the character being cut. This method restores the cut characters but retains the characters' cut parts and may cut another character again. Chen [28] does not use P2C but counts the binary histograms every a short distance on the dust cover, rotates the dust cover until the histograms match the template's histograms, and then separates the character region. Due to the difficulty of this matching, the performance is not satisfactory enough.
ISSC proposed in this paper can perfectly solve the problem of a character being cut. It is divided into two steps: stitching and cropping. In the stitching step, the band acquired in P2C is copied and horizontally stitched, which can restore the original cut character in the center of the "new band". In the cropping step, a simple binary segmentation is used to distinguish between foreground and background roughly. The pixel value of the foreground becomes 0, and the pixel value of the background is 255. After that, the cropping position is located by counting the pixel number of foreground in every column. If the pixel number of foreground in a column does not exceed a certain threshold, the column is a background region, which means the cropping position can be set in this column. The distance between another cropping position and this cropping position is the length of the original band. After these two steps, a new band without any cut character is acquired. Figure 10 shows the performance of this method from an example. First, the original image is copied and horizontally stitched, increasing the length of the band from the original L to 2L and restoring the original cut character at the purple column in the figure.
After the binary segmentation, black pixels in every column of the band are counted from left to right. Once the number of black pixels in a certain column does not exceed a certain threshold, the column is a background region, as shown at the left yellow column in the figure. A L-length band is cropped from the left yellow column to the right one to get a complete band without any cut character. Compensation light-reflection and color attenuation may create redundant contours, which may disturb the segmentation. However, the cropping position can only be located at a certain column without any contours. Thus, all the columns with normal and redundant contours are excluded. So compensation light-reflection and color attenuation cannot influence the performance of ISSC. This method can effectively retain all the information of the original image without adding redundant information. Figure 11 and Algorithm 1 shows the process.

Threshold Segmentation
The grayscale image cannot accurately extract characters or defects, so we need to perform binary segmentation on the image. The binary segmentation method we use is the OTSU algorithm [29][30][31]. Assuming that the band has N pixels, m is the pixels occupied by the foreground (characters or defects), n is the pixels occupied by the background, and T is the set threshold, then: w 0 and w 1 respectively represent the proportion of foreground and background pixels, µ 0 and µ 1 respectively represent the average gray levels of foreground and background pixels. µ T and σ 2 T respectively represent the overall average and variance when T is the set threshold. The OTSU algorithm traverses the threshold from 0 to 255 and finds T OTSU which makes σ 2 T the largest, that is: The OTSU algorithm segments gray images by finding the threshold that maximizes the variance between foreground and background. It can quickly and accurately segment characters, defects, and backgrounds due to the apparent differences between characters and backgrounds in the image. Besides, it can increase the segmentation accuracy of some images, such as low saturation images.

The Removal of Small Connected Domains and Holes
After the threshold segmentation of the outer ring, dust cover, and inner ring bands, some noise or reflective spots are retained to form small white connected domains in the non-character regions and small black holes inside the characters. These connected domains and holes will interfere with the subsequent algorithms. Figure 12 demonstrates a removal example of connected domains and holes in a dust cover. It can be seen that all the small white connected domains in the non-character region have been removed [32], all the holes inside the characters have been filled, which is helpful for character extraction and recognition.  Before character recognition, each character needs to be extracted from the dust cover band. The first step is to extract all the contours in the band, but only keep the outermost contour [33]. The step ensures that each character has only one contour. The next step is to calculate the minimum bounding rectangle according to the outermost contour and crop each character. As shown in Figure 13.

Defect Detection of Non-Character Regions
As can be seen from Figure 12, the bands have no noise, leaving probably only defects or characters. Therefore, in the character extraction step, characters can be easily found by contour extraction. It should be noted that if there is a defect in the band, the defect contour will also be extracted as a character candidate. Before character cropping, a simple step of contour counting function is added. Suppose the number of contours is not equal to the original number of characters in the band. In that case, it can be judged that there is a defect in the non-character region (defects may lead to adhesion of two character contours, and this step can also detect such character defects). For example, the normal band has 13 characters: WTOO 608Z CHINA, so the number of contours in the band is 13. Once the number of contours of a band is more than 13, the situation is abnormal regardless of whether the extra contour is a character or a defect. Some instances are shown in Figure 14. Additionally, there is a scenario: a defect adds one contour, and due to another defect, one character is removed from the bearing surface. The algorithm cannot detect this defect. However, this defect is classified as a character defect and can be detected by the character recognition method in the following section.
Defects such as depressions, scratches, and oil can all be detected in this step. There are no characters on the outer and inner rings, so it must be a defect as long as any contour is extracted.

Character Recognition
The existing machine vision literature adopts targeted character recognition methods [34]. Liu [20] trains 50 samples with a small neural network, and it can classify seven classes of characters and defective characters. Shen [21] divides each character into 3 × 3 cells, calculates the proportion of the character part in each small cell as a feature vector, and then recognizes the character by 2-norm of the difference between the template feature vector and this feature vector. Chen's method [22] rotates and matches character regions through cross-correlation coefficients. Then it recognizes characters through the moment invariant features of the character edge envelopes. According to the position, division, and normalization features of characters, Wang [34] uses LabVIEW's virtual instrument technology and image processing technology to recognize characters. This paper proposes a character recognition method based on Spatial Pyramid Character Proportion Matching (SPCPM). Inspired by Shen's method [21], this method identifies whether the character is correct by character proportion matching. It has been found by experiments that the character recognition method of Shen [21] cannot detect all the defects on characters, and it is easy to confuse different characters like "8" and "6". Moreover, we have found inspiration from the improvement from HOG [35] to PHOG [36,37] and given the sense of "space" and "level" to character features, namely adding the concept of Spatial Pyramid Matching (SPM) [38,39]. This method enriches and concretizes the features and further increases the difference of different characters so as to recognize each character more accurately and identify whether there are defects on the characters more accurately.
To get the character proportion features, it divides a single character into m × n cells firstly, as shown in Figure 15.  Assuming a pixel p in the i-th cell, since the image is binary, the value of p can only be 0 or 255. The proportion of character in this cell can be calculated by Formula (10).
Here, n (p=255) , n (p=0) respectively represent the pixel number of character and the pixel number of background. After obtaining the proportion of each cell, combine them into a 1 × (m × n) vector, which is the feature vector of this character, as shown in Equation (11).
Here, C is the feature vector of this character. Finally, the 2-norm of the difference between template feature vector C t and C is calculated. If the 2-norm is less than a certain threshold, it means that the characters match successfully. Otherwise, the recognition fails. The calculation of the two norms is as Equation (12).
To improve the accuracy of character matching and recognition, SPM [38,39] has been introduced. SPM divides the image into several cells and calculates the feature information of each cell. Meanwhile, a multi-scale segmentation method is adopted to extract different fine-grained feature information.
SPCPM divides characters into several scales. For example, as shown in Figure 16, the character "8" is divided into 1 × 1, 2 × 2, 3 × 3 cells. The character proportion of each cell in each scale is still calculated by Formula (10) and the feature vector is still calculated by Formula (11). Assuming that the feature vectors corresponding to different scales is C 1×1 , C 2×2 , · · · , C k×k , then these feature vectors are flattened into one-dimensional vector, as shown in Formula (13). Figure 17 intuitively reflects this process. Finally, character proportion matching is achieved by Equation (12). The whole process is as shown in Figure 18 and Algorithm 2.  Character proportion is the statistical information of image features, which lacks structural information of image features, and Spatial Pyramid makes up for this weakness. Moreover, owing to the existence of hierarchical features, the sensitivity of features increases. Therefore, SPCPM can improve the accuracy of character recognition and defect detection. Specific comparisons will be given in the experimental section.

Defect Detection of Characters
The character recognition in this paper is not only for recognizing characters but also for detecting character defects. The character recognition and defect detection work perfectly together here. Once the character to be recognized does not match any template character, it can be judged that the character has a defect. The defect detection process is the same as the previous section, so it will not be described here. Possible defects are depressions, scratches, and shallow characters, as shown in Figure 19.

Experimental Setup
A suitable combination of image acquisition devices is used in this research. Firstly, according to the industrial field's actual requirements, a CCD area-array camera with a resolution of 2 million pixels and a frame rate of 24 fps was chosen. Comparison and analysis of the effect of the combination of lens and light source on bearing surface defect detection were tested. Finally, a 25 mm prime lens and coaxial surface light source were assembled on the camera. All the parameters like exposure and brightness were set at proper values according to the actual situation to acquire high-quality images. The image acquisition devices could avoid the appearance of dark, blurred, low saturation, and low contrast images, reduce the complexity of the algorithm and improve the detection speed.
Because the authors of the two articles we compared did not provide the corresponding data set and there was no recognized bearing surface image data set, we could only conduct experiments and verify through our own data set. In this paper, 650 bearings produced by a bearing enterprise were used as experimental samples. The bearing specifications were the same, and the characters on the dust cover had two types: "608Z WTOO CHINA" and "608Z WTOO". Among them, there were 483 standard bearing samples and 167 defective bearing samples. A total of 1285 effective bearing surface images were obtained, including 120 images with depressions, 82 images with scratches, 45 images with abnormal dimensions, 36 images with notches, 22 images with shallow characters, and 15 images with oil. All these bearing images were input into our algorithm for defect detection, and corresponding results could be obtained.
We measured the effectiveness and advancement of this method by the following indicators. The number of normal bearings was defined as N 1 , the number of successful detection of normal bearings was n 1 , the number of defective bearings was N 2 , and the number of successful detection of defective bearings was n 2 , then: Detection accuracy of normal bearings R 1 is: Detection accuracy of defective bearings R 2 is: False alarm rate R 3 is: Missing alarm rate R 4 is: Table 1 shows the experimental results obtained by testing with all the collected images. In the actual bearing defect detection process, detection accuracy was more important than false alarm and missing alarm because the detection error was the alarm source. Lowering the rate of error was the most fundamental way to lower the alarm. So the primary aim of SERDD was to control the detection error. Moreover, a missing alarm was more significant than a false alarm because missing detection made the defective bearings enter the market. Once the defective bearings were assembled on the machines, it would bring huge security risks. Therefore, we have further tightened the basis for determining defects in each step of SERDD and finally achieved zero missing alarm results. We sorted out the experimental data in the paper [20,21] and compared them with our results, as shown in Figure 20. It can be seen from the figure that the false alarm rate of the paper [20] was the highest, reaching 4%. The false alarm rate of paper [21] was relatively low, only 2.06%, but its missing alarm rate was 6%, which is not allowed in actual detection. The experimental results of ours were as follows: the detection accuracy of normal bearings was 97.31%, the detection accuracy of defective bearings was 100%, the average detection accuracy was 98.66%, the false alarm rate was controlled at 2.69%, and the missing alarm rate was 0%. It should be noted that the average detection accuracy was 0.66% higher than [20] and 2.69% higher than [21]. The detection accuracy of normal bearings was slightly lower than that of the paper [21] and the false alarm rate was slightly higher than that of the paper [21]. This is because the missing alarm rate was controlled at 0%, which resulted in a partial loss of accuracy. In terms of processing speed, as shown in Figure 21, ref [20] took 2.11 s on average to detect a bearing image which resolution was 480 × 640 pixels and [21] took 1.56 s on average to detect a bearing image which resolution was 1600 × 1200 pixels. Thanks to SERDD, it took an average of 0.66 s to detect a bearing image which resolution was 1296 × 972 pixels. When the image resolution was at the same level, the speed of SERDD was more than twice that of state-of-the-art methods. Moreover, if SERDD detected defects at the first segment of SERDD, it took only 0.32 s. The time shown here includes the CPU time of processing the programs and reading the images. Due to the segmented feature of SERDD, we adopted separate tests for each step of SERDD. For the four defect detection parts, we chose 100 targeted bearing images, respectively. In order to simulate the actual situation as much as possible, the 100 images contained defects to be detected and contained various other defects. The results of the four defect detection parts are shown in Tables 2-5 respectively.    It can be seen from Tables 2-4 that the detection of notches, abnormal dimensions, and defects in non-character regions could almost reach 100% accuracy. Only the detection of character defects has a 3.92% false alarm rate. We picked out these two wrong images and found that the characters detected incorrectly were both "N, I, A". After a series of checks, it has been found that the feature vectors of these three characters were sensitive to the endpoints of these letters, which led to false alarms.

Experimental Results and Analysis of Character Recognition
SPCPM proposed in this paper is an improvement and innovation method based on Shen's method [21]. Therefore, we have rebuilt Shen's method and conducted a set of control experiments with this article. The results are shown in Tables 6 and 7. The data in Tables 6 and 7 are drawn as histograms for intuitive comparison, as shown in Figure 22. As shown in Figure 22, the SPCPM method is superior to the CPM method in all indicators. The normal character recognition accuracy improves by nearly 9%, and the false alarm rate reduces by 9 % accordingly. The defective character recognition rate increases to 100%, and the missing alarm rate declines to 0%. It can be seen that neither SPCPM nor CPM had a high recognition accuracy for normal characters, and there were two reasons. First, under the influence of defects in non-character regions, the shape of characters changed slightly, resulting in the failure of recognizing the characters correctly. Second, in order to control the missing alarm rate, we had a relatively strict judgment basis for normal characters, and it was was to judge normal characters as defective characters. It is found that most of the normal character recognition errors were caused by defects in non-character regions, which could be detected before character recognition, so the overall defect detection accuracy was not affected. Besides, we also tested character recognition of Tesseract [40], the mainstream OCR system, and the results were not satisfactory. The overall recognition rate of Tesseract was about 52%.

Conclusions
The research work in this paper aims at detecting bearing surface defects, and this paper proposes a novel surface defect detection method, Segmented Embedded Rapid Defect Detection Method for Surface Defects (SERDD). This method has the advantages of high speed, high accuracy, high portability, low error, and low cost. In this method, defect detection is embedded and fused into specific suitable image processing steps to complete image processing and defect detection tasks at the same time. This paper also proposes Image Self-Stitching-and-Cropping (ISSC) to prevent characters from being cut. Moreover, this paper proposes a method of character recognition based on Spatial Pyramid Character Proportion Matching (SPCPM), which can efficiently and accurately recognize specific characters. Experiments have proved that the algorithm can accurately and quickly detect defects such as depressions, scratches, notches, oil, shallow characters, abnormal dimensions on the bearing, and is state-of-the-art in peer research.
However, much follow-up work remains to be done: 1. Improve the accuracy of SPCPM. 2. Deal with some possible situations of particular defects. 3. Set up the related execution equipment to build a defect detection system for bearing surface defects.