Anomaly Segmentation Based on Depth Image for Quality Inspection Processes in Tire Manufacturing

: This paper introduces and implements an efﬁcient training method for deep learning–based anomaly area detection in the depth image of a tire. A depth image of 16 bit integer size is used in various ﬁelds, such as manufacturing, industry, and medicine. In addition, the advent of the 4th Industrial Revolution and the development of deep learning require deep learning–based problem solving in various ﬁelds. Accordingly, various research efforts use deep learning technology to detect errors, such as product defects and diseases, in depth images. However, a depth image expressed in grayscale has limited information, compared with a three-channel image with potential colors, shapes, and brightness. In addition, in the case of tires, despite the same defect, they often have different sizes and shapes, making it difﬁcult to train deep learning. Therefore, in this paper, the four-step process of (1) image input, (2) highlight image generation, (3) image stacking, and (4) image training is applied to a deep learning segmentation model that can detect atypical defect data. Defect detection aims to detect vent spews that occur during tire manufacturing. We compare the training results of applying the process proposed in this paper and the general training result for experiment and evaluation. For evaluation, we use intersection of union (IoU), which compares the pixel area where the actual error is located in the depth image and the pixel area of the error inferred by the deep learning network. The results of the experiment conﬁrmed that the proposed methodology improved the mean IoU by more than 7% and the IoU for the vent spew error by more than 10%, compared to the general method. In addition, the time it takes for the mean IoU to remain stable at 60% is reduced by 80%. The experiments and results prove that the methodology proposed in this paper can train efﬁciently without losing the information of the original depth data.


Introduction
The 4th Industrial Revolution has stimulated the need for innovation in the manufacturing industry, and the intelligent manufacturing system is also becoming an important issue [1][2][3][4][5]. In addition, the manufacturing paradigm is changing from the mass production of a small range of types to multi-products [6][7][8]. For this reason, to adapt to the change in the manufacturing paradigm and meet the needs of consumers, the existing manufacturing process is developing to become more flexible. Manufacturing process data collection and analysis technology are some of the crucial elements of the intelligent factory. With the development of IoT technology, it has become possible to extract data from all manufacturing processes. Based on this vast amount of data, intelligent systems aim to improve productivity and energy efficiency and reduce defect rates [9][10][11][12]. Quality inspection of a product is an important task that can reduce the defect rate in the manufacturing process through various tests before shipment [13,14]. Product quality inspection can be composed of visual inspection and inner structure inspection. Surface inspection is performed based on vision systems, such as ultraviolet, microscopy, RGB, and depth imaging [15][16][17][18], while inner structure inspection is performed by X-ray, 3D-CT, and ultrasound [19][20][21].
In the automobile tire manufacturing process, the final inspection stage detects defects in the tire using a visual inspection. In most of these inspections, the operator visually determines whether a tire is defective [22]. Among the errors occurring on the tire surface, the vent spew error is hairs on the tire surface, which are to be cut during the manufacturing process, remaining over a certain length. This error implies that the tool that cuts the hairs on the tire surface needs to be replaced. There are more than 200 points on a single tire surface where a vent spew error can occur.
Additionally, there are a total of over 100 errors that can appear on the tire surface, including the vent spew error. Inspecting these errors by the naked eye can waste time and human resources. The error detection rate dramatically depends on the operator's skill level, because the size and location volatility of the error is high [23]. Driving a vehicle with poor tire condition can lead to critical accidents, so a more precise error detection method should be applied.
Therefore, this paper applies a deep learning method to the final quality inspection of automobile tires. For this proposal, we demonstrate the process and architecture as follows: (1) use a depth image of the tire surface as training data, (2) segment the error region using a deep learning model, and (3) preprocess the data by applying the anomaly detection concept for more effective accuracy.

Depth Image
In the depth image, the pixel value represents the distance from the camera to the subject. This depth image can be post-processed to add realism to the 2D image, or used in various areas, such as games, motion recognition, and 3D printing [24]. Depending on factors of interest, such as the material and hardness of the subject, various types of images, such as CT images or X-ray radiographic images, can be used to analyze the subject. Various studies, such as medicine, agriculture, and manufacturing, are actively being conducted [25][26][27][28][29][30][31]. These images have the advantage of seeing changes in parameters of interest (e.g., materials constituting the object, hardness, and height) from a photograph of the object at a glance [32]. A depth image is most suitable for judging tire errors, because the value of each pixel represents height, which is the most reliable criterion for detecting faults occurring in the tire surface.
It is challenging to use a traditional computer vision algorithm to detect many errors occurring on the tire surface through simple height information. Because the tire's tread forms a finely curved shape, methods using conventional computer vision algorithms to address this error require curvature correction before error detection [33]. Additionally, to accurately detect an error occurring in a tire, it is necessary to be able to filter the noise of the height value and to be able to determine the shape of the information displayed by the height difference. In addition, excessive noise filtering to detect one error may lose features that play an important role in detecting another error. To solve this problem, in this paper, we detect the tire error using the result of training the depth image on the deep learning network model.

Deep Learning Segmentation
Deep learning methods applied to error detection are composed of (1) classification and (2) segmentation methods. The same defects on the tire surface follow an atypical shape [34,35]. For example, in the case of a vent spew error, it is difficult to have the same shape because the type of tire, the location of the error, the size of the tire, and the length of the protruding hair are different. The classification method is not suitable for training and classifying such unstructured data [36]. Since the segmentation method detects errors in units of pixels, more accurate fault detection is possible. We train and detect tire error data using a network called DeepLab, among several deep learning-based segmentation networks.
DeepLab is a deep learning network used in fields that require precise prediction on a pixel-by-pixel basis as a semantic segmentation algorithm [37]. Semantic segmentation classifies each image pixel differently from the existing CNN that classifies the entire image [38,39]. Therefore, it is possible to classify several classes existing in one image in units of pixels. Because of these characteristics, the semantic segmentation method is used in various fields, such as autonomous driving and medical care [40]. The semantic segmentation algorithm extracts abstract semantic features that are global and resistant to change through filters for training. This process may lose features that are not necessarily global but can be important factors in error detection. The DeepLab network uses atrous convolution and fully connected conditional random field (CRF) to compensate for these shortcomings [41]. In this paper, we train and detect tire defects, using the latest version of the DeepLab network (V3+) model [42].
Unlike the classification method, the segmentation method must also label pixels on the normal tire surface. This means that during the training process, the deep learning classification model also trains normal pixels. The rate of occurrence of defective tires in the tire manufacturing process is quite low. In addition, the area occupied by the defect in the overall image of the defective tire is very small. Therefore, the tire error detection process is similar to the anomaly detection process.

Anomaly Detection
Anomaly detection refers to distinguishing between normal and abnormal samples in data [43,44]. Anomaly detection mainly detects anomalous data or situations in various fields, such as manufacturing, medical care, and image processing [45][46][47][48][49][50]. The difficulty facing anomaly detection is that the frequency of occurrence of abnormal data is significantly lower than normal. Therefore, significant time and effort are required to extract abnormal data. Methodologies for training normal data can solve problems arising from imbalances in abnormal data [51].
The autoencoder methodology [52,53] is a method for training the characteristics of normal data in an autoencoder. When information containing abnormal data is input, the autoencoder, which has learned normal data, restores abnormal data. The difference between input data and restore data implies faults in the input data. However, this method uses unsupervised learning that does not contain label data. It also depends on the hyperparameters and the performance of the autoencoder. Therefore, there is the disadvantage that the overall restoration performance is somewhat unstable.
The DeepLabV3+ neural network model used in tire error detection in this paper uses a structure similar to the autoencoder methodology of anomaly detection to segment the data [42]. Because it labels all training data, it still solves various problems with conventional autoencoders. However, when using images with only simple depth information, there is a possibility of a limit to the training results. Therefore, in this paper, based on the neural network architecture described above, neural network training is performed through anomaly detection and global preprocessing that are not dependent on a specific tire error. The system architecture presented in this paper generates additional information by applying the anomaly detection concept based on the tire depth image. Because the convolution by the image filters does not generate these images, these give the neural network new data about tire errors.

Summary
This paper designs and implement the tire error segmentation system. To this end, after acquiring a depth image of the tire surface through a 3D camera, deep learning is performed by using the concept of anomaly detection. For tires, defects, such as bulges, dents, and scratches, are detected using visual inspection [54]. Therefore, there is a need to detect errors through a depth image that can express the height of the tire surface more precisely than the existing RGB camera. Since the number of bad tires in a manufacturing plant is low per day, it takes a significant amount of time to collect enough data to ensure the reliability of the training results. Therefore, before other tire error segmentation, we aim to detect vent spew errors that can obtain the most data when a bad tire occurs. The trained model using the tire depth images detects the area of fault on the tire surface. Since the tire detection system must have the robustness to distinguish various tire errors, it aims to derive the maximum classification result, without losing the original data as much as possible. Therefore, in this paper, the deep learning network uses preprocessed tire depth images that apply anomaly detection concepts to train images. This paper is structured as follows. Section 2 describes the acquiring of tire data, and the data preprocessing process and architecture for more effective training. Section 3 compares the training result through preprocessing presented in this paper with the training result without the same model. And analyzes the points to be improved in this study through the results and suggests future research. Section 4 draws our conclusion.

Materials and Methods
This section introduces the process and architecture of preprocessing depth image data for efficient tire defect detection and training. For tire error detection, a depth image having tire height information is acquired, using a 3D camera, and this image is used to detect defects in the tire.
The tire depth image contains depth information that best describes the error that may occur in the tire. However, (1) the height value is concentrated in a narrow section, and (2) with the same tire error, the shape is often different. For these two reasons, there is a limit to the error detection accuracy when the deep learning model trains using only pure depth images. Although this paper aims to classify the area of vent spew error, it must have the robustness to classify additional errors in the future. In other words, it should not damage the original data, and show maximum classification performance.
Therefore, this paper generates three images based on the original depth image: (1) the original image, (2) a histogram equalization depth image, and (3) a height information heatmap image. A heatmap is a graphical representation of data in which data values are represented as colors. These images are stacked into one three-channel image and used for deep learning network training. Histogram equalization data can compensate for the phenomenon that height data are concentrated in a narrow section, and heatmap data inform the degree of the anomaly of height values in depth images. By providing these additional data as training information, better results are obtained. Figure 1 compares deep learning, using original depth images and the learning method presented in this paper.

System Process
This paper uses the depth image containing 16-bit depth information. The vent spew error, which is the target of detection, mainly appears in the form of a protrusion from the tire's surface. However, errors can be found at low height points in the depth image, such as tears and dents. Therefore, data preprocessing that is specialized for vent spew errors, such as filtering values above a certain height, is unsuitable for the training and detection of different errors, as it loses information on the lower height part.
This section introduces a depth image preprocessing process that efficiently trains and detects tire vent spews and prevents the loss of existing information. The image preprocessing process consists of 4 steps: (1) image input, (2) highlighted image creation, (3) image stacking, and (4) image training. Figure 2 shows the four-step process of preprocessing the depth image before training the tire error. In this paper, we use the DeeplabV3+ model-one of the image segmentation modelsto train and detect atypical errors. Figure 3 shows a schematic detailing the application of the four-step process to train the DeepLab model.

Step 1: Image Input
This step is the step of loading the image to train the tire depth image. We created tire depth images for training and labeled images that indicate errors in the training images to train the image segmentation model. In this paper, we defined a vent spew error, which is more than 2 mm from the tire surface among vents. Figure 4 shows the vents on the tire tread. Since it is difficult to judge an image expressed in depth with the naked eye, this is an example taken with an RGB camera. Figure 4 shows the tire tread taken with a general RGB camera to help understand the vent spew error. In Figure 4a, the vents are points protruding from the tire surface. The green circle in Figure 4a indicates one of the vents. Among these vents, a vent with a length of 2 mm or more is a vent spew error. Figure 4b shows the results of labeling vent spew errors in yellow. The green circle in Figure 4b indicates one of the vent spew errors. To express the tire surface as a depth image, we used the following method. (1) First, a laser is fired on the tire tread surface that rotates with a constant angular velocity. (2) The fired laser is photographed with a 3D camera positioned at an angle of 30 degrees. (3) The captured image is converted into a depth image. Because the 3D camera captures the tire surface at an angle of 30 degrees, shadows may appear when photographing the vent spew error, which shows the shape protruding from the tire surface. Areas of unknown depth, such as shadow areas, have a value of 0. The depth image is mapped to an integer range (0 to 65,535), according to the degree of protrusion. Figure 5 shows the conversion of the surface of the tire into a depth image, using a laser. We compared the actual tire with the depth image taken to label the vent spew error, and measured a vent greater than 2 mm. Since the depth image composed of 16-bit integers has an extensive range of values, it is difficult for the naked eye to distinguish between high and low points. Therefore, we performed labeling by comparing it with the depth image converted to 3D form and the actual location. Figure 6 shows a part of the photographed tire tread depth image labeled accordingly.

Step 2: Highlight Image Creation
In Step 2, creating a highlight image provides the classifier with additional information other than depth information. The tire depth image taken in this paper expresses height information as a 16-bit integer (0 to 65,535), but errors, such as vent spew, occurring in tires have a small height difference of about 0.1 mm. In addition, since the difference in the height change of the tire surface is not large, the data are concentrated in a certain section in the entire range. Such data concentration may cause difficulties in training, because height difference information is lost in the data normalization process before training, or the distance difference between pixels is very narrow. Therefore, in Step 2, additional image data based on the original data are created to reinforce this problem.

Histogram Equalization
In order to compensate for data loss that may occur in data normalization and the minimization of the height difference between pixels, an image obtained by performing histogram normalization on the present image is generated. Since there is almost no difference in height, except for areas where errors do not occur, the height data are concentrated in one section. In the process of normalizing the height values concentrated in a narrow range to a real number (between 0 and 1) for training, information on the height difference between pixels may be lost. In addition, in scanning the tire surface with a 3D camera, all values of the tire surface that were not measured due to shadows are filled with zeros, so the ratio of zero values in the depth image is high. We performed histogram smoothing, using the accumulated values of the image histograms.
Histogram smoothing is a method of rearranging contrast values to emphasize the contrast in grayscale images [55][56][57]

Histogram Heatmap
When looking at the defect detection process in the tire from an anomaly detection point of view, the height information of the defect location is more likely to have an exceptional value than elsewhere in the image. Thus, an image weighing the pixel value of the image is created and included in the training data. The pixel value weight is defined as the number of current pixel values for the entire image. Therefore, the weight value of the normal tire surface is higher than the weight value of the abnormal surface. This weighted image cannot be computed by convolution by an image filter. Therefore, the weighted image provides additional data, and the DeepLabV3+ network cannot infer that through the inside convolution layer. In addition, it helps to improve the training speed by expressing unusual parts of the tire surface with a low value during the training process. Figure 8 shows the weighted image calculated based on the input image. The normal tire surface is expressed brightly in this weighted image, while the abnormal or rare value is expressed darkly.

Step 3: Image Stacking
In Step 3, (1) the original image, (2) the histogram smoothed image, and (3) the weighted images are stacked to create a three-channel image. We normalized the data since the original image, the histogram smoothed image, and the weighted image all have different pixel values. Figure 9 shows a three-channel image created by stacking three images. Before stacking images, we changed the image data format to a 32-bit float type and normalized it to a range of (0 to 255). This operation minimizes data loss during deep learning model training.

Step 4: Image Training
DeepLabV3+ model training is performed based on images and labels with added information. The average size of the images taken with the actual 3D camera is vast, with a width of 10,000 px and a height of 1024 px, so the training is carried out through a separate process, as follows: (1) Crop the tire image into a square shape suitable for DeepLabV3+ training, and save it. (2) Load the saved image. (3) Preprocess through Steps 1-3. (4) Put the image and label into the DeepLabV3+ model to train it. When cropping and saving the tire image, the sliding window is moved and cut by half the crop size to compensate for the defective part being divided, and the overall shape is not trained. When the sliding window moves to the edge of the image, if the size of the tire image included is smaller than the window, it is stored with 0 paddings.

System Architecture
The tire fault detection system consists of a depth image scan and labeling that creates a depth image for training and actual tire inspection, and a tire fault inspection system that trains a deep learning model, and detects defects. Figure 10 shows the overall system architecture.

Depth Image Scan and Labeling
A 3D camera scans the finished tire to create a depth image. If it is not a training tire, it sends the image to the tire defect inspection system. When used as training data for the system, pixel-by-pixel labeling of erroneous and healthy pixels is performed.

Tire Fault Inspection System
The system trains tire defects and inspects defects, using the trained weights. When the tire inspection is completed, the detected defect area is visually communicated to the operator. If erroneous detection or non-detection error occurs, the operator re-labels relevant parts, and includes them in the training data.

• Data loader
The data loader loads tire depth images for defect detection and training. If the load data are training, the label data paired with the training data are also loaded.
Data matching module: This module lists data for training and inspection and loads in order. When importing training data, it lists label data pairs that match the data. In addition, it separates the data into training data and validation data. When one epoch is finished during training, the training data set is randomly shuffled, and then loaded in order.
Data slice module: The tire depth image is a long horizontal rectangle and is very large. Therefore, a significant amount of GPU memory is required for training and detection. Thus, the imported tire data are cut and used for training and detection. Tire data are cut into a square shape. If the error region is located on the cut surface, it may not train the complete error form. Considering this, when slicing the image, we slide by 1/2 of the slice size. In this paper, data are sliced with a step of 64 px in a 128 px square shape. The excess area that may occur while cutting the data is solved by filling the remaining area with zeros.
Data stacking module: A three-channel image is created through the four processes proposed in this paper. Finally, we use this image for training and error detection. • Visualizer The visualizer visually shows the training process or actual tire error detection results. Result visualizing module: When training, this module shows various training indicators, such as test, validation data-based loss, mean IoU, and validation results in real-time. In the case of tire error detection, the cut-out tire images are merged to show the area of the defect in the entire tire image.
Feedback module: As a result of detecting a tire defect, false detection or data requiring additional training may be generated. At this time, the corresponding data are additionally labeled, and later included in the training data.

Experimental Evaluation
This chapter describes (1) tire depth image training using a general method, and (2) tire depth image training through the preprocessing proposed in this paper. For accurate experiments, we use the same hyperparameters and datasets. For the quantitative evaluation of the investigation, the precision, recall, and f-score for each training result are compared. Additionally, we compare the training results for several sub-models that can be applied to the DeepLabV3+ model. Table 1 shows the device specifications used for model training in this paper. A total of 18 types of tire images were used for model training. We cropped the whole tire image in an 8:2 ratio for training, then used 80% of the images for training, and the rest for validation. Each image was cropped in a square shape with a width of 128 px and a height of 128 px.
Since the proportion of vent spew errors in tire images is very small, most of the cropped images are images without vent spew errors. Since the imbalance in the normal data and bad data in the training data can lead to poor training results [58], we included only 5% of images with no vent spew error in the training and validation images. As a result, 5686 training data and 1425 validation data were generated. Figure 11 shows a sample image and labeling data used for training. Figure 11. Sample image data were used to train the model. In each image, the image on the left is the tire depth image (.tif), and the image on the right is the image marked vent spew error.
We trained all training data through 1000 iterations (1000 epochs). The initial value of the learning rate was 0.001. In this experiment, weight decay was set to 0.0001, and momentum to 0.9. Additionally, we applied a polynomial strategy, such as Equation (1) [59], to the training process. That caused the learning rate to gradually decrease, so that as the learning parameter approached the optimal value, the performance was stable.
Finally, the output stride of DeepLabV3+ was 8, and the convolution neural network inside the model was resnet-101. The batch size was set to 150. Data segmentation divides the classes into units of pixels. Since the number of pixels in the vent spew error in the training data set image is relatively small, the number of pixels per class is highly unbalanced. Therefore, in this experiment, in calculating the training loss using the crossentropy function, weights for each class were applied to compensate for the imbalance. The pixels in the vent spew error region occupy about 5% of the total pixels in the training data, so a weight of 0.05 was applied to the normal class, and 0.95 to the vent spew class.

Precision, Recall, and F1-Score Analysis
Precision is the ratio of the number of positive pixels predicted correctly to all predictive positive pixels. Recall is the ratio of the number of correctly predicted positive pixels to the total of true positive pixels. Equations (2) and (3) define the precision and recall.
where TP is true positive, which is a pixel that is predicted to be positive and is actually positive; FP is false positive, which is a pixel predicted to be positive, but is actually negative; FN is false negative, which is a pixel predicted to be negative but is actually positive. The F1-score is a combination of precision and recall. Since precision and recall are in a trade-off relationship, they might have extreme values. Additionally, like the data used in this paper, when the imbalance between the number of normal classes and error classes is severe, it compensates for this. Equation (4) shows the F1-score.

Intersection of Union (IoU)
IoU is the most popular evaluation method for object detection benchmarks. Many object detection methods in computer vision use a bounding box to indicate the interest object's location. IoU is calculated using the bounding box of the ground truth and the bounding box inferred from the object detection method. Equation (5) shows the calculation for IoU.

IoU =
Ground truth area ∩ Inferred area Ground truth area ∪ Inferred area (5) Object segmentation methods-based deep learning describes object location using pixel-wise, not bounding box. So, we calculate IoU using the pixel count but with the bounding box's area. Figure 12 describes the comparison of two methods: the bounding box-based and pixel-based methods.

Training Results
We compare the mean IoU of the proposed method and the original method to evaluate. The original method uses the raw depth image data (one channel) for training without extending it to three channels though the proposed method. At the end of data training for one epoch, IoU is calculated from the validation data. We use the average value of the IoUs of this verification data as the accuracy index of the deep learning model.
For comparison, (1) original 1-channel depth image, and (2) 3-channel image with additional information were used as training data, based on the same model and hyperparameters. As a result of training, the proposed method showed a mean IoU improvement of about 7%, compared to the original method. Figure 13 compares the validation of the mean IoU of the original method and the proposed method. Defect areas were inferred from the test image based on the weights in the last epoch. Figure 14 shows the inference results.
We measured the average IoU data for the training set and the separate validation set with 1000 iterations of training. In Figure 13, it can be seen that the proposed method converges more stable than the original, and training proceeds. In the original method, the time point at which the mean IoU was stably maintained above 0.61 was 516 epochs, but in the case of the proposed method, the time was shortened by about 80% to 100 epochs.
In addition, various performance indicators were derived through various CNN networks (Mobilenet, Resnet-50, and Resnet-101) that can be applied inside the DeepLabV3+ model. As a result of the test, the precision/recall values in units of image pixels were derived, and the harmonic average of these values, the F1-score, was calculated.
When the model using Resnet-101 with the best performance was compared, the precision result of the proposed method was found to be lower than that of the existing method. However, it was deduced that the F1-score value, which is the harmonic average with recall, increased, resulting in reliable results, compared to the existing method. Table 2 shows the results of these indicators.

Discussion
The method proposed in this paper preserves the existing original data; at the same time, it generates additional data that cannot be extracted through the CNN network, and performs training based on the data, showing higher performance compared to the existing method. In addition, maintaining the original data without data preprocessing through filters, such as height threshold values, shows robustness to additional error training in the future.
The proposed method has improved precision, recall, and F1-score values, compared to the previous one, but it is not excellent given absolute value. The reasons for this are the (1) lack of data and (2) inclusion of human error.
It takes a long time to collect enough tire error data of the same kind for training. The methodology presented in this paper also has the purpose of compensating for the lack of data. However, as the number of data increases, the absolute precision and recall values will increase.
In the data labeling performed in this paper, a person directly measures the length of the vent spew error and labels the depth image data. Additionally, since the area of error data is determined by the person, it is inconsistent. For example, it is ambiguous whether only the pixels of the rubber hair where the error occurred is taken as the error area or the area around it. A more accurate error measurement and labeling method will improve the results.
Furthermore, in the training process, the training loss converges to 0 in both methods, but the validation loss diverges after a certain epoch. Figure 15 shows the training/validation loss in the training processes of the original and proposed methods. This type of graph occurs because the error criteria and labeling for the vent spew error are ambiguous. The vent spew error is a vent protruding more than 2 mm from the tire surface, and the vents protruding from the tire surface are clustered around this 2 mm. Because differences in increments of 0.1 mm are difficult to recognize by the ordinary person and by the naked eye, data labels are ambiguous, unless vents larger than 2 mm are clearly identified. In addition, data according to the degree of inclination of the vent protrusion is not considered, so if it exceeds 2 mm but is attached to the tire surface, it can be recognized as a normal vent.
Therefore, in future research, for improving precision, recall, and training loss, we will study the classification method that considers the accurate measurement criteria of vent spew error and the protrusion angle of the vent.

Conclusions
This paper introduced and implemented the process of segmenting the vent spew error, a type of tire failure error, through the four steps of (1) image input, (2) highlight images creation, (3) image stacking, and (4) image training. Detecting a vent spew error in which rubber hairs protrude more than 2 mm from the tire tread is an inefficient task because it is necessary to measure the protrusion lengths for the number of rubber hairs existing in one tire. Therefore, in this paper, the vent spew error was easily and quickly detected by acquiring the height value of the tire surface with a 3D camera, and training a deep learning network with it during the tire inspection process. However, since the height values measured by the 3D camera were concentrated in a narrow range, the performance when training only the depth image of one channel was not good. Therefore, in this paper, based on the existing data, it was easy to train for other errors in the future, so additional data of the (1) histogram equalization, and (2) histogram heatmap images were generated and included in the training data. In this proposed method, the original data were not lost.
The existing one channel depth image training results were compared with the results trained through the proposed process for experiments and evaluations. As a result of the experiment, both the mean IoU and the vent spew error detection rate increased by about 7-10%, respectively. The epoch was shortened by 80% during training, until the average IoU remained stable at 61%. However, there was limited accuracy to classify the atypical rubber hairs that protrude more than 2 mm from the tire surface based on the depth image with only a tiny difference in one-dimensional height information. This limitation is explained by a graph in which the training and validation losses do not converge but diverge. In addition, since the tire tread photographed with a 3D camera has a curved shape, it is necessary to correct the height of the center part and the edge part of the tire. To solve these limitations, we plan to research more accurate vent spew error measurement in the future: methods, standards, and depth image generation.
The introduction of artificial intelligence technology into the visual inspection stage, which can be the last stage of the manufacturing process, is expected to greatly increase productivity by eliminating inspection errors that arise because each operator has different skill levels and error evaluation standards.