AdvancingTire Safety: Explainable Artificial Intelligence-Powered Foreign Object Defect Detection with Xception Networks and Grad-CAM Interpretation

: Automatic detection of tire defects has become an important issue for tire production companies since these defects cause road accidents and loss of human lives. Defects in the inner structure of the tire cannot be detected with the naked eye; thus, a radiographic image of the tire is gathered using X-ray cameras. This image is then examined by a quality control operator


Introduction
The global rise in the human population increases the need for various modes of transportation, including automobiles, buses, and trucks.This particular circumstance pushes numerous manufacturing facilities to play a role in the production of a crucial component for automobiles, namely, the tire.However, it is important to note that the annual return rate of defective tires is at 7 percent of the total tire production size, leading to a yearly restitution amounting to USD 100 million [1].To minimize the number of tire returns, it is necessary to implement quality-inspection procedures that involve the utilization of X-ray imaging for the detection of defects in tires.
Non-destructive testing (NDT) methods, such as radiographic X-ray testing, have been instrumental in identifying latent defects in tires.Following tire production, each tire undergoes an X-ray inspection to create images that facilitate rapid and reliable interpretation.These images are then evaluated by quality control operators, who categorize tires as nondefective or defective.In Figure 1, examples of a non-defective (perfect) tire obtained from the X-ray device are presented.While some defects are identifiable through computer-aided demarcation lines, detecting foreign objects with the naked eye proves challenging.Such objects often vary in size and properties, making detection even more arduous, particularly during periods of operator fatigue.Furthermore, the process is inherently subjective and characterized by inefficiency, time consumption, and potential bias, necessitating a significant degree of concentrated effort [2].With the development of technology, artificial intelligence has been used in autonomous vehicles, quality control processes, and many similar areas of car technology [3].In recent years, the field of deep learning (DL) has undergone significant advancements that have led to innovative solutions for numerous industrial challenges, including tire defect detection problems.DL techniques aim to categorize images by extracting meaningful features from the images.
In many datasets, especially datasets taken from industrial applications, it is difficult to obtain sufficient labeled data from each class.In such cases, pre-trained transfer learning (TL) models can enhance performance with less labeled data by benefiting from the experiences gained during the training phase.Thus, rather than training a model from scratch, the model previously trained on a comprehensive dataset using TL methods could be applied to another dataset.AlexNet, VGG, ResNet and Xception, DenseNet and their variations, and TL approaches are some of the most frequently used methods [4].TL techniques can be applied in two ways.First, the TL model can use the weights obtained with the ImageNet dataset to extract features.It learns general data features and patterns from this dataset.These features can be used because they provide better generalization across different but similar tasks.Secondly, it is fine-tuned with a new dataset and transformed into a model suitable for that dataset [5].In the examination of models in the literature, the Xception model demonstrated better accuracy in the classification task with the ImageNet dataset, with fewer parameters than many other deep learning models [4].
While DL has demonstrated remarkable success in various real-world applications [6,7], its inherent stochastic nature can undermine trust in its outcomes.To enhance reliability, there is a critical need to elucidate DL models' decisions, ensuring transparency and fostering trust in their results.Thus, the concept of Explainable Artificial Intelligence (XAI) has become increasingly important in the domain of DL [8][9][10][11][12].
XAI seeks to enhance the transparency and reliability of the results produced by artificial intelligence (AI) systems by offering visual explanations in the form of heatmaps generated through techniques such as Grad-CAM [13].Visually presenting incorrect or unexpected decisions of the model with Grad-CAM provides the opportunity to evaluate the validity of the model's accuracy and make corrections when necessary.At the same time, by seeing which visual features the model focuses on, it can be understood how decisive these features are.Grad-CAM visualizes the decisions of a pre-trained transfer model, allowing interpretation of how well the model can generalize across different tasks.In the context of tire defect detection, transparency is of utmost importance in establishing trust among quality control operators, regulatory bodies, and other stakeholders.Understanding how and why an AI system arrives at a particular defect classification is essential for the acceptance and adoption of such systems in the tire manufacturing industry [8].
XAI serves as a valuable tool for tire quality control professionals by providing them with insights into the rationale behind the AI system's defect determinations [13].By offering interpretable explanations through Grad-CAM, XAI empowers operators to validate and verify the decisions made by AI models, thereby aiding them in making well-informed judgments about the tire's quality.This support not only improves the overall efficiency of defect detection but also provides operators with a deeper understanding of the factors that influence the AI system's output.

•
This paper introduces a novel approach to detecting foreign objects in tire radiographic X-ray images using an Xception-based deep learning network.

•
The proposed method demonstrates high efficiency in identifying foreign objects in defective tire images, with a primary focus on building a comprehensive tire foreign object dataset.• Furthermore, the proposed detection model is investigated using the Grad-CAM method to provide a comprehensive interpretation of the decisions made.

•
By analyzing the interaction between AI models and a wide range of tire defect images, we seek to provide a more comprehensive understanding of both the capabilities and limitations of AI-powered tire defect detection systems in industrial settings.

•
For this study, an original dataset is obtained from a global tire manufacturer located in Kocaeli, Türkiye.

•
This research contributes to the tire manufacturing industry's efforts to leverage cutting-edge technology for improving quality control processes and ensuring the production of safer and more reliable tires.
The rest of this paper is designed as follows.Section 2 briefly presents the related works.Section 3 presents the methodology and dataset of this study.Section 4 shows and discusses the experimental results.Section 5 presents the conclusions.

Related Works
The adoption of machine learning methods for automatic tire defect detection has gained popularity, with various algorithms and approaches explored.A study conducted by us [1] focused on tire classification using basic machine learning classifiers such as the SVM (Support Vector Machine), and kNN (k-Nearest Neighbors) , ANN (Artificial Neural Network), and texture features, achieving approximately 99% accuracy.
While limited, the literature contains some notable studies concentrating solely on foreign object detection.For instance, the authors of [2] proposed a wavelet multiscale representation method to detect the defect of the tires from 400 radiographic images.They achieved 96.9% detection accuracy with the selection of the optimal scale and threshold parameters of defect edge detection.
In [14], an end-to-end tire defect detection method was proposed with the combination of an optimized Semantic Segmentation Network and a compact CNN classifier.They used 3234 test images with four different defect types and achieve 96.5% classification accuracy.The authors used texture segmentation based on the Gabor filter and fuzzy c-means clustering to decrease the computation complexity.
In Ref. [15], the authors proposed a TireNet model by using the Siamese network with some modifications.They preprocessed images as object and nonobject images to improve learning models.The classification model is generated by a Siamese network and weighted cross-entropy loss.The final model is fine-tuned with a balanced threshold.The experimental results were obtained for a total of 120,000 images with 20,000 defective tire images.
In Ref. [16], an unsupervised approach is proposed to avoid labeling the defective images.The authors augmented the images with a Generative Adversarial Network (GAN)based method.The features of the non-defective tire images were extracted, and these features were used for image reconstruction.The network memorized these features, and when a defective image was input into the generator, the difference between the features could be detected.This method achieved 79.8% accuracy with a training set of 20,000 defect-free images and a test set of 2077 defective and 3947 defect-free images.
The authors of [17] proposed a Faster RCNN network based on the feature pyramid network to detect tire bubble defects.Their model contains four parts: the backbone, Tire Feature Pyramid Network, region proposal, and region prediction.Experimental results were evaluated with mean average precision and average precision.The model obtains better results than the classical feature pyramid network.
In Ref. [18] a class-level, weighted, partial domain adaptation network-based defect detection method is proposed for five different defect types (namely, impurity, bubble, slack, bend, and overlap).An average accuracy of 95.07% was obtained with the defective tire dataset.The same authors present a Transferable Swin Transformer-based method [19] for tire defect detection under domain shift conditions.The Swin Transformer is used as the feature extractor for all the images.This model exhibits 96.17% average accuracy.
In Ref. [20] a pixel-level defect detection method based on transformers is proposed.This method detects the type of defect as well as the geometric shape of the defect.The experimental results were obtained with 1450 tire images that contain the six types of defects: Tread and Sidewall Foreign Matter, Tread and Sidewall Core Cracking, Sidewall Core Overlapping, and Sidewall Bubbles.
In another study [21], the authors augmented an imbalanced dataset with the Wasserstein Generative Adversarial Network (WGAN).The new balanced dataset was classified with transfer learning methods, and the best classification accuracy obtained with the ResNet model was 95.92%.

Dataset Collection and Labeling
The defective tire dataset for this study was collected from the Pirelli Automobile Tyre Factory, which is a global tire manufacturer located in Kocaeli, Türkiye.Radiological tire X-ray images were obtained from X-ray devices (Tire-X 3000 system, Alfautomazione, Lissone, Italy).An advanced tire X-ray examination machine was used to create the dataset which was shown in [4].The diode and the receiver are the two primary parts of the technology used in X-ray machines.Because this setup is housed in a lead-lined chamber to prevent radioactive leakage, it provides the maximum level of safety possible during operation.When the cabin door is closed, a high voltage is supplied to the internal diode, which releases X-rays.The cabin is sealed closed as soon as the tire enters the chamber, starting an exact 360-degree spin.Following the same principles as medical X-ray devices, a U-shaped receiver outside the tire records the resulting X-ray image.The diode ensures operational efficiency and safety by controlling temperature using a water-cooling system.The length of the inspection process varies based on tire diameter and is typically one minute.To maintain the highest standards of honesty and quality, quality professionals can extensively study tire X-ray images due to the device's interaction with a computer-user interface.
All images are labeled as defective or non-defective with the assistance of their expert quality control operators.We collected 2303 defective tire X-ray images and 49,198 nondefective tire images.The dataset used in this study contains several tire textures for different traffic modes and densities.The resolution of the collected images varied due to the wide range of tire shapes and sizes.To standardize the input, each image was resized to a resolution of 299 × 299 × 3 and normalized by dividing the pixel value by 255.Some examples of defective tire images are given in Figure 3.These are long images of tires, such as the images in Figure 1, but in Figure 3 only the defective part of each image is shown, and the foreign objects zones are enclosed in green rectangles.

Dataset Augmentation
To address the class imbalance in our dataset, we used custom systematic augmentation techniques like shifting and illumination changes specifically for the "foreign object" class.This enriched the data and made them more representative of real-world scenarios, ensuring better model performance in detecting foreign objects under various lighting and spatial conditions.By applying shifting techniques, we introduced small translations to the images in the "foreign object" class.This approach not only increased the diversity of the samples but also made the model more invariant to spatial transformations, enabling it to recognize foreign objects regardless of their location within an image.In addition, we employed illumination change techniques to vary the lighting conditions in the images of the "foreign object" class.Other augmentation elements, aside from vertical shifting and brightness adjustment, were considered unsuitable for this application because of their inability to correctly recreate real-world conditions.For example, flipping images along the vertical or horizontal axes may cause severe distortion and shatter the coherence of layers and stripes.As a result, such augmentation strategies may cause the model to learn patterns that do not correspond to real-world conditions, resulting in unwanted outcomes.In conclusion, it is critical to prioritize augmentation strategies that closely mimic real-world events observed in tire manufacturing operations to ensure model robustness and generalizability.Thus, in this work, we thoroughly analyzed many various augmentation methods and only retained those that were certified by manufacturing operators and specialists.
Table 1 summarizes the number of images in the dataset before and after augmentation.This augmentation method improved the model's ability to handle varying lighting conditions, making it more robust and adaptable to different environments.The combination of these systematic augmentation techniques allowed us to enrich the "foreign object" class with synthetic yet realistic data, which addressed the imbalance issue and facilitated the model's learning process.

AI Model Training
The framework of this study is based on the Xception model with model fine-tuning for our specific problem, which is foreign object detection [22].Fine-tuning the model is crucial when using a model that is trained to solve different tasks.Fine-tuning in this study was implemented by substituting the last fully connected layer with a separable convolutional layer followed by a global average pooling layer and employing the Sigmoid function in the last 1-neuron fully connected layer for prediction.This successfully diminished the parameter count and thus, streamlined the model's complexity.Additionally, batch normalization was employed to uphold data distribution, resulting in enhanced model performance.Table 2 shows the architecture of the proposed model.The final output shape of the proposed model is "1", representing the defect index.The total number of parameters is 25,084,457, of which 25,025,833 are trainable and 58,624 are non-trainable.The model was trained on 90% of the dataset for 30 epochs with the Adam optimizer and a batch size of 16.Given the large size of our training dataset and the limits imposed by our devices, we did not need to focus on the more advanced tuning of the hyperparameters for using any other optimization algorithms.So, we used the same hyperparameters of our previous study, which used the Xception model to detect COVID-19 [23].

Evaluation Criteria
The proposed model was tested and evaluated using the validation set.The evaluation was conducted using common performance metrics, namely, accuracy, precision, recall, and F1-score.These metrics were calculated using the outcomes of the confusion matrix.To justify and provide more insights into the results, heatmaps using the Grad-CAM technique were included for each class.These heatmaps offer a visual representation of the key areas within the tire image that influenced the AI model's defect detection.

Learning and Execution Environment
Experiments using the proposed method were conducted on an impressive workstation equipped with an Intel Core i7-11700 CPU, Intel, Santa Clara, CA, USA clocked at 2.50 GHz, a powerful NVIDIA GeForce RTX 3060 Ti GPU, NVIDIA, Santa Clara, CA, USA, and 16 GB of RAM, ensuring sufficient processing power for our research tasks.Utilizing the numerous advantages of TensorFlow v2.13.0, a cutting-edge deep learning framework, we completely trained and evaluated our models to achieve optimal performance.This work was made simpler by the user-friendly Spyder IDE 5.1.5,which operated seamlessly within an Anaconda environment integrated with Python 3.7.10.Furthermore, our workstation operated on the advanced Windows 11 operating system, providing a reliable and efficient environment for our research.

Experimental Results
The dataset was split into 90% for training and 10% for testing according to the standard splitting ratio for large datasets [24].All performance results of the model's evaluation for the 10% validation set are presented in Table 3.The performance evaluation criteria used are accuracy, precision, recall, and F1-score.These metrics were used as defined in [25,26].From the table, we can observe that the model successfully achieved an accuracy of 99.19%, recall of 98.75%, precision of 99.34%, and an F1-score of 99.05.These results show that the model is capable of detecting and classifying the defective tires from the non-defective ones.The experimental setup with the convergence rate plots was established, adding recall, precision, and F-score for the training and testing (validation) datasets.The convergence curves, which are a graphical representation that depicts the progression of the loss and accuracy over successive epochs during the training of the model, are illustrated in Figure 4.The curve illustrates the effectiveness of training and highlights potential issues like overfitting that might arise during the training process, providing an initial view of the model's performance [27].The curves summarize that through training epochs, the model shows a steady increase in accuracy and a decrease in loss, signifying that the model is learning and adapting to the patterns in the data.As the model approaches the final epochs of training, the curves start to display a slight upward slope with a small increment in accuracy and decrement in loss.
During the validation stage, on the other hand, we observed some fluctuations in the model's performance with the validation set as it trained.This behavior is typical, indicating that the model starts to capture significant features from the data and is transitioning from a state of randomness to one of structured learning.The number of epochs was limited to 30 epochs since we observed no further improvements in the model's performance.
The classification results of the proposed model are depicted in Figure 5 as a confusion matrix.The positive class is considered the defective tires, and the negative class is the non-defective tires.As the confusion matrix illustrates, 3639 defective tires were correctly classified out of 3685, with 46 being misclassified as non-defective.Similarly, 4897 nondefective tires were correctly classified out of 4921, with only 24 misclassifications.In this study, when defect-free tires were found to be defective, they could be re-examined and found to be defect-free.However, if defective tires are found to be defect-free, factories can put them on sale.It is therefore important to see that 3639 of the 3685 defective tire samples were classified correctly.This indicates that the model can minimize significant errors in real-world scenarios.Furthermore, the model's success in correctly identifying defect-free tires is noteworthy.For example, 4897 out of 4921 defect-free tire samples were correctly classified, indicating that the model is generally robust in correctly distinguishing defect-free samples.In Figure 6, we observe heatmaps of two tires, one that is non-defective and the other that is defective.These heatmaps show a prominent and vibrant red area around a specific region of the tire, precisely where the tire's sidewall and tread layer meet.This area is defined as a point where features that may be decisive in terms of the structural integrity or performance of the tire are located.The AI model appears to make more use of this area when detecting the characteristics of this particular region and determining the weight of this region in the classification.With this method, it is seen that the AI has captured a weak point or an important feature in the structure of the tire within this red region.With this figure, the decision-making mechanism of the model in the classification process and the priority order of the features are better understood.This intense coloration signifies that the AI model placed significant emphasis on this region when classifying the tire.
Upon closer inspection of the defective tire, we identify a small foreign object inside the red area.This small but critical defect, nearly imperceptible to the human eye, is precisely what the AI model detected and highlighted through the heatmap.This interpretation provides quality control operators with a clear understanding of why the AI system classified the tire as defective, enabling them to take appropriate action and ensure the production of safer and higher-quality tires.Table 4 presents a comparison with related studies in the literature.When compared with these similar works, our method has significant accuracy performance.During the testing, the proposed model demonstrates exceptional efficiency, with a testing time of 0.00919 seconds per image and an average frame rate of 108.84 frames per second (FPS).Furthermore, the training time per epoch is 333 seconds, indicating the speed and effectiveness of the model.

Conclusions
In brief, this study introduces a novel approach to detecting tire defects, with a specific emphasis on the identification of foreign objects in radiographic X-ray pictures.The proposed approach utilizes a deep learning network based on Xception architecture and heatmaps generated by Grad-CAM to achieve both efficiency and accuracy in identifying foreign objects.The development of a comprehensive dataset on foreign objects in tires enhances the robustness of our methodology.
This work undertakes a thorough examination to gain a comprehensive understanding of the capabilities and limitations of AI-powered tire defect detection in real-world environments.The experimental results demonstrate the exceptional performance of our model, with an accuracy of 99.19%, recall of 98.75%, precision of 99.34%, and f-score of 99.05%.The incorporation of Grad-CAM into the system improves visibility, hence facilitating the process of making quality control decisions.
The results of this study not only contribute to the progress of AI-supported tire fault identification but also establish novel benchmarks for quality assurance within the tire manufacturing sector.The utilization of Grad-CAM in the process of continuous refinement holds the potential to enhance the accuracy of detecting systems to a greater extent.This phenomenon highlights the tire industry's dedication to the manufacture of safer and more reliable tires.However, it is crucial to acknowledge that our work has limitations.We only addressed one kind of tire fault because of its popularity, but future research will look into other defect types to develop a more comprehensive model.Furthermore, our future research efforts will focus on creating a comprehensive system that integrates defect detection with automated decision-making processes, with the goal of increasing efficiency and accuracy in tire manufacturing operations.

Figure 1 .
Figure 1.X-ray images of the non-defective tires.

Figure 2
Figure 2 visualizes the framework designed for tire defect detection and classification.The frameworks possess five stages, namely, data collection and labeling, preprocessing techniques, data augmentation, data splitting, training the model, and finally testing and evaluating the trained model on unseen images.The seamless integration of these clearly stated stages enables the framework's durability, reliability, and efficacy in attaining its intended goals with superb accuracy and confidence.

Figure 3 .
Figure 3. X-ray images of defective tire parts showing the foreign objects.

Figure 4 .
Figure 4. Convergence rate of the proposed method.

Figure 5 .
Figure 5. Confusion matrix of the proposed model.

Figure 6 .
Figure 6.The heatmap using Grad-CAM of a non-defective tire (on the left) and a defective tire (on the right).

Table 1 .
Datasetsize before and after augmentation.

Table 2 .
The proposed model architecture.

Table 3 .
The performance results of the proposed model.

Table 4 .
This is a table caption.Tables should be placed in the main text near to the first time they are cited.