1. Introduction
The research objects of traditional defect detection have been mainly planar surfaces, regular spherical surfaces, and measured surfaces with diffuse reflection properties. In recent years, with the increasing requirements for intelligence in detection systems, several scholars have begun to study the detection of defects on irregular reflective surfaces. Up-to-date visual inspection is a difficult process in this area. Research objects with irregular surface features include ceramic surfaces, fine-ground metal surfaces, and solar panels. When photographs are taken, their bright physical properties cause local exposure and can be affected by disturbances in the surrounding environment. Additionally, the difficulty of detection is increased because the surface’s high reflectivity and its irregular shape are coupled to each other. With the aim to solve the problem of detecting defects on irregular surfaces, the research object used in this study was sanitary ceramic with an irregular surface and specular reflection shaped features.
Uncertainty factors in production, such as spraying processes, product transportation, and personnel operation, can cause cracks, breakages, peeling, and other appearance defects on the surface of the workpiece. These defects restrict the normal use of the product to a large extent [
1,
2,
3,
4,
5]. Therefore, the development of a sanitary ceramic defect detection system for identifying and locating defects is very meaningful. However, the automatic detection of defects on irregular surfaces involves several challenges, such as the defect size being too small, the appearance of different defect types being very similar, and the contrast between the defect and the background being too low. The types of defects identified in this study were pinholes, hard cracks, cracks, spots, pits, and impurities on the surface of sanitary ceramics.
To solve the above problems, researchers have carried out defect detection on irregular surfaces by manually extracting the shallow layer of the research object and feeding the features to a conventional machine learning classifier, such as Support Vector Machine (SVM). Shanmugamani et al. [
6] calculated a histogram, the gray-level co-occurrence matrix, and texture features in images, and the classification results of surface defects in gun barrels were obtained using an SVM classifier. However, the algorithm needs to extract multiple texture features, resulting in excessive time consumption. Saeed et al. [
7] used a closed operation for the morphological processing of defect areas on tiles and classified the defect using the one-to-many support vector machine. Li et al. [
8] located the defect profile of a track by using a trajectory extraction program and a projection algorithm. Feng et al. [
9] proposed a probabilistic model for detecting the wear or loss of railway fasteners. The quality of a defect detection method based on traditional image processing depends largely on the quality of the extracted features. Therefore, when defect profiles are very similar, the recognition effect of the model is not optimal. In addition, when a new type of defect occurs, the feature extractor needs to be redesigned; thus, such algorithms are not robust to the defect shape.
In recent years, with the development of deep learning [
10,
11], target measurement algorithms based on Convolutional Neural Networks (CNNs) have achieved great success in many fields. A deep learning model avoids the cumbersome image processing steps of the traditional method by autonomously extracting high-dimensional information in the graph. Some researchers have tried to apply deep learning to the detection of defects on irregular surfaces. Maestro-Watson et al. [
12] proposed a U-Net-based Fully Convolutional Neural Network (FCN) for the semantic segmentation of defects on reflective surfaces. The network performed a pixel-wise classification on the basis of local curvatures and data modulation to determine the location and boundaries of defects. Excellent recognition performance was achieved in industrial environments. Park et al. [
3] detected defects such as dirt and scratches on the surface of a product by constructing deep networks with different depths and layer nodes. Zhao et al. [
13] trained a backpropagation network and SVM classifier by extracting the area feature of the target. This method could effectively segment the defect area, but the shallow feature expression ability was limited, resulting in low recognition accuracy. Xu et al. [
14] proposed an improved Faster R-CNN model that combines the characteristics of different layers and applies the Soft-NMS algorithm to detect defects in a tunnel. However, because of the excessive number of regional proposals considered by Soft-NMS, the model detection speed was not ideal. Although these improved methods of deep learning-based defect detection have advanced recognition performance, for micro-defects with different contours on a reflective surface, they have poor positioning accuracy and do not extract shallow features with local detail information.
In summary, there is a need for a model with improved robustness to the defect profile and a feature matrix that is able to distinguish defects. Thus, this study used Faster R-CNN with powerful feature expression and a structure that can be easily modified as the recognition framework.
However, since the image of a sanitary ceramic surface defect is quite different from the image of a natural scene, the direct use of the standard Faster R-CNN network encounters the following challenges: some defects (such as pinholes) in the image are 15 × 15 pixels, and the image size is 640 × 480 pixels; thus, the target area is too small relative to the entire picture. Furthermore, similar contour features (such as cracks and hard cracks) increase the difficulty of distinguishing defects. In order to solve these problems and thus increase the model’s ability to recognize micro-defects on irregular surfaces, we introduced the K-Means clustering algorithm and multi-scale feature combination to Faster R-CNN.
The structure of this paper is as follows:
Section 2 introduces the method proposed in this paper, and
Section 3 shows the experimental results. Finally,
Section 4 summarizes this article.
3. Experimental Results
This section analyzes our proposed defect detection method. For the experiments, TensorFlow was selected as a deep learning framework. The program was run on an Intel core i7-8700 CPU, NVIDIA GTX-1070 GPU, 16 GB memory PC, and the operating system was Windows 10 1903. The test sample was 2295 images containing various types of surface defects.
3.1. Evaluation Indicators
To measure the detection performance of the proposed model, we used three common evaluation indicators, namely, Precision, Recall, and F-score. The precision measures the fraction of correctly identified positive detections and predicted positive detections, while the recall rate measures the fraction of correctly identified positive detections and true positive detections. The F-score is used to measure the overall performance of the model and is defined as follows:
3.2. The Importance of K-Means
Figure 9 shows the detection effect of Faster R-CNN on micro-miniature defects with and without the K-Means algorithm generating the aspect ratio of the anchor box. The figure shows that introducing the K-Means clustering algorithm increases the detection framework’s accuracy in locating small defects, such as spots. Additionally, the K-Means algorithm increases the confidence in the judgment of the defect type.
In the target detection task, an anchor box with a larger size contains more redundancy, which affects the precision of the model and increases the calculation cost. Therefore, we changed the base size n of the generated anchor box (, , and ) to 2, 4, and 8, respectively, and compared it with the standard network (whose base size is 16). By comparing the level of the F-score, we determined the most suitable size for the research object.
The model that participated in the detection of defects was trained in 40k iterations, and the confidence threshold was set to 0.5 in the test phase. The changes in the F-score at each stage of the model are shown in
Table 3 at 10k iteration intervals. It can be seen from the table that the network with a base size of 8 always maintains good detection performance. However, the value of the base size is not as small as possible. When the base size is 2, the initial anchor boxes are too small to markedly weaken the detection performance of the model. Compared with the standard network (base size of 16), the model with a base size of 8 has the highest detection performance, and the F-score is increased by 0.015.
3.3. Analysis of the Multi-Scale Fusion Feature
After multi-scale feature fusion, the feature matrix can improve the recall rate of the defect detection model. To determine the best performance, we combined the features from different layers. These networks completed 40k iterations in training and had the same hyperparameters.
Table 4 shows the recall rate and the F-score of the feature matrix composed of different layers.
The value 4-5 in the table indicates that the feature map is composed of the feature matrix output by Block4 and Block5, and Conv indicates that the feature map is only from the last layer output of the standard network. It can be seen from the table that when the feature matrix comes from Block4 and Block5, the model has the highest recall rate and F-score because the feature matrix combines abstract semantic information with local detail features. The recognition results of the feature fusion matrix from Block3, Block4, and Block5 are not ideal because the shallow features contained in Block3 have better positioning effects, but the discrimination of defects with similar contours is not optimal. The feature map from Conv5_3 in Faster R-CNN has a lower recall rate than that of 4-5, so we fused the features from Block4 and Block5.
To compare the convergence of Faster R-CNN with the improved model, we studied the change in the recognition precision rate and loss function values in the first 10k-iteration training cases. The models participating in the comparative experiment had the same hyperparameters: the learning rate was
, the maximum number of iterations was
, and the weight decay rate was
. The SGD (Stochastic Gradient Descent) algorithm was used as the optimizer with a test interval of 100 iterations. The detection performance line graph is shown in
Figure 10.
In the figure, the loss values of the two models converge to around 0.1, indicating a good fitting effect. In the early stage of training, the precision of our algorithm has a faster ascending speed because the newly generated anchor boxes are more in line with the morphological characteristics of the research object, and the positioning accuracy of the micro-miniature defects is higher. For Faster R-CNN, the precision rate fluctuates greatly because the oversized initial anchor box produces more redundant information. After entering the steady rising phase, the improved model’s ability to perceive micro-miniature defects with similar contours is further enhanced, which increases the model’s recognition accuracy. In the first 10k iterations of training, the average precision of the improved model increases by 8.17%.
3.4. Comparison of Four Models for Micro-Defect Recognition
We compared the proposed method with other test frameworks.
Table 5 shows the performance indicators of the detection frameworks of Faster R-CNN, YOLO-V3 [
28], SVM + LBP (Local Binary Pattern) [
29], and our method on the datasets we collected. The core idea of the YOLO (You Only Look Once) target detection framework is to use the global information of the entire image to directly calculate the position of the bounding box and its type using the regression method in the output layer, and it has a faster detection speed. SVM is a typical machine learning method that uses statistical concepts to solve classification problems. It extracts the texture features at the defects and maps them to high-dimensional space to complete the training and detection of the classifier. SVM is suitable for classifying small-scale data.
It can be seen from the table that the method we propose has the best detection performance. SVM has poor recognition performance for micro-defects with irregular contours because only the shallow features of the image are extracted. From the detection time, YOLO-V3 has a faster detection speed, but the precision and recall are too low, and the practicality is not ideal. Because the proposed algorithm upsamples the features of different layers and fuses the features, the detection time is longer. However, because a production line has a low running speed and requires high precision, the detection time still meets real-time requirements.
In addition, to further verify the robustness of the proposed method in an actual application scenario, we simulated illumination changes by changing the contrast of the image in the test set and directly tested the performance of our model using these samples. The test samples under different contrasts are shown in
Figure 11.
The F-scores under different lighting conditions are shown in
Table 6.
It can be seen that the model has the highest F-score when the brightness of the test sample is similar to that of the training sample. When the brightness has a certain fluctuation, the detection performance of the model does not change significantly. These results prove our method’s robustness to illumination.