Automatic Fabric Defect Detection Method Using PRAN-Net

: Fabric defect detection is very important in the textile quality process. Current deep learning algorithms are not effective in detecting tiny and extreme aspect ratio fabric defects. In this paper, we proposed a strong detection method, Priori Anchor Convolutional Neural Network (PRAN-Net), for fabric defect detection to improve the detection and location accuracy of fabric defects and decrease the inspection time. First, we used Feature Pyramid Network (FPN) by selected multi-scale feature maps to reserve more detailed information of tiny defects. Secondly, we proposed a trick to generate sparse priori anchors based on fabric defects ground truth boxes instead of fixed anchors to locate extreme defects more accurately and efficiently. Finally, a classification network is used to classify and refine the position of the fabric defects. The method was validated on two self-made fabric datasets. Experimental results indicate that our method significantly improved the accuracy and efficiency of detecting fabric defects and is more suitable to the automatic fabric defect detection.


Introduction
Fabric defect detection is a necessary quality inspection process, which aims to classify and locate defects in textiles. However, most fabric defects are categorized by the human vision, the inspection accuracy of which is only 50-70% [1] and the inspection speed of which is only 6-8 m per minute [2]. Automatic defect detection technology based on deep learning dramatically improves the defects detection accuracy and efficiency. Therefore, the research of the automatic fabric defect inspection algorithms has been paid more and more attention.
However, there are more than 70 categories of fabric defects [3] in actual production. Most of them are tiny and have extreme aspect ratios in images, such as coarse warp, coarse weft and mispick. Currently, deep learning algorithms aiming to object detection are divided into one-stage algorithms [4][5][6][7][8][9] and two-stage algorithms. One-stage algorithms such as RetinaNet have fast inspection speed which meets the demands of on-line inspection, however its detection accuracy is not satisfied. Compared with one-stage algorithms, two-stage algorithms [10][11][12][13][14][15] have higher accuracy and positioning in objects detection. Faster R-CNN [13], as an excellent two-stage algorithm, is easy to implement and has high accuracy on multiple objects detection, many researchers use the network model for ship detection [16] and medical image detection [17,18]. However, the detected objects occupy larger regions in images, and the shapes of the detected objects are moderate aspect ratios, such as people, airplanes, etc. Many methods, such as GAN [19], deep convolutional neural network [20] and YOLO v3 [21], are effective in detecting fabric defects, but cannot detect exceedingly small and extreme aspect ratios fabric defects well.
To improve the detection accuracy of tiny defects, some researchers use Feature Pyramid Network (FPN) [22] to fuse feature maps of different scales. The Mask RCNN [14] method improves the tiny defects detection accuracy, but the runtime of detection is much longer. Although these methods improve the detection accuracy of tiny defects, its location accuracy for extreme shape defects is still unsatisfied, because the preset fixed size anchors cannot accurately match extreme aspect ratios fabric defects in images.
Anchor-free algorithms such as CenterNet [23] assume object as a point to improve the detection accuracy of tiny objects and partly improve the detection ability of extreme objects. However, for extreme long and narrow fabric defects, CenterNet still cannot detect them accurately. A Guided Anchoring method [24] is proposed instead of fixed anchors, which decreases the detection speed and increases the detection accuracy of the extreme defects. However, the Guided Anchoring method cannot detect defects when defects distribution is too concentrated.
Inspired by the FPN and the Guided Anchoring, this paper proposes a Priori Anchor Convolutional Neural Network (PRAN-Net) based on Faster R-CNN to detect tiny and extreme fabric defects. Firstly, we introduce the FPN to reserve more detailed information, which adapts to tiny fabric defects. Secondly, we proposed a trick to generate sparse priori anchors, which can match extreme aspect ratio defects well and remove a large number of redundant anchors in order to improve the accuracy and efficiency of the fabric defect detection. Finally, a classification network is used to classify and refine the position of the fabric defects. Figure 1 shows the fabric defect detection process of the PRAN-Net based on the Faster R-CNN proposed in this paper. First, the FPN [22] is used to extract feature maps at different scales from the fabric images. Then, priori anchors are adaptively generated in each scale feature map as the defect proposals. Finally, a classification network is established to classify and refine the position of defects in fabric images by the defect proposals.

Feature Extraction Based on Multi-Scales Feature Maps
As shown in Figure 2, we use the typical deep residual network ResNet-101-FPN [25] as the backbone to extract features of fabric defects. The whole network is divided into bottom-upline and top-down line, which consists of several convolutional layers and upsampling layers. These two lines are laterally connected with 1 × 1 convolutional layers, which are used to reduce the channel dimensions. For the bottom-upline, the classical FPN obtain five scales feature maps. To reserve enough information of tiny defects while getting rid of redundant information and increasing efficiency, we only retain three feature maps with strides of {4, 8, 32} defined as {C1, C2, C3}, respectively. For the top-down line, the high-level feature map is fused with the low-level feature map. M3 is the third scale feature map in the merged maps obtained by the 1 × 1 convolution of C3.
The upsampled M3 and the 1 × 1 convolution result of C3 are summed to get M2. The same operation is done to M2 and C2 to get M1. To reduce the aliasing effect of upsampling, {M1, M2, M3} are convoluted by a 3 × 3 convolution kernel to get three new feature maps {P1, P2, P3}. The top-down line operation enables to reserve more detailed information of defects by interlayer fusion, which is especially beneficial to detect tiny defects.

Priori Anchor Generation
Region Proposal Network (RPN) [13] is used to predict defect proposals in feature maps by anchors. Generally, a series of fixed-size anchors set for each pixel of the feature map in RPN work well to detect the moderate aspect ratio defects. Due to the extreme aspect ratios of most fabric defects, fixed-size anchors cannot match defective regions exactly, which lead to defect location deviation. Furthermore, a large number of fixed anchors are set in each feature map by sliding windows, most of which have no contribution to detect defects and lead to extra calculation time. PRAN-Net generates sparse priori anchors in images to locate diverse fabric defects accurately and efficiently.
The fabric defect in images determined by the location and the shape is represented by a quaternary tuple ( ) , , ,

Location Prediction
The first step is to predict the location of the priori anchor in each feature map by the location prediction network l N as shown in Figure 3. The semantic segmentation is made to get the defective score map by a 1 × 1 convolutional layer. Then, the defective scores are transformed by an element-wised sigmoid function to get the defective probability map with the same size as the feature map. The pixels whose value are larger than the set threshold L Δ are considered as the defect pixels.
We determine the defect sub-regions by gathering the defect pixels within the set distance between the adjacent pixels. The GT box, presented by ( , , , ) is in one of the sub-regions, the enclosing rectangle of this sub-region is taken as the candidate anchor, which is described as x y are the center coordinates, 0 w and 0 h are the width and height of the candidate anchor.
We introduce GHM-C loss loc L [26] as the priori anchor location loss.

Shape Prediction
As shown in Figure 3, the second step is to predict the shape of the priori anchors priori a . The can IoU is the intersection over union (IoU) of the candidate anchor and the matched GT'. We try to make the IoU larger by adjusting the width and height of the candidate anchor. First the shape prediction network s N , which includes a 1 × 1 convolutional layer and an element-wise transform layer is used to predict the width ' 0 w and height ' 0 h as follows: where w t and h t are the scale parameters obtained by the above 1 × 1 convolution layer. Adjust The last width ' 0 w and height ' 0 h is the shape of the priori anchor. Different from thousands of fixed anchors for each feature map, only sparse prior anchors are generated by PRAN-Net. The priori anchor shape loss shape L is as follows: (1 min( , )) (1 min( , ))

Defect Classification Network
The structure of the classification network is shown in Figure 4. Defect proposals and their corresponding feature maps are resized by a RoI Align layer [14] to the fixed size. Then, the features are fed into a sequence of full-connection layers and are finally branched to classification and bounding box regression. The classification branch outputs the prediction of defect category, and the regression branch outputs the prediction of defect position after the secondary location.

Experiment and Results
The fabric defects detection experiments environment is Intel Core I7 v4 CPU, graphics card RTX2080Ti GPU, 128GB RAM and Ubuntu 16.04 operating system. All experiments are implemented by the Pytorch. There are two experimental fabric defect datasets: the plain fabric dataset and the denim dataset.

The Plain Fabric Dataset
The plain fabric dataset has 1054 images and overall 1164 defects labeled in PASCAL VOC [27] format. The image size is 128 × 256 pixels. There are five categories of defects which are oil stains, coarse warp, long coarse weft, short coarse weft and mispick, respectively, as shown in Figure 5. It can be seen that most defects are tiny and the shape of the defects are diverse. Table 1 gives the number of each defect category. All defects except oil stains are tiny. Long coarse weft, coarse warp and mispick are the extreme aspect ratio defects, which made up 54.4% of the total defects.

The Denim Dataset
The denim dataset has 6913 images and overall 9523 defects labeled in PASCAL VOC [27] format. The image size is 2446 × 1000 pixels. There are 20 categories of defects. The number of each category defect is shown in Table 2 and Figure 6a. Most of the defects are tiny, and there areas are less than 1% of the entire image. The aspect ratios distribution of defects is shown in Figure 6b. The red bounding box corresponds to the number of defects with moderate aspect ratios. The extreme aspect ratio defects from 0 to 0.5 and 2 to 5 are accounted for 47.7% of total defects.  Train  221  656  112  243  323  366  1379  Validation  55  164  27  60  80  91  344  Test  26  95  22  31  67  89  273  Total  302  915  161  336  473  546  1996   Category Name  Flower  jump   Coarse  warp   Loose  warp  Mark  Three  wire  Hole  Total  Defect class id  15  16  17  18 19 20 Train  96  143  267  324  698  196  6527  Validation  23  35  66  81  174  48  1621  Test  15  35  63  65  183  68  1375  Total  134  213  396  470  1055 312 9523

Defect Detection Evaluation Metrics
IoU of the detected region with the matched GT is used to evaluate detection performance as shown in Figure 8. The IoU threshold is set to 0.5 to distinguish whether the detected bounding box is a defect.  Figure 9. The upper of the outer rectangle is true defects including TP and FN, the lower of the outer rectangle is non-defects including FP and TN. The inner rectangle is the detected defects, which includes TP and FP detected by mistake. Precision p is the ratio of the detected true defects to all detected defects. Recall r is the ratio of the detected true defects to all true defects. ACC, p and r are defined, respectively, as follows:

TP TN ACC TP FP FN TN
TP p TP FP = + (8) TP r TP FN = + For each class defect, the p and r decrease with increasing of the threshold of IoU. AR is the ratio of the detected true defects of all classes to the true defects of all classes, which is calculated as follows:

Experimental Settings
We divide the images of two datasets into the train set, the validation set and the test set according to the defect's ratio of 4:1:1. The training set is doubled with random flips. The batch size is 4 for the denim dataset and 32 for the plain fabric dataset because of their difference image size. The optimizer is stochastic gradient descent optimizer (SGD); the momentum and weight decay are, respectively, set as 0.9 and 0.0001. The defect probability threshold L Δ is set as 0.5. The initial learning rate of the network is 0.005 and trained 20 epochs in total. The learning rate is reduced to 0.0005 and 0.00005 at epoch 8 and 11. The training loss value curves and the accuracy curve of PRAN-Net for the denim dataset are shown in Figure 10. It can be seen that as the number of epochs increases, the curves of the train loss value tend to convergence. When the epoch is greater than 12, the total loss is steady at 0.04, and the corresponding accuracy reaches the maximum and tends to be stable.

Detection Results
The proposed PRAN-Net based on Faster R-CNN in this paper was compared with RetinaNet, Mask R-CNN and Faster R-CNN with Guided Anchoring (GA-Faster R-CNN) on two fabric datasets, and all compared algorithms used ResNet-101-FPN [25] as backbone. The detection results were evaluated by ACC, AR, mAP, IoUs and FPS.

Detection Results of the Denim Dataset
The detection results by RetinaNet, Mask R-CNN, GA-Faster R-CNN and PRAN-Net are shown in Table 3. It can be seen that the inspection speed of RetinaNet is slightly faster than that of PRAN-Net, but the detection accuracy is much lower. Compared with Mask R-CNN, the ACC and the mAP of PRAN-Net have improved 2.5% and 3.9%, respectively, and the inspection speed is greatly improved. Compared with GA-Faster R-CNN, PRAN-Net also improves detection and position accuracy, while its detection speed is also faster than that of GA-Faster R-CNN.

Method
The Denim Dataset The detection results by PRAN-Net for the denim dataset are shown in Figure 11. (Mask R-CNN was not considered because its detection speed was too slow for online inspection.) The blue rectangles are the GT boxes of the defects, the yellow rectangles are the detected location corresponding to the defects and the green intersection regions are the IoU of the GT boxes with the detected defects of three methods, which are 55.1% for RetinaNet, 72.4% for GA-Faster R-CNN and 80.6% for PRAN-Net. Their IoU values are consistent with the tendency of IoUs as shown in Table 3, which are 63.8% for RetinaNet, 71.4% for GA-Faster R-CNN and 72.9% for PRAN-Net, respectively. All these indicate the better location precision of PRAN-Net for extreme shape defects.

Detection Results of the Plain Fabric Dataset
The test results of the plain fabric dataset with the four methods are also given in Table 4. It can be seen that PRAN-Net also has the obvious improvement with the plain fabric dataset. Furthermore, the detection results are better than that of the denim dataset because the defects of the plain fabric dataset are easier to distinguish from backgrounds. It should be highlighted that the FPS was improved more obviously than that of the denim dataset because the image size of the plain dataset is much smaller than that of the denim dataset. After the comprehensive comparison of the four methods, the detection performance on the plain fabric dataset of PRAN-Net is also the best.

Conclusions
This paper proposed a PRAN-Net method that improves the detection accuracy of fabric defects. Aiming at the problem that the initial anchor in the Faster R-CNN model is not suitable for tiny and extreme shape fabric defect detection, we propose Priori Anchor to effectively generate sparse anchors adaptively. On this basis, the FPN is used to obtain multi-scales feature maps to extract more detailed information of the defects. Two different fabric datasets were used to verify the performance of PRAN-Net by comparing it with RetinaNet, Mask R-CNN and GA-Faster R-CNN. Compared with the One-stage algorithm, the detection accuracy is improved 7.2% and 7.4% on the denim dataset and the plain fabric dataset separately, while the detection speed decreased less than 0.7f/s. Compared with the Two-stage algorithm, the mAP increases at least 2.1% and 2.4% separately on the two datasets. The PRAN-Net method increases detection and location accuracy for tiny and extreme fabric defects greatly, and the detection speed is also improved because of using fewer prior anchors, which satisfies the requirements of real-time detection. In the future, we will try PRAN-Net with other deep learning networks and apply to multiple tiny objects detection of other application fields.