Figure 1.
Some lens samples of ski goggles in different sizes, curvatures, and colors.
Figure 1.
Some lens samples of ski goggles in different sizes, curvatures, and colors.
Figure 2.
Some standard sizes of ski goggles lenses. The width ranges from 92 to 205 mm, and the height ranges from 60 to 126 mm.
Figure 2.
Some standard sizes of ski goggles lenses. The width ranges from 92 to 205 mm, and the height ranges from 60 to 126 mm.
Figure 3.
Some main components in the optical inspection system.
Figure 3.
Some main components in the optical inspection system.
Figure 4.
The flowchart of the proposed method. There are two modules for processing images. The first one is the image acquisition to capture the raw image, extract regions of interest, and data labeling. The second module is defect detection, which involves training data and inferencing defects. This module combines Faster R-CNN, MobileNetV3, and FPN to create the customized end-to-end model. It is compatible with data of the ski goggles lenses.
Figure 4.
The flowchart of the proposed method. There are two modules for processing images. The first one is the image acquisition to capture the raw image, extract regions of interest, and data labeling. The second module is defect detection, which involves training data and inferencing defects. This module combines Faster R-CNN, MobileNetV3, and FPN to create the customized end-to-end model. It is compatible with data of the ski goggles lenses.
Figure 5.
The ski goggles’ lens sample. It is wide and curved; thus, we mark its surface to be easily controlled by cameras.
Figure 5.
The ski goggles’ lens sample. It is wide and curved; thus, we mark its surface to be easily controlled by cameras.
Figure 6.
The most that a custom light source meets the curvature of the ski goggles lens. Five dot matrix LED modules are connected by an angle of 125°.
Figure 6.
The most that a custom light source meets the curvature of the ski goggles lens. Five dot matrix LED modules are connected by an angle of 125°.
Figure 7.
Design diagram of the image acquisition system. Five cameras are placed at the top. The custom light source (yellow) is placed at the bottom. The ski goggles’ lens sample (red) is positioned in the middle and held on two sides.
Figure 7.
Design diagram of the image acquisition system. Five cameras are placed at the top. The custom light source (yellow) is placed at the bottom. The ski goggles’ lens sample (red) is positioned in the middle and held on two sides.
Figure 8.
The actual model of image acquisition. The bottom is a custom light source that shines through the ski goggles lens surface to the cameras. The computer controls the five cameras through the acquisition card. The developed program will access the card’s interface to capture images simultaneously.
Figure 8.
The actual model of image acquisition. The bottom is a custom light source that shines through the ski goggles lens surface to the cameras. The computer controls the five cameras through the acquisition card. The developed program will access the card’s interface to capture images simultaneously.
Figure 9.
Input and Output of image acquisition. Input is some lens samples of ski goggles, and output is raw images from five cameras.
Figure 9.
Input and Output of image acquisition. Input is some lens samples of ski goggles, and output is raw images from five cameras.
Figure 10.
Crop regions of interest from raw images. Five cameras capture five images in the first row. To facilitate data labeling, extracting parts of good images is necessary. The bottom row is five regions of interest.
Figure 10.
Crop regions of interest from raw images. Five cameras capture five images in the first row. To facilitate data labeling, extracting parts of good images is necessary. The bottom row is five regions of interest.
Figure 11.
Some defect samples on the surface of ski goggles lenses. Column (a): scratch defects, (b): watermark, (c): spotlight, (d): stain, (e): dust-line, (f): dust-spot.
Figure 11.
Some defect samples on the surface of ski goggles lenses. Column (a): scratch defects, (b): watermark, (c): spotlight, (d): stain, (e): dust-line, (f): dust-spot.
Figure 12.
The GUI of LabelMe: the image annotation tool used to label defects on the surface of ski goggles lenses.
Figure 12.
The GUI of LabelMe: the image annotation tool used to label defects on the surface of ski goggles lenses.
Figure 13.
Synthetic image generated from the flip technique.
Figure 13.
Synthetic image generated from the flip technique.
Figure 14.
This is an instant of the inverted-residual block in MobileNetV3 architecture. First is a convolutional expand layer that widens channels from 24 to 72. Second is the convolutional depth-wise layer for better efficiency than traditional convolution. Its input and output channels are equal to 72, and the striking attribute of convolution halves the resolution. Next is the squeeze and excitation module to improve the power of features in the network. The final convolutional projection layer presents features in the lower dimension space, from 72 to 40.
Figure 14.
This is an instant of the inverted-residual block in MobileNetV3 architecture. First is a convolutional expand layer that widens channels from 24 to 72. Second is the convolutional depth-wise layer for better efficiency than traditional convolution. Its input and output channels are equal to 72, and the striking attribute of convolution halves the resolution. Next is the squeeze and excitation module to improve the power of features in the network. The final convolutional projection layer presents features in the lower dimension space, from 72 to 40.
Figure 15.
Backbone: the feature pyramid network and MobileNetV3 backbone together. The input is the image of size . Firstly, The MobileNetV3 extracts the image to many multi-scale feature maps . Secondly, the multi-scale output of MobileNetV3 is the input for FPN. The final result is the feature maps at multiple levels .
Figure 15.
Backbone: the feature pyramid network and MobileNetV3 backbone together. The input is the image of size . Firstly, The MobileNetV3 extracts the image to many multi-scale feature maps . Secondly, the multi-scale output of MobileNetV3 is the input for FPN. The final result is the feature maps at multiple levels .
Figure 16.
Illustrate how to create the box anchors in the region proposal network. Left is the image containing some red ground-truth boxes. RPN generates the reference boxes called “anchors” to map to ground-truth boxes. The multi-scale anchors are generated on the right image at various positions. They are the rectangular boxes marked with white, black, yellow, green, and blue colors.
Figure 16.
Illustrate how to create the box anchors in the region proposal network. Left is the image containing some red ground-truth boxes. RPN generates the reference boxes called “anchors” to map to ground-truth boxes. The multi-scale anchors are generated on the right image at various positions. They are the rectangular boxes marked with white, black, yellow, green, and blue colors.
Figure 17.
Box regression transforms the proposal anchor set to ground-truth label set.
Figure 17.
Box regression transforms the proposal anchor set to ground-truth label set.
Figure 18.
Illustration of transformation from the anchor to the ground-truth box . The formula is in Equation (12).
Figure 18.
Illustration of transformation from the anchor to the ground-truth box . The formula is in Equation (12).
Figure 19.
Selected examples of defect detection results. Defects are marked by the red rectangular boxes and the label above is the defect category. (a–c) display a variety of defect types, including dust-spot and dust-line. (d) showcases the spotlight defect type, while (e) highlights the stain defect type. (f–h) and h feature the scratch defect type.
Figure 19.
Selected examples of defect detection results. Defects are marked by the red rectangular boxes and the label above is the defect category. (a–c) display a variety of defect types, including dust-spot and dust-line. (d) showcases the spotlight defect type, while (e) highlights the stain defect type. (f–h) and h feature the scratch defect type.
Table 1.
The equipment description of the image acquisition system.
Table 1.
The equipment description of the image acquisition system.
Equipment | Producer | Specification |
---|
Camera | Basler | Model acA4112-30uc, sensor Sony IMX352, resolution 12 mp, pixel size 3.45 × 3.45 µm, frame rate 30 fps. |
Acquisition Card | Basler | USB 3.0 Interface Card PCIe, Fresco FL1100, 4HC, x4, 4Ports. Data transfer with rates of up to 380 MB/s per port. |
Vision Lens | Tokina | Model TC3520-12MP, image format 4/3 inch, mount C, focal length 35 mm, aperture range F2.0-22. |
Light Source | Custom | The custom-designed light source comprises five-dot matrix LED modules that are connected by an angle of 125°. |
Computer | Asus | Windows 10 Pro; hardware based on: mainboard Asus Z590-A, CPU Intel I7-11700K, RAM 16G, VGA gigabyte RTX 3080Ti 12 GB. |
Table 2.
The defect detection dataset of ski goggles lenses: defect type and its respective quantity.
Table 2.
The defect detection dataset of ski goggles lenses: defect type and its respective quantity.
Type | Defects | Type | Defects | Type | Defects |
---|
scratch | 1972 | spotlight | 229 | dust-line | 7292 |
watermark | 120 | stain | 281 | dust-spot | 1898 |
Total | 11,792 | | | | |
Table 3.
The statistics of the number of defects in the synthetic dataset.
Table 3.
The statistics of the number of defects in the synthetic dataset.
Type | Defects | Type | Defects | Type | Defects |
---|
scratch | 0 | spotlight | 1093 | dust-line | 0 |
watermark | 973 | stain | 1328 | dust-spot | 0 |
Total | 3394 | | | | |
Table 4.
The statistics of the defect categories from the combined dataset, which merges the initial and synthetic datasets.
Table 4.
The statistics of the defect categories from the combined dataset, which merges the initial and synthetic datasets.
Type | Defects | Type | Defects | Type | Defects |
---|
scratch | 1972 | spotlight | 1322 | dust-line | 7292 |
watermark | 1093 | stain | 1609 | dust-spot | 1898 |
Total | 3394 | | | | |
Table 5.
The number of images containing each defect category is extracted from the JSON file containing the labels.
Table 5.
The number of images containing each defect category is extracted from the JSON file containing the labels.
Defect Type | Scratch | Watermark | Spotlight | Stain | Dust-Line | Dust-Spot |
---|
Images | 447 | 199 | 352 | 316 | 546 | 612 |
Instances | 1972 | 1093 | 1322 | 1609 | 7292 | 1898 |
Table 6.
Comparison of defect detection between each architecture trained on the ski goggles defect dataset without any hyper-parameters adaptation.
Table 6.
Comparison of defect detection between each architecture trained on the ski goggles defect dataset without any hyper-parameters adaptation.
Architecture | BACKBONE | IoU Metric | Speed (s/it) |
---|
AP | AP50 | AP75 | Train | Test |
---|
Faster-RCNN | ResNet50 | 56.3 | 78.5 | 63.3 | 0.528 | 0.126 |
Mobile-large | 41.3 | 72.8 | 38.1 | 0.127 | 0.059 |
Mobile-small | 10.0 | 25.1 | 08.4 | 0.086 | 0.045 |
FCOS | ResNet50 | 59.6 | 78.6 | 64.0 | 0.352 | 0.126 |
RetinaNet | ResNet50 | 10.2 | 25.2 | 05.9 | 0.331 | 0.140 |
Table 7.
Defect detection results from fine-tuning the output channel of FPN and the anchor scale in RPN.
Table 7.
Defect detection results from fine-tuning the output channel of FPN and the anchor scale in RPN.
FPN | RPN | IoU metric | Speed (s/it) |
---|
Out Channel | Anchor Scales | mAP | APS | Train | Test |
---|
256 | {82, 162, 322, 642, 1282} | 55.3 | 46.4 | 0.4864 | 0.1161 |
256 | {162, 322, 642, 1282, 2562} | 49.2 | 42.8 | 0.4867 | 0.1179 |
128 | {82, 162, 322, 642, 1282} | 53.6 | 45.4 | 0.3040 | 0.1133 |
128 | {162, 322, 642, 1282, 2562} | 55.0 | 47.0 | 0.3080 | 0.1074 |
96 | {82, 162, 322, 642, 1282} | 46.7 | 39.6 | 0.2857 | 0.1094 |
96 | {162, 322, 642, 1282, 2562} | 51.8 | 42.7 | 0.2860 | 0.1046 |
64 | {82, 162, 322, 642, 1282} | 47.6 | 38.3 | 0.2517 | 0.0993 |
64 | {162, 322, 642, 1282, 2562} | 51.4 | 46.0 | 0.2520 | 0.0968 |
Table 8.
The comparative assessment of the initial dataset and the combined dataset using COCO metrics.
Table 8.
The comparative assessment of the initial dataset and the combined dataset using COCO metrics.
Model | COCO Metric | Dataset |
---|
initial DS (InDS) | Combined DS (CoDS) |
---|
Faster R-CNN with the MobileV3 Backbone
The output channel number of FPN is 64
The anchor scales of RPN {162, 322, 642, 1282, 2562} | AP | 51.4 | 55.1 |
AP50 | 71.5 | 75.8 |
AP75 | 40.7 | 47.4 |
APS | 46.0 | 48.2 |
APm | 47.3 | 50.3 |
APl | 59.1 | 63.4 |
AR1 | 31.3 | 32.6 |
AR10 | 50.9 | 53.8 |
AR100 | 55.6 | 59.4 |
ARS | 45.2 | 49.7 |
ARm | 61.6 | 54.6 |
ARl | 60.4 | 64.9 |