A Common Knowledge-Driven Generic Vision Inspection Framework for Adaptation to Multiple Scenarios, Tasks, and Objects
Abstract
1. Introduction
- We propose a generic inspection framework that includes pattern connection, progressive alignment, and adaptive detection to adapt to composite vision tasks and achieve universality and traceability;
- We introduce embedding strategies that encompass knowledge of industry common sense, field-task knowledge, and expert experience to integrate learning models and knowledge models, ultimately enhancing adaptability and accuracy;
- This study allows us to construct an inspection pipeline and accumulate data–knowledge–experience for different scenarios, tasks, and objects under weak annotation, and transfer to other applications;
- Experimental results demonstrate that compared to end-to-end learning strategies, an inspection pipeline continuously optimized through fault tracking and knowledge improvements has higher performance potential and controllability.
2. Literature Review
2.1. Generic Vision Inspection Framework
2.2. Appearance Quality Assurance of Complex Product
2.3. Knowledge Application for Vision-Based Methods
2.4. Research Gaps and Motivation
3. Proposed Method
3.1. Virtual–Real Connection Based on Industry Common Sense
3.1.1. Significant Elements and Feature Space
3.1.2. Constraints Method and Embedding Strategy
3.2. Task-Driven Progressive Perception of Multi-Object
3.2.1. Structured Representation of Task
3.2.2. Multi-Granularity Pattern Alignment
3.3. Knowledge-Driven Adaptive Inspection
3.3.1. Method Pool
3.3.2. Inspection under Different Situations
3.3.3. Knowledge Improvement Strategies
- Complete observation. This means that the current observation can be well-represented by the configured sub-pattern (or combination), while the subsequent addition of data has little impact. For example, the presence or absence of fasteners can be determined using typical templates or area-based conditions. Due to good visibility and fewer unpredictable changes, some objects can be modeled by CNN and existing datasets to improve their sub-pattern (semantic);
- Incomplete observation. Essentially, incomplete observation is a challenge that less constrained vision-based applications will inevitably face. In other words, any changes in Man–Machine–Material–Method–Environment (4M1E) may lead to deviations in perception that are difficult to be covered by the pattern space constructed by existing knowledge such as an unseen viewpoint. In this situation, if the difference with the internal sub-pattern space is too large, then manual intervention is necessary. For the semantic space, difficult case mining and augmentation [57,59] has been proven to be an effective strategy. For other spaces, exploring new sub-patterns or reconstructing existing patterns can be considered;
- Intermediate state. More objects belong to the intermediate state between (1) and (2). For these data, a combined disturbance strategy oriented towards perception and pattern space is adopted here to simulate potential variation based on existing knowledge, as follows: (a) Perception. Perform pose perturbation around the viewpoint of the existing sample and simulate possible geometric matching differences using virtual rendering. For example, in the small part recognition, geometric dictionaries under different states can be constructed to improve the perception by perturbing the local virtual viewpoint; For foreign object debris, normal patch collection can be carried out around the frequently occurring areas on the data after viewpoint perturbation, to improve observation of reference pattern and enhance metric ability; and (b) Pattern. For high-dimensional semantic features, it is recommended to use data augmentation strategies and feature layer Gaussian noise, while for low-dimensional manual features and modeling functions, it is recommended to use internal parameter perturbations. For example, in the fine-defect detection task of small objects, recognition results obtained through manual rules or empirical functions based on multiple factors, such as shape, texture, and geometric attributes, may fail due to insufficient consideration of abnormal changes in these factors. Therefore, establishing disturbance and simulation (e.g., shape deformation, feature space noise) based on intermediate results at each stage can enhance the adaptability of the method to new patterns. Then, re-train or refit (cluster) the data after perturbation and embed the discovered inherent invariance as constraints into the learning process.
4. Case Study
4.1. Preliminary Work
4.1.1. Scenarios and Datasets
4.1.2. Criterion and Method Pool
4.2. Evaluation on Complex Assembled Product
4.2.1. Deployment Work
4.2.2. Validation under Possible Non-Ideal Conditions in Actual Production
4.2.3. Validation on Mobile Robot Capture System
4.3. Validation on AR Projection under Different Object
4.4. Validation on Different Scenario and Task
4.5. Knowledge Integration and Reapplication
- Perform Criterion design (or directly call based on experience, or automatically recommend based on established pattern space (e.g., knowledge graph));
- Build pattern connection and alignment;
- Conduct internal debugging of Criterion + Parameter;
- Implement joint testing of Connection + Alignment;
- Determine if repetition is necessary.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Pang, G.; Shen, C.; Cao, L.; Hengel, A.V.D. Deep learning for anomaly detection: A review. ACM Comput. Surv. 2021, 54, 38. [Google Scholar] [CrossRef]
- Wang, J.; Ma, Y.; Zhang, L.; Gao, R.X.; Wu, D. Deep learning for smart manufacturing: Methods and applications. J. Manuf. Syst. 2018, 48, 144–156. [Google Scholar] [CrossRef]
- Kong, F.; Zhao, D.; Du, F. A doubt–confirmation-based visual detection method for foreign object debris aided by assembly models. Trans. Can. Soc. Mech. Eng. 2023, 47, 508–520. [Google Scholar] [CrossRef]
- Realyvásquez-Vargas, A.; Arredondo-Soto, K.C.; García-Alcaraz, J.L.; Márquez-Lobato, B.Y.; Cruz-García, J. Introduction and configuration of a collaborative robot in an assembly task as a means to decrease occupational risks and increase efficiency in a manufacturing company. Robot. Comput. Manuf. 2018, 57, 315–328. [Google Scholar] [CrossRef]
- Guo, S.; Diao, Q.; Xi, F. Vision based navigation for omni-directional mobile industrial robot. Procedia Comput. Sci. 2017, 105, 20–26. [Google Scholar] [CrossRef]
- Rentzos, L.; Papanastasiou, S.; Papakostas, N.; Chryssolouris, G. Augmented reality for human-based assembly: Using product and process semantics. IFAC Proc. 2013, 46, 98–101. [Google Scholar] [CrossRef]
- Hořejší, P.; Novikov, K.; Šimon, M. A smart factory in a Smart City: Virtual and augmented reality in a Smart assembly line. IEEE Access 2020, 8, 94330–94340. [Google Scholar] [CrossRef]
- Yang, S.; Wang, W.; Liu, C.; Deng, W. Scene understanding in deep learning-based end-to-end controllers for autonomous vehicles. IEEE Trans. Syst. Man Cybern. Syst. 2018, 49, 53–63. [Google Scholar] [CrossRef]
- Zhang, C.; Zhou, G.; Hu, J.; Li, J. Deep learning-enabled intelligent process planning for digital twin manufacturing cell. Knowl.-Based Syst. 2020, 191, 105247. [Google Scholar] [CrossRef]
- Sharfuddin, A.K.; Iram, N.; Simonov, K.-S.; Himanshu, G.; Ashraf, R.I. A knowledge-based experts’ system for evaluation of digital supply chain readiness. Knowl.-Based Syst. 2021, 228, 107262. [Google Scholar] [CrossRef]
- Wang, M.; Deng, W. Deep visual domain adaptation: A survey. Neurocomputing 2018, 312, 135–153. [Google Scholar] [CrossRef]
- Wuest, T.; Weimer, D.; Irgens, C.; Thoben, K.D. Machine learning in manufacturing: Advantages, challenges, and applications. Prod. Manuf. Res. 2016, 4, 23–45. [Google Scholar] [CrossRef]
- Zheng, P.; Wang, H.; Sang, Z.; Zhong, R.Y.; Liu, Y.; Liu, C.; Khamdi, M.; Yu, S.; Xu, X. Smart manufacturing systems for Industry 4.0: Conceptual framework, scenarios, and future perspectives. Front. Mech. Eng. 2018, 13, 137–150. [Google Scholar] [CrossRef]
- Kamble, S.S.; Gunasekaran, A.; Gawankar, S.A. Sustainable Industry 4.0 framework: A systematic literature review identifying the current trends and future perspectives. Process Saf. Environ. Prot. 2018, 117, 408–425. [Google Scholar] [CrossRef]
- Insa-Iglesias, M.; Jenkins, M.D.; Morison, G. 3D visual inspection system framework for structural condition monitoring and analysis. Autom. Constr. 2021, 128, 103755. [Google Scholar] [CrossRef]
- Xu, Z.; Chen, B.; Zhan, X.; Xiu, Y.; Suzuki, C.; Shimada, K. A vision-based autonomous UAV inspection framework for unknown tunnel construction sites with dynamic obstacles. arXiv 2023. [Google Scholar] [CrossRef]
- Liu, T.; Li, B.; Du, X.; Jiang, B.; Jin, X.; Jin, L.; Zhao, Z. Component-aware anomaly detection framework for adjustable and logical industrial visual inspection. arXiv 2023. [Google Scholar] [CrossRef]
- Yang, X.; Cai, J.; Li, K.; Fan, X.; Cao, H. A monocular-based tracking framework for industrial augmented reality applications. Int. J. Adv. Manuf. Technol. 2023, 128, 2571–2588. [Google Scholar] [CrossRef]
- Zhu, Q.; Zhang, Y.; Luan, J.; Hu, L. A Machine Vision Development Framework for Product Appearance Quality Inspection. Appl. Sci. 2022, 12, 11565. [Google Scholar] [CrossRef]
- Singh, S.A.; Desai, K.A. Automated surface defect detection framework using machine vision and convolutional neural networks. J. Intell. Manuf. 2023, 34, 1995–2011. [Google Scholar] [CrossRef]
- Hridoy, M.W.; Rahman, M.M.; Sakib, S. A Framework for Industrial Inspection System using Deep Learning. Ann. Data Sci. 2022, 11, 445–478. [Google Scholar] [CrossRef]
- Zhao, D.; Xue, D.; Wang, X.; Du, F. Adaptive vision inspection for multi-type electronic products based on prior knowledge. J. Ind. Inf. Integr. 2022, 27, 100283. [Google Scholar] [CrossRef]
- Xiao, M.; Yang, B.; Wang, S.; Mo, F.; He, Y.; Gao, Y. GRA-Net: Global receptive attention network for surface defect detection. Knowl.-Based Syst. 2023, 280, 111066. [Google Scholar] [CrossRef]
- Xiang, Y.; Schmidt, T.; Narayanan, V.; Fox, D. PoseCNN: A convolutional neural network for 6D object pose estimation in cluttered scenes. arXiv 2017. [Google Scholar] [CrossRef]
- Hu, Y.; Hugonot, J.; Fua, P.; Salzmann, M. Segmentation-driven 6D object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3385–3394. [Google Scholar] [CrossRef]
- Li, Y.; Wang, G.; Ji, X.; Xiang, Y.; Fox, D. DeepIM: Deep iterative matching for 6D pose estimation. In Computer Vision—ECCV 2018, Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; Springer: Cham, Switzerland, 2018; pp. 695–711. [Google Scholar] [CrossRef]
- Hu, Y.; Fua, P.; Wang, W.; Salzmann, M. Single-stage 6D object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2930–2939. [Google Scholar] [CrossRef]
- Labbé, Y.; Carpentier, J.; Aubry, M.; Sivic, J. Cosypose: Consistent multi-view multi-object 6D pose estimation. In Computer Vision–ECCV 2020, Proceedings of the 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XVII; Springer: Cham, Switzerland, 2020; pp. 574–591. [Google Scholar] [CrossRef]
- Kendall, A.; Cipolla, R. Geometric loss functions for camera pose regression with deep learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5974–5983. [Google Scholar] [CrossRef]
- Tekin, B.; Sinha, S.N.; Fua, P. Real-Time Seamless Single Shot 6D Object Pose Prediction. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 292–301. [Google Scholar] [CrossRef]
- Peng, S.; Liu, Y.; Huang, Q.; Zhou, X.; Bao, H. PVNet: Pixel-wise voting network for 6D of pose estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 3212–3223. [Google Scholar] [CrossRef] [PubMed]
- Park, K.; Patten, T.; Vincze, M. Pix2Pose: Pixel-wise coordinate regression of objects for 6D pose estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7668–7677. [Google Scholar] [CrossRef]
- Li, Z.; Wang, G.; Ji, X. CDPN: Coordinates-based disentangled pose network for real-time RGB-based 6-dof object pose estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7678–7687. [Google Scholar] [CrossRef]
- Song, C.; Song, J.; Huang, Q. HybridPose: 6D object pose estimation under hybrid representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 431–440. [Google Scholar] [CrossRef]
- Mariotti, O.; Bilen, H. Semi-supervised Viewpoint Estimation with Geometry-Aware Conditional Generation. In Computer Vision—ECCV 2020 Workshops, Proceedings of the ECCV 2020, Glasgow, UK, 23–28 August 2020; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2020; Volume 12536. [Google Scholar] [CrossRef]
- Zhou, G.; Wang, D.; Yan, Y.; Chen, H.; Chen, Q. Semi-Supervised 6D Object Pose Estimation Without Using Real Annotations. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 5163–5174. [Google Scholar] [CrossRef]
- Wang, G.; Manhardt, F.; Shao, J.; Ji, X.; Navab, N.; Tombari, F. Self6D: Self-supervised monocular 6D object pose estimation. In Computer Vision–ECCV 2020, Proceedings of the 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part I; Springer: Cham, Switzerland, 2020; pp. 108–125. [Google Scholar] [CrossRef]
- Langerman, J.; Qiu, Z.; Sörös, G.; Sebők, D.; Wang, Y.; Huang, H. Domain Adaptation of Networks for Camera Pose Estimation: Learning Camera Pose Estimation without Pose Labels. arXiv 2021. [Google Scholar] [CrossRef]
- Ito, S.; Aizawa, H.; Kato, K. Few-Shot NeRF-Based View Synthesis for Viewpoint-Biased Camera Pose Estimation. In Artificial Neural Networks and Machine Learning—ICANN 2023, Proceedings of the ICANN 2023, Heraklion, Crete, Greece, 26–29 September 2023; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2023; Volume 14255. [Google Scholar] [CrossRef]
- Shu, Q.; Luan, Z.; Poslad, S.; Bourguet, M.L.; Xu, M. MCAPR: Multi-modality Cross Attention for Camera Absolute Pose Regression. In Artificial Neural Networks and Machine Learning—ICANN 2023, Proceedings of the ICANN 2023, Heraklion, Crete, Greece, 26–29 September 2023; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2023; Volume 14255. [Google Scholar] [CrossRef]
- Lee, T.; Lee, B.U.; Shin, I.; Choe, J.; Shin, U.; Kweon, I.S.; Yoon, K.J. UDA-COPE: Unsupervised domain adaptation for category-level object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 14891–14900. [Google Scholar] [CrossRef]
- Zhang, D.; Barbot, A.; Seichepine, F.; Lo, F.P.-W.; Bai, W.; Yang, G.-Z.; Lo, B. Micro-object pose estimation with sim-to-real transfer learning using small dataset. Commun. Phys. 2022, 5, 80. [Google Scholar] [CrossRef]
- Kendall, A.; Grimes, M.; Cipolla, R. PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 2938–2946. [Google Scholar] [CrossRef]
- Manon, P.J.; Arnaud, P.; Dominique, N.; Jean-Luc, M.; Jean-Philippe, P. Survey on the View Planning Problem for Reverse Engineering and Automated Control Applications. Comput.-Aided Des. 2021, 141, 103094. [Google Scholar] [CrossRef]
- Mehdi, M.; MohammadReza, H.; Soohwan, S.; Shirin, M.; Mohammad, S. A Review on Viewpoints and Path Planning for UAV-Based 3-D Reconstruction. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 5026–5048. [Google Scholar] [CrossRef]
- Youkachen, S.; Ruchanurucks, M.; Phatrapomnant, T.; Kaneko, H. Defect Segmentation of Hot-rolled Steel Strip Surface by using Convolutional Auto-Encoder and Conventional Image processing. In Proceedings of the 2019 10th International Conference of Information and Communication Technology for Embedded Systems (IC-ICTES), Bangkok, Thailand, 25–27 March 2019; pp. 1–5. [Google Scholar] [CrossRef]
- Wang, K. Contrastive learning-based semantic segmentation for In-situ stratified defect detection in additive manufacturing. J. Manuf. Syst. 2023, 68, 465–476. [Google Scholar] [CrossRef]
- Hu, X.; Yang, J.; Jiang, F.; Amir, H.; Kia, D.; Mandar, G. Steel surface defect detection based on self-supervised contrastive representation learning with matching metric. Appl. Soft Comput. 2023, 145, 110578. [Google Scholar] [CrossRef]
- Kim, J.; Oh, T.H.; Lee, S.; Pan, F.; Kweon, I.S. Variational prototyping-encoder: One-shot learning with prototypical images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9462–9470. [Google Scholar] [CrossRef]
- Zhou, Y.; Zhang, Y. SiamET: A Siamese based visual tracking network with enhanced templates. Appl. Intell. 2020, 52, 9782–9794. [Google Scholar] [CrossRef]
- Xia, X.; Pan, X.; Li, N.; He, X.; Ma, L.; Zhang, X.; Ding, N. GAN-based anomaly detection: A review. Neurocomputing 2022, 493, 497–535. [Google Scholar] [CrossRef]
- Wang, C.-Y.; Alexey, B.; Mark, L.H.-Y. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar] [CrossRef]
- Alexander, K.; Eric, M.; Nikhila, R.; Hanzi, M.; Chloe, R.; Laura, G.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.-Y.; et al. Segment anything. arXiv 2023. [Google Scholar] [CrossRef]
- Gou, J.; Yu, B.; Maybank, S.J.; Tao, D. Knowledge Distillation: A Survey. Int. J. Comput. Vis. 2021, 129, 1789–1819. [Google Scholar] [CrossRef]
- Ben Abdallah, H.; Jovančević, I.; Orteu, J.J.; Brèthes, L. Automatic inspection of aeronautical mechanical assemblies by matching the 3D CAD model and real 2D images. J. Imaging 2019, 5, 81. [Google Scholar] [CrossRef] [PubMed]
- Li, D.-C.; Lin, L.-S.; Chen, C.-C.; Yu, W.-H. Using virtual samples to improve learning performance for small datasets with multimodal distributions. Soft Comput. 2019, 23, 11883–11900. [Google Scholar] [CrossRef]
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. Mixup: Beyond empirical risk minimization. arXiv 2017. [Google Scholar] [CrossRef]
- Siu, C.; Wang, M.; Cheng, J.C. A framework for synthetic image generation and augmentation for improving automatic sewer pipe defect detection. Autom. Constr. 2022, 13, 104213. [Google Scholar] [CrossRef]
- Wang, X.; Shrivastava, A.; Gupta, A. A-Fast-RCNN: Hard positive generation via adversary for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2606–2615. [Google Scholar]
- Zhou, Q.; Chen, R.; Huang, B.; Xu, W.; Yu, J. DeepInspection: Deep learning based hierarchical network for specular surface inspection. Measurement 2020, 160, 107834. [Google Scholar] [CrossRef]
- Wang, C.; Ge, S.; Jiang, Z.; Hao, H.; Gu, Q. SiamFuseNet: A pseudo-siamese network for detritus detection from polarized microscopic images of river sands. Comput. Geosci. 2021, 156, 104912. [Google Scholar] [CrossRef]
- Chen, B.; Parra, A.; Cao, J.; Li, N.; Chin, T.-J. End-to-End Learnable Geometric Vision by Backpropagating PnP Optimization. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 8097–8106. [Google Scholar] [CrossRef]
- Xu, C.; Wang, J.; Tao, J.; Zhang, J.; Zheng, P. A knowledge augmented deep learning method for vision-based yarn contour detection. J. Manuf. Syst. 2022, 63, 317–328. [Google Scholar] [CrossRef]
- Xu, Y.; Qiao, W.; Zhao, J.; Zhang, Q.; Li, H. Vision-based multi-level synthetical evaluation of seismic damage for RC structural components: A multi-task learning approach. Earthq. Eng. Eng. Vib. 2023, 22, 69–85. [Google Scholar] [CrossRef]
- Dong, X.; Taylor, C.J.; Cootes, T.F. Defect Classification and Detection Using a Multitask Deep One-Class CNN. IEEE Trans. Autom. Sci. Eng. 2022, 19, 1719–1730. [Google Scholar] [CrossRef]
- Wu, H.; Li, B.; Tian, L.; Feng, J.; Dong, C. An adaptive loss weighting multi-task network with attention-guide proposal generation for small size defect inspection. Vis. Comput. 2023, 40, 681–698. [Google Scholar] [CrossRef]
- Wright, L.G.; Onodera, T.; Stein, M.M.; Wang, T.; Schachter, D.T.; Hu, Z.; McMahon, P.L. Deep physical neural networks trained with backpropagation. Nature 2022, 601, 549–555. [Google Scholar] [CrossRef] [PubMed]
- Bazighifan, O.; Cesarano, C. A Philos-Type Oscillation Criteria for Fourth-Order Neutral Differential Equations. Symmetry 2020, 12, 379. [Google Scholar] [CrossRef]
- Chang, A.; Zhang, Y.; Zhang, S.; Zhong, L.; Zhang, L. Detecting prohibited objects with physical size constraint from cluttered X-ray baggage images. Knowl.-Based Syst. 2022, 237, 107916. [Google Scholar] [CrossRef]
- Wang, X.; Peng, Z.; Kong, D.; Zhang, P.; He, Y. Infrared dim target detection based on total variation regularization and principal component pursuit. Imaging Vis. Comput. 2017, 63, 1–9. [Google Scholar] [CrossRef]
- Zhang, W.; Şerban, O.; Sun, J.; Guo, Y. Conflict-aware multilingual knowledge graph completion. Knowl.-Based Syst. 2023, 281, 111070. [Google Scholar] [CrossRef]
- Ge, Y.; Ma, J.; Zhang, L.; Li, X.; Lu, H. Trustworthiness-aware knowledge graph representation for recommendation. Knowl.-Based Syst. 2023, 278, 110865. [Google Scholar] [CrossRef]
- Li, X.; Liu, G.; Sun, S.; Bai, C. Contour detection and salient feature line regularization for printed circuit board in point clouds based on geometric primitives. Measurement 2021, 185, 109978. [Google Scholar] [CrossRef]
- Zhang, Q.; Liu, J.; Zheng, S.; Yu, C. A novel accurate positioning method of reference hole for complex surface in aircraft assembly. Int. J. Adv. Manuf. Technol. 2021, 119, 571–586. [Google Scholar] [CrossRef]
- Koch, C.; Neges, M.; König, M.; Abramovici, M. Natural markers for augmented reality-based indoor navigation and facility maintenance. Autom. Constr. 2014, 48, 18–30. [Google Scholar] [CrossRef]
- Vázquez Nava, A. Vision System for Quality Inspection of Automotive Parts Based on Non-Defective Samples. Master’s Thesis, Instituto Tecnológico y de Estudios Superiores de Monterrey, Monterrey, Mexico, 2021. Available online: https://hdl.handle.net/11285/648442 (accessed on 11 June 2021).
- Yuan, G.; Fu, Q.; Mi, Z.; Luo, Y.; Tao, W. SSRNet: Scalable 3D Surface Reconstruction Network. IEEE Trans. Vis. Comput. Graph. 2022, 29, 4906–4919. [Google Scholar] [CrossRef]
- Xing, C.; Rostamzadeh, N.; Oreshkin, B.; Pinheiro, P.O. Adaptive cross-modal few-shot learning. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Article number 436; pp. 4847–4857. Available online: https://dl.acm.org/doi/10.5555/3454287.3454723 (accessed on 11 June 2021).
- Du, F.; Kong, F.; Zhao, D. A Knowledge Transfer Method for Unsupervised Pose Keypoint Detection Based on Domain Adaptation and CAD Models. Adv. Intell. Syst. 2023, 5, 2200214. [Google Scholar] [CrossRef]
- Zhao, D.; Kong, F.; Du, F. Vision-based adaptive stereo measurement of pins on multi-type electrical connectors. Meas. Sci. Technol. 2019, 30, 105002. [Google Scholar] [CrossRef]
- Bergström, P.; Edlund, O. Robust registration of point sets using iteratively reweighted least squares. Comput. Optim. Appl. 2014, 58, 543–561. [Google Scholar] [CrossRef]
- Yang, S. A high-precision linear method for camera pose determination. In Proceedings of the 2010 IEEE International Conference on Mechatronics and Automation, Xi’an, China, 4–7 August 2010; pp. 595–600. [Google Scholar] [CrossRef]
- Leon, K.; Mery, D.; Pedreschi, F.; Leon, J. Color measurement in L* a* b* units from RGB digital images. Food Res. Int. 2006, 39, 1084–1091. [Google Scholar] [CrossRef]
- Zhao, H.; Gallo, O.; Frosio, I.; Kautz, J. Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imaging 2016, 3, 47–57. [Google Scholar] [CrossRef]
- Deng, C.; Wang, B.; Lin, W.; Huang, G.; Zhao, B. Effective visual tracking by pairwise metric learning. Neurocomputing 2017, 261, 266–275. [Google Scholar] [CrossRef]
- Li, P.; Chen, B.; Wang, D.; Lu, H. Visual tracking by dynamic matching-classification network switching. Pattern Recognit. 2020, 107, 107419. [Google Scholar] [CrossRef]
- Tsin, Y.; Kanade, T. A correlation-based approach to robust point set registration. In Computer Vision-ECCV 2004, Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic, 11–14 May 2004; Proceedings, Part III; Springer: Berlin/Heidelberg, Germany, 2004; pp. 558–569. [Google Scholar] [CrossRef]
- Myronenko, A.; Song, X. Point Set Registration: Coherent Point Drift. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 2262–2275. [Google Scholar] [CrossRef]
- Shi, X.; Zhang, S.; Cheng, M.; He, L.; Tang, X.; Cui, Z. Few-shot semantic segmentation for industrial defect recognition. Comput. Ind. 2023, 148, 103901. [Google Scholar] [CrossRef]
- Danzer, A.; Griebel, T.; Bach, M.; Dietmayer, K. 2D car detection in radar data with pointnets. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27–30 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 61–66. [Google Scholar] [CrossRef]
- Zhao, Z.; Li, B.; Dong, R.; Zhao, P. A Surface Defect Detection Method Based on Positive Samples. In PRICAI 2018: Trends in Artificial Intelligence, Proceedings of the PRICAI 2018, Nanjing, China, 28–31 August 2018; Geng, X., Kang, B.H., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2018; Volume 11013. [Google Scholar] [CrossRef]
- Fang, Y.; Zeng, T. Learning deep edge prior for image denoising. Comput. Vis. Image Underst. 2020, 200, 103044. [Google Scholar] [CrossRef]
- Park, S.; Bang, S.; Kim, H.; Kim, H. Patch-Based Crack Detection in Black Box Images Using Convolutional Neural Networks. J. Comput. Civ. Eng. 2019, 33, 04019017. [Google Scholar] [CrossRef]
- Tsai, D.M.; Wu, S.C.; Li, W.C. Defect detection of solar cells in electroluminescence images using Fourier image reconstruction. Sol. Energy Mater. Sol. Cells 2012, 99, 250–262. [Google Scholar] [CrossRef]
- Duan, J.; Liu, X.; Wu, X.; Mao, C. Detection and segmentation of iron ore green pellets in images using lightweight U-net deep learning network. Neural Comput. Appl. 2020, 32, 5775–5790. [Google Scholar] [CrossRef]
- Wang, J.; Bai, X.; You, X.; Liu, W.; Latecki, L.J. Shape Matching and Classification Using Height Functions. Pattern Recognit. Lett. 2012, 33, 134–143. [Google Scholar] [CrossRef]
- Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2274–2282. [Google Scholar] [CrossRef]
- Zhao, D.; Du, F. A novel approach for scale and rotation adaptive estimation based on time series alignment. Vis. Comput. 2020, 36, 175–189. [Google Scholar] [CrossRef]
- Herbert, B.; Andreas, E.; Tinne, T.; Luc, V.G. Speeded-Up Robust Features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
- Nikhila, R.; Jeremyand, R.; David, N.; Taylor, G.; Wan-Yen, L.; Justin, J.; Georgia, G. Accelerating 3D Deep Learning with PyTorch3D. arXiv 2020. [Google Scholar] [CrossRef]

















| ID | EB2 | EA6 | ED49 | ED50 | ED13 | ED14 | EC11 | ED43 | ED44 | EB5 | ED25 | ED26 | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 58 | 97 | 73 | 45 | 41 | 35 | 88 | 43 | 69 | 77 | 52 | 53 | |
| 41 | 72 | 36 | 53 | 31 | 30 | 61 | 35 | 48 | 61 | 41 | 39 | 
| Item | Configurations | 
|---|---|
| Significant elements | ES0, ES1 | 
| Feature design | Projected cuboid described by 8 + 1 2D points | 
| Constraints | Auxiliary recognition task Approximate parallel constraint | 
| Post processing | Weight adjustment [82] | 
| A1 | CPD [88] | 
| A2 | Edge-based [96] | 
| A3 | Weighted-ICP [81] | 
| A4 | Shape fitting algorithm set | 
| B1 | Color segmentation, K-means | 
| B2 | SLIC [97] | 
| B4 | HOG, Canny, Sobel | 
| C3 | [98] with multiple templates | 
| C4 | SURF [99] with parallel constraints | 
| Stage | Description | Datasets | Criterion | Indicators | 
|---|---|---|---|---|
| A | Connection and alignment | PSC | 10-(a) | , , , , | 
| B | Inspection initialization Knowledge improvement Connection and alignment | PSC | 10-(b) | , , , , and , | 
| C | Connection, alignment and inspection | TO | 10-(b) | , ,, TO, PT | 
| D | Knowledge improvement Connection, alignment and inspection | PSC, TO | 10-(c) | all ALL | 
| Connection | Alignment | Group I | Group II | Group III | Group IV | Group V | |
|---|---|---|---|---|---|---|---|
| () | 2.678 | 0.817 | 3.357 | 1.594 | 1.613 | 2.181 | 3.895 | 
| (mm) | 142.196 | 35.304 | 163.623 | 64.725 | 62.014 | 97.864 | 165.074 | 
| 0.553 | 0.805 | 0.515 | 0.700 | 0.688 | 0.612 | 0.394 | |
| 0.374 | 0.643 | 0.336 | 0.469 | 0.460 | 0.405 | 0.177 | |
| () | 2.771 | 1.342 | 3.429 | 1.767 | 1.838 | 2.475 | 4.023 | 
| (mm) | 153.584 | 59.641 | 169.240 | 69.802 | 74.125 | 114.523 | 181.451 | 
| 0.547 | 0.721 | 0.497 | 0.673 | 0.662 | 0.577 | 0.367 | |
| 0.369 | 0.489 | 0.314 | 0.432 | 0.419 | 0.398 | 0.113 | 
| Before | After Improvement | |||||||
|---|---|---|---|---|---|---|---|---|
| PSC | ||||||||
| 0.778 | 0.517 | 27.304 | 0.882 | 0.730 | 0.894 | |||
| PT | ||||||||
| 0.712 | 0.571 | 1.034 | 52.308 | 0.755 | 0.492 | 0.845 | 0.597 | |
| EA/B | ED | EB | ED | EA/B | ED | EB | ED | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Proposed (Green) | 0.856 | 0.671 | 0.736 | 0.585 | 0.104 | 0.244 | 0.845 | 0.425 | 0.642 | 0.538 | 0.049 | 0.270 | 
| DL1 (Red) | 0.797 | 0.665 | 0.704 | 0.724 | 0.247 | 0.387 | 0.766 | 0.681 | 0.692 | 0.670 | 0.213 | 0.325 | 
| DL2 (Blue) | 0.812 | 0.697 | 0.821 | 0.805 | 0.260 | 0.401 | 0.773 | 0.664 | 0.808 | 0.714 | 0.213 | 0.399 | 
| Yolo7 (Cyan) | 0.914 | 0.742 | 0.805 | 0.862 | 0.026 | 0.069 | 0.901 | 0.735 | 0.783 | 0.835 | 0.016 | 0.074 | 
| Proposed+ | 0.981 | 0.984 | 0.967 | 0.908 | 0.932 | 0.936 | 0.956 | 0.884 | 
| DL2+ | 0.917 | 0.930 | 0.927 | 0.786 | 0.825 | 0.910 | 0.945 | 0.741 | 
| Yolo7+ | 0.943 | 0.962 | 0.959 | 0.721 | 0.938 | 0.944 | 0.967 | 0.683 | 
| Stage | Criterion Scheme | Object ID | Expert Experience | 
|---|---|---|---|
| A | At first, there were only a few samples, and the framework built through experience and expertise served as the agent for data acquisition. | ||
| (a)-Alignment | ES0/ES1/EBx/EAx | Extracting domain independent geometric information in the case of a small number of samples and no annotation. | |
| EA3/EA18/EA16 | As the significant feature, color-region can reduce the amount of information received by A2. | ||
| ECx | The small silver-gray plug has the characteristics of low information volume and consistent texture. Simple shape allows us to pre-built a shape library. | ||
| EDx | The materials of standard parts are uniform. The small size of the fastener makes its features clustered and pattern single. | ||
| B, C | After knowledge improvement, the state recognition pipeline is built according to the collected samples, and the general task model and various pattern libraries are enabled. | ||
| (b)-Alignment | ES0/ES1/EBx | A1 can be maintained under the constraint of Connection. | |
| ECx | It is found that B2 is easy to gather the plug and cable (bracket) together, so B4 is adopted to extract the gradient, and A2’s shape library is enriched. | ||
| EDx | A4 is added to prevent the shadow part from being miscalculated as part of the shape, resulting in centroid offset. | ||
| (b)-Detection | EA0/EA1/…/EA14 | D5 is used to mitigate noise. Another purpose of A2 besides matching is to enrich the shape Library. | |
| EA5/EB3 | The texture is complex but regular, which makes the geometric matching prone to bias, but the analysis in gradient mode will benefit from it. | ||
| EA8/EA12/EB1/EB2 | Significant geometric features (box structure). | ||
| EA3/EA16/EA18 | Significant visual features (Color) | ||
| EA10/EA15/EA17 | Significant geometric features. (Quasi circular structure) | ||
| EB4/EB5 | The pattern change under visible conditions is single, so C3 is configured. | ||
| ECx | C3 is added to prevent some missed detections. | ||
| EDx | A2 replaces A4 used in alignment to determine the position of object, and C3 is assembled to assist in determining whether the object exists. | ||
| D | (c)-Alignment | EBx | By observing Alignment of (b), it is found that the defect of EBx will cause large matching deviations, so C4 is configured to limit the search of A1. | 
| EAx/ECx | D2 is introduced to provide observation of semantic space to prevent local matching errors. | ||
| (d)-Detection | EA0/EA1/…/EA14 | Configure D3 to enrich observation space. | |
| EA5/EB3 | Configure A1 to add position sensitivity. | ||
| EA12 | Due to the special shape and invisibility (based on experience), recognition can be completed only by C3. | ||
| EB4/EB5 | Although significant shapes are easy to fit, in practice, it is quite easy to make matching errors under the influence of cables and brackets, so D2 and D5 are added. | ||
| EB1/EB2 | Configure D3 to enrich observation space. | ||
| Potential deviation | (0.623, 0.403) | (0.636, 0.407) | (0.579, 0.364) | (0.659, 0.411) | (0.655, 0.412) | |
| Visible Number | Normal | 54 | 60 | 61 | 50 | 49 | 
| Defects | 4 | 8 | 7 | 7 | 6 | |
| Group I | Normal | 42 | 44 | 35 | 37 | 36 | 
| Defects | 3 | 3 | 6 | 3 | 5 | |
| Group II | Alignment | (0.907, 0.812) | (0.918, 0.808) | (0.911, 0.815) | (0.933, 0.820) | (0.929, 0.812) | 
| Normal | 52 | 58 | 58 | 50 | 47 | |
| Defects | 4 | 8 | 7 | 6 | 6 | |
| Proposed_(a) D5 + C3 | 0.986 | 43.423 | 0.874 | 0.561 | 
| Proposed_(a) D0 + EPnP | 1.021 | 52.786 | 0.833 | 0.515 | 
| Proposed_(b) | 0.753 | 37.665 | 0.927 | 0.722 | 
| MTCI | 0.625 | 0.854 | 0.691 | 0.837 | 0.219 | 
| MTCI2 | 0.723 | 0.878 | 0.648 | 0.844 | 0.286 | 
| Proposed | 0.705 | 0.963 | 0.722 | 0.955 | 0.425 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, D.; Kong, F.; Lv, N.; Xu, Z.; Du, F. A Common Knowledge-Driven Generic Vision Inspection Framework for Adaptation to Multiple Scenarios, Tasks, and Objects. Sensors 2024, 24, 4120. https://doi.org/10.3390/s24134120
Zhao D, Kong F, Lv N, Xu Z, Du F. A Common Knowledge-Driven Generic Vision Inspection Framework for Adaptation to Multiple Scenarios, Tasks, and Objects. Sensors. 2024; 24(13):4120. https://doi.org/10.3390/s24134120
Chicago/Turabian StyleZhao, Delong, Feifei Kong, Nengbin Lv, Zhangmao Xu, and Fuzhou Du. 2024. "A Common Knowledge-Driven Generic Vision Inspection Framework for Adaptation to Multiple Scenarios, Tasks, and Objects" Sensors 24, no. 13: 4120. https://doi.org/10.3390/s24134120
APA StyleZhao, D., Kong, F., Lv, N., Xu, Z., & Du, F. (2024). A Common Knowledge-Driven Generic Vision Inspection Framework for Adaptation to Multiple Scenarios, Tasks, and Objects. Sensors, 24(13), 4120. https://doi.org/10.3390/s24134120
 
        

 
                                                

 
       