Hierarchical Open-Set Object Detection in Unseen Data
Abstract
:1. Introduction
2. Related Works
2.1. Multi-Object Detection
2.2. Open-Set Recognition
2.3. Active Learning and Semi-Supervised Learning Combination
3. System Overview
4. Dynamic Hierarchical Feature Model
5. Outlier Detection
6. Open-Set-Aware Incremental ASSL
Algorithm 1. Open-set aware incremental active semi-supervised learning using outlier detection |
Input: Confident labeled dataset, tentatively labeled dataset, and dynamic HFM. Output: Optimal dynamic HFM. |
1: while do |
2: Train initial CNN model using. |
3: while not convergence, do |
4: Select batch pool of candidate samples from . |
5: Select tentatively labeled samples filtered by, parameters using (2) and (3). and each selection criteria. |
6: Assign pseudo-label and score to each unlabeled . |
7: Sort pseudo-labeled tentative samples in decreasing order. |
8: Divide j bins sorted tentative samples in decreasing order. bin has samples in range of (i − 1)/ to i/. Generate bin sequence by partitioning . |
9: while i < j do |
10: , train using and calculate . |
11: If , ; ; Else if or outlier detected by (1), oracle labels incorrectly labeled data in and return . i++ |
12: end |
13: Retrain using ; |
14: end |
15: Update dynamic HFM with . |
16: end |
7. Experiment
7.1. Dataset Overview
7.2. Results
8. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
- Bendale, A.; Boult, T. Towards Open Set Deep Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 July 2015. [Google Scholar] [CrossRef]
- Geng, C.; Chen, S. Hierarchical Dirichlet Process-based Open Set Recognition. arXiv 2018, arXiv:1806.11258. [Google Scholar]
- Scheirer, W.J.; Rocha, A.D.R. Toward Open Set Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 35, 1757–1772. [Google Scholar]
- Sermanet, P.; Eigen, D.; Zhang, X.; Mathieu, M.; Fergus, R.; LeCun, Y. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks. arXiv 2013, arXiv:1312.6229. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems; Nips: Grenada, Spain, 2015; pp. 1–10. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2015. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2999–3007. [Google Scholar] [CrossRef]
- Fan, J.; Zhao, T.; Kuang, Z.; Zheng, Y.; Zhang, J.; Yu, J.; Peng, J. HD-MTL: Hierarchical Deep Multi-Task Learning for Large-Scale Visual Recognition. IEEE Trans. Image Process. 2017, 26, 1923–1938. [Google Scholar] [CrossRef] [PubMed]
- Wu, Q.; Tan, M.; Song, H.; Chen, J.; Ng, M.K. ML-FOREST: A multi-label tree ensemble method for multi-label classification. IEEE Trans. Knowl. Data Eng. 2016, 28, 2665–2680. [Google Scholar] [CrossRef]
- Zhang, H.; Patel, V.M. Sparse representation-based open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1690–1696. [Google Scholar] [CrossRef]
- Rudd, E.M.; Jain, L.P.; Scheirer, W.J.; Boult, T.E. The Extreme Value Machine. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 762–768. [Google Scholar] [CrossRef]
- Rhee, P.K.; Erdenee, E.; Kyun, S.D.; Ahmed, M.U.; Jin, S. Active and semi-supervised learning for object detection with imperfect data. Cognit. Syst. Res. 2017, 45, 109–123. [Google Scholar] [CrossRef]
- Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Zitnick, C.L. Microsoft COCO: Common objects in context. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8693 LNCS(PART 5). In European Conference on Computer Vision; Springer: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar] [CrossRef]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Li, F. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef] [Green Version]
- Uçar, A.; Demir, Y.; Güzeliş, C. Object recognition and detection with deep learning for autonomous driving applications. Simulation 2017, 93, 759–769. [Google Scholar] [CrossRef]
- Bengio, Y.; Courville, A.; Vincent, P. Representation Learning: A Review and New Perspectives. Pattern Analysis and Machine Intelligence. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef] [PubMed]
- Felzenszwalb, P.F.; Girshick, R.B.; Mcallester, D.; Ramanan, D. Object Detection with Discriminatively Trained Part Based Models. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 32, 1–20. [Google Scholar] [CrossRef] [PubMed]
- Makantasis, K.; Doulamis, A.; Doulamis, N.; Psychas, K. Deep learning based human behavior recognition in industrial workflows. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 1609–1613. [Google Scholar] [CrossRef]
- Cao, Z.; Simon, T.; Wei, S.E.; Sheikh, Y. Realtime multi-person 2D pose estimation using part affinity fields. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, HI, USA, 21–26 January 2017; pp. 1302–1310. [Google Scholar] [CrossRef]
- Han, J.; Zhang, D.; Cheng, G.; Liu, N.; Xu, D. Advanced Deep-Learning Techniques for Salient and Category-Specific Object Detection: A Survey. IEEE Signal Process. Mag. 2018. [Google Scholar] [CrossRef]
- Han, J.; Quan, R.; Zhang, D.; Nie, F. Robust Object Co-Segmentation Using Background Prior. IEEE Trans. Image Process. 2018, 27, 1639–1651. [Google Scholar] [CrossRef]
- Hasan, M.; Roy-Chowdhury, A.K. A Continuous Learning Framework for Activity Recognition Using Deep Hybrid Feature Models. IEEE Trans. Multimed. 2015, 17, 1909–1922. [Google Scholar] [CrossRef]
- Tang, Y.; Wang, J.; Gao, B.; Dellandrea, E.; Gaizauskas, R.; Chen, L. Large Scale Semi-Supervised Object Detection Using Visual and Semantic Knowledge Transfer. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2119–2128. [Google Scholar] [CrossRef]
- Nguyen, A.; Yosinski, J.; Clune, J. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 427–436. [Google Scholar] [CrossRef]
- Moosavi-Dezfooli, S.M.; Fawzi, A.; Fawzi, O.; Frossard, P. Universal adversarial perturbations. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, HI, USA, 21–26January 2017; pp. 86–94. [Google Scholar] [CrossRef]
- Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. arXiv 2013, arXiv:1312.6199. [Google Scholar]
- Yang, J.; Guan, Y.; Dong, X. Lymphocyte-style word representations. In Proceedings of the 2014 IEEE International Conference on Information and Automation (ICIA 2014), Hailar, Chian, 28–30 July 2014; pp. 920–925. [Google Scholar] [CrossRef]
- Ji, S.; Yang, M.; Yu, K. 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 221–231. [Google Scholar] [CrossRef]
- Mahalakshmi, R.; Praba, V.L. ENHANCING THE LABELLING TECHNIQUE OF SUFFIX TREE CLUSTERING ALGORITHM. Int. J. Data Min. Knowl. Manag. Process 2014, 4, 41. [Google Scholar]
- Muslea, I.; Minton, S.N.; Knoblock, C.A. Active learning with strong and weak views: A case study on wrapper induction. In Proceedings of the IJCAI International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 9–15 August 2003; pp. 415–420. [Google Scholar]
- Settles, B. Active Learning Literature Survey. Mach. Learn. 2010, 15, 201–221. [Google Scholar]
- Gordon, J.; Hernández-Lobato, J.M. Bayesian Semisupervised Learning with Deep Generative Models. arXiv 2017, arXiv:1706.09751. [Google Scholar]
- Zhang, K.; Zhang, Z.; Li, Z.; Member, S.; Qiao, Y.; Member, S. Joint Face Detection and Alignment using Multitask Cascaded Convolutional Networks. IEEE Signal Process. Lett. 2016, 23, 1499–1503. [Google Scholar] [CrossRef]
- Zhu, X. Semi-Supervised Learning Literature Survey Contents. Sciences 2008, 10, 10. [Google Scholar]
- Wan, L.; Tang, K.; Li, M.; Zhong, Y.; Qin, A.K.; Lunjun, W.; Qin, A.K. Collaborative Active and Semisupervised Learning for Hyperspectral Remote Sensing Image Classification. Geoscience and Remote Sensing. IEEE Trans. Geosci. Remote Sens. 2015, 53, 2384–2396. [Google Scholar] [CrossRef]
- Sorokin, A.; Forsyth, D. Utility data annotation with Amazon Mechanical Turk. In Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar] [CrossRef]
- Forero, P.A.; Kekatos, V.; Giannakis, G.B. Robust clustering using outlier-sparsity regularization. IEEE Trans. Signal Process. 2012, 60, 4163–4177. [Google Scholar] [CrossRef]
- Settles, B.; Craven, M. An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP ’08), Honolulu, Hawaii, 25–27 October 2008; p. 1070. [Google Scholar] [CrossRef]
- Shin, D.K.; Ahmed, M.U.; Rhee, P.K. Incremental Deep Learning for Robust Object Detection in Unknown Cluttered Environments. IEEE Access 2018, 6, 2169–3536. [Google Scholar] [CrossRef]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Zheng, X. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv 2016, arXiv:1603.04467. [Google Scholar]
w | Backbone | mAP | |||
---|---|---|---|---|---|
07 Test + Local Data (Fire Extinguisher) | 07 Test + Local Data (Hog) | 07 Test + COCO 2017 val Data (Animal) | 07 Test + ILSVRC DET 2017 val Data (Vehicle) | ||
Faster RCNN | VGG16 | 69.3 | 67.1 | 59.8 | 52.9 |
Faster RCNN | Resnet101 | 75.5 | 75.8 | 59.7 | 55.8 |
SSD 300 | VGG16 | 73.3 | 73.6 | 64.1 | 54.2 |
YOLOv2 | Darknet19 | 71.5 | 72.3 | 61.4 | 57.9 |
YOLOv2 | Resnet50 | 67.3 | 67.6 | 57.0 | 54.1 |
YOLOv2 | Resnet152 | 69.9 | 70.4 | 59.3 | 56.4 |
YOLOv2 | Densenet201 | 71.9 | 72.5 | 61.4 | 58.1 |
Ours | Darknet19 | 77.9 | 77.5 | 72.3 | 70.8 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, Y.H.; Shin, D.K.; Ahmed, M.U.; Rhee, P.K. Hierarchical Open-Set Object Detection in Unseen Data. Symmetry 2019, 11, 1271. https://doi.org/10.3390/sym11101271
Kim YH, Shin DK, Ahmed MU, Rhee PK. Hierarchical Open-Set Object Detection in Unseen Data. Symmetry. 2019; 11(10):1271. https://doi.org/10.3390/sym11101271
Chicago/Turabian StyleKim, Yeong Hyeon, Dong Kyun Shin, Minhaz Uddin Ahmed, and Phill Kyu Rhee. 2019. "Hierarchical Open-Set Object Detection in Unseen Data" Symmetry 11, no. 10: 1271. https://doi.org/10.3390/sym11101271