Unsupervised Domain Adaptation in Semantic Segmentation: A Review
Abstract
:1. Introduction
1.1. Semantic Segmentation
1.2. Domain Adaptation (DA)
1.3. Unsupervised Domain Adaptation (UDA)
1.4. Application Motivations
1.5. Outline
2. Unsupervised Domain Adaptation for Semantic Segmentation
2.1. Problem Formulation
- Closed Set DA: all the possible categories appear in both the source and target domains ();
- Partial DA: all the categories appear in the source domain, but just a subset appears in the target domain ();
- Open Set DA: some categories appear in the source domain and all categories appear in the target domain ();
- Open-Partial DA: some categories belong only to the source or to the target set and others belong to both sets ( and );
- Boundless DA: an Open Set DA where all the target domain categories are learned individually ( and ).
2.2. UDA in Semantic Segmentation: Adaptation Spaces
3. Review of Unsupervised Domain Adaptation Strategies
3.1. Weakly- and Semi-Supervised Learning
3.2. Domain Adversarial Discriminative
3.3. Generative-Based Approaches
3.4. Classifier Discrepancy
3.5. Self-Training
3.6. Entropy Minimization
3.7. Curriculum Learning
3.8. Multi-Tasking
3.9. New Research Directions
4. A Case Study: Synthetic to Real Adaptation for Semantic Understanding of Road Scenes
- Autonomous driving is nowadays one of the biggest research areas and massive fundings support this research [114];
- Autonomous vehicles should fully understand the surrounding environment to plan decisions [116] and such navigation task in the environment could be encountered in many other applications, for example, in the robotics field;
- The first works on the topic addressed this setting and it has become the de-facto standard for performance comparison with the state-of-the-art in the UDA for semantic segmentation field.
4.1. Source Domain: Synthetic Datasets of Urban Scenes
4.2. Target Domain: Real World Datasets of Urban Scenes
4.3. Methods Comparison
5. Conclusions and Future Directions
Author Contributions
Funding
Conflicts of Interest
References
- Wang, M.; Deng, W. Deep visual domain adaptation: A survey. Neurocomputing 2018, 312, 135–153. [Google Scholar] [CrossRef] [Green Version]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar]
- Yu, F.; Koltun, V.; Funkhouser, T.A. Dilated Residual Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 636–644. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 2018, 40, 834–848. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 833–851. [Google Scholar]
- Chen, L.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar]
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), LasVegas, NV, USA, 27–30 June 2016; pp. 3213–3223. [Google Scholar]
- Neuhold, G.; Ollmann, T.; Rota Bulo, S.; Kontschieder, P. The Mapillary vistas dataset for semantic understanding of street scenes. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 4990–4999. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The Pascal Visual Object Classes (VOC) challenge. Int. J. Comput. Vis. (IJCV) 2010, 88, 303–338. [Google Scholar] [CrossRef] [Green Version]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In European Conference on Computer Vision (ECCV); Springer: Berlin, Germany, 2014; pp. 740–755. [Google Scholar]
- Zhou, B.; Zhao, H.; Puig, X.; Fidler, S.; Barriuso, A.; Torralba, A. Scene parsing through ade20k dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 633–641. [Google Scholar]
- Silberman, N.; Derek Hoiem, P.K.; Fergus, R. Indoor Segmentation and Support Inference from RGBD Images. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, 7–13 October 2012. [Google Scholar]
- Song, S.; Lichtenberg, S.P.; Xiao, J. Sun rgb-d: A rgb-d scene understanding benchmark suite. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 567–576. [Google Scholar]
- Sun, S.; Shi, H.; Wu, Y. A survey of multi-source domain adaptation. Inf. Fusion 2015, 24, 84–92. [Google Scholar] [CrossRef]
- Csurka, G. Domain adaptation for visual applications: A comprehensive survey. arXiv 2017, arXiv:1702.05374. [Google Scholar]
- Jiang, J.; Zhai, C. Instance weighting for domain adaptation in NLP. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, 23–30 June 2007; pp. 264–271. [Google Scholar]
- Fang, F.; Dutta, K.; Datta, A. Domain adaptation for sentiment classification in light of multiple sources. INFORMS J. Comput. 2014, 26, 586–598. [Google Scholar] [CrossRef]
- Jiang, J. A Literature Survey on Domain Adaptation of Statistical Classifiers. 2008. Available online: http://www.mysmu.edu/faculty/jingjiang/papers/da_survey.pdf (accessed on 19 June 2020).
- Patel, V.M.; Gopalan, R.; Li, R.; Chellappa, R. Visual Domain Adaptation: A survey of recent advances. IEEE Signal Process. Mag. 2015, 32, 53–69. [Google Scholar] [CrossRef]
- Ho, H.T.; Gopalan, R. Model-driven domain adaptation on product manifolds for unconstrained face recognition. Int. J. Comput. Vis. (IJCV) 2014, 109, 110–125. [Google Scholar] [CrossRef] [Green Version]
- Saenko, K.; Kulis, B.; Fritz, M.; Darrell, T. Adapting visual category models to new domains. In Proceedings of the European Conference on Computer Vision (ECCV), Crete, Greece, 5–10 September 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 213–226. [Google Scholar]
- Richter, S.R.; Vineet, V.; Roth, S.; Koltun, V. Playing for Data: Ground Truth from Computer Games. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Berlin/Heidelberg, Germany, 2016; Volume 9906, pp. 102–118. [Google Scholar]
- Bucher, M.; Vu, T.H.; Cord, M.; Pérez, P. BUDA: Boundless Unsupervised Domain Adaptation in Semantic Segmentation. arXiv 2020, arXiv:2004.01130. [Google Scholar]
- Yang, Y.; Soatto, S. FDA: Fourier Domain Adaptation for Semantic Segmentation. arXiv 2020, arXiv:2004.05498. [Google Scholar]
- Vezhnevets, A.; Buhmann, J.M. Towards weakly supervised semantic segmentation by means of multiple instance and multitask learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, 13–18 June 2010; pp. 3249–3256. [Google Scholar]
- Pathak, D.; Shelhamer, E.; Long, J.; Darrell, T. Fully convolutional multi-class multiple instance learning. arXiv 2014, arXiv:1412.7144. [Google Scholar]
- Papandreou, G.; Chen, L.C.; Murphy, K.P.; Yuille, A.L. Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation. In Proceedings of the International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1742–1750. [Google Scholar]
- Pathak, D.; Krahenbuhl, P.; Darrell, T. Constrained convolutional neural networks for weakly supervised segmentation. In Proceedings of the International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1796–1804. [Google Scholar]
- Wei, Y.; Liang, X.; Chen, Y.; Shen, X.; Cheng, M.M.; Feng, J.; Zhao, Y.; Yan, S. STC: A simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 2017, 39, 2314–2320. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hong, S.; Noh, H.; Han, B. Decoupled deep neural network for semi-supervised semantic segmentation. In Proceedings of the Neural Information Processing Systems (NeurIPS), Montreal, QC, Canada, 7–12 December 2015; pp. 1495–1503. [Google Scholar]
- Dai, J.; He, K.; Sun, J. Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In Proceedings of the International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1635–1643. [Google Scholar]
- Souly, N.; Spampinato, C.; Shah, M. Semi and weakly supervised semantic segmentation using generative adversarial network. arXiv 2017, arXiv:1703.09695. [Google Scholar]
- Kolesnikov, A.; Lampert, C.H. Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin, Germany, 2016; pp. 695–711. [Google Scholar]
- Huang, Z.; Wang, X.; Wang, J.; Liu, W.; Wang, J. Weakly-supervised semantic segmentation network with deep seeded region growing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 7014–7023. [Google Scholar]
- Wei, Y.; Feng, J.; Liang, X.; Cheng, M.M.; Zhao, Y.; Yan, S. Object region mining with adversarial erasing: A simple classification to semantic segmentation approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1568–1576. [Google Scholar]
- Ahn, J.; Kwak, S. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4981–4990. [Google Scholar]
- Lee, J.; Kim, E.; Lee, S.; Lee, J.; Yoon, S. Ficklenet: Weakly and semi-supervised semantic image segmentation using stochastic inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 5267–5276. [Google Scholar]
- Ahn, J.; Cho, S.; Kwak, S. Weakly supervised learning of instance segmentation with inter-pixel relations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 2209–2218. [Google Scholar]
- Ramirez, P.Z.; Tonioni, A.; Salti, S.; Stefano, L.D. Learning Across Tasks and Domains. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the Neural Information Processing Systems (NeurIPS), Montreal, QC, USA, 8–13 December 2014; pp. 2672–2680. [Google Scholar]
- Ganin, Y.; Lempitsky, V. Unsupervised Domain Adaptation by Backpropagation. In Proceedings of the International Conference on Machine Learning (ICML), Lille, France, 7–9 July 2015; pp. 1180–1189. [Google Scholar]
- Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; Marchand, M.; Lempitsky, V. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 2016, 17, 1–35. [Google Scholar]
- Tzeng, E.; Hoffman, J.; Saenko, K.; Darrell, T. Adversarial discriminative domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 7167–7176. [Google Scholar]
- Hoffman, J.; Wang, D.; Yu, F.; Darrell, T. FCNs in the wild: Pixel-level adversarial and constraint-based adaptation. arXiv 2016, arXiv:1612.02649. [Google Scholar]
- Chen, Y.; Li, W.; Van Gool, L. Road: Reality oriented adaptation for semantic segmentation of urban scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 7892–7901. [Google Scholar]
- Zhang, Y.; Qiu, Z.; Yao, T.; Liu, D.; Mei, T. Fully convolutional adaptation networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 6810–6818. [Google Scholar]
- Li, Y.; Yuan, L.; Vasconcelos, N. Bidirectional Learning for Domain Adaptation of Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Huang, H.; Huang, Q.; Krähenbühl, P. Domain Transfer Through Deep Activation Matching. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Hoffman, J.; Tzeng, E.; Park, T.; Zhu, J.Y.; Isola, P.; Saenko, K.; Efros, A.; Darrell, T. CyCADA: Cycle-Consistent Adversarial Domain Adaptation. In Proceedings of the International Conference on Machine Learning (ICML), Stockholm, Sweden, 10–15 July 2018. [Google Scholar]
- Chen, Y.C.; Lin, Y.Y.; Yang, M.H.; Huang, J.B. CrDoCo: Pixel-Level Domain Transfer With Cross-Domain Consistency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Luo, Y.; Liu, P.; Guan, T.; Yu, J.; Yang, Y. Significance-Aware Information Bottleneck for Domain Adaptive Semantic Segmentation. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019. [Google Scholar]
- Toldo, M.; Michieli, U.; Agresti, G.; Zanuttigh, P. Unsupervised Domain Adaptation for Mobile Semantic Segmentation based on Cycle Consistency and Feature Alignment. arXiv 2020, arXiv:2001.04692. [Google Scholar] [CrossRef] [Green Version]
- Chen, Y.H.; Chen, W.Y.; Chen, Y.T.; Tsai, B.C.; Frank Wang, Y.C.; Sun, M. No more discrimination: Cross city adaptation of road scene segmenters. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1992–2001. [Google Scholar]
- Du, L.; Tan, J.; Yang, H.; Feng, J.; Xue, X.; Zheng, Q.; Ye, X.; Zhang, X. SSF-DAN: Separated Semantic Feature Based Domain Adaptation Network for Semantic Segmentation. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.; Li, K.; Li, F. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Zhu, X.; Zhou, H.; Yang, C.; Shi, J.; Lin, D. Penalizing top performers: Conservative loss for semantic segmentation adaptation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 568–583. [Google Scholar]
- Murez, Z.; Kolouri, S.; Kriegman, D.J.; Ramamoorthi, R.; Kim, K. Image to Image Translation for Domain Adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Sankaranarayanan, S.; Balaji, Y.; Jain, A.; Nam Lim, S.; Chellappa, R. Learning from synthetic data: Addressing domain shift for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Munich, Germany, 8–14 September 2018; pp. 3752–3761. [Google Scholar]
- Tsai, Y.H.; Hung, W.C.; Schulter, S.; Sohn, K.; Yang, M.H.; Chandraker, M. Learning to adapt structured output space for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Munich, Germany, 8–14 September 2018; pp. 7472–7481. [Google Scholar]
- Chen, Y.; Li, W.; Chen, X.; Van Gool, L. Learning Semantic Segmentation from Synthetic Data: A Geometrically Guided Input-Output Adaptation Approach. arXiv 2018, arXiv:1812.05040. [Google Scholar]
- Chang, W.; Wang, H.; Peng, W.; Chiu, W. All About Structure: Adapting Structural Information Across Domains for Boosting Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 1900–1909. [Google Scholar]
- Luo, Y.; Zheng, L.; Guan, T.; Yu, J.; Yang, Y. Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Yang, J.; An, W.; Wang, S.; Zhu, X.; Yan, C.; Huang, J. Label-Driven Reconstruction for Domain Adaptation in Semantic Segmentation. arXiv 2020, arXiv:2003.04614. [Google Scholar]
- Biasetton, M.; Michieli, U.; Agresti, G.; Zanuttigh, P. Unsupervised Domain Adaptation for Semantic Segmentation of Urban Scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Michieli, U.; Biasetton, M.; Agresti, G.; Zanuttigh, P. Adversarial Learning and Self-Teaching Techniques for Domain Adaptation in Semantic Segmentation. IEEE Trans. Intell. Veh. 2020. [Google Scholar] [CrossRef] [Green Version]
- Spadotto, T.; Toldo, M.; Michieli, U.; Zanuttigh, P. Unsupervised Domain Adaptation with Multiple Domain Discriminators and Adaptive Self-Training. arXiv 2020, arXiv:2004.12724. [Google Scholar]
- Vu, T.H.; Jain, H.; Bucher, M.; Cord, M.; Pérez, P. Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 2517–2526. [Google Scholar]
- Vu, T.; Jain, H.; Bucher, M.; Cord, M.; Pérez, P. DADA: Depth-Aware Domain Adaptation in Semantic Segmentation. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 7363–7372. [Google Scholar]
- Tsai, Y.H.; Sohn, K.; Schulter, S.; Chandraker, M. Domain Adaptation for Structured Output via Discriminative Patch Representations. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 1456–1465. [Google Scholar]
- Zhou, Q.; Feng, Z.; Cheng, G.; Tan, X.; Shi, J.; Ma, L. Uncertainty-Aware Consistency Regularization for Cross-Domain Semantic Segmentation. arXiv 2020, arXiv:2004.08878. [Google Scholar]
- Qin, C.; Wang, L.; Zhang, Y.; Fu, Y. Generatively Inferential Co-Training for Unsupervised Domain Adaptation. In Proceedings of the International Conference on Computer Vision Workshops (ICCVW), Seoul, Korea, 27 October–2 November 2019; pp. 1055–1064. [Google Scholar]
- Li, P.; Liang, X.; Jia, D.; Xing, E.P. Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption. In Proceedings of the British Machine Vision Conference (BMVC), Newcastle, UK, 3–6 September 2018. [Google Scholar]
- Yang, Y.; Lao, D.; Sundaramoorthi, G.; Soatto, S. Phase Consistent Ecological Domain Adaptation. arXiv 2020, arXiv:2004.04923. [Google Scholar]
- Gong, R.; Li, W.; Chen, Y.; Gool, L.V. DLOW: Domain Flow for Adaptation and Generalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 2477–2486. [Google Scholar]
- Zhu, J.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Lee, K.; Ros, G.; Li, J.; Gaidon, A. SPIGAN: Privileged Adversarial Learning from Simulation. In Proceedings of the International Conference on Learning Representations (ICLR), New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Choi, J.; Kim, T.; Kim, C. Self-Ensembling With GAN-Based Data Augmentation for Domain Adaptation in Semantic Segmentation. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 6830–6840. [Google Scholar]
- Hong, W.; Wang, Z.; Yang, M.; Yuan, J. Conditional Generative Adversarial Network for Structured Domain Adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 1335–1344. [Google Scholar]
- Pizzati, F.; Charette, R.d.; Zaccaria, M.; Cerri, P. Domain bridge for unpaired image-to-image translation and unsupervised domain adaptation. In Proceedings of the British Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA, 1–5 March 2020; pp. 2990–2998. [Google Scholar]
- Huang, X.; Liu, M.; Belongie, S.J.; Kautz, J. Multimodal Unsupervised Image-to-Image Translation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 179–196. [Google Scholar]
- Wu, Z.; Han, X.; Lin, Y.; Uzunbas, M.G.; Goldstein, T.; Lim, S.; Davis, L.S. DCAN: Dual Channel-Wise Alignment Networks for Unsupervised Scene Adaptation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 535–552. [Google Scholar]
- Wu, Z.; Wang, X.; Gonzalez, J.; Goldstein, T.; Davis, L. ACE: Adapting to Changing Environments for Semantic Segmentation. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 2121–2130. [Google Scholar]
- Dundar, A.; Liu, M.; Wang, T.; Zedlewski, J.; Kautz, J. Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation. arXiv 2018, arXiv:1807.09384. [Google Scholar]
- Gatys, L.A.; Ecker, A.S.; Bethge, M. Texture Synthesis Using Convolutional Neural Networks. In Proceedings of the Neural Information Processing Systems (NeurIPS), Montreal, QC, Canada, 7–12 December 2015; pp. 262–270. [Google Scholar]
- Gatys, L.A.; Ecker, A.S.; Bethge, M. Image Style Transfer Using Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), LasVegas, NV, USA, 27–30 June 2016; pp. 2414–2423. [Google Scholar]
- Huang, X.; Belongie, S.J. Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 1510–1519. [Google Scholar]
- Saito, K.; Ushiku, Y.; Harada, T.; Saenko, K. Adversarial Dropout Regularization. In Proceedings of the International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Saito, K.; Watanabe, K.; Ushiku, Y.; Harada, T. Maximum Classifier Discrepancy for Unsupervised Domain Adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 3723–3732. [Google Scholar]
- Watanabe, K.; Saito, K.; Ushiku, Y.; Harada, T. Multichannel Semantic Segmentation with Unsupervised Domain Adaptation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Lee, S.; Kim, D.; Kim, N.; Jeong, S.G. Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 91–100. [Google Scholar]
- Grandvalet, Y.; Bengio, Y. Semi-supervised Learning by Entropy Minimization. In Proceedings of the Actes de CAP 05, Conférence Francophone sur L’apprentissage Automatique, Nice, France, 31 May–3 June 2005; pp. 281–296. [Google Scholar]
- Zou, Y.; Yu, Z.; Vijaya Kumar, B.; Wang, J. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 289–305. [Google Scholar]
- Zou, Y.; Yu, Z.; Liu, X.; Kumar, B.V.; Wang, J. Confidence Regularized Self-Training. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 5982–5991. [Google Scholar]
- Chen, M.; Xue, H.; Cai, D. Domain Adaptation for Semantic Segmentation With Maximum Squares Loss. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019. [Google Scholar]
- Zhang, Y.; David, P.; Gong, B. Curriculum domain adaptation for semantic segmentation of urban scenes. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2020–2030. [Google Scholar]
- Zhang, Y.; David, P.; Foroosh, H.; Gong, B. A curriculum domain adaptation approach to the semantic segmentation of urban scenes. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 2019, in press. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sakaridis, C.; Dai, D.; Hecker, S.; Van Gool, L. Model adaptation with synthetic and real data for semantic dense foggy scene understanding. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 687–704. [Google Scholar]
- Dai, D.; Sakaridis, C.; Hecker, S.; Van Gool, L. Curriculum model adaptation with synthetic and real data for semantic foggy scene understanding. Int. J. Comput. Vis. (IJCV) 2019, 128, 1182–1204. [Google Scholar] [CrossRef] [Green Version]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the Neural Information Processing Systems (NeurIPS), Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1106–1114. [Google Scholar]
- Lian, Q.; Lv, F.; Duan, L.; Gong, B. Constructing Self-motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 6758–6767. [Google Scholar]
- Chen, Y.; Li, W.; Chen, X.; Gool, L.V. Learning semantic segmentation from synthetic data: A geometrically guided input-output adaptation approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seoul, Korea, 27 October–2 November 2019; pp. 1841–1850. [Google Scholar]
- Busto, P.P.; Gall, J. Open Set Domain Adaptation. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 754–763. [Google Scholar]
- Saito, K.; Kim, D.; Sclaroff, S.; Saenko, K. Universal Domain Adaptation through Self Supervision. arXiv 2020, arXiv:2002.07953. [Google Scholar]
- Zhuo, J.; Wang, S.; Cui, S.; Huang, Q. Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seoul, Korea, 27 October–2 November 2019; pp. 750–759. [Google Scholar]
- Bucher, M.; Vu, T.; Cord, M.; Pérez, P. Zero-Shot Semantic Segmentation. In Proceedings of the Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada, 8–14 December 2019; pp. 466–477. [Google Scholar]
- Kirkpatrick, J.; Pascanu, R.; Rabinowitz, N.; Veness, J.; Desjardins, G.; Rusu, A.A.; Milan, K.; Quan, J.; Ramalho, T.; Grabska-Barwinska, A.; et al. Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. USA 2017, 114, 3521–3526. [Google Scholar] [CrossRef] [Green Version]
- Li, Z.; Hoiem, D. Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 2017, 40, 2935–2947. [Google Scholar] [CrossRef] [Green Version]
- Shmelkov, K.; Schmid, C.; Alahari, K. Incremental learning of object detectors without catastrophic forgetting. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 3400–3409. [Google Scholar]
- Michieli, U.; Zanuttigh, P. Incremental learning techniques for semantic segmentation. In Proceedings of the International Conference on Computer Vision Workshops (ICCVW), Seoul, Korea, 27 October–2 November 2019. [Google Scholar]
- Michieli, U.; Zanuttigh, P. Knowledge Distillation for Incremental Learning in Semantic Segmentation. arXiv 2019, arXiv:1911.03462. [Google Scholar]
- Cermelli, F.; Mancini, M.; Bulò, S.R.; Ricci, E.; Caputo, B. Modeling the Background for Incremental Learning in Semantic Segmentation. arXiv 2020, arXiv:2002.00718. [Google Scholar]
- Mel, M.; Michieli, U.; Zanuttigh, P. Incremental and Multi-Task Learning Strategies for Coarse-To-Fine Semantic Segmentation. Technologies 2020, 8, 1. [Google Scholar] [CrossRef] [Green Version]
- Smith, D.; Burke, B. Gartner’s 2019 Hype Cycle for Emerging Technologies; Technical Report; Gartner: Stamford, CT, USA, August 2019. [Google Scholar]
- Ros, G.; Sellart, L.; Materzynska, J.; Vazquez, D.; Lopez, A.M. The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vegas, NV, USA, 27–30 June 2016; pp. 3234–3243. [Google Scholar]
- Michieli, U.; Badia, L. Game Theoretic Analysis of Road User Safety Scenarios Involving Autonomous Vehicles. In Proceedings of the IEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communications, Bologna, Italy, 9–12 September 2018; pp. 1377–1381. [Google Scholar]
- Dosovitskiy, A.; Ros, G.; Codevilla, F.; Lopez, A.; Koltun, V. CARLA: An open urban driving simulator. arXiv 2017, arXiv:1711.03938. [Google Scholar]
- Kim, J.; Park, C. Attribute Dissection of Urban Road Scenes for Efficient Dataset Integration. In Proceedings of the International Joint Conference on Artificial Intelligence Workshops, Stockholm, Sweden, 13–15 July 2018; pp. 8–15. [Google Scholar]
- Maddern, W.; Pascoe, G.; Linegar, C.; Newman, P. 1 year, 1000 km: The Oxford RobotCar dataset. Int. J. Robot. Res. 2017, 36, 3–15. [Google Scholar] [CrossRef]
- Liu, M.Y.; Breuel, T.; Kautz, J. Unsupervised image-to-image translation networks. In Proceedings of the Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 4–9 December 2017; pp. 700–708. [Google Scholar]
Method | Backbone | mIoU | Method | Backbone | mIoU |
---|---|---|---|---|---|
Biasetton et al. [65] | ResNet-101 | 30.4 | Chen et al. [46] | VGG-16 | 35.9 |
Chang et al. [62] | ResNet-101 | 45.4 | Chen et al. [51] | VGG-16 | 38.1 |
Chen et al. [46] | ResNet-101 | 39.4 | Choi et al. [78] | VGG-16 | 42.5 |
Chen et al. [95] | ResNet-101 | 46.4 | Du et al. [55] | VGG-16 | 37.7 |
Du et al. [55] | ResNet-101 | 45.4 | Hoffman et al. [45] | VGG-16 | 27.1 |
Gong et al. [75] | ResNet-101 | 42.3 | Hoffman et al. [50] | VGG-16 | 35.4 |
Hoffman et al. [50] | ResNet-101 | 42.7 * | Huang et al. [49] | VGG-16 | 32.6 |
Li et al. [48] | ResNet-101 | 48.5 | Li et al. [48] | VGG-16 | 41.3 |
Lian et al. [101] | ResNet-101 | 47.4 | Lian et al. [101] | VGG-16 | 37.2 |
Luo et al. [52] | ResNet-101 | 42.6 | Luo et al. [52] | VGG-16 | 34.2 |
Luo et al. [63] | ResNet-101 | 43.2 | Luo et al. [63] | VGG-16 | 36.6 |
Michieli et al. [66] | ResNet-101 | 33.3 | Saito et al. [89] | VGG-16 | 28.8 |
Spadotto et al. [67] | ResNet-101 | 35.1 | Sankaranarayanan et al. [59] | VGG-16 | 37.1 |
Tsai et al. [60] | ResNet-101 | 42.4 | Tsai et al. [60] | VGG-16 | 35.0 |
Tsai et al. [70] | ResNet-101 | 46.5 | Tsai et al. [70] | VGG-16 | 37.5 |
Vu et al. [68] | ResNet-101 | 45.5 | Vu et al. [68] | VGG-16 | 36.1 |
Wu et al. [82] | ResNet-101 | 38.5 | Wu et al. [82] | VGG-16 | 36.2 |
Yang et al. [25] | ResNet-101 | 50.5 | Yang et al. [25] | VGG-16 | 42.2 |
Zhang et al. [47] | ResNet-101 | 47.8 | Zhang et al. [96] | VGG-16 | 28.9 |
Zou et al. [94] | ResNet-101 | 47.1 | Zhang et al. [97] | VGG-16 | 31.4 |
Murez et al. [58] | ResNet-34 | 31.8 | Zhou et al. [71] | VGG-16 | 47.8 |
1-3 Lian et al. [101] | ResNet-38 | 48.0 | Zhu et al. [57] | VGG-16 | 38.1 * |
Zou et al. [93] | ResNet-38 | 47.0 | Zou et al. [93] | VGG-16 | 36.1 |
Zou et al. [94] | ResNet-38 | 49.8 | Hong et al. [79] | VGG-19 | 44.5 |
1-6 Lee et al. [91] | ResNet-50 | 35.8 | Chen et al. [51] | DRN-26 | 45.1 |
Saito et al. [88] | ResNet-50 | 33.3 | Dundar et al. [84] | DRN-26 | 38.3 |
Wu et al. [82] | ResNet-50 | 41.7 | Hoffman et al. [50] | DRN-26 | 39.5 |
1-3 Hoffman et al. [50] | MobileNet-v2 | 37.3 * | Huang et al. [49] | DRN-26 | 40.2 |
Toldo et al. [53] | MobileNet-v2 | 41.1 | Liu et al. [120] | DRN-26 | 39.1 * |
Zhu et al. [76] | MobileNet-v2 | 29.3 * | Yang et al. [74] | DRN-26 | 42.6 |
Murez et al. [58] | DenseNet | 35.7 | Zhu et al. [76] | DRN-26 | 39.6 * |
Huang et al. [49] | ERFNet | 31.3 | Saito et al. [89] | DRN-105 | 39.7 |
Method | Backbone | mIoU | mIoU | Method | Backbone | mIoU | mIoU |
---|---|---|---|---|---|---|---|
Biasetton et al. [65] | ResNet-101 | - | 30.2 | Chen et al. [54] | VGG-16 | 35.7 | - |
Bucher et al. [24] | ResNet-101 | - | 36.2 | Chen et al. [46] | VGG-16 | - | 36.2 |
Chang et al. [62] | ResNet-101 | - | 41.5 | Chen et al. [46] | VGG-16 | 41.8 * | 36.2 * |
Chen et al. [95] | ResNet-101 | 48.2 | 41.4 | Chen et al. [51] | VGG-16 | - | 38.2 |
Du et al. [55] | ResNet-101 | 50.0 | - | Chen et al. [102] | VGG-16 | 43.0 | 37.3 |
Li et al. [48] | ResNet-101 | 51.4 | - | Choi et al. [78] | VGG-16 | 46.6 | 38.5 |
Lian et al. [101] | ResNet-101 | 53.3 | 46.7 | Du et al. [55] | VGG-16 | 43.4 | - |
Luo et al. [52] | ResNet-101 | 46.3 | - | Hoffman et al. [45] | VGG-16 | 17.0 | 20.2 * |
Luo et al. [63] | ResNet-101 | 47.8 | - | Huang et al. [49] | VGG-16 | - | 30.7 * |
Michieli et al. [66] | ResNet-101 | - | 31.3 | Lee et al. [77] | VGG-16 | 42.4 * | 36.8 |
Spadotto et al. [67] | ResNet-101 | - | 34.6 | Li et al. [48] | VGG-16 | - | 39.0 |
Tsai et al [70] | ResNet-101 | 46.5 | 40.0 | Lian et al. [101] | VGG-16 | 42.6 | 35.9 |
Tsai et al. [60] | ResNet-101 | 46.7 | - | Luo et al. [63] | VGG-16 | 39.3 | - |
Vu et al. [68] | ResNet-101 | 48.0 | 41.2 | Luo et al. [52] | VGG-16 | 37.2 | - |
Vu et al. [69] | ResNet-101 | 49.8 | 42.6 | Sankaran. et al. [59] | VGG-16 | 42.1 * | 36.1 |
Wu et al. [82] | ResNet-101 | - | 36.5 | Tsai et al [70] | VGG-16 | 39.6 | 33.7 |
Yang et al. [25] | ResNet-101 | 52.5 | - | Tsai et al. [60] | VGG-16 | 37.6 | - |
Zou et al. [94] | ResNet-101 | 50.1 | 43.8 | Vu et al. [68] | VGG-16 | 36.6 | 31.4 |
Zou et al. [93] | ResNet-38 | - | 38.4 | Wu et al. [82] | VGG-16 | - | 35.4 |
Wu et al. [82] | ResNet-50 | 48.4 | 42.5 | Yang et al. [25] | VGG-16 | - | 40.5 |
1-4 Hoffman et al [50] | MobileNet-v2 | - | 27.5 * | Yang et al. [74] | VGG-16 | 48.7 | 41.1 |
Toldo et al. [53] | MobileNet-v2 | - | 32.6 | Zhang et al. [96] | VGG-16 | 34.8 * | 29.0 |
Zhu et al. [76] | MobileNet-v2 | - | 24.2 * | Zhang et al. [97] | VGG-16 | - | 29.7 |
1-4 Chen et al. [51] | DRN-26 | - | 33.4 | Zhou et al. [71] | VGG-16 | 48.6 | 41.5 |
Dundar et al. [84] | DRN-26 | - | 29.5 | Zhu et al. [57] | VGG-16 | 40.3 * | 34.2 * |
Liu et al. [120] | DRN-26 | - | 28.0* | Zou et al. [93] | VGG-16 | 36.1 | 35.4 |
Zhu et al. [76] | DRN-26 | - | 27.1 * | Hong et al. [79] | VGG-19 | - | 41.2 |
Saito et al. [89] | DRN-105 | 43.5 * | 37.3 * |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Toldo, M.; Maracani, A.; Michieli, U.; Zanuttigh, P. Unsupervised Domain Adaptation in Semantic Segmentation: A Review. Technologies 2020, 8, 35. https://doi.org/10.3390/technologies8020035
Toldo M, Maracani A, Michieli U, Zanuttigh P. Unsupervised Domain Adaptation in Semantic Segmentation: A Review. Technologies. 2020; 8(2):35. https://doi.org/10.3390/technologies8020035
Chicago/Turabian StyleToldo, Marco, Andrea Maracani, Umberto Michieli, and Pietro Zanuttigh. 2020. "Unsupervised Domain Adaptation in Semantic Segmentation: A Review" Technologies 8, no. 2: 35. https://doi.org/10.3390/technologies8020035