HDRSeg-UDA: Semantic Segmentation for HDR Images with Unsupervised Domain Adaptation
Highlights
- The use of HDR images with multi-exposure and feature extraction for road marking semantic segmentation efficiently enables pixel-wise classification on driving images under adverse weather.
- A comprehensive dataset specifically designed for road marking segmentation is introduced, providing a valuable resource for evaluating and improving HDR-based semantic segmentation under different illumination conditions.
- It is feasible to modify the baseline segmentation architecture to better leverage rich features of HDR images with adversarial training and self-training to enhance driving scene understanding tasks.
- The HDR dataset serves as a benchmark for future research in semantic segmentation under various weather conditions.
Abstract
1. Introduction
- We develop a road marking segmentation model capable of accurate pixel-wise classification on HDR images.
- We demonstrate the feasibility of using HDR images for semantic segmentation under adverse weather.
- The effectiveness of the ClassMix approach for training semantic segmentation model on HDR driving images is verified.
- We establish a new HDR driving dataset for road marking segmentation benchmarking.
2. Related Work
2.1. Semantic Segmentation
2.2. Unsupervised Domain Adaptation
2.3. High Dynamic Range Image
3. Method
3.1. Multi-Exposure Feature Extraction for HDR Images
3.2. Source Domain Supervised Training
3.3. Target Domain Unsupervised Training
3.4. Domain Discrimination Training
4. Experimental Section
4.1. Datasets
4.1.1. Cityscapes
4.1.2. BDD100K
4.1.3. CeyMo
4.1.4. VPGNet
4.1.5. RLMD-AC
4.1.6. Evaluation Metric
4.2. Implementation
4.3. Results
4.4. Ablation Study
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| TLD | Traffic Light Detection |
| C2I | Car-to-Infrastructure |
| CNN | Convolutional Neural Network |
| SOTA | State-Of-The-Art |
| SimAM | Simple Attention Module |
| ECA | Efficient Channel Attention Mechanism |
| CIoU | Complete Intersection of Union |
| EIoU | Efficient Intersection of Union |
| HSM | Hard Sample Mining |
| GFLOPs | Giga Floating-point Operations Per Second |
References
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and efficient design for semantic segmentation with transformers. Adv. Neural Inf. Process. Syst. 2021, 34, 12077–12090. [Google Scholar]
- Gidaris, S.; Komodakis, N. Object detection via a multi-region and semantic segmentation-aware cnn model. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1134–1142. [Google Scholar]
- Tokunaga, H.; Teramoto, Y.; Yoshizawa, A.; Bise, R. Adaptive weighting multi-field-of-view CNN for semantic segmentation in pathology. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 12597–12606. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar]
- Wang, J.; Liao, X.; Wang, Y.; Zeng, X.; Ren, X.; Yue, H.; Qu, W. M-SKSNet: Multi-scale spatial kernel selection for image segmentation of damaged road markings. Remote Sens. 2024, 16, 1476. [Google Scholar] [CrossRef]
- Hao, S.; Wu, H.; Du, C.; Zeng, X.; Ji, Z.; Zhang, X.; Ganchev, I. Cacdu-net: A novel doubleu-net based semantic segmentation model for skin lesions detection in images. IEEE Access 2023, 11, 82449–82463. [Google Scholar] [CrossRef]
- Zhang, X.; Li, L.; Bian, Z.; Dai, C.; Ji, Z.; Liu, J. RDL-YOLO: A Method for the Detection of Leaf Pests and Diseases in Cotton Based on YOLOv11. Agronomy 2025, 15, 1989. [Google Scholar] [CrossRef]
- Hou, Y.; Ma, Z.; Liu, C.; Hui, T.W.; Loy, C.C. Inter-region affinity distillation for road marking segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 12486–12495. [Google Scholar]
- Wu, J.; Liu, W.; Maruyama, Y. Automated road-marking segmentation via a multiscale attention-based dilated convolutional neural network using the road marking dataset. Remote Sens. 2022, 14, 4508. [Google Scholar] [CrossRef]
- Hsiao, H.C.; Cai, Y.C.; Lin, H.Y.; Chiu, W.C.; Chan, C.T.; Wang, C.C. FuseRoad: Enhancing Lane Shape Prediction Through Semantic Knowledge Integration and Cross-Dataset Training. In Proceedings of the 2025 IEEE Intelligent Vehicles Symposium (IV), Cluj-Napoca, Romania, 22–25 June 2025; pp. 897–902. [Google Scholar]
- Hoyer, L.; Dai, D.; Van Gool, L. Daformer: Improving network architectures and training strategies for domain-adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 9924–9935. [Google Scholar]
- Hoyer, L.; Dai, D.; Van Gool, L. Hrda: Context-aware high-resolution domain-adaptive semantic segmentation. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 372–391. [Google Scholar]
- Hoyer, L.; Dai, D.; Wang, H.; Van Gool, L. MIC: Masked image consistency for context-enhanced domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 11721–11732. [Google Scholar]
- Xie, B.; Li, S.; Li, M.; Liu, C.H.; Huang, G.; Wang, G. Sepico: Semantic-guided pixel contrast for domain adaptive semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 9004–9021. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef] [PubMed]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Vu, T.H.; Jain, H.; Bucher, M.; Cord, M.; Pérez, P. Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2517–2526. [Google Scholar]
- Wang, H.; Shen, T.; Zhang, W.; Duan, L.Y.; Mei, T. Classes matter: A fine-grained adversarial approach to cross-domain semantic segmentation. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 642–659. [Google Scholar]
- Cai, Y.C.; Hsiao, H.C.; Chiu, W.C.; Lin, H.Y.; Chan, C.T. RMSeg-UDA: Unsupervised Domain Adaptation for Road Marking Segmentation Under Adverse Conditions. In Proceedings of the 2025 IEEE International Conference on Robotics and Automation (ICRA), Atlanta, GA, USA, 19–23 May 2025; pp. 13471–13477. [Google Scholar]
- Hsiao, H.C.; Cai, Y.C.; Lin, H.Y.; Chiu, W.C.; Chan, C.T. RLMD: A Dataset for Road Marking Segmentation. In Proceedings of the 2023 International Conference on Consumer Electronics-Taiwan (ICCE-Taiwan), PingTung, Taiwan, 17–19 July 2023; pp. 427–428. [Google Scholar]
- Olsson, V.; Tranheden, W.; Pinto, J.; Svensson, L. Classmix: Segmentation-based data augmentation for semi-supervised learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2021; pp. 1369–1378. [Google Scholar]
- Tranheden, W.; Olsson, V.; Pinto, J.; Svensson, L. Dacs: Domain adaptation via cross-domain mixed sampling. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2021; pp. 1379–1389. [Google Scholar]
- Wang, J.G.; Zhou, L.; Song, Z.; Yuan, M. Real-time vehicle signal lights recognition with HDR camera. In Proceedings of the 2016 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), Chengdu, China, 15–18 December 2016; pp. 355–358. [Google Scholar]
- Wang, J.G.; Zhou, L.B. Traffic light recognition with high dynamic range imaging and deep learning. IEEE Trans. Intell. Transp. Syst. 2018, 20, 1341–1352. [Google Scholar] [CrossRef]
- Kocdemir, I.H.; Akyuz, A.O.; Koz, A.; Chalmers, A.; Alatan, A.; Kalkan, S. Object detection for autonomous driving: High-dynamic range vs. low-dynamic range images. In Proceedings of the 2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP), Shanghai, China, 26–28 September 2022; pp. 1–5. [Google Scholar]
- Weiher, M. Domain Adaptation of HDR Training Data for Semantic Road Scene Segmentation by Deep Learning. 2019. Available online: https://mediatum.ub.tum.de/1525857 (accessed on 23 November 2025).
- Huang, T.; Song, S.; Liu, Q.; He, W.; Zhu, Q.; Hu, H. A novel multi-exposure fusion approach for enhancing visual semantic segmentation of autonomous driving. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2023, 237, 1652–1667. [Google Scholar] [CrossRef]
- Singh, K.; Parihar, A.S. MRN-LOD: Multi-exposure refinement network for low-light object detection. J. Vis. Commun. Image Represent. 2024, 99, 104079. [Google Scholar] [CrossRef]
- Onzon, E.; Bömer, M.; Mannan, F.; Heide, F. Neural exposure fusion for high-dynamic range object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 17564–17573. [Google Scholar]
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 3213–3223. [Google Scholar]
- Yu, F.; Chen, H.; Wang, X.; Xian, W.; Chen, Y.; Liu, F.; Madhavan, V.; Darrell, T. Bdd100k: A diverse driving dataset for heterogeneous multitask learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2636–2645. [Google Scholar]
- Jayasinghe, O.; Hemachandra, S.; Anhettigama, D.; Kariyawasam, S.; Rodrigo, R.; Jayasekara, P. Ceymo: See more on roads-a novel benchmark dataset for road marking detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2022; pp. 3104–3113. [Google Scholar]
- Lee, S.; Kim, J.; Shin Yoon, J.; Shin, S.; Bailo, O.; Kim, N.; Lee, T.H.; Seok Hong, H.; Han, S.H.; So Kweon, I. Vpgnet: Vanishing point guided network for lane and road marking detection and recognition. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1947–1955. [Google Scholar]
- Liu, Y.L.; Lai, W.S.; Chen, Y.S.; Kao, Y.L.; Yang, M.H.; Chuang, Y.Y.; Huang, J.B. Single-image HDR reconstruction by learning to reverse the camera pipeline. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 1651–1660. [Google Scholar]





| Task | Method | Cityscapes & BDD100K | CeyMo | ||||
|---|---|---|---|---|---|---|---|
| Clear↑ | Night↑ | Rainy↑ | Clear↑ | Night↑ | Rainy↑ | ||
| Clear → Night | Baseline | - | - | - | 71.33 | 36.29 | - |
| HDRSeg-UDA | - | - | - | 76.99 | 75.44 | - | |
| Clear → Rainy | Baseline | - | - | - | 71.33 | - | 61.70 |
| HDRSeg-UDA | - | - | - | 76.42 | - | 80.56 | |
| Clear → Mixed | Baseline | 57.83 | 21.47 | 34.74 | 71.33 | 36.29 | 61.70 |
| MIC [15] | 61.22 | 29.18 | 38.73 | 68.28 | 59.12 | 73.68 | |
| HDRSeg-UDA | 65.55 | 28.33 | 40.13 | 77.03 | 74.55 | 79.21 | |
| Task | Method | VPGNet | RLMD-AC | ||||
| clear ↑ | night ↑ | rainy ↑ | clear ↑ | night ↑ | rainy ↑ | ||
| Clear → Night | Baseline | 33.98 | 29.29 | - | 52.36 | 28.53 | - |
| HDRSeg-UDA | 34.01 | 32.91 | - | 55.99 | 37.72 | - | |
| Clear → Rainy | Baseline | 33.98 | - | 29.59 | 52.36 | - | 32.32 |
| HDRSeg-UDA | 34.35 | - | 36.82 | 53.83 | - | 37.79 | |
| Clear → Mixed | Baseline | 33.98 | 29.29 | 29.59 | 52.36 | 28.53 | 32.32 |
| MIC [15] | 32.79 | 23.15 | 30.75 | 53.03 | 35.43 | 39.60 | |
| HDRSeg-UDA | 35.80 | 35.58 | 37.28 | 57.88 | 40.06 | 40.93 | |
| Method | Clear ↑ | Night ↑ | Rainy ↑ |
|---|---|---|---|
| Baseline | 52.62 | 25.68 | 32.86 |
| Base + Dual-Encoder | 53.47 | 25.58 | 33.63 |
| Base + Triple-Encoder | 54.12 | 29.06 | 33.53 |
| Base + DE + Self-Training | 54.67 | 26.27 | 36.52 |
| Base + DE + ST + Discriminator | 54.97 | 31.66 | 38.04 |
| Base + DE + ST + Dis + ClassMix | 57.88 | 40.06 | 40.93 |
| Datasets | Weather | SDR (8-bit) ↑ | HDR (16/24-bit) ↑ |
|---|---|---|---|
| Cityscapes | Clear | 62.09 | 65.55 |
| RLMD-AC | Night | 29.30 | 32.26 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Lin, H.-Y.; Chen, M.-Y. HDRSeg-UDA: Semantic Segmentation for HDR Images with Unsupervised Domain Adaptation. Smart Cities 2026, 9, 10. https://doi.org/10.3390/smartcities9010010
Lin H-Y, Chen M-Y. HDRSeg-UDA: Semantic Segmentation for HDR Images with Unsupervised Domain Adaptation. Smart Cities. 2026; 9(1):10. https://doi.org/10.3390/smartcities9010010
Chicago/Turabian StyleLin, Huei-Yung, and Ming-Yiao Chen. 2026. "HDRSeg-UDA: Semantic Segmentation for HDR Images with Unsupervised Domain Adaptation" Smart Cities 9, no. 1: 10. https://doi.org/10.3390/smartcities9010010
APA StyleLin, H.-Y., & Chen, M.-Y. (2026). HDRSeg-UDA: Semantic Segmentation for HDR Images with Unsupervised Domain Adaptation. Smart Cities, 9(1), 10. https://doi.org/10.3390/smartcities9010010

