PF-ConvNeXt: An Adverse Weather Recognition Network for Autonomous Driving Scenes
Abstract
1. Introduction
- (1)
- PF-ConvNeXt for adverse-weather recognition. Using ConvNeXt as the backbone, we build an enhanced network that better captures degradation patterns caused by rain, snow, fog, and dust, enabling stable multi-class recognition.
- (2)
- We propose feature enhancement channel and spatial attention module (FECS) to adaptively recalibrate features through two complementary paths: channel semantics and spatial locations. This emphasizes weather-related responses while suppressing interference from complex backgrounds and noise, thereby improving cross-scene robustness.
- (3)
- We introduce a lightweight pyramid split attention (PSA) module to perform multi-scale feature fusion. Through multi-scale splitting and cross-scale interaction, the module captures both global haze-like scattering patterns and fine-grained cues such as rain streaks and snow textures, enhancing the representation of weather cues across different intensities and scales. In addition, Focal Loss is adopted to emphasize hard samples and minority classes, alleviating class-imbalance issues.
- (4)
- We build a dataset by integrating RTTS, DAWN, and a self-collected rainy-weather dataset, and expand it to 5000 samples via augmentation. Extensive ablation and comparative experiments are conducted to validate the effectiveness of each component. Results demonstrate that the proposed method outperforms multiple mainstream models in terms of accuracy and precision.
2. Related Work
2.1. CNN-Based Image Classification Methods
2.2. Transformer-Based Image Classification Methods
2.3. CNN-Transformer Hybrid Methods for Image Classification
3. Method
3.1. ConvNeXt
3.2. PF-ConvNeXt
3.3. PSA
3.4. FECS
3.5. Focal Loss
4. Experiment
4.1. Experimental Design
4.2. Dataset
4.3. Evaluation Metrics
4.4. Ablation Analysis
4.5. Results of Comparative Experiments
4.5.1. Sensitivity Analysis of in Focal Loss
4.5.2. Experimental Comparison with Other Models
5. Discussion
5.1. Limitations and Generalization Considerations
5.2. Failure Cases and Error Analysis
5.3. Comparison with Related Studies and Practical Considerations
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
- Liu, Z.; Mao, H.; Wu, C.Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 11976–11986. [Google Scholar]
- Zhang, Y.; Carballo, A.; Yang, H.; Takeda, K. Perception and sensing for autonomous vehicles under adverse weather conditions: A survey. ISPRS J. Photogramm. Remote Sens. 2023, 196, 146–177. [Google Scholar] [CrossRef]
- Gupta, H.; Kotlyar, O.; Andreasson, H.; Lilienthal, A.J. Video weather recognition (VARG): An intensity-labeled video weather recognition dataset. J. Imaging 2024, 10, 281. [Google Scholar] [CrossRef] [PubMed]
- Karvat, M.; Givigi, S. Adver-city: Open-source multi-modal dataset for collaborative perception under adverse weather conditions. arXiv 2024, arXiv:2410.06380. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Shigesawa, A.; Yagi, M.; Takahashi, S.; Takedomi, S.; Mori, T. Winter road surface condition recognition in snowy regions based on image-to-image translation. Sensors 2025, 26, 241. [Google Scholar] [CrossRef] [PubMed]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
- Chung, Y.L. Efficient lightweight image classification via coordinate attention and channel pruning for resource-constrained systems. Future Internet 2025, 17, 489. [Google Scholar] [CrossRef]
- Manivannan, P.; Sathyaprakash, P.; Jayakumar, V.; Chandrasekaran, J.; Ananthanarayanan, B.; Sayeed, M. Weather classification for autonomous vehicles under adverse conditions using multi-level knowledge distillation. Comput. Mater. Contin. 2024, 81, 4327. [Google Scholar] [CrossRef]
- Introvigne, M.; Ramazzina, A.; Walz, S.; Scheuble, D.; Bijelic, M. Real-time environment condition classification for autonomous vehicles. In Proceedings of the 2024 IEEE Intelligent Vehicles Symposium, Sorrento, Italy, 2–6 June 2024; pp. 1527–1533. [Google Scholar]
- Dosovitskiy, A. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jégou, H. Training data-efficient image transformers & distillation through attention. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 10347–10357. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
- Cui, J.; Chen, Y.; Wu, Z.; Wu, H.; Wu, W. A driver behavior detection model for human-machine co-driving systems based on an improved Swin Transformer. World Electr. Veh. J. 2024, 16, 7. [Google Scholar] [CrossRef]
- Wang, W.; Xie, E.; Li, X.; Fan, D.P.; Song, K.; Liang, D.; Lu, T.; Luo, P.; Shao, L. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 568–578. [Google Scholar]
- Chen, S.; Shu, T.; Zhao, H.; Tang, Y.Y. MASK-CNN-Transformer for real-time multi-label weather recognition. Knowl.-Based Syst. 2023, 278, 110881. [Google Scholar] [CrossRef]
- Wu, H.; Xiao, B.; Codella, N.; Liu, M.; Dai, X.; Yuan, L.; Zhang, L. CvT: Introducing convolutions to vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 22–31. [Google Scholar]
- Dai, Z.; Liu, H.; Le, Q.V.; Tan, M. CoatNet: Marrying convolution and attention for all data sizes. In Proceedings of the Advances in Neural Information Processing Systems, Virtual, 6–14 December 2021; pp. 3965–3977. [Google Scholar]
- Mehta, S.; Rastegari, M. MobileViT: Light-weight, general-purpose, and mobile-friendly vision transformer. arXiv 2021, arXiv:2110.02178. [Google Scholar]
- Aloufi, N.; Alnori, A.; Basuhail, A. Enhancing autonomous vehicle perception in adverse weather: A multi objectives model for integrated weather classification and object detection. Electronics 2024, 13, 3063. [Google Scholar] [CrossRef]
- Marathe, A.; Ramanan, D.; Walambe, R.; Kotecha, K. Wedge: A multi-weather autonomous driving dataset built from generative vision-language models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 3318–3327. [Google Scholar]
- Zhang, H.; Zu, K.; Lu, J.; Zou, Y.; Meng, D. EPSANet: An efficient pyramid squeeze attention block on convolutional neural network. In Proceedings of the Asian Conference on Computer Vision, Macau, China, 4–8 December 2022; pp. 1161–1177. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for dense object detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Li, B.; Ren, W.; Fu, D. Benchmarking single-image dehazing and beyond. IEEE Trans. Image Process. 2018, 28, 492–505. [Google Scholar] [CrossRef] [PubMed]
- Kenk, M.A.; Hassaballah, M. DAWN: Vehicle detection in adverse weather nature dataset. arXiv 2020, arXiv:2008.05402. [Google Scholar] [CrossRef]
- Chen, J.; Kao, S.H.; He, H.; Zhuo, W.; Wen, S.; Lee, C.H.; Chan, S.H.G. Run, don’t walk: Chasing higher FLOPS for faster neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 12021–12031. [Google Scholar]
- Zhou, C.; Zhang, H.; Zhou, Z.; Yu, L.; Huang, L.; Fan, X.; Tian, Y. QKFormer: Hierarchical spiking transformer using QK attention. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 9–15 December 2024; pp. 13074–13098. [Google Scholar]
- Shin, H.; Choi, D.W. Teacher as a lenient expert: Teacher-agnostic data-free knowledge distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; pp. 14991–14999. [Google Scholar]
- Qiu, X.; Zhu, R.J.; Chou, Y.; Wang, Z.; Deng, L.J.; Li, G. Gated attention coding for training high-performance and efficient spiking neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; pp. 601–610. [Google Scholar]
- Wu, X.; Gao, S.; Zhang, Z.; Li, Z.; Bao, R.; Zhang, Y.; Wang, X.; Huang, H. Auto-Train-Once: Controller network guided automatic network pruning from scratch. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 16163–16173. [Google Scholar]







| Software/Hardware | Version |
|---|---|
| Operating System | Windows 10 |
| CPU | Intel(R) Core(TM) i7-13650HX |
| GPU | NVIDIA GeForce RTX 4060 |
| Memory | 24 GB |
| Programming Language | Python 3.10 |
| Deep Learning Framework | PyTorch 2.0.1 |
| Parallel Computing Platform | CUDA 11.8 |
| Class | Raw Images | After Augmentation | Train (80%) | Test (20%) |
|---|---|---|---|---|
| Fog (RTTS) | 500 | 1667 | 1333 | 334 |
| Rain (Self-collected) | 500 | 1667 | 1334 | 333 |
| Sandstorm (DAWN) | 300 | 1000 | 800 | 200 |
| Snow (DAWN) | 200 | 666 | 533 | 133 |
| Total | 1500 | 5000 | 4000 | 1000 |
| Model | PSA | FECS | Focal Loss | (%) | (%) | (%) | Params (M) | GFLOPs |
|---|---|---|---|---|---|---|---|---|
| ConvNeXt | 85.42 | 89.78 | 86.23 | 28.0 | 4.5 | |||
| √ | 86.84 | 91.02 | 88.12 | 28.3 | 4.6 | |||
| √ | 87.32 | 91.66 | 88.69 | 28.5 | 4.8 | |||
| √ | 86.23 | 90.45 | 87.35 | 28.0 | 4.5 | |||
| √ | √ | 88.17 | 92.96 | 89.67 | 28.9 | 4.9 | ||
| √ | √ | 89.23 | 94.05 | 90.86 | 28.3 | 4.6 | ||
| √ | √ | √ | 90.16 | 95.24 | 92.18 | 30.1 | 5.0 |
| (%) | (%) | (%) | |
|---|---|---|---|
| 0 | 84.80 | 88.90 | 86.01 |
| 1 | 85.60 | 89.80 | 86.80 |
| 2 | 86.23 | 90.45 | 87.35 |
| 3 | 85.90 | 90.00 | 87.04 |
| 4 | 85.52 | 89.56 | 86.45 |
| Model | (%) | (%) | (%) | Params (M) | GFLOPs | FPS |
|---|---|---|---|---|---|---|
| ConvNeXt | 85.42 | 89.78 | 86.23 | 28.0 | 4.5 | 90 |
| ResNet50 | 86.59 | 91.06 | 87.67 | 25.5 | 4.1 | 115 |
| MobileNet | 87.03 | 92.94 | 88.35 | 4.2 | 1.1 | 96 |
| MobileViT | 86.12 | 91.64 | 87.23 | 5.6 | 1.8 | 56 |
| FasterNet | 86.78 | 92.53 | 87.96 | 31.1 | 4.5 | 92 |
| QKFormer | 89.82 | 95.04 | 91.86 | 16.5 | 3.0 | 33 |
| TLENet | 88.24 | 94.12 | 90.37 | 22.0 | 3.8 | 24 |
| GAC-SNN | 89.11 | 94.65 | 91.24 | 18.0 | 3.3 | 28 |
| ATONet | 87.42 | 93.13 | 89.14 | 12.3 | 2.2 | 40 |
| PF-ConvNeXt | 90.16 | 95.24 | 92.18 | 30.1 | 5.0 | 78 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Wang, Q.; Zhou, Z.; Zhang, Z. PF-ConvNeXt: An Adverse Weather Recognition Network for Autonomous Driving Scenes. Electronics 2026, 15, 920. https://doi.org/10.3390/electronics15050920
Wang Q, Zhou Z, Zhang Z. PF-ConvNeXt: An Adverse Weather Recognition Network for Autonomous Driving Scenes. Electronics. 2026; 15(5):920. https://doi.org/10.3390/electronics15050920
Chicago/Turabian StyleWang, Quanxiang, Zhaofa Zhou, and Zhili Zhang. 2026. "PF-ConvNeXt: An Adverse Weather Recognition Network for Autonomous Driving Scenes" Electronics 15, no. 5: 920. https://doi.org/10.3390/electronics15050920
APA StyleWang, Q., Zhou, Z., & Zhang, Z. (2026). PF-ConvNeXt: An Adverse Weather Recognition Network for Autonomous Driving Scenes. Electronics, 15(5), 920. https://doi.org/10.3390/electronics15050920
