Semantic-Enhanced Bidirectional Multimodal Fusion for 3D Object Detection Under Adverse Weather
Abstract
1. Introduction
- 1.
- We propose SeBFusion to address adverse-weather noise and multimodal information dilution via semantic enhancement and bidirectional fusion.
- 2.
- We design a virtual point generation and camera semantic injection mechanism to selectively map image semantics into 3D space, enhancing point-cloud feature representations.
- 3.
- We propose a confidence-aware bidirectional cross-attention fusion module that adaptively regulates bidirectional information flow to suppress noise propagation and improve fusion robustness.
2. Related Work
2.1. Enhancing Data Quality Under Adverse Weather
2.2. Improving Multimodal Fusion Mechanisms for Robust 3D Detection
3. Method
3.1. Overall Framework
3.2. Semantic LiDAR Feature Generation Module
3.2.1. Virtual Point Generation
3.2.2. Camera Semantic Injection
3.3. Bidirectional Cross-Attention Fusion Module
4. Experiments
4.1. Dataset
4.2. Implementation Details
4.3. Compare with Other Methods
4.4. Ablation Study
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wan, R.; Zhao, T.; Lu, W. Robust 3D sparse object detection through multi-modal framework with dynamic feature encoding and hierarchical object-guided feature enhancement. Inf. Fusion 2026, 126, 103648. [Google Scholar] [CrossRef]
- Wang, Z.; Huang, Z.; Gao, Y.; Wang, N.; Liu, S. MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2026, 48, 609–623. [Google Scholar] [CrossRef] [PubMed]
- Huang, X.; Xu, Z.; Wu, H.; Wang, J.; Xia, Q.; Xia, Y.; Li, J.; Gao, K.; Wen, C.; Wang, C. L4DR: LiDAR-4DRadar Fusion for Weather-Robust 3D Object Detection. In Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Walsh, T., Shah, J., Kolter, Z., Eds.; AAAI Press: Menlo Park, CA, USA, 2025; pp. 3806–3814. [Google Scholar] [CrossRef]
- Mudavath, T.; Mamidi, A. Object detection challenges: Navigating through varied weather conditions—Acomprehensive survey. J. Ambient Intell. Humaniz. Comput. 2025, 16, 443–457. [Google Scholar] [CrossRef]
- Chen, Z.; Zhang, Z.; Su, Q.; Yang, K.; Wu, Y.; He, L.; Tang, X. Object detection for autonomous vehicles under adverse weather conditions. Expert Syst. Appl. 2026, 296, 128994. [Google Scholar] [CrossRef]
- Xu, J.; Hu, X.; Zhu, L.; Heng, P.A. Unifying Physically-Informed Weather Priors in a Single Model for Image Restoration Across Multiple Adverse Weather Conditions. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 9575–9591. [Google Scholar] [CrossRef]
- Yoon, J.H.; Jung, J.W.; Yoo, S.B. Equirectangular Point Reconstruction for Domain Adaptive Multimodal 3D Object Detection in Adverse Weather Conditions. In Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Walsh, T., Shah, J., Kolter, Z., Eds.; AAAI Press: Menlo Park, CA, USA, 2025; pp. 9553–9561. [Google Scholar] [CrossRef]
- Graf, M.; Steinhauser, D.; Vaculín, O.; Brandmeier, T. Impact of Adverse Weather on Road Safety: A Survey of Test Methods for Enhancing Safety of Automated Vehicles and Sensor Robustness in Challenging Environmental Conditions. IEEE Access 2025, 13, 179817–179838. [Google Scholar] [CrossRef]
- Batten, B.; Lomuscio, A. Improving Weather-based OOD Generalisation in Lidar-based Object Detection Models via Adversarial Training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Nashville, TN, USA, 11–12 June 2025; pp. 4360–4368. [Google Scholar]
- Xing, L.; Ye, J.; Deng, K.; Wu, H.; Ma, H.; Gao, J. Cerberus: Accurate Real-Time Object Detection System Under Adverse Weather Conditions via Multimodal Fusion. IEEE Internet Things J. 2025, 12, 52837–52849. [Google Scholar] [CrossRef]
- Valanarasu, J.M.J.; Yasarla, R.; Patel, V.M. TransWeather: Transformer-based Restoration of Images Degraded by Adverse Weather Conditions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022; IEEE: New York, NY, USA, 2022; pp. 2343–2353. [Google Scholar] [CrossRef]
- Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M. Restormer: Efficient Transformer for High-Resolution Image Restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022; IEEE: New York, NY, USA, 2022; pp. 5718–5729. [Google Scholar] [CrossRef]
- Cui, Y.; Ren, W.; Cao, X.; Knoll, A. Focal Network for Image Restoration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, 1–6 October 2023; IEEE: New York, NY, USA, 2023; pp. 12955–12965. [Google Scholar] [CrossRef]
- Hahner, M.; Sakaridis, C.; Dai, D.; Gool, L.V. Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021; IEEE: New York, NY, USA, 2021; pp. 15263–15272. [Google Scholar] [CrossRef]
- Hahner, M.; Sakaridis, C.; Bijelic, M.; Heide, F.; Yu, F.; Dai, D.; Gool, L.V. LiDAR Snowfall Simulation for Robust 3D Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022; IEEE: New York, NY, USA, 2022; pp. 16343–16353. [Google Scholar] [CrossRef]
- Kilic, V.; Hegde, D.; Cooper, A.B.; Patel, V.M.; Foster, M.A. LiDAR Light Scattering Augmentation (LISA): Physics-based Simulation of Adverse Weather Conditions for 3D Object Detection. In Proceedings of the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2025, Hyderabad, India, 6–11 April 2025; IEEE: New York, NY, USA, 2025; pp. 1–5. [Google Scholar] [CrossRef]
- Li, B.; Li, J.; Chen, G.; Wu, H.; Huang, K. De-snowing LiDAR Point Clouds With Intensity and Spatial-Temporal Features. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–25 May 2022; pp. 2359–2365. [Google Scholar] [CrossRef]
- Yang, J.; Shi, S.; Wang, Z.; Li, H.; Qi, X. ST3D++: Denoised Self-Training for Unsupervised Domain Adaptation on 3D Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 6354–6371. [Google Scholar] [CrossRef] [PubMed]
- Chang, G.; Roh, W.; Jang, S.; Lee, D.; Ji, D.; Oh, G.; Park, J.; Kim, J.; Kim, S. CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection. In Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Vancouver, BC, Canada, 20–27 February 2024; Wooldridge, M.J., Dy, J.G., Natarajan, S., Eds.; AAAI Press: Menlo Park, CA, USA, 2024; pp. 972–980. [Google Scholar] [CrossRef]
- Huang, X.; Wu, H.; Li, X.; Fan, X.; Wen, C.; Wang, C. Sunshine to Rainstorm: Cross-Weather Knowledge Distillation for Robust 3D Object Detection. In Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Vancouver, BC, Canada, 20–27 February 2024; Wooldridge, M.J., Dy, J.G., Natarajan, S., Eds.; AAAI Press: Menlo Park, CA, USA, 2024; pp. 2409–2416. [Google Scholar] [CrossRef]
- Liu, Y.; Zhang, Y.; Lan, R.; Cheng, C.; Wu, Z. AWARDistill: Adaptive and robust 3D object detection in adverse conditions through knowledge distillation. Expert Syst. Appl. 2025, 266, 126032. [Google Scholar] [CrossRef]
- Song, Z.; Zhang, G.; Liu, L.; Yang, L.; Xu, S.; Jia, C.; Jia, F.; Wang, L. RoboFusion: Towards Robust Multi-Modal 3D Object Detection via SAM. In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI 2024, Jeju, Republic of Korea, 3–9 August 2024; pp. 1272–1280. [Google Scholar]
- Dong, Y.; Kang, C.; Zhang, J.; Zhu, Z.; Wang, Y.; Yang, X.; Su, H.; Wei, X.; Zhu, J. Benchmarking Robustness of 3D Object Detection to Common Corruptions in Autonomous Driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, 17–24 June 2023; IEEE: New York, NY, USA, 2023; pp. 1022–1032. [Google Scholar] [CrossRef]
- Zhang, Y.; Sun, Y.; Li, H.; Zheng, S.; Zhu, C.; Yang, L. Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital Pathology. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2022—25th International Conference, Singapore, 18–22 September 2022; Proceedings, Part II; Lecture Notes in Computer Science; Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S., Eds.; Springer: Berlin/Heidelberg, Germany, 2022; Volume 13432, pp. 242–252. [Google Scholar] [CrossRef]
- Sindagi, V.A.; Zhou, Y.; Tuzel, O. MVX-Net: Multimodal VoxelNet for 3D Object Detection. In Proceedings of the International Conference on Robotics and Automation, ICRA 2019, Montreal, QC, Canada, 20–24 May 2019; IEEE: New York, NY, USA, 2019; pp. 7276–7282. [Google Scholar] [CrossRef]
- Huang, T.; Liu, Z.; Chen, X.; Bai, X. EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection. In Proceedings of the Computer Vision—ECCV 2020—16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XV; Lecture Notes in Computer Science; Vedaldi, A., Bischof, H., Brox, T., Frahm, J., Eds.; Springer: Berlin/Heidelberg, Germany, 2020; Volume 12360, pp. 35–52. [Google Scholar] [CrossRef]
- Bai, X.; Hu, Z.; Zhu, X.; Huang, Q.; Chen, Y.; Fu, H.; Tai, C. TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022; IEEE: New York, NY, USA, 2022; pp. 1080–1089. [Google Scholar] [CrossRef]
- Liu, Z.; Tang, H.; Amini, A.; Yang, X.; Mao, H.; Rus, D.L.; Han, S. BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird’s-Eye View Representation. In Proceedings of the IEEE International Conference on Robotics and Automation, ICRA 2023, London, UK, 29 May–2 June 2023; IEEE: New York, NY, USA, 2023; pp. 2774–2781. [Google Scholar] [CrossRef]
- Yang, Z.; Chen, J.; Miao, Z.; Li, W.; Zhu, X.; Zhang, L. DeepInteraction: 3D Object Detection via Modality Interaction. In Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, 28 November–9 December 2022; Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2022. [Google Scholar]
- Xie, Y.; Xu, C.; Rakotosaona, M.; Rim, P.; Tombari, F.; Keutzer, K.; Tomizuka, M.; Zhan, W. SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, 1–6 October 2023; IEEE: New York, NY, USA, 2023; pp. 17545–17556. [Google Scholar] [CrossRef]
- Xu, S.; Zhou, D.; Fang, J.; Yin, J.; Zhou, B.; Zhang, L. FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection. In Proceedings of the 24th IEEE International Intelligent Transportation Systems Conference, ITSC 2021, Indianapolis, IN, USA, 19–22 September 2021; IEEE: New York, NY, USA, 2021; pp. 3047–3054. [Google Scholar] [CrossRef]
- Nabati, R.; Qi, H. CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, WACV 2021, Waikoloa, HI, USA, 3–8 January 2021; IEEE: New York, NY, USA, 2021; pp. 1526–1535. [Google Scholar] [CrossRef]
- Bijelic, M.; Gruber, T.; Mannan, F.; Kraus, F.; Ritter, W.; Dietmayer, K.; Heide, F. Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020; Computer Vision Foundation/IEEE: New York, NY, USA, 2020; pp. 11679–11689. [Google Scholar] [CrossRef]
- Palladin, E.; Dietze, R.; Narayanan, P.; Bijelic, M.; Heide, F. SAMFusion: Sensor-Adaptive Multimodal Fusion for 3D Object Detection in Adverse Weather. arXiv 2025, arXiv:2508.16408. [Google Scholar] [CrossRef]
- Tian, Y.; Ye, Q.; Doermann, D. YOLOv12: Attention-Centric Real-Time Object Detectors. arXiv 2025, arXiv:2502.12524. [Google Scholar]
- Carballo, A.; Lambert, J.; Monrroy, A.; Wong, D.R.; Narksri, P.; Kitsukawa, Y.; Takeuchi, E.; Kato, S.; Takeda, K. LIBRE: The Multiple 3D LiDAR Dataset. In Proceedings of the IEEE Intelligent Vehicles Symposium, IV 2020, Las Vegas, NV, USA, 19 October–13 November 2020; IEEE: New York, NY, USA, 2020; pp. 1094–1101. [Google Scholar] [CrossRef]
- Yang, Z.; Sun, Y.; Liu, S.; Jia, J. 3DSSD: Point-Based 3D Single Stage Object Detector. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020; Computer Vision Foundation/IEEE: New York, NY, USA, 2020; pp. 11037–11045. [Google Scholar] [CrossRef]
- Shi, S.; Jiang, L.; Deng, J.; Wang, Z.; Guo, C.; Shi, J.; Wang, X.; Li, H. PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection. Int. J. Comput. Vis. 2023, 131, 531–551. [Google Scholar] [CrossRef]
- Yan, Y.; Mao, Y.; Li, B. SECOND: Sparsely Embedded Convolutional Detection. Sensors 2018, 18, 3337. [Google Scholar] [CrossRef] [PubMed]
- Shih, Y.; Liao, W.; Lin, W.; Wong, S.; Wang, C. Reconstruction and Synthesis of Lidar Point Clouds of Spray. IEEE Robot. Autom. Lett. 2022, 7, 3765–3772. [Google Scholar] [CrossRef]
- Lang, A.H.; Vora, S.; Caesar, H.; Zhou, L.; Yang, J.; Beijbom, O. PointPillars: Fast Encoders for Object Detection From Point Clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019; Computer Vision Foundation/IEEE: New York, NY, USA, 2019; pp. 12697–12705. [Google Scholar] [CrossRef]
- Shi, S.; Wang, Z.; Shi, J.; Wang, X.; Li, H. From Points to Parts: 3D Object Detection From Point Cloud With Part-Aware and Part-Aggregation Network. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 2647–2664. [Google Scholar] [CrossRef] [PubMed]
- Liu, Z.; Wu, Z.; Tóth, R. SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2020, Seattle, WA, USA, 14–19 June 2020; Computer Vision Foundation/IEEE: New York, NY, USA, 2020; pp. 4289–4298. [Google Scholar] [CrossRef]
- Wang, T.; Zhu, X.; Pang, J.; Lin, D. Probabilistic and Geometric Depth: Detecting Objects in Perspective. In Proceedings of the Conference on Robot Learning, London, UK, 8–11 November 2021; Proceedings of Machine Learning Research; Faust, A., Hsu, D., Neumann, G., Eds.; PMLR: New York, NY, USA, 2021; Volume 164, pp. 1475–1485. [Google Scholar]
- Rukhovich, D.; Vorontsova, A.; Konushin, A. ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022, Waikoloa, HI, USA, 3–8 January 2022; IEEE: New York, NY, USA, 2022; pp. 1265–1274. [Google Scholar] [CrossRef]
- Chen, Y.; Li, Y.; Zhang, X.; Sun, J.; Jia, J. Focal Sparse Convolutional Networks for 3D Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022; IEEE: New York, NY, USA, 2022; pp. 5418–5427. [Google Scholar] [CrossRef]
- Zhu, X.; Ma, Y.; Wang, T.; Xu, Y.; Shi, J.; Lin, D. SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds. In Proceedings of the Computer Vision—ECCV 2020—16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XXV; Lecture Notes in Computer Science; Vedaldi, A., Bischof, H., Brox, T., Frahm, J., Eds.; Springer: Berlin/Heidelberg, Germany, 2020; Volume 12370, pp. 581–597. [Google Scholar] [CrossRef]
- Yin, T.; Zhou, X.; Krähenbühl, P. Center-Based 3D Object Detection and Tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, 19–25 June 2021; Computer Vision Foundation/IEEE: New York, NY, USA, 2021; pp. 11784–11793. [Google Scholar] [CrossRef]
- Wang, T.; Zhu, X.; Pang, J.; Lin, D. FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection. arXiv 2021, arXiv:2104.10956. [Google Scholar]
- Wang, Y.; Guizilini, V.; Zhang, T.; Wang, Y.; Zhao, H.; Solomon, J. DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries. In Proceedings of the Conference on Robot Learning, London, UK, 8–11 November 2021; Proceedings of Machine Learning Research; Faust, A., Hsu, D., Neumann, G., Eds.; PMLR: New York, NY, USA, 2021; Volume 164, pp. 180–191. [Google Scholar]
- Li, Z.; Wang, W.; Li, H.; Xie, E.; Sima, C.; Lu, T.; Qiao, Y.; Dai, J. BEVFormer: Learning Bird’s-Eye-View Representation From LiDAR-Camera via Spatiotemporal Transformers. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 47, 2020–2036. [Google Scholar] [CrossRef] [PubMed]
- Chen, X.; Zhang, T.; Wang, Y.; Wang, Y.; Zhao, H. FUTR3D: A Unified Sensor Fusion Framework for 3D Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023—Workshops, Vancouver, BC, Canada, 17–24 June 2023; IEEE: New York, NY, USA, 2023; pp. 172–181. [Google Scholar] [CrossRef]





| Methods | Modality | Clean | Fog | Rain | Snow | Light |
|---|---|---|---|---|---|---|
| 3DSSD [37] | LiDAR | 80.18 | 46.26 | 28.31 | 28.33 | 25.14 |
| PV-RCNN [38] | LiDAR | 85.17 | 79.22 | 52.21 | 53.12 | 80.55 |
| SECOND [39] | LiDAR | 81.15 | 74.63 | 52.12 | 55.81 | 77.21 |
| PointRCNN [40] | LiDAR | 82.23 | 77.15 | 51.02 | 52.64 | 62.08 |
| PointPillars [41] | LiDAR | 78.41 | 64.28 | 36.18 | 36.47 | 62.28 |
| Part-A2 [42] | LiDAR | 82.45 | 71.61 | 41.63 | 42.70 | 76.45 |
| SMOKE [43] | Camera | 7.09 | 5.63 | 3.94 | 2.47 | 6.00 |
| PGD [44] | Camera | 8.10 | 0.87 | 3.06 | 0.63 | 7.07 |
| ImVoxelNet [45] | Camera | 11.49 | 1.34 | 1.24 | 0.22 | 10.08 |
| EPNets [26] | LiDAR + Camera | 85.13 | 44.16 | 40.12 | 34.71 | 69.12 |
| Focals Conv [46] | LiDAR + Camera | 84.16 | 44.15 | 40.12 | 35.23 | 80.75 |
| AWARDistill [21] | LiDAR + Camera | 88.62 | 82.74 | 70.92 | 65.74 | 80.19 |
| Ours | LiDAR + Camera | 89.05 | 83.60 | 74.20 | 69.10 | 82.10 |
| Methods | Modality | Clean | Fog | Rain | Snow | Light |
|---|---|---|---|---|---|---|
| PointPillars [41] | LiDAR | 27.69 | 24.49 | 27.71 | 27.57 | 23.71 |
| SSN [47] | LiDAR | 46.65 | 41.64 | 46.50 | 46.38 | 40.28 |
| CenterPoint [48] | LiDAR | 59.28 | 43.78 | 56.08 | 55.90 | 54.20 |
| FCOS3D [49] | Camera | 23.86 | 13.53 | 13.00 | 2.01 | 17.20 |
| PGD [44] | Camera | 23.19 | 12.83 | 13.51 | 2.30 | 22.77 |
| DETR3D [50] | Camera | 34.71 | 27.89 | 20.39 | 5.08 | 34.66 |
| BEVFormer [51] | Camera | 41.65 | 32.76 | 24.97 | 5.73 | 41.68 |
| FUTR3D [52] | LiDAR + Camera | 64.17 | 53.19 | 58.40 | 52.73 | 57.70 |
| TransFusion [27] | LiDAR + Camera | 66.38 | 53.67 | 65.35 | 63.30 | 55.14 |
| BEVFusion [28] | LiDAR + Camera | 68.45 | 54.10 | 66.13 | 62.84 | 64.42 |
| RoboFusion-B [22] | LiDAR + Camera | 69.40 | 65.54 | 67.01 | 66.07 | 66.71 |
| AWARDistill [21] | LiDAR + Camera | 68.11 | 60.11 | 66.93 | 66.03 | 62.92 |
| Ours | LiDAR + Camera | 69.90 | 66.20 | 67.80 | 66.60 | 67.30 |
| Method | Modality | Snow | Fog | ||||
|---|---|---|---|---|---|---|---|
| 0–30 m | 30–50 m | 50–80 m | 0–30 m | 30–50 m | 50–80 m | ||
| MVXNet [25] | Camera + LiDAR | 76.23 | 59.73 | 25.83 | 73.89 | 50.98 | 16.73 |
| BEVFusion [28] | Camera + LiDAR | 71.12 | 62.61 | 10.01 | 76.24 | 58.04 | 8.61 |
| SparseFusion [30] | Camera + LiDAR | 73.33 | 66.84 | 19.87 | 79.25 | 58.39 | 17.05 |
| Ours | Camera + LiDAR | 80.64 | 74.63 | 38.59 | 81.79 | 63.37 | 26.51 |
| Setting | MDC | CSI | BCAF | Clean | Fog | Rain | Snow | Light |
|---|---|---|---|---|---|---|---|---|
| Baseline | 62.15 | 53.08 | 58.11 | 51.96 | 57.52 | |||
| +MDC | ✓ | 64.95 | 58.08 | 61.61 | 57.46 | 61.12 | ||
| +CSI | ✓ | ✓ | 67.25 | 62.08 | 64.61 | 61.96 | 64.12 | |
| +BCAF (Full) | ✓ | ✓ | ✓ | 69.90 | 66.20 | 67.80 | 66.60 | 67.30 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Jiao, T.; Chen, Y.; Feng, X.; Guo, C.; Song, J. Semantic-Enhanced Bidirectional Multimodal Fusion for 3D Object Detection Under Adverse Weather. Appl. Sci. 2026, 16, 2943. https://doi.org/10.3390/app16062943
Jiao T, Chen Y, Feng X, Guo C, Song J. Semantic-Enhanced Bidirectional Multimodal Fusion for 3D Object Detection Under Adverse Weather. Applied Sciences. 2026; 16(6):2943. https://doi.org/10.3390/app16062943
Chicago/Turabian StyleJiao, Tianzhe, Yuming Chen, Xiaoyue Feng, Chaopeng Guo, and Jie Song. 2026. "Semantic-Enhanced Bidirectional Multimodal Fusion for 3D Object Detection Under Adverse Weather" Applied Sciences 16, no. 6: 2943. https://doi.org/10.3390/app16062943
APA StyleJiao, T., Chen, Y., Feng, X., Guo, C., & Song, J. (2026). Semantic-Enhanced Bidirectional Multimodal Fusion for 3D Object Detection Under Adverse Weather. Applied Sciences, 16(6), 2943. https://doi.org/10.3390/app16062943

