WS-DINO: A DINOv2-Based Weed Segmentation Method with Feature Priors and Spatial Fusion
Abstract
1. Introduction
2. Materials and Methods
2.1. Datasets
2.1.1. PhenoBench Dataset
2.1.2. MotionBlurred Dataset
2.2. Weed Segmentation Model
2.2.1. Feature Prior Module
2.2.2. Spatial Feature Fusion Module
2.2.3. Loss Function
3. Results
3.1. Model Evaluation Metrics
3.2. Experimental Settings
3.3. Comparisons with Benchmarks on Segmentation Datasets
3.4. Ablation Studies
3.5. Computational Expense
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Saini, A.K.; Yadav, A.K.; Dhiraj. A Comprehensive review on technological breakthroughs in precision agriculture: IoT and emerging data analytics. Eur. J. Agron. 2025, 163, 127440. [Google Scholar] [CrossRef]
- Singh, P.; Zhao, B.; Shi, Y. Computer Vision for Site-Specific Weed Management in Precision Agriculture: A Review. Agriculture 2025, 15, 2296. [Google Scholar] [CrossRef]
- Sandoval-Pillajo, L.; García-Santillán, I.; Pusdá-Chulde, M.; Giret, A. Weed detection based on deep learning from UAV imagery: A review. Smart Agric. Technol. 2025, 12, 101147. [Google Scholar] [CrossRef]
- Syed, A.; Chen, B.; Abbasi, A.A.; Butt, S.A.; Fang, X. MSEA-Net: Multi-Scale and Edge-Aware Network for Weed Segmentation. AgriEngineering 2025, 7, 103. [Google Scholar] [CrossRef]
- Gao, J.; Tan, F.; Li, X. EDM-UNet: An Edge-Enhanced and Attention-Guided Model for UAV-Based Weed Segmentation in Soybean Fields. Agriculture 2025, 15, 2575. [Google Scholar] [CrossRef]
- Yang, Q.; Ye, Y.; Gu, L.; Wu, Y. MSFCA-Net: A Multi-Scale Feature Convolutional Attention Network for Segmenting Crops and Weeds in the Field. Agriculture 2023, 13, 1176. [Google Scholar] [CrossRef]
- Liao, J.; Chen, M.; Zhang, K.; Zhou, H.; Zou, Y.; Xiong, W.; Zhang, S.; Kuang, F.; Zhu, D. SC-Net: A new strip convolutional network model for rice seedling and weed segmentation in paddy field. Comput. Electron. Agric. 2024, 220, 108862. [Google Scholar] [CrossRef]
- Janneh, L.L.; Zhang, Y.; Cui, Z.; Yang, Y. Multi-level feature re-weighted fusion for the semantic segmentation of crops and weeds. J. King Saud Univ.-Comput. Inf. Sci. 2023, 35, 101545. [Google Scholar] [CrossRef]
- Castellano, G.; De Marinis, P.; Vessio, G. Weed mapping in multispectral drone imagery using lightweight vision transformers. Neurocomputing 2023, 562, 126914. [Google Scholar] [CrossRef]
- Wei, Y.; Feng, Y.; Zu, D.; Zhang, X. A hybrid CNN-transformer network: Accurate and efficient semantic segmentation of crops and weeds on resource-constrained embedded devices. Crop Prot. 2025, 188, 107018. [Google Scholar] [CrossRef]
- Liu, Y.; Liu, M.; Wang, L.; Ma, H.; Zhang, M. Real-time semantic segmentation network for crops and weeds based on multi-branch structure. IET Comput. Vis. 2024, 18, 1313–1324. [Google Scholar] [CrossRef]
- Thiagarajan, S.; Vijayalakshmi, A.; Grace, G.H. Weed detection in precision agriculture: Leveraging encoder-decoder models for semantic segmentation. J. Ambient Intell. Humaniz. Comput. 2024, 15, 3547–3561. [Google Scholar] [CrossRef]
- Tao, J.; Qiao, Q.; Song, J.; Sun, S.; Chen, Y.; Wu, Q.; Liu, Y.; Xue, F.; Wu, H.; Zhao, F. Deep Learning-Driven Automatic Segmentation of Weeds and Crops in UAV Imagery. Sensors 2025, 25, 6576. [Google Scholar] [CrossRef] [PubMed]
- Lu, C.; Gehring, K.; Kopfinger, S.; Bernhardt, H.; Beck, M.; Walther, S.; Ebertseder, T.; Minceva, M.; Hu, Y.; Yu, K. Weed instance segmentation from UAV Orthomosaic Images based on Deep Learning. Smart Agric. Technol. 2025, 11, 100966. [Google Scholar] [CrossRef]
- Zhang, J.; Cao, S.; Xu, B.; Li, Y.; Jia, W.; Wu, T.; Lu, H.; Hu, W.; Han, Z. DepthCropSeg++: Scaling a Crop Segmentation Foundation Model With Depth-Labeled Data. IEEE J. Sel. Top. Signal Process. 2026, 20, 129–141. [Google Scholar] [CrossRef]
- Espejo-Garcia, B.; Güldenring, R.; Nalpantidis, L.; Fountas, S. Foundation vision models in agriculture: DINOv2, LoRA and knowledge distillation for disease and weed identification. Comput. Electron. Agric. 2025, 239, 110900. [Google Scholar] [CrossRef]
- Li, W.; Zhu, L.; Liu, J. PL-DINO: An Improved Transformer-Based Method for Plant Leaf Disease Detection. Agriculture 2024, 14, 691. [Google Scholar] [CrossRef]
- Picón, A.; Eguskiza, I.; Mugica, D.; Romero, J.; Jimenez, C.J.; White, E.M.; Do-Lago-Junqueira, G.; Klukas, C.; Navarra-Mestre, R. Robust MultiSpecies Agricultural Segmentation Across Devices, Seasons, and Sensors Using Hierarchical DINOv2 Models. arXiv 2025, arXiv:2508.07514v2. [Google Scholar]
- Oquab, M.; Darcet, T.; Moutakanni, T.; Vo, H.; Szafraniec, M.; Khalidov, V.; Fernandez, P.; Haziza, D.; Massa, F.; El-Nouby, A.; et al. DINOv2: Learning Robust Visual Features without Supervision. arXiv 2024, arXiv:2304.07193. [Google Scholar] [CrossRef]
- Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. In Proceedings of the International Conference on Learning Representations, Online, 25–29 April 2022. [Google Scholar]
- Weyler, J.; Magistri, F.; Marks, E.; Chong, Y.L.; Sodano, M.; Roggiolani, G.; Chebrolu, N.; Stachniss, C.; Behley, J. PhenoBench—A Large Dataset and Benchmarks for Semantic Image Interpretation in the Agricultural Domain. IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI) 2024, 46, 9583–9594. [Google Scholar] [CrossRef] [PubMed]
- Genze, N.; Ajekwe, R.; Güreli, Z.; Haselbeck, F.; Grieb, M.; Grimm, D.G. Deep learning-based early weed segmentation using motion blurred UAV images of sorghum fields. Comput. Electron. Agric. 2022, 202, 107388. [Google Scholar] [CrossRef]
- Azad, R.; Kazerouni, A.; Azad, B.; Khodapanah Aghdam, E.; Velichko, Y.; Bagci, U.; Merhof, D. Laplacian-Former: Overcoming the Limitations of Vision Transformers in Local Texture Detection. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2023; Greenspan, H., Madabhushi, A., Mousavi, P., Salcudean, S., Duncan, J., Syeda-Mahmood, T., Taylor, R., Eds.; Spinger: Cham, Switzerland, 2023; pp. 736–746. [Google Scholar]
- Shi, B.; Zhu, W.P.; Swamy, M. SGDC: Structurally-Guided Dynamic Convolution for Medical Image Segmentation. arXiv 2026, arXiv:2602.23496. [Google Scholar] [CrossRef]
- Suárez, P.L.; Sappa, A.D. Edge-Aware Camouflaged Object Detection. In Proceedings of the Computer Analysis of Images and Patterns; Castrillón-Santana, M., Travieso-González, C.M., Deniz Suarez, O., Freire-Obregón, D., Hernández-Sosa, D., Lorenzo-Navarro, J., Santana, O.J., Eds.; Spinger: Cham, Switzerland, 2026; pp. 197–208. [Google Scholar]
- Wang, H.; Chen, X.; Zhang, T.; Xu, Z.; Li, J. CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images. Remote Sens. 2022, 14, 1956. [Google Scholar] [CrossRef]
- Silva, L.; Drews, P.; de Bem, R. Soybean Weeds Segmentation Using VT-Net: A Convolutional-Transformer Model. In Proceedings of the 2023 36th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), Rio Grande, Brazil, 6–9 November 2023; pp. 127–132. [Google Scholar] [CrossRef]
- Cheng, X.; Huang, S.; Liao, B.; Wang, Y.; Luo, X. BG-Net: Boundary-guidance network for object consistency maintaining in semantic segmentation. Vis. Comput. 2024, 40, 373–391. [Google Scholar] [CrossRef]
- Bui, N.T.; Hoang, D.H.; Nguyen, Q.T.; Tran, M.T.; Le, N. MEGANet: Multi-Scale Edge-Guided Attention Network for Weak Boundary Polyp Segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 4–8 January 2024; pp. 7985–7994. [Google Scholar]
- Yu, H.; Fu, T.; Li, B.; Xue, X. EAFormer: Scene Text Segmentation with Edge-Aware Transformers. In Proceedings of the Computer Vision—ECCV 2024; Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G., Eds.; Spinger: Cham, Switzerland, 2025; pp. 410–427. [Google Scholar]
- Zhu, D.; Huang, X.; Huang, H.; Cheng, Q.; Huang, Z.; Shao, Z. ChangeViT: Unleashing plain vision transformers for change detection in remote sensing images. Pattern Recognit. 2026, 172, 112539. [Google Scholar] [CrossRef]
- Wang, W.; Xie, E.; Li, X.; Fan, D.P.; Song, K.; Liang, D.; Lu, T.; Luo, P.; Shao, L. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 568–578. [Google Scholar]
- Li, R.; Yu, B.; Zhang, B.; Ma, H.; Qin, Y.; Lv, X.; Yan, S. Lightweight CNN–Transformer Hybrid Network with Contrastive Learning for Few-Shot Noxious Weed Recognition. Horticulturae 2025, 11, 1236. [Google Scholar] [CrossRef]
- Kervadec, H.; Bouchtiba, J.; Desrosiers, C.; Granger, E.; Dolz, J.; Ben Ayed, I. Boundary loss for highly unbalanced segmentation. In Proceedings of the 2nd International Conference on Medical Imaging with Deep Learning, London, UK, 8–10 July 2019; Proceedings of Machine Learning Research; Cardoso, M.J., Feragen, A., Glocker, B., Konukoglu, E., Oguz, I., Unal, G., Vercauteren, T., Eds.; PMLR: Cambridge, MA, USA, 2019; Volume 102, pp. 285–296. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Spinger: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Chen, L.-C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar] [CrossRef]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
- Cao, H.; Wang, Y.; He, D.; Wang, J.; Wang, H.; Miao, W. Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation. In Proceedings of the Computer Vision—ECCV 2022 Workshops; Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S., Eds.; Spinger: Cham, Switzerland, 2022; pp. 205–218. [Google Scholar]
- Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. arXiv 2021, arXiv:2105.15203. [Google Scholar] [CrossRef]







| Method | Bg IoU (%) | Crop IoU (%) | Weed IoU (%) | mIoU (%) | F1-Score (%) | Precision (%) | Recall (%) |
|---|---|---|---|---|---|---|---|
| U-Net | 99.24 | 95.48 | 63.47 | 86.06 | 91.53 | 90.67 | 92.41 |
| DeepLabV3 | 99.19 | 95.19 | 63.50 | 85.96 | 90.96 | 92.27 | 89.68 |
| SegNet | 99.56 | 95.32 | 62.74 | 85.87 | 91.27 | 90.22 | 92.35 |
| Swin-UNet | 99.02 | 95.26 | 65.61 | 86.63 | 92.28 | 90.51 | 94.13 |
| SegFormer-B0 | 99.21 | 95.06 | 66.05 | 86.77 | 92.82 | 91.61 | 94.06 |
| WS-DINO | 99.50 | 95.82 | 70.68 | 88.67 | 93.77 | 92.31 | 95.28 |
| Method | Bg IoU (%) | Crop IoU (%) | Weed IoU (%) | mIoU (%) | F1-Score (%) | Precision (%) | Recall (%) |
|---|---|---|---|---|---|---|---|
| U-Net | 99.24 | 83.54 | 77.44 | 86.74 | 91.76 | 89.28 | 94.39 |
| DeepLabV3 | 99.18 | 82.23 | 75.56 | 85.66 | 91.97 | 90.26 | 93.75 |
| SegNet | 99.27 | 83.72 | 77.49 | 86.83 | 92.69 | 91.36 | 94.05 |
| Swin-UNet | 99.20 | 83.65 | 78.20 | 87.02 | 92.84 | 92.91 | 92.78 |
| SegFormer-B0 | 98.91 | 84.76 | 75.75 | 86.47 | 92.82 | 91.61 | 94.06 |
| WS-DINO | 99.35 | 85.55 | 81.34 | 88.75 | 93.92 | 93.13 | 94.72 |
| Baseline | FPM | SFFM | Bg IoU (%) | Crop IoU (%) | Weed IoU (%) | mIoU (%) | F1-Score (%) | Precision (%) | Recall (%) |
|---|---|---|---|---|---|---|---|---|---|
| ✔ | 99.23 | 95.39 | 64.46 | 86.36 | 91.88 | 91.25 | 92.53 | ||
| ✔ | ✔ | 99.26 | 95.72 | 69.46 | 88.15 | 93.14 | 91.94 | 94.43 | |
| ✔ | ✔ | 99.27 | 95.69 | 69.15 | 88.04 | 93.06 | 92.25 | 93.91 | |
| ✔ | ✔ | ✔ | 99.50 | 95.82 | 70.68 | 88.67 | 93.77 | 92.31 | 95.28 |
| WS-DINO | SegFormer-B0 | Swin-UNet | SegNet | DeepLabV3 | U-Net | |
|---|---|---|---|---|---|---|
| Parameters (M) | 24.29 | 3.82 | 36.10 | 27.20 | 55.28 | 24.21 |
| Inference Speed (FPS) | 34.10 | 77.25 | 28.32 | 45.15 | 32.63 | 54.32 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Zhou, H.; Liu, J.; Wu, R.; Zhao, B. WS-DINO: A DINOv2-Based Weed Segmentation Method with Feature Priors and Spatial Fusion. Agriculture 2026, 16, 1105. https://doi.org/10.3390/agriculture16101105
Zhou H, Liu J, Wu R, Zhao B. WS-DINO: A DINOv2-Based Weed Segmentation Method with Feature Priors and Spatial Fusion. Agriculture. 2026; 16(10):1105. https://doi.org/10.3390/agriculture16101105
Chicago/Turabian StyleZhou, Hongsheng, Jiangping Liu, Rigeng Wu, and Baoping Zhao. 2026. "WS-DINO: A DINOv2-Based Weed Segmentation Method with Feature Priors and Spatial Fusion" Agriculture 16, no. 10: 1105. https://doi.org/10.3390/agriculture16101105
APA StyleZhou, H., Liu, J., Wu, R., & Zhao, B. (2026). WS-DINO: A DINOv2-Based Weed Segmentation Method with Feature Priors and Spatial Fusion. Agriculture, 16(10), 1105. https://doi.org/10.3390/agriculture16101105

