You are currently viewing a new version of our website. To view the old version click .
Sensors
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

18 December 2025

YOLO-SAM AgriScan: A Unified Framework for Ripe Strawberry Detection and Segmentation with Few-Shot and Zero-Shot Learning

,
,
,
and
1
Department of Biological and Agricultural Engineering, Texas A&M AgriLife Research, Texas A&M University System, Dallas, TX 75252, USA
2
Department of Information Engineering, University of Pisa, 56122 Pisa, Italy
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Computer Vision and Pattern Recognition for Advanced Smart Agriculture Solutions

Abstract

Traditional segmentation methods are slow and rely on manual annotations, which are labor-intensive. To address these limitations, we propose YOLO-SAM AgriScan, a unified framework that combines the fast object detection capabilities of YOLOv11 with the zero-shot segmentation power of the Segment Anything Model 2 (SAM2). Our approach adopts a hybrid paradigm for on-plant ripe strawberry segmentation, wherein YOLOv11 is fine-tuned using a few-shot learning strategy with minimal annotated samples, and SAM2 performs mask generation without additional supervision. This architecture eliminates the bottleneck of pixel-wise manual annotation and enables the scalable and efficient segmentation of strawberries in both controlled and natural farm environments. Experimental evaluations on two datasets, a custom-collected dataset and a publicly available benchmark, demonstrate strong detection and segmentation performance in both full-data and data-constrained scenarios. The proposed framework achieved a mean Dice score of 0.95 and an IoU of 0.93 on our collected dataset and maintained competitive performance on public data (Dice: 0.95, IoU: 0.92), demonstrating its robustness, generalizability, and practical relevance in real-world agricultural settings. Our results highlight the potential of combining few-shot detection and zero-shot segmentation to accelerate the development of annotation-light, intelligent phenotyping systems.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.