In this paper, a lightweight detection model
DSW-YOLO based on improved
YOLOv10n is proposed. After comparing mainstream lightweight models (
YOLOv5n,
YOLOv6n,
YOLOv8n,
YOLOv9t and
YOLOv10n),
YOLOv10n with the best performance was selected as the baseline. The
DWRR block was then designed and integrated with the
C2f module to form
C2f-
DWRR, replacing the original
C2f blocks in the backbone. Consequently, the model’s P, R, mAP50, and mAP50-95 increased by 2.3%, 2.1%, 1.8%, and 3.4%, respectively, while the parameter count dropped by 0.16 M and the model size was reduced by 0.25 MB. A
SimAM parameter-free attention mechanism was added to the last layer of the backbone, boosting P, R, mAP50, and mAP50-95 to 90.6%, 84.0%, 91.8%, and 68.5%, and reducing average detection time to 1.1 ms. The
CIOU function was replaced with
WIOUv3 to accelerate convergence, decrease loss, and significantly enhance detection performance. Experimental results show that on a custom green pepper dataset,
DSW-YOLO outperformed the baseline by achieving gains of 2.9%, 2.7%, 2.2%, and 3.4% in P, R, mAP50, and mAP50-95, reducing parameters by 1.6 M, cutting inference time by 0.7 ms, and shrinking the model size to 5.31 MB.
DSW-YOLO efficiently and accurately detects green peppers in complex field conditions, significantly improving detection accuracy while remaining lightweight, and provides theoretical and technical support for designing and optimizing pepper-picking robot vision systems.
Full article