Accurate fruit segmentation in images is the prerequisite and key step for precision agriculture. In this article, aiming at the segmentation of grape cluster with different varieties, 3 state-of-the-art semantic segmentation networks, i.e., Fully Convolutional Network (FCN), U-Net, and DeepLabv3+ applied on six different datasets were studied. We investigated: (1) the segmentation performance difference of the 3 studied networks; (2) The impact of different input representations on segmentation performance; (3) The effect of image enhancement method to improve the poor illumination of images and further improve the segmentation performance; (4) The impact of the distance between grape clusters and camera on segmentation performance. The experiment results show that compared with FCN and U-Net the DeepLabv3+ combined with transfer learning is more suitable for the task with an intersection over union (IoU
) of 84.26%. Five different input representations, namely RGB, HSV, L*a*b, HHH, and YCrCb obtained different IoU
, ranging from 81.5% to 88.44%. Among them, the L*a*b got the highest IoU
. Besides, the adopted Histogram Equalization (HE) image enhancement method could improve the model’s robustness against poor illumination conditions. Through the HE preprocessing, the IoU
of the enhanced dataset increased by 3.88%, from 84.26% to 88.14%. The distance between the target and camera also affects the segmentation performance, no matter in which dataset, the closer the distance, the better the segmentation performance was. In a word, the conclusion of this research provides some meaningful suggestions for the study of grape or other fruit segmentation.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited