Study of a QueryPNet Model for Accurate Detection and Segmentation of Goose Body Edge Contours
Abstract
:Simple Summary
Abstract
1. Introduction
- We propose a novel neck module to obtain multiscale features of targets for fusion, shortening the path of feature fusion between high and low levels and making both the detection of targets and the segmentation of individual instances more effective.
- We construct a new and efficient query-based instance segmentation method by reducing the number of training parameters through a rational design and combining it with the neck module.
- We build a new goose dataset containing 639 instance segmentation images including 80 geese, which can be used as a reference for poultry instance segmentation research. The goose dataset comes from a meat goose free-range farm. The dataset images have both single, individual goose and dense geese activities, which are disturbed by natural factors, such as vegetation shading, non-goose animals, water bodies, and litter. Such datasets come from free-range production farming, which has a more complex background environment than captive breeding and can make the trained model more robust.
2. Materials and Methods
2.1. Data Collection
- Green, scientific, free-range farming methods are more complex compared to the narrow and homogeneous environment of captivity, with various vegetation, running water, and other shading factors; the dataset had strong interference, making the experiment more challenging.
- The background environment of the dataset had feed-feeding core areas, edge areas, etc. There were frequent change situations in goose location, as well as image balances of geese in sparse and dense distributions, which made our network design have stronger robustness and generalization ability.
- The existence of a high degree of similarity in appearance between goose bodies made it difficult for both the human eye and the network to distinguish between specific geese, making it difficult for later flock analysis, so improving segmentation accuracy was key.
- In the goose detection task, it was also a great challenge to detect individual goose instances in a complex environmental context.
2.2. Data Enhancement
2.2.1. CutMix Data Enhancement [21]
2.2.2. Mosaic
2.2.3. Flip
2.2.4. Random Color (Color Jitter)
2.2.5. Contrast Enhancement
2.2.6. Rotate
2.2.7. Center Clipping and Random Clipping
2.3. Method
2.3.1. QueryInst Network
2.3.2. Model Architecture—QueryPNet
- The information path was shortened, and the feature pyramid was enhanced with the precise localization signals present in the lower layers. The resulting high-level feature maps were then additionally processed using a bottom-up path enhancement method.
- Through the adaptive feature pool, all the features of each level were aggregated, and the features of the highest level were distributed to the same N5 levels obtained by the bottom-up path enhancement.
- To capture different views of each task, our model used tiny, fully connected layers to enhance the predictions. For the mask part, this layer had complementary properties to the FCN originally used by Mask R-CNN, and by fusing predictions from these two views, the information diversity increased and a better-quality mask was generated, while for the target in the detection part, a better-quality box could be generated.
3. Results
3.1. Experimental Setup
3.2. Performance Evaluation Metrics
3.3. Comparison with State-of-the-Art Methods and Results
4. Discussion
4.1. Contribution and Effectiveness of the Proposed Method
4.2. Limitations and Future Developments
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Statistical Bulletin of the People’s Republic of China on National Economic and Social Development in 2020. Available online: http://www.stats.gov.cn/xxgk/sjfb/zxfb2020/202102/t20210228_1814159.html (accessed on 1 August 2021).
- Hu, Z.; Yang, H.; Lou, T. Dual attention-guided feature pyramid network for instance segmentation of group pigs. Comput. Electron. Agric. 2021, 186, 106140. [Google Scholar] [CrossRef]
- Berckmans, D. Precision livestock farming (PLF). Comput. Electron. Agric. 2008, 62, 1. [Google Scholar] [CrossRef]
- Fournel, S.; Rousseau, A.N.; Laberge, B. Rethinking environment control strategy of confined animal housing systems through precision livestock farming—ScienceDirect. Biosyst. Eng. 2017, 155, 96–123. [Google Scholar] [CrossRef]
- Hertem, T.V.; Rooijakkers, L.; Berckmans, D.; Fernández, A.P.; Norton, T.; Berckmans, D.; Vranken, E. Appropriate data visualisation is key to Precision Livestock Farming acceptance. Comput. Electron. Agric. 2017, 138, 1–10. [Google Scholar] [CrossRef]
- Neethirajan, S. Recent advances in wearable sensors for animal health management. Sens. Bio.-Sens. Res. 2017, 12, 15–29. [Google Scholar] [CrossRef] [Green Version]
- Zhang, G.; Tao, S.; Lina, Y.; Chu, Q.; Jia, J.; Gao, W. Pig Body Temperature and Drinking Water Monitoring System Based on Implantable RFID Temperature Chip. Trans. Chin. Soc. Agric. Mach. 2019, 50, 297–304. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39, 640–651. [Google Scholar]
- Salau, J.; Krieter, J. Instance Segmentation with Mask R-CNN Applied to Loose-Housed Dairy Cows in a Multi-Camera Setting. Animals 2020, 10, 2402. [Google Scholar] [CrossRef] [PubMed]
- Zheng, X.; Li, F.; Lin, B.; Xie, D.; Liu, Y.; Jiang, K.; Gong, X.; Jiang, H.; Peng, R.; Duan, X. A Two-Stage Method to Detect the Sex Ratio of Hemp Ducks Based on Object Detection and Classification Networks. Animals 2022, 12, 1177. [Google Scholar] [CrossRef] [PubMed]
- Lin, B.; Jiang, K.; Xu, Z.; Li, F.; Li, J.; Mou, C.; Gong, X.; Duan, X. Feasibility Research on Fish Pose Estimation Based on Rotating Box Object Detection. Fishes 2021, 6, 65. [Google Scholar] [CrossRef]
- Liao, J.; Li, H.; Feng, A.; Wu, X.; Luo, Y.; Duan, X.; Ni, M.; Li, J. Domestic pig sound classification based on TransformerCNN. Appl. Intell. 2022, 1–17. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. YOLACT: Real-time Instance Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 9157–9166. [Google Scholar]
- Wang, X.; Kong, T.; Shen, C.; Jiang, Y.; Li, L. SOLO: Segmenting Objects by Locations. In Proceedings of the European Conference on Computer Vision, Odessa, Ukraine, 9 September 2019. [Google Scholar]
- Wang, X.; Zhang, R.; Kong, T.; Li, L.; Shen, C. SOLOv2: Dynamic and Fast Instance Segmentation. In Proceedings of the Conference on Neural Information Processing Systems, Virtual, 6–12 December 2020. [Google Scholar]
- Chen, H.; Sun, K.; Tian, Z.; Shen, C.; Huang, Y.; Yan, Y. BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–18 June 2020. [Google Scholar]
- Fang, Y.; Yang, S.; Wang, X.; Li, Y.; Fang, C.; Shan, Y.; Feng, B.; Liu, W. Instances as queries. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 6910–6919. [Google Scholar]
- Bello, R.W.; Mohamed, A.; Talib, A.Z. Contour Extraction of Individual Cattle from an Image Using Enhanced Mask R-CNN Instance Segmentation Method. IEEE Access 2021, 9, 56984–57000. [Google Scholar] [CrossRef]
- Brünger, J.; Gentz, M.; Traulsen, I.; Koch, R. Panoptic Instance Segmentation on Pigs. arXiv 2020, arXiv:2005.10499. [Google Scholar]
- Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019. [Google Scholar]
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. Mixup: Beyond empirical risk minimization. arXiv 2017, arXiv:1710.09412. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Fausto, M.; Nassir, N.; Seyed-Ahmad, A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Sun, P.; Zhang, R.; Jiang, Y.; Kong, T.; Xu, C.; Zhan, W.; Tomizuka, M.; Li, L.; Yuan, Z.; Wang, C.; et al. Sparse r-cnn: End-to-end object detection with learnable proposals. arXiv 2020, arXiv:2011.12450. [Google Scholar]
- Aaron, O.; Yazhe, L.; Igor, B.; Karen, S.; Oriol, V.; Koray, K.; George, D.; Edward, L.; Luis, C.; Florian, S.; et al. Parallel wavenet: Fast high-fidelity speech synthesis. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018. [Google Scholar]
- Neubeck, A.; Gool, L.V. Efficient non-maximum suppression. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, China, 20–24 August 2006; Volume 3. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34. [Google Scholar]
Method | Setting |
---|---|
CutMix | Random |
Mosaic | img_scale = (640, 640) |
prob = 1.0 | |
RandomFlip | flip_ratio = [0.4, 0.4] |
direction = [‘horizontal’, ‘vertical’] | |
RandomColor | level = (0.255) |
prob = 0.8 | |
Contrast enhancement | level = (0.90) |
prob = 0.8 | |
Rotate | level = (0.90) |
CenterCrop and RandomCrop | crop_size = (512, 256) |
prob = 0.8 | |
Multi-scale training | height = 1333, |
weight = [480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800] |
Software | Type/Version | Hardware | Type/Version |
---|---|---|---|
Operating system | Ubuntu20.04 | CPU | Intel(R) Xeon(R) |
Silver 4208 CPU @ 2.10 GHz | |||
IDE | Pycharm | GPU | NVIDIA Corporation GV100 [TITAN V] (rev a1) |
Python version | Python3.8 | RAM | DDR4 |
Python library | Pytorch1.7.0 | Hard disk | 2 Terabytes |
Mean Average Precision | |
---|---|
IoU = 0.50:0.95 | mAP |
IoU = 0.50 | mAP@0.5 |
IoU = 0.75 | mAP@0.75 |
mAP Across Scales | |
mAP for small objects: area < 322 | mAP_s |
mAP for medium objects: 322 < area < 962 | mAP_m |
mAP for large objects: area > 322 | mAP_l |
Model | mAP | mAP@0.5 | mAP@0.75 | mAP_s | mAP_m | mAP_l |
---|---|---|---|---|---|---|
Mask | 0.772 | 0.934 | 0.876 | 0.182 | 0.739 | 0.829 |
R-CNN | ||||||
YOLACT | 0.602 | 0.845 | 0.700 | 0.393 | 0.584 | 0.661 |
PointRend | 0.786 | 0.931 | 0.878 | 0.259 | 0.769 | 0.829 |
BlendMask | 0.761 | 0.899 | 0.825 | 0.420 | 0.725 | 0.809 |
QueryInst | 0.790 | 0.951 | 0.870 | 0.132 | 0.757 | 0.842 |
QueryPNet | 0.811 | 0.963 | 0.893 | 0.209 | 0.797 | 0.857 |
Model | mAP | mAP@0.5 | mAP@0.75 | mAP_s | mAP_m | mAP_l |
---|---|---|---|---|---|---|
Mask | 0.651 | 0.916 | 0.775 | 0.050 | 0.543 | 0.749 |
R-CNN | ||||||
YOLACT | 0.475 | 0.799 | 0.553 | 0.128 | 0.317 | 0.612 |
PointRend | 0.695 | 0.934 | 0.836 | 0.081 | 0.586 | 0.791 |
SOLO | 0.599 | 0.922 | 0.744 | 0.218 | 0.476 | 0.729 |
SOLOv2 | 0.631 | 0.890 | 0.786 | 0.191 | 0.516 | 0.768 |
BlendMask | 0.682 | 0.903 | 0.799 | 0.308 | 0.544 | 0.810 |
QueryInst | 0.689 | 0.945 | 0.823 | 0.041 | 0.591 | 0.785 |
SparseInst | 0.583 | 0.881 | 0.690 | 0.254 | 0.422 | 0.700 |
QueryPNet | 0.699 | 0.963 | 0.841 | 0.046 | 0.598 | 0.780 |
Model | Size (Pixel) | mAPbbox | mAPsegm | FLOPs (G) | Params (M) | FPS |
---|---|---|---|---|---|---|
QueryInst | 224 | 0.791 | 0.694 | 31.32 | 176.06 | 4.8 |
QueryPNet | 224 | 0.811 | 0.699 | 26.03 | 141.85 | 7.3 |
▲ | +0.02 | +0.005 | −5.29 | −34.21 | +2.5 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, J.; Su, H.; Zheng, X.; Liu, Y.; Zhou, R.; Xu, L.; Liu, Q.; Liu, D.; Wang, Z.; Duan, X. Study of a QueryPNet Model for Accurate Detection and Segmentation of Goose Body Edge Contours. Animals 2022, 12, 2653. https://doi.org/10.3390/ani12192653
Li J, Su H, Zheng X, Liu Y, Zhou R, Xu L, Liu Q, Liu D, Wang Z, Duan X. Study of a QueryPNet Model for Accurate Detection and Segmentation of Goose Body Edge Contours. Animals. 2022; 12(19):2653. https://doi.org/10.3390/ani12192653
Chicago/Turabian StyleLi, Jiao, Houcheng Su, Xingze Zheng, Yixin Liu, Ruoran Zhou, Linghui Xu, Qinli Liu, Daixian Liu, Zhiling Wang, and Xuliang Duan. 2022. "Study of a QueryPNet Model for Accurate Detection and Segmentation of Goose Body Edge Contours" Animals 12, no. 19: 2653. https://doi.org/10.3390/ani12192653
APA StyleLi, J., Su, H., Zheng, X., Liu, Y., Zhou, R., Xu, L., Liu, Q., Liu, D., Wang, Z., & Duan, X. (2022). Study of a QueryPNet Model for Accurate Detection and Segmentation of Goose Body Edge Contours. Animals, 12(19), 2653. https://doi.org/10.3390/ani12192653