A Rapid Construction Method for High-Throughput Wheat Grain Instance Segmentation Dataset Using High-Resolution Images
Abstract
:1. Introduction
1.1. Background and Motivation
- We have proposed simple and effective preprocessing and postprocessing algorithms to enhance the accuracy of small object detection in ultra-high-resolution images.
- We have built a pipeline for fast ground truth annotation, which improves the efficiency of dataset construction and reduces the associated costs.
- We have packaged the annotation pipeline into software, making it convenient for the manual correction of model annotations.
1.2. Related Work
1.2.1. Weed Seed Datasets and Detection Models
1.2.2. Ultra-High-Resolution Small Object Detection
1.2.3. Applications of Grounding DINO and SAM Models
2. Materials and Methods
2.1. Capturing Images for Testing
2.2. Annotation Pipeline
2.2.1. Preprocessing
2.2.2. Postprocessing
- Incomplete objects falling on the edges of patches;
- Objects completely falling within non-overlapping regions;
- Objects partially falling within overlapping regions;
- Objects completely falling within non-overlapping regions.
Algorithm 1 The corresponding postprocessing method for the overlapping preprocessing method. |
|
2.2.3. Detection
- Grounding DINO: Grounding DINO is a multimodal model that combines visual and linguistic features. By providing the object category as a text prompt, the model can output the corresponding bounding box for the object of interest. The model utilizes both the text prompt and visual features to generate the bounding box predictions. It’s important to note that modifying the input text prompt can impact the model’s visual feature extraction and subsequent prediction results.
- Segment Anything: Segment Anything is another multimodal model that incorporates both visual and linguistic information. This model accepts two types of prompts: points and boxes. If no specific prompt is provided, the model uses pre-defined grid points as prompts to detect all objects within the entire image.
- The annotator designs the text prompt, specifying the desired object category or categories to be detected.
- The text prompt and corresponding images are inputted into the Grounding DINO model, which generates bounding box coordinates for the objects based on the given prompt.
- The original images and the obtained bounding box coordinates are then fed into the Segment Anything model. This model produces segmentation results.
2.3. Manual Correction
- Modifying incorrect object category labels;
- Correcting inaccurate mask annotations;
- Adding mask labels for missed objects;
- Deploying the Grounding DINO model and SAM model within the software to enable users to easily achieve pixel-level annotations using text or bounding boxes.
3. Results
3.1. Data Preprocessing and Postprocessing
- Baseline: This option involves directly subsampling the original image and inputting it into the Grounding DINO model. The output of the Grounding DINO model is then fed into the SAM model.
- Non-overlapping clipping method: In this approach, the image is divided into patches, ensuring that there is no overlap between these patches. Each patch is individually processed by the model to obtain the positioned and segmented structure. As shown in Figure 7, objects located on the dividing line between adjacent patches are merged into the same object using a synthesis algorithm.
- Overlapping clipping method: This is the method we propose. The image is segmented into overlapping patches, and a postprocessing algorithm is used to filter out redundantly labeled objects (Figure 4).
3.2. The Composition of the Pipeline
- Baseline: pure manual annotation.
- SAM-only: directly inputting the images into the SAM model.
- Manual bounding box + SAM: annotators draw bounding boxes on the images, and then the SAM model is used to predict masks.
- Grounding DINO + SAM: the images are first inputted into the Grounding DINO model and then into the SAM model.
4. Discussion
4.1. Data Preprocessing and Postprocessing
4.2. The Composition of the Pipeline
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Wang, X.; Ma, L.; Yan, S.; Chen, X.; Growe, A. Trade for food security: The stability of global agricultural trade networks. Foods 2023, 12, 271. [Google Scholar] [CrossRef] [PubMed]
- Erenstein, O.; Jaleta, M.; Mottaleb, K.A.; Sonder, K.; Donovan, J.; Braun, H.J. Global trends in wheat production, consumption and trade. In Wheat Improvement: Food Security in a Changing Climate; Springer International Publishing: Cham, Switzerland, 2022; pp. 47–66. [Google Scholar]
- Barratt, B.I.; Colmenarez, Y.C.; Day, M.D.; Ivey, P.; Klapwijk, J.N.; Loomans, A.J.; Mason, P.G.; Palmer, W.A.; Sankaran, K.; Zhang, F. Regulatory challenges for biological control. In Biological Control: Global Impacts, Challenges and Future Directions of Pest Management; CSIRO Publishing: Clayton, VIC, Austrilia, 2021; pp. 166–196. [Google Scholar]
- Jhariya, M.K.; Banerjee, A.; Raj, A.; Meena, R.S.; Khan, N.; Kumar, S.; Bargali, S.S. Species invasion and ecological risk. In Natural Resources Conservation and Advances for Sustainability; Elsevier: Amsterdam, The Netherlands, 2022; pp. 503–531. [Google Scholar]
- Zhao, J.; Hu, K.; Chen, K.; Shi, J. Quarantine supervision of wood packaging materials (WPM) at Chinese ports of entry from 2003 to 2016. PLoS ONE 2021, 16, e0255762. [Google Scholar] [CrossRef] [PubMed]
- Luo, T.; Zhao, J.; Gu, Y.; Zhang, S.; Qiao, X.; Tian, W.; Han, Y. Classification of weed seeds based on visual images and deep learning. Inf. Process. Agric. 2023, 10, 40–51. [Google Scholar] [CrossRef]
- Miller, J.P.; Taori, R.; Raghunathan, A.; Sagawa, S.; Koh, P.W.; Shankar, V.; Liang, P.; Carmon, Y.; Schmidt, L. Accuracy on the line: On the strong correlation between out-of-distribution and in-distribution generalization. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 18–24 July 2021; pp. 7721–7735. [Google Scholar]
- Olsen, A.; Konovalov, D.A.; Philippa, B.; Ridd, P.; Wood, J.C.; Johns, J.; Banks, W.; Girgenti, B.; Kenny, O.; Whinney, J.; et al. DeepWeeds: A multiclass weed species image dataset for deep learning. Sci. Rep. 2019, 9, 2058. [Google Scholar] [CrossRef] [PubMed]
- Sapkota, B.B.; Hu, C.; Bagavathiannan, M.V. Evaluating cross-applicability of weed detection models across different crops in similar production environments. Front. Plant Sci. 2022, 13, 837726. [Google Scholar] [CrossRef] [PubMed]
- Peteinatos, G.G.; Reichel, P.; Karouta, J.; Andújar, D.; Gerhards, R. Weed identification in maize, sunflower, and potatoes with the aid of convolutional neural networks. Remote Sens. 2020, 12, 4185. [Google Scholar] [CrossRef]
- Dang, F.; Chen, D.; Lu, Y.; Li, Z. YOLOWeeds: A novel benchmark of YOLO object detectors for multi-class weed detection in cotton production systems. Comput. Electron. Agric. 2023, 205, 107655. [Google Scholar] [CrossRef]
- Haq, M.A. CNN based automated weed detection system using UAV imagery. Comput. Syst. Sci. Eng. 2022, 42, 837–849. [Google Scholar] [CrossRef]
- Bosquet, B.; Mucientes, M.; Brea, V.M. STDnet: Exploiting high resolution feature maps for small object detection. Eng. Appl. Artif. Intell. 2020, 91, 103615. [Google Scholar] [CrossRef]
- Liu, Z.; Gao, G.; Sun, L.; Fang, Z. HRDNet: High-resolution detection network for small objects. In Proceedings of the 2021 IEEE International Conference on Multimedia and Expo (ICME), Shenzhen, China, 5–9 July 2021; pp. 1–6. [Google Scholar]
- Yang, C.; Huang, Z.; Wang, N. Querydet: Cascaded sparse query for accelerating high-resolution small object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 13668–13677. [Google Scholar]
- Noh, J.; Bae, W.; Lee, W.; Seo, J.; Kim, G. Better to follow, follow to be better: Towards precise supervision of feature super-resolution for small object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9725–9734. [Google Scholar]
- Ramesh, D.B.; Iytha Sridhar, R.; Upadhyaya, P.; Kamaleswaran, R. Lugsam: A Novel Framework for Integrating Text Prompts to Segment Anything Model (Sam) for Segmentation Tasks of Icu Chest X-Rays. 4 February 2024. Available online: https://ssrn.com/abstract=4676192 (accessed on 30 March 2024).
- Cen, J.; Zhou, Z.; Fang, J.; Shen, W.; Xie, L.; Jiang, D.; Zhang, X.; Tian, Q. Segment anything in 3d with nerfs. Adv. Neural Inf. Process. Syst. 2023, 36, 25971–25990. [Google Scholar]
- Réby, K.; Guilhelm, A.; De Luca, L. Semantic Segmentation using Foundation Models for Cultural Heritage: An Experimental Study on Notre-Dame de Paris. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 1689–1697. [Google Scholar]
- Li, Y.; Wang, D.; Yuan, C.; Li, H.; Hu, J. Enhancing agricultural image segmentation with an agricultural segment anything model adapter. Sensors 2023, 23, 7884. [Google Scholar] [CrossRef] [PubMed]
- Ren, T.; Liu, S.; Zeng, A.; Lin, J.; Li, K.; Cao, H. Grounded sam: Assembling open-world models for diverse visual tasks. arXiv 2024, arXiv:2401.14159. [Google Scholar]
- Jiao, S.; Wei, Y.; Wang, Y.; Zhao, Y.; Shi, H. Learning mask-aware clip representations for zero-shot segmentation. Adv. Neural Inf. Process. Syst. 2023, 36, 35631–35653. [Google Scholar]
- Wang, D.; Zhang, J.; Du, B.; Xu, M.; Liu, L.; Tao, D.; Zhang, L. Samrs: Scaling-up remote sensing segmentation dataset with segment anything model. Adv. Neural Inf. Process. Syst. 2024, 36, 8815–8827. [Google Scholar]
- Quick Label. Available online: https://github.com/gaoCleo/quick-label (accessed on 10 March 2024).
- Xu, X.; Geng, Q.; Gao, F.; Xiong, D.; Qiao, H.; Ma, X. Segmentation and counting of wheat spike grains based on deep learning and textural feature. Plant Methods 2023, 19, 77. [Google Scholar] [CrossRef] [PubMed]
- Gao, Y.; Li, Y.; Jiang, R.; Zhan, X.; Lu, H.; Guo, W.; Yang, W.; Ding, Y.; Liu, S. Enhancing green fraction estimation in rice and wheat crops: A self-supervised deep learning semantic segmentation approach. Plant Phenomics 2023, 5, 0064. [Google Scholar] [CrossRef] [PubMed]
- Shen, R.; Zhen, T.; Li, Z. Segmentation of unsound wheat kernels based on improved mask RCNN. Sensors 2023, 23, 3379. [Google Scholar] [CrossRef] [PubMed]
Method | Model-Only Accuracy | Accuracy after Manual Calibration | Manual Correction Cost |
---|---|---|---|
Baseline | 0.00 | 0.95 | 6 h 33 min |
Non-overlapping | 0.93 | 0.97 | 37 min |
Overlapping | 0.89 | 0.99 | 54 min |
Method | Model-Only Accuracy | Accuracy after Manual Calibration | Manual Correction Cost |
---|---|---|---|
Baseline | / | 1.00 | 30 h 32 min |
SAM-only | 0.74 | 0.99 | 11 h 23 min |
Manual box+SAM | 0.99 | 0.99 | 2 h 28 min |
Grounding DINO+SAM | 0.89 | 0.99 | 54 min |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gao, Q.; Li, H.; Meng, T.; Xu, X.; Sun, T.; Yin, L.; Chai, X. A Rapid Construction Method for High-Throughput Wheat Grain Instance Segmentation Dataset Using High-Resolution Images. Agronomy 2024, 14, 1032. https://doi.org/10.3390/agronomy14051032
Gao Q, Li H, Meng T, Xu X, Sun T, Yin L, Chai X. A Rapid Construction Method for High-Throughput Wheat Grain Instance Segmentation Dataset Using High-Resolution Images. Agronomy. 2024; 14(5):1032. https://doi.org/10.3390/agronomy14051032
Chicago/Turabian StyleGao, Qi, Heng Li, Tianyue Meng, Xinyuan Xu, Tinghui Sun, Liping Yin, and Xinyu Chai. 2024. "A Rapid Construction Method for High-Throughput Wheat Grain Instance Segmentation Dataset Using High-Resolution Images" Agronomy 14, no. 5: 1032. https://doi.org/10.3390/agronomy14051032
APA StyleGao, Q., Li, H., Meng, T., Xu, X., Sun, T., Yin, L., & Chai, X. (2024). A Rapid Construction Method for High-Throughput Wheat Grain Instance Segmentation Dataset Using High-Resolution Images. Agronomy, 14(5), 1032. https://doi.org/10.3390/agronomy14051032