Segmentation of Unsound Wheat Kernels Based on Improved Mask RCNN
Abstract
:1. Introduction
2. Materials and Methods
2.1. Experimental Settings
2.1.1. Datasets
2.1.2. Experiment Equipment
2.2. Improved Mask RCNN
2.2.1. Improved Mask RCNN with AM
2.2.2. Improvement in FPN
2.2.3. Improvement in RPN
2.3. Evaluation Metrics
3. Results
3.1. Resulting Images
3.2. Quantitative Results
3.3. Comparison with Other Models
4. Discussion
4.1. Attention Mechanism Module
4.2. FPN Module
4.3. Fusion of Attention Mechanisms and FPN
5. Conclusions
- (1)
- This model can solve the problem of multi-grain adhesion in dense wheat kernels. It is well known that if a classification network is used, traditional image segmentation methods such as concave segmentation and watershed are needed to solve the adhesion problem, which is not accurate.
- (2)
- The mask RCNN network is improved based on the circularity characteristics, the difference between edge features and underlying features such as color and texture, to make it more efficient for multi-target and fine-grained target recognition. Mask RCNN is widely used in the segmentation of the foreground and background, the segmentation of a single class of targets, and the segmentation of the target with a small number of categories. The efficiency is good in the above respects, but for the fine-grained, multi-target segmentation, results are worse. Our model can well overcome the above problems.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Velesaca, H.O.; Mira, R.; Suárez, P.L.; Larrea, C.X.; Sappa, A.D. Deep Learning based Corn Kernel Classification. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 14–19 June 2020; pp. 294–302. [Google Scholar]
- Su, W.-H.; Zhang, J.; Yang, C.; Page, R.; Szinyei, T.; Hirsch, C.D.; Steffenson, B.J. Automatic Evaluation of Wheat Resistance to Fusarium Head Blight Using Dual Mask-RCNN Deep Learning Frameworks in Computer Vision. Remote Sens. 2021, 13, 26. [Google Scholar] [CrossRef]
- Dong, J.; Wu, J.; Qian, L.; Liu, C.; Mao, W.; Zhang, Y. Research on hyperspectral image detection method of wheat unsound kernel. J. Electron. Meas. Instrum. 2017, 31, 1074–1080. [Google Scholar]
- Platt, J.C. Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines; Microsoft: Washington, DC, USA, 1998. [Google Scholar]
- Liu, H.; Wang, Y.Q.; Wang, X.M.; An, D.; Yan, Y.L. Study on Detection Method of Wheat Unsound Kernel Based on Near-Infrared Hyperspectral Imaging Technology. Spectrosc. Spectr. Anal. 2019, 39, 223–229. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Pearson, T.C.; Cetin, A.E.; Tewfik, A.H. Detection of insect damaged wheat kernels by impact acoustics. In Proceedings of the (ICASSP ‘05), IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, PA, USA, 23–23 March 2005; Volume 645, pp. v/649–v/652. [Google Scholar]
- Blei, D.M.; Ng, A.; Jordan, M.I. Latent dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Shlens, J. A Tutorial on Principal Component Analysis. arXiv 2014, arXiv:1404.1100. [Google Scholar]
- Hui, X.; Yuhua, Z.; Tong, Z.; Zhihui, L. Survey of image semantic segmentation methods based on deep neural network. J. Front. Comput. Sci. Technol. 2021, 15, 47–59. [Google Scholar]
- Cao, T.C.; Xiao-Hai, H.E.; Dong, D.L.; Shi, H.; Xiong, S.H. Identification of Unsound Kernels in Wheat Based on CNN Deep Model. Mod. Comput. 2017, 36, 9–14. [Google Scholar]
- He, J.; Wu, X.; He, X.; Hu, J.; Qin, L. Imperfect wheat kernel recognition combined with image enhancement and conventional neural network. J. Comput. Appl. 2021, 41, 911–916. [Google Scholar]
- Shatadal, P.; Jayas, D.S.; Bulley, N.R. Digital image analysis for software separation and classification of touching grains. II. Classification. Trans. ASABE 1995, 38, 645–649. [Google Scholar] [CrossRef]
- Siriwongkul, C.; Polpinit, P. Rice Kernel Separations Using Contour Analysis and Skeleton. Appl. Mech. Mater. 2015, 781, 515–518. [Google Scholar] [CrossRef]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate Attention for Efficient Mobile Network Design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 13708–13717. [Google Scholar]
- Zhu, X.; Cheng, D.; Zhang, Z.; Lin, S.; Dai, J. An Empirical Study of Spatial Attention Mechanisms in Deep Networks. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6687–6696. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Wang, S.; Sun, G.; Zheng, B.; Du, Y. A Crop Image Segmentation and Extraction Algorithm Based on Mask RCNN. Entropy 2021, 23, 1160. [Google Scholar] [CrossRef] [PubMed]
- Wang, D.; He, D. Fusion of Mask RCNN and attention mechanism for instance segmentation of apples under complex background. Comput. Electron. Agric. 2022, 196, 106864. [Google Scholar] [CrossRef]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Neubeck, A.; Gool, L.V. Efficient Non-Maximum Suppression. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, China, 20–24 August 2006; pp. 850–855. [Google Scholar]
- Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning Deep Features for Discriminative Localization. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2921–2929. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 9992–10002. [Google Scholar]
- Huang, Z.; Huang, L.; Gong, Y.; Huang, C.; Wang, X. Mask Scoring R-CNN. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 6402–6411. [Google Scholar]
- Xie, E.; Sun, P.; Song, X.; Wang, W.; Liu, X.; Liang, D.; Shen, C.; Luo, P. PolarMask: Single Shot Instance Segmentation With Polar Representation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 12190–12199. [Google Scholar]
- Wang, X.; Kong, T.; Shen, C.; Jiang, Y.; Li, L. SOLO: Segmenting Objects by Locations. arXiv 2019, arXiv:1912.04488. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2017; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.-S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Qiao, S.; Chen, L.-C.; Yuille, A.L. DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2020; pp. 10208–10219. [Google Scholar]
- Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.P.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 40, 834–848. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kernel Types | Number of Training Images | Number of Validation Images | Number of Test Images |
---|---|---|---|
Perfect | 24 | 3 | 3 |
Broken | 24 | 3 | 3 |
Moldy | 24 | 3 | 3 |
Spotted | 24 | 3 | 3 |
Sprouted | 24 | 3 | 3 |
Injured | 24 | 3 | 3 |
Mixture | 480 | 60 | 60 |
Total | 624 | 78 | 78 |
Module | Precision | Recall | Parameters | Time/s |
---|---|---|---|---|
Mask RCNN | 0.58 | 0.67 | 9203 | 6.18 |
Mask RCNN + AM | 0.82 | 0.89 | 9521 | 7.10 |
Mask RCNN + FPN | 0.72 | 0.76 | 9318 | 6.67 |
Mask RCNN + AM + FPN | 0.86 | 0.91 | 9530 | 7.81 |
Model | mAP | AP | |||||
---|---|---|---|---|---|---|---|
Perfect | Moldy | Injured | Spotted | Sprouted | Broken | ||
Mask RCNN | 0.58 | 0.42 | 0.74 | 0.46 | 0.42 | 0.73 | 0.69 |
Mask RCNN + AM | 0.82 | 0.53 | 0.97 | 0.75 | 0.76 | 0.91 | 0.98 |
Mask RCNN + FPN | 0.72 | 0.46 | 0.91 | 0.69 | 0.83 | 0.83 | 0.89 |
Mask RCNN + AM + FPN | 0.86 | 0.60 | 0.99 | 0.82 | 0.89 | 0.89 | 0.98 |
Module | AP | AR | mIOU | Parameters | Time/s |
---|---|---|---|---|---|
Mask RCNN | 0.58 | 0.67 | 0.60 | 9203 | 6.18 |
Swin Transformer | 0.53 | 0.63 | 0.53 | 6513 | 5.10 |
Mask Scoring RCNN | 0.67 | 0.73 | 0.77 | 9210 | 6.67 |
Polar Mask | 0.17 | 0.34 | 0.25 | 7653 | 5.61 |
SOLO | 0.38 | 0.53 | 0.44 | 8125 | 6.12 |
Improved Mask RCNN | 0.86 | 0.91 | 0.87 | 9530 | 7.81 |
Module | Precision | Recall | Parameters | Time/s |
---|---|---|---|---|
SE | 0.72 | 0.76 | 9528 | 7.80 |
CBAM | 0.73 | 0.78 | 9538 | 7.96 |
ECA | 0.82 | 0.89 | 9521 | 7.10 |
Model | Precision | Recall | Parameters | Time/s |
---|---|---|---|---|
SE + RFPN | 0.73 | 0.78 | 9545 | 8.12 |
SE + bottom-up FPN | 0.69 | 0.74 | 9538 | 7.96 |
CBAM + RFPN | 0.72 | 0.75 | 9560 | 8.27 |
CBAM + bottom-up FPN | 0.71 | 0.76 | 9540 | 7.91 |
ECA + RFPN | 0.78 | 0.81 | 9538 | 7.96 |
ECA + bottom-up FPN | 0.86 | 0.91 | 9530 | 7.83 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shen, R.; Zhen, T.; Li, Z. Segmentation of Unsound Wheat Kernels Based on Improved Mask RCNN. Sensors 2023, 23, 3379. https://doi.org/10.3390/s23073379
Shen R, Zhen T, Li Z. Segmentation of Unsound Wheat Kernels Based on Improved Mask RCNN. Sensors. 2023; 23(7):3379. https://doi.org/10.3390/s23073379
Chicago/Turabian StyleShen, Ran, Tong Zhen, and Zhihui Li. 2023. "Segmentation of Unsound Wheat Kernels Based on Improved Mask RCNN" Sensors 23, no. 7: 3379. https://doi.org/10.3390/s23073379
APA StyleShen, R., Zhen, T., & Li, Z. (2023). Segmentation of Unsound Wheat Kernels Based on Improved Mask RCNN. Sensors, 23(7), 3379. https://doi.org/10.3390/s23073379