TTH-Net: Two-Stage Transformer–CNN Hybrid Network for Leaf Vein Segmentation
Abstract
:Featured Application
Abstract
1. Introduction
- (1)
- We expand the fusion of CNN and transformer and propose a coarse-to-fine approach that combines the advantages of Transformer and CNN, which is termed a two-stage hybrid network (TTH-Net) for leaf vein segmentation.
- (2)
- We introduce the CSE module to guide the feature fusion between the CNN and transformer. This approach breaks away from the traditional stage-wise cascade and significantly enhances the segmentation accuracy of the decoder.
- (3)
- Extensive experiments conducted on the LVN dataset and our proposed TTH-Net achieve stage-of-the-art performance. Additionally, corresponding ablation experiments validate the necessity of each stage and the CSE module.
2. Related Work
3. The Proposed Method
3.1. First Stage: Coarse-Grained Feature Extraction
3.2. Second Stage: Fine-Grained Feature Recovery
3.3. Cross-Stage Semantic Enhancement Module
3.4. Loss Function
4. Experiment
4.1. Dataset
4.2. Evaluation Metrics
4.3. Implementation Details
4.4. Comparisons with the State of the Art
4.5. Ablation Study
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Larese, M.G.; Granitto, P.M. Finding local leaf vein patterns for legume characterization and classification. Mach. Vis. Appl. 2016, 27, 709–720. [Google Scholar] [CrossRef]
- Darwin, B.; Dharmaraj, P.; Prince, S.; Popescu, D.E.; Hemanth, D.J. Recognition of bloom/yield in crop images using deep learning models for smart agriculture: A review. Agronomy 2021, 11, 646. [Google Scholar] [CrossRef]
- Sack, L.; Scoffoni, C. Leaf venation: Structure, function, development, evolution, ecology and applications in the past, present and future. New Phytol. 2013, 198, 983–1000. [Google Scholar] [CrossRef] [PubMed]
- Lersten, N.R. Modified clearing method to show sieve tubes in minor veins of leaves. Stain Technol. 1986, 61, 231–234. [Google Scholar] [CrossRef]
- Larese, M.G.; Namías, R.; Craviotto, R.M.; Arango, M.R.; Gallo, C.; Granitto, P.M. Automatic classification of legumes using leaf vein image features. Pattern Recognit. 2014, 47, 158–168. [Google Scholar] [CrossRef]
- Price, C.A.; Symonova, O.; Mileyko, Y.; Hilley, T.; Weitz, J.S. Leaf extraction and analysis framework graphical user interface: Segmenting and analyzing the structure of leaf veins and areoles. Plant Physiol. 2011, 155, 236–245. [Google Scholar] [CrossRef]
- Sibi Chakkaravarthy, S.; Sajeevan, G.; Kamalanaban, E.; Varun Kumar, K.A. Automatic leaf vein feature extraction for first degree veins. In Proceedings of the Advances in Signal Processing and Intelligent Recognition Systems: Proceedings of the Second International Symposium on Signal Processing and Intelligent Recognition Systems (SIRS-2015), Trivandrum, India, 16–19 December 2015; Springer International Publishing: Berlin/Heidelberg, Germany, 2015; pp. 581–592. [Google Scholar] [CrossRef]
- Radha, R.; Jeyalakshmi, S. An effective algorithm for edges and veins detection in leaf images. In Proceedings of the IEEE 2014 World Congress on Computing and Communication Technologies, Trichirappalli, India, 27 February–1 March 2014; pp. 128–131. [Google Scholar] [CrossRef]
- Selda, J.D.S.; Ellera, R.M.R.; Cajayon, L.C.; Linsangan, N.B. Plant identification by image processing of leaf veins. In Proceedings of the International Conference on Imaging, Signal Processing and Communication, Penang, Malaysia, 26–28 July 2017; pp. 40–44. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar] [CrossRef]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, 5–9 October 2015; Proceedings, Part III 18; Springer International Publishing: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar] [CrossRef]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. Automated and accurate segmentap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Washington, USA, 14–19 June 2020; pp. 1580–1589. [Google Scholar] [CrossRef]
- Xu, H.; Blonder, B.; Jodra, M.; Malhi, Y.; Fricker, M. Automated and accurate segmentation of leaf venation networks via deep learning. New Phytol. 2021, 229, 631–648. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar] [CrossRef]
- Strudel, R.; Garcia, R.; Laptev, I.; Schmid, C. Segmenter: Transformer for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 7262–7272. [Google Scholar] [CrossRef]
- Zheng, S.; Lu, J.; Zhao, H.; Zhu, X.; Luo, Z.; Wang, Y.; Fu, Y.; Feng, J.; Xiang, T.; Torr, P.H.S.; et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 6881–6890. [Google Scholar] [CrossRef]
- Bühler, J.; Rishmawi, L.; Pflugfelder, D.; Huber, G.; Scharr, H.; Hülskamp, M.; Koornneef, M.; Schurr, U.; Jahnke, S. phenoVein—A tool for leaf vein segmentation and analysis. Plant Physiol. 2015, 169, 2359–2370. [Google Scholar] [CrossRef] [PubMed]
- Kirchgeßner, N.; Scharr, H.; Schurr, U. Robust vein extraction on plant leaf images. In Proceedings of the 2nd IASTED International Conference Visualization, Imaging and Image Processing, Málaga, Spain, 9–12 September 2002. [Google Scholar]
- Blonder, B.; De Carlo, F.; Moore, J.; Rivers, M.; Enquist, B.J. X-ray imaging of leaf venation networks. New Phytol. 2012, 196, 1274–1282. [Google Scholar] [CrossRef]
- Salima, A.; Herdiyeni, Y.; Douady, S. Leaf vein segmentation of medicinal plant using hessian matrix. In Proceedings of the IEEE 2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Depok, Indonesia, 10–11 October 2015; pp. 275–279. [Google Scholar] [CrossRef]
- Katyal, V. Leaf vein segmentation using Odd Gabor filters and morphological operations. arXiv 2012, arXiv:1206.5157. [Google Scholar] [CrossRef]
- Saleem, R.; Shah, J.H.; Sharif, M.; Yasmin, M.; Yong, H.S.; Cha, J. Mango leaf disease recognition and classification using novel segmentation and vein pattern technique. Appl. Sci. 2021, 11, 11901. [Google Scholar] [CrossRef]
- Fan, Y.; Shi, H.; Yu, J.; Liu, D.; Han, W.; Yu, H.; Wang, Z.; Wang, X.; Huang, T.S. Balanced two-stage residual networks for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 161–168. [Google Scholar] [CrossRef]
- Li, Y.; Liu, M.; Yi, Y.; Li, Q.; Ren, D.; Zuo, W. Two-stage single image reflection removal with reflection-aware guidance. Appl. Intell. 2023, 53, 19433–19448. [Google Scholar] [CrossRef]
- Wang, H.; Huang, T.Z.; Xu, Z.; Wang, Y. A two-stage image segmentation via global and local region active contours. Neurocomputing 2016, 205, 130–140. [Google Scholar] [CrossRef]
- Ong, S.H.; Yeo, N.C.; Lee, K.H.; Venkatesh, Y.V.; Cao, D.M. Segmentation of color images using a two-stage self-organizing network. Image Vis. Comput. 2002, 20, 279–289. [Google Scholar] [CrossRef]
- Kaur, R.; Gupta, N. CFS-MHA: A Two-Stage Network Intrusion Detection Framework. Int. J. Inf. Secur. Priv. (IJISP) 2022, 16, 1–27. [Google Scholar] [CrossRef]
- Mashta, F.; Altabban, W.; Wainakh, M. Two-Stage Spectrum Sensing for Cognitive Radio Using Eigenvalues Detection. Int. J. Interdiscip. Telecommun. Netw. (IJITN) 2020, 12, 18–36. [Google Scholar] [CrossRef]
- Chen, Y.; Xia, R.; Zou, K.; Yang, K. FFTI: Image inpainting algorithm via features fusion and two-steps inpainting. J. Vis. Commun. Image Represent. 2023, 91, 103776. [Google Scholar] [CrossRef]
- Cai, J.F.; Chan, R.H.; Nikolova, M. Fast two-phase image deblurring under impulse noise. J. Math. Imaging Vis. 2010, 36, 46–53. [Google Scholar] [CrossRef]
- Tian, C.; Zheng, M.; Zuo, W.; Zhang, B.; Zhang, Y.; Zhang, D. Multi-stage image denoising with the wavelet transform. Pattern Recognit. 2023, 134, 109050. [Google Scholar] [CrossRef]
- Elhassan, M.A.; Yang, C.; Huang, C.; Legesse Munea, T.; Hong, X. -FPN: Scale-ware Strip Attention Guided Feature Pyramid Network for Real-time Semantic Segmentation. arXiv 2022. [Google Scholar] [CrossRef]
- Xue, Y.; Jin, G.; Shen, T.; Tan, L.; Wang, L. Template-Calibrating the Dice loss to handle neural network overconfidence for bioadaptive cross-entropy loss for UAV visual tracking. Chin. J. Aeronaut. 2023, 36, 299–312. [Google Scholar] [CrossRef]
- Yeung, M.; Rundo, L.; Nan, Y.; Sala, E.; Schönlieb, C.B.; Yang, G. Calibrating the Dice loss to handle neural network overconfidence for biomedical image segmentation. J. Digit. Imaging 2023, 36, 739–752. [Google Scholar] [CrossRef]
- Blonder, B.; Both, S.; Jodra, M.; Majalap, N.; Burslem, D.; Teh, Y.A.; Malhi, Y. Leaf venation networks of Bornean trees: Images and hand-traced segmentations. Ecology 2019, 100. [Google Scholar]
- Gu, Y.; Piao, Z.; Yoo, S.J. STHarDNet: Swin transformer with HarDNet for MRI segmentation. Appl. Sci. 2022, 12, 468. [Google Scholar] [CrossRef]
- Yang, B.; Qin, L.; Peng, H.; Guo, C.; Luo, X.; Wang, J. SDDC-Net: A U-shaped deep spiking neural P convolutional network for retinal vessel segmentation. Digit. Signal Process. 2023, 136, 104002. [Google Scholar] [CrossRef]
- Maqsood, A.; Farid, M.S.; Khan, M.H.; Grzegorzek, M. Deep malaria parasite detection in thin blood smear microscopic images. Appl. Sci. 2021, 11, 2284. [Google Scholar] [CrossRef]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 2019, 39, 1856–1867. [Google Scholar] [CrossRef]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V.; et al. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar] [CrossRef]
- Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jégou, H. Training data-efficient image transformers & distillation through attention. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 10347–10357. [Google Scholar]
- Tang, W.; Zou, D.; Yang, S.; Shi, J.; Dan, J.; Song, G. A two-stage approach for automatic liver segmentation with Faster R-CNN and DeepLab. Neural Comput. Appl. 2020, 32, 6769–6778. [Google Scholar] [CrossRef]
- Jiang, Z.; Ding, C.; Liu, M.; Tao, D. Two-stage cascaded u-net: 1st place solution to brats challenge 2019 segmentation task. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 5th International Workshop, BrainLes 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, 17 October 2019; Revised Selected Papers, Part I 5; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 231–241. [Google Scholar] [CrossRef]
- Božič, J.; Tabernik, D.; Skočaj, D. End-to-end training of a two-stage neural network for defect detection. In Proceedings of the IEEE 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 5619–5626. [Google Scholar] [CrossRef]
- Liu, J.; Yang, X.; Lau, S.; Wang, X.; Luo, S.; Lee, V.C.S.; Ding, L. Automated pavement crack detection and segmentation based on two-step convolutional neural network. Comput.-Aided Civ. Infrastruct. Eng. 2020, 35, 1291–1305. [Google Scholar] [CrossRef]
- Park, M.H.; Cho, J.H.; Kim, Y.T. CNN Model with Multilayer ASPP and Two-Step Cross-Stage for Semantic Segmentation. Machines 2023, 11, 126. [Google Scholar] [CrossRef]
- Jiang, Y.; Pang, D.; Li, C.; Yu, Y.; Cao, Y. Two-step deep learning approach for pavement crack damage detection and segmentation. Int. J. Pavement Eng. 2022, 1–14. [Google Scholar] [CrossRef]
- Dina, A.S.; Siddique, A.B.; Manivannan, D. A deep learning approach for intrusion detection in Internet of Things using focal loss function. Internet Things 2023, 22, 100699. [Google Scholar] [CrossRef]
Model | Year | Dice | Spe | Sen | Acc | |
---|---|---|---|---|---|---|
CNN | FCN [10] | 2015 | 76.52 1.29 | 77.57 .52 | 81.54 2.63 | 83.78 0.05 |
U-Net [12] | 2015 | 81.47 1.27 | 81.38 .45 | 84.28 .57 | 87.02 0.03 | |
GhostNet [13] | 2020 | 81.22 1.35 | 80.87 .68 | 84.33 .66 | 86.74 0.06 | |
MobileNetv3 [41] | 2019 | 81.34 1.38 | 80.75 .60 | 84.36 .72 | 86.73 0.07 | |
DeepLabv3+ [42] | 2018 | 82.51 1.25 | 81.90 .41 | 85.06 .51 | 87.12 0.03 | |
Transformer | ViT [15] | 2020 | 82.86 2.42 | 81.41 21 | 86.30 .84 | 87.24 0.04 |
Segmenter [16] | 2021 | 81.71 .37 | 80.77 02 | 84.88 .91 | 87.15 0.05 | |
Swin Transformer [43] | 2021 | 83.14 .44 | 81.76 .86 | 86.75 .72 | 87.86 0.03 | |
SETR [17] | 2021 | 82.82 .41 | 81.48 .72 | 86.69 .88 | 87.59 0.04 | |
DeiT [44] | 2021 | 82.78 .48 | 81.98 .84 | 86.67 .90 | 87.81 0.03 | |
Two-stage | ALSNet [45] | 2020 | 82.36 .23 | 81.64 .56 | 85.58 .59 | 87.44 0.03 |
TSUNet [46] | 2020 | 82.93 .25 | 81.53 .72 | 85.72 .47 | 87.63 0.02 | |
DDNet [47] | 2021 | 83.89 .31 | 82.45 .64 | 86.85 .53 | 87.82 0.01 | |
APCNet [48] | 2020 | 83.40 .40 | 81.55 .51 | 86.77 .58 | 87.47 0.03 | |
CMMANet [49] | 2023 | 83.96 .29 | 82.03 .43 | 86.89 .61 | 87.62 0.04 | |
PCDNet [50] | 2022 | 83.49 .26 | 82.16 .49 | 86.38 .50 | 87.53 0.02 | |
TTH-Net | 85.24 .28 | 82.95 .46 | 87.36 .55 | 89.01 0.03 |
Module | Dice | Spe | Sen | Acc |
---|---|---|---|---|
with 1st stage | 83.83 .35 | 81.57 .63 | 86.03 .80 | 87.67 0.04 |
with 1st stage + 2nd stage | 84.47 .26 | 82.18 .47 | 86.42 .61 | 88.22 0.03 |
with 1st + 2nd stage + CSE module | 85.24 .28 | 82.95 .46 | 87.36 .55 | 89.01 0.03 |
Study | Stage | Loss Function | Dice | Spe | Sen | Acc | ||
---|---|---|---|---|---|---|---|---|
CE | DL | Focal | ||||||
1 | 1st | √ | − | − | 85.24 .28 | 82.95 .46 | 87.36 .55 | 89.01 0.03 |
2nd | − | √ | − | |||||
2 | 1st | − | √ | − | 84.56 .35 | 82.37 .54 | 86.47 .67 | 88.37 0.04 |
2nd | − | − | √ | |||||
3 | 1st | √ | − | − | 83.89 .38 | 82.42 .58 | 85.68 | 87.42 0.03 |
2nd | − | − | √ | |||||
4 | 1st | − | √ | − | 84.78 .26 | 82.67 .41 | 86.02 .52 | 87.76 0.04 |
2nd | √ | − | − | |||||
5 | 1st | − | − | √ | 84.83 .29 | 82.25 .44 | 86.14 .41 | 88.35 0.02 |
2nd |
Dice | Spe | Sen | Acc | |
---|---|---|---|---|
0.3 | 84.56 .34 | 81.74 .58 | 87.03 .64 | 88.58 0.04 |
0.4 | 85.24 .28 | 82.95 .46 | 87.36 .55 | 89.01 0.03 |
0.5 | 85.07 .31 | 82.46 .47 | 87.10 .57 | 88.56 0.03 |
0.6 | 84.57 .38 | 82.20 .62 | 87.02 .71 | 88.14 005 |
0.7 | 84.42 .35 | 82.28 .53 | 87.04 .69 | 88.28 0.04 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Song, P.; Yu, Y.; Zhang, Y. TTH-Net: Two-Stage Transformer–CNN Hybrid Network for Leaf Vein Segmentation. Appl. Sci. 2023, 13, 11019. https://doi.org/10.3390/app131911019
Song P, Yu Y, Zhang Y. TTH-Net: Two-Stage Transformer–CNN Hybrid Network for Leaf Vein Segmentation. Applied Sciences. 2023; 13(19):11019. https://doi.org/10.3390/app131911019
Chicago/Turabian StyleSong, Peng, Yonghong Yu, and Yang Zhang. 2023. "TTH-Net: Two-Stage Transformer–CNN Hybrid Network for Leaf Vein Segmentation" Applied Sciences 13, no. 19: 11019. https://doi.org/10.3390/app131911019
APA StyleSong, P., Yu, Y., & Zhang, Y. (2023). TTH-Net: Two-Stage Transformer–CNN Hybrid Network for Leaf Vein Segmentation. Applied Sciences, 13(19), 11019. https://doi.org/10.3390/app131911019