A Segmentation Network with Two Distinct Attention Modules for the Segmentation of Multiple Renal Structures in Ultrasound Images
Abstract
1. Introduction
- (1)
- We explore deep-learning-based methods for the segmentation of multiple renal structures in ultrasound images and propose a novel segmentation model named MAT-UNet, which demonstrates high reliability, accuracy, and robustness.
- (2)
- We design a multi-convolution pixel-wise attention module (MCPAM), which utilizes convolution layers of different kernel sizes and pixel-wise attention to lead the network to focus on more important features.
- (3)
- To enhance the model’s ability to capture features, we develop a triple-branch multi-head self-attention mechanism (TBMSM) at the bottom of MAT-UNet. The triple-branch multi-head self-attention mechanism uses three convolution layers with different kernel sizes to obtain different receptive fields and learn the global contextual features, and employs three multi-head self-attention mechanisms to effectively learn global contextual information.
2. Materials and Methods
2.1. Network Architecture
2.1.1. Overall
2.1.2. Encoder
2.1.3. Decoder
2.1.4. Multi-Convolution Pixel-Wise Attention Module
2.1.5. Triple-Branch Multi-Head Self-Attention Mechanism
2.2. Dataset
2.3. Implementation Details
2.4. Loss Function
2.5. Metrics
3. Results
3.1. Comparison Results of Renal Capsule
3.2. Comparison Results of Internal Renal Structures
3.3. Ablation Results
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Gulati, M.; Cheng, J.; Loo, J.T.; Skalski, M.; Malhi, H.; Duddalwar, V. Pictorial Review: Renal Ultrasound. Clin. Imaging 2018, 51, 133–154. [Google Scholar] [CrossRef]
- Burgan, C.M.; Sanyal, R.; Lockhart, M.E. Ultrasound of Renal Masses. Radiol. Clin. N. Am. 2019, 57, 585–600. [Google Scholar] [CrossRef]
- Shelhamer, E.; Long, J.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 640–651. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2015; Volume 9351, pp. 234–241. ISBN 978-3-319-24573-7. [Google Scholar]
- Çiçek, Ö.; Abdulkadir, A.; Lienkamp, S.S.; Brox, T.; Ronneberger, O. 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. arXiv 2016, arXiv:1606.06650. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. arXiv 2019, arXiv:1709.01507. [Google Scholar]
- Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention U-Net: Learning Where to Look for the Pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. UNet++: A Nested U-Net Architecture for Medical Image Segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support; Stoyanov, D., Taylor, Z., Carneiro, G., Syeda-Mahmood, T., Martel, A., Maier-Hein, L., Tavares, J.M.R.S., Bradley, A., Papa, J.P., Belagiannis, V., et al., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 11045, pp. 3–11. ISBN 978-3-030-00888-8. [Google Scholar]
- Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Computer Vision—ECCV 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 11211, pp. 833–851. ISBN 978-3-030-01233-5. [Google Scholar]
- Gu, Z.; Cheng, J.; Fu, H.; Zhou, K.; Hao, H.; Zhao, Y.; Zhang, T.; Gao, S.; Liu, J. CE-Net: Context Encoder Network for 2D Medical Image Segmentation. IEEE Trans. Med. Imaging 2019, 38, 2281–2292. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2021, arXiv:2010.11929. [Google Scholar]
- Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation. arXiv 2021, arXiv:2105.05537. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. arXiv 2021, arXiv:2103.14030. [Google Scholar]
- Zhu, Z.; Zhang, Z.; Qi, G.; Li, Y.; Li, Y.; Mu, L. A Dual-Branch Network for Ultrasound Image Segmentation. Biomed. Signal Process. Control 2025, 103, 107368. [Google Scholar] [CrossRef]
- Wu, Z.; Hai, J.; Zhang, L.; Chen, J.; Cheng, G.; Yan, B. Cascaded Fully Convolutional DenseNet for Automatic Kidney Segmentation in Ultrasound Images. In Proceedings of the 2019 2nd International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, 25–28 May 2019; pp. 384–388. [Google Scholar]
- Yin, S.; Zhang, Z.; Li, H.; Peng, Q.; You, X.; Furth, S.L.; Tasian, G.E.; Fan, Y. Fully-Automatic Segmentation Of Kidneys In Clinical Ultrasound Images Using A Boundary Distance Regression Network. In Proceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 8–11 April 2019; pp. 1741–1744. [Google Scholar]
- Chen, G.; Yin, J.; Dai, Y.; Zhang, J.; Yin, X.; Cui, L. A Novel Convolutional Neural Network for Kidney Ultrasound Images Segmentation. Comput. Methods Programs Biomed. 2022, 218, 106712. [Google Scholar] [CrossRef] [PubMed]
- Chen, G.; Dai, Y.; Zhang, J.; Yin, X.; Cui, L. MBANet: Multi-Branch Aware Network for Kidney Ultrasound Images Segmentation. Comput. Biol. Med. 2022, 141, 105140. [Google Scholar] [CrossRef]
- Valente, S.; Morais, P.; Torres, H.R.; Oliveira, B.; Buschle, L.R.; Fritz, A.; Correia-Pinto, J.; Lima, E.; Vilaça, J.L. A Comparative Study of Deep Learning Methods for Multi-Class Semantic Segmentation of 2D Kidney Ultrasound Images. In Proceedings of the 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Sydney, Australia, 24–27 July 2023; pp. 1–4. [Google Scholar]
- Chen, G.-P.; Zhao, Y.; Dai, Y.; Zhang, J.-X.; Yin, X.-T.; Cui, L.; Qian, J. Asymmetric U-Shaped Network with Hybrid Attention Mechanism for Kidney Ultrasound Images Segmentation. Expert Syst. Appl. 2023, 212, 118847. [Google Scholar] [CrossRef]
- Wang, Z.; Guan, Y.; Chen, Z.; Zhang, W.; Chen, G.; Dai, Y. A Kidney Dynamic Ultrasound Image Segmentation Method Based on STDC Network. In Proceedings of the 2024 36th Chinese Control and Decision Conference (CCDC), Xi’an, China, 25–27 May 2024; pp. 501–505. [Google Scholar]
- Chen, G.; Dai, Y.; Zhang, J.; Yin, X.; Cui, L. MBDSNet: Automatic Segmentation of Kidney Ultrasound Images Using a Multi-Branch and Deep Supervision Network. Digit. Signal Process. 2022, 130, 103742. [Google Scholar] [CrossRef]
- Chang, Y.-C.; Lo, C.-M.; Chen, Y.-K.; Wu, P.-H.; Luh, H. W-Net: Two-Stage Segmentation for Multi-Center Kidney Ultrasound. In Proceedings of the 2024 IEEE Conference on Artificial Intelligence (CAI), Singapore, 25–27 June 2024; pp. 1522–1523. [Google Scholar]
- Isensee, F.; Jaeger, P.F.; Kohl, S.A.A.; Petersen, J.; Maier-Hein, K.H. nnU-Net: A Self-Configuring Method for Deep Learning-Based Biomedical Image Segmentation. Nat. Methods 2021, 18, 203–211. [Google Scholar] [CrossRef] [PubMed]
- Khan, R.; Zaman, A.; Chen, C.; Xiao, C.; Zhong, W.; Liu, Y.; Hassan, H.; Su, L.; Xie, W.; Kang, Y.; et al. MLAU-Net: Deep Supervised Attention and Hybrid Loss Strategies for Enhanced Segmentation of Low-Resolution Kidney Ultrasound. Digit. Health 2024, 10, 20552076241291306. [Google Scholar] [CrossRef]
- Singla, R.; Ringstrom, C.; Hu, G.; Lessoway, V.; Reid, J.; Nguan, C.; Rohling, R. The Open Kidney Ultrasound Data Set. In Simplifying Medical Ultrasound; Kainz, B., Noble, A., Schnabel, J., Khanal, B., Müller, J.P., Day, T., Eds.; Lecture Notes in Computer Science; Springer Nature Switzerland: Cham, Switzerland, 2023; Volume 14337, pp. 155–164. ISBN 978-3-031-44520-0. [Google Scholar]
- Tang, F.; Ding, J.; Wang, L.; Ning, C.; Zhou, S.K. CMUNeXt: An Efficient Medical Image Segmentation Network Based on Large Kernel and Skip Fusion. arXiv 2023, arXiv:2308.01239. [Google Scholar]
- Ruan, J.; Li, J.; Xiang, S. VM-UNet: Vision Mamba UNet for Medical Image Segmentation. arXiv 2024, arXiv:2402.02491. [Google Scholar]
- Wang, Z.; Zheng, J.-Q.; Zhang, Y.; Cui, G.; Li, L. Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation. arXiv 2024, arXiv:2402.05079. [Google Scholar]
- Valanarasu, J.M.J.; Patel, V.M. UNeXt: MLP-Based Rapid Medical Image Segmentation Network. arXiv 2022, arXiv:2203.04967. [Google Scholar]
- Dai, D.; Dong, C.; Yan, Q.; Sun, Y.; Zhang, C.; Li, Z.; Xu, S. I 2 U-Net: A Dual-Path U-Net with Rich Information Interaction for Medical Image Segmentation. Med. Image Anal. 2024, 97, 103241. [Google Scholar] [CrossRef] [PubMed]
Methods | DSC (%) | HD95 (mm) | ASD (mm) | IOU (%) |
---|---|---|---|---|
UNet | 89.95 * | 81.02 * | 25.80 * | 82.54 * |
Attention UNet | 90.69 * | 71.13 * | 21.83 * | 83.79 * |
CMUNeXt | 91.58 * | 52.96 * | 15.54 * | 85.17 * |
VMUNet | 89.34 * | 51.27 * | 17.47 * | 81.96 * |
MambaUNet | 89.33 * | 53.31 * | 18.41 * | 81.89 * |
UNeXt | 89.50 * | 50.39 * | 15.48 * | 82.60 * |
SwinUNet | 78.94 * | 82.09 * | 33.09 * | 67.16 * |
I2UNet | 92.14 * | 38.02 * | 12.37 * | 86.12 * |
Ours | 93.83 | 32.02 | 9.80 | 88.74 |
Structures | Methods | DSC (%) | HD95 (mm) | ASD (mm) | IOU (%) |
---|---|---|---|---|---|
CEC | UNet | 81.48 * | 50.10 * | 16.45 * | 70.40 * |
Attention UNet | 81.42 * | 55.43 * | 16.68 * | 70.34 * | |
CMUNeXt | 83.13 | 42.11 | 12.63 | 72.30 | |
VMUNet | 80.73 * | 42.70 | 12.35 | 69.63 * | |
MambaUNet | 79.62 * | 46.15 * | 13.68 | 68.48 * | |
UNeXt | 79.29 * | 60.43 * | 16.34 * | 67.95 * | |
SwinUNet | 72.72 * | 79.60 * | 28.55 * | 59.96 * | |
I2UNet | 82.70 | 33.73 | 10.92 | 72.86 | |
Ours | 84.34 | 35.79 | 11.17 | 74.26 | |
renal medulla | UNet | 64.81 | 85.10 | 24.62 * | 50.28 |
Attention UNet | 62.91 * | 82.92 | 22.97 * | 48.13 * | |
CMUNeXt | 63.71 | 88.91 | 27.71 * | 48.71 | |
VMUNet | 62.38 * | 86.16 | 27.23 * | 47.78 * | |
MambaUNet | 62.58 | 83.90 | 25.56 * | 47.77 * | |
UNeXt | 60.39 * | 87.36 | 24.67 * | 45.76 * | |
SwinUNet | 52.41 * | 112.59 * | 42.89 * | 37.74 * | |
I2UNet | 65.56 | 75.48 | 22.40 | 50.91 | |
Ours | 66.34 | 82.54 | 19.52 | 51.78 | |
renal cortex | UNet | 57.04 | 112.92 | 26.82 | 41.53 |
Attention UNet | 58.30 | 120.28 | 27.51 * | 42.62 | |
CMUNeXt | 56.53 | 111.62 | 31.53 * | 41.01 | |
VMUNet | 53.40 * | 114.95 | 35.81 * | 38.36 * | |
MambaUNet | 54.40 * | 119.38 | 33.35 * | 39.14 * | |
UNeXt | 51.55 * | 117.81 | 31.79 * | 36.88 * | |
SwinUNet | 43.54 * | 138.05 * | 46.60 * | 29.55 * | |
I2UNet | 57.35 | 100.31 | 25.74 | 42.00 | |
Ours | 58.93 | 107.02 | 21.69 | 43.61 |
Methods | DSC (%) | HD95 (mm) | ASD (mm) | IOU (%) |
---|---|---|---|---|
Baseline | 91.24 * | 61.63 * | 19.85 * | 84.71 * |
Baseline + MCPAM | 92.56 * | 35.25 | 11.31 | 86.84 * |
Baseline + TBMSM | 93.26 | 37.06 | 10.49 | 87.81 |
Baseline + MCPAM + TBMSM | 93.83 | 32.02 | 9.80 | 88.74 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zuo, Y.; Li, J.; Tian, J. A Segmentation Network with Two Distinct Attention Modules for the Segmentation of Multiple Renal Structures in Ultrasound Images. Diagnostics 2025, 15, 1978. https://doi.org/10.3390/diagnostics15151978
Zuo Y, Li J, Tian J. A Segmentation Network with Two Distinct Attention Modules for the Segmentation of Multiple Renal Structures in Ultrasound Images. Diagnostics. 2025; 15(15):1978. https://doi.org/10.3390/diagnostics15151978
Chicago/Turabian StyleZuo, Youhe, Jing Li, and Jing Tian. 2025. "A Segmentation Network with Two Distinct Attention Modules for the Segmentation of Multiple Renal Structures in Ultrasound Images" Diagnostics 15, no. 15: 1978. https://doi.org/10.3390/diagnostics15151978
APA StyleZuo, Y., Li, J., & Tian, J. (2025). A Segmentation Network with Two Distinct Attention Modules for the Segmentation of Multiple Renal Structures in Ultrasound Images. Diagnostics, 15(15), 1978. https://doi.org/10.3390/diagnostics15151978