Breast Ultrasound Image Segmentation Integrating Mamba-CNN and Feature Interaction
Abstract
1. Introduction
2. Materials and Methods
2.1. General Structure of the Network Model
2.2. Encoder Module VSS
2.3. Hybrid Attention Enhancement Mechanism (HAEM)
2.4. Cross-Fusion Module (CFM)
2.5. Loss Function
3. Experiments
3.1. Material Preparation
3.1.1. Environment Configuration
3.1.2. Datasets
3.1.3. Evaluation Metrics
3.2. Experimental Results
3.2.1. Module Control Experiment
3.2.2. Comparison of Different Algorithms
3.2.3. Ablation Study
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Wang, Z.; Xiao, Y.; Jiang, Y.; Shao, Z. Advances in fundamental and translational breast cancer research in 2022. China Oncol. 2023, 33, 95–102. [Google Scholar]
- Siegel, R.L.; Kratzer, T.B.; Giaquinto, A.N.; Sung, H.; Jemal, A. Cancer statistics, 2025. CA Cancer J. Clin. 2025, 75, 10. [Google Scholar] [CrossRef] [PubMed]
- Dong, F. Clinical applications of contrast-enhanced ultrasound in diagnosis of breast diseases: Present situation and prospect. Chin. J. Med. Ultrasound 2020, 17, 1151–1154. [Google Scholar]
- Almajalid, R.; Shan, J.; Du, Y.; Zhang, M. Development of a deep-learning-based method for breast ultrasound image segmentation. In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, 17–20 December 2018; pp. 1103–1108. [Google Scholar]
- Szénási, S. Distributed region growing algorithm for medical image segmentation. Int. J. Circuits Syst. Signal Process. 2014, 8, 173–181. [Google Scholar]
- Li, Y.; Zhu, R.; Mi, L.; Cao, Y.; Yao, D. Segmentation of white blood cell from acute lymphoblastic leukemia images using dual-threshold method. Comput. Math. Methods Med. 2016, 2016, 9514707. [Google Scholar] [CrossRef]
- Kayalibay, B.; Jensen, G.; van der Smagt, P. CNN-based segmentation of medical imaging data. arXiv 2017, arXiv:1701.03056. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18, 2015. pp. 234–241. [Google Scholar]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, 20 September 2018; Proceedings 4, 2018. pp. 3–11. [Google Scholar]
- Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Shareef, B.; Vakanski, A.; Freer, P.E.; Xian, M. Estan: Enhanced small tumor-aware network for breast ultrasound image segmentation. Healthcare 2022, 10, 2262. [Google Scholar] [CrossRef]
- Yap, M.H.; Pons, G.; Marti, J.; Ganau, S.; Sentis, M.; Zwiggelaar, R.; Davison, A.K.; Marti, R. Automated breast ultrasound lesions detection using convolutional neural networks. IEEE J. Biomed. Health Inform. 2017, 22, 1218–1226. [Google Scholar] [CrossRef]
- Hu, Y.; Guo, Y.; Wang, Y.; Yu, J.; Li, J.; Zhou, S.; Chang, C. Automatic tumor segmentation in breast ultrasound images using a dilated fully convolutional network combined with an active contour model. Med. Phys. 2019, 46, 215–228. [Google Scholar] [CrossRef]
- Zhu, L.; Chen, R.; Fu, H.; Xie, C.; Wang, L.; Wan, L.; Heng, P.-A. A second-order subregion pooling network for breast lesion segmentation in ultrasound. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, 4–8 October 2020; Proceedings, Part VI 23, 2020. pp. 160–170. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar] [CrossRef]
- Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. Transunet: Transformers make strong encoders for medical image segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar] [CrossRef]
- Mo, Y.; Han, C.; Liu, Y.; Liu, M.; Shi, Z.; Lin, J.; Zhao, B.; Huang, C.; Qiu, B.; Cui, Y. Hover-trans: Anatomy-aware hover-transformer for roi-free breast cancer diagnosis in ultrasound images. IEEE Trans. Med. Imaging 2023, 42, 1696–1706. [Google Scholar] [CrossRef] [PubMed]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-unet: Unet-like pure transformer for medical image segmentation. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 205–218. [Google Scholar]
- Gu, A.; Goel, K.; Ré, C. Efficiently modeling long sequences with structured state spaces. arXiv 2021, arXiv:2111.00396. [Google Scholar]
- Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar] [CrossRef]
- Liu, Y.; Tian, Y.; Zhao, Y.; Yu, H.; Xie, L.; Wang, Y.; Ye, Q.; Jiao, J.; Liu, Y. Vmamba: Visual state space model. Adv. Neural Inf. Process. Syst. 2024, 37, 103031–103063. [Google Scholar]
- Wang, Z.; Zheng, J.-Q.; Zhang, Y.; Cui, G.; Li, L. Mamba-unet: Unet-like pure visual mamba for medical image segmentation. arXiv 2024, arXiv:2402.05079. [Google Scholar]
- Gao, H.; Yuan, H.; Wang, Z.; Ji, S. Pixel transposed convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 1218–1227. [Google Scholar] [CrossRef]
- Yuan, L.; Chen, Y.; Wang, T.; Yu, W.; Shi, Y.; Jiang, Z.-H.; Tay, F.E.; Feng, J.; Yan, S. Tokens-to-token vit: Training vision transformers from scratch on imagenet. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 558–567. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Shazeer, N. Glu variants improve transformer. arXiv 2020, arXiv:2002.05202. [Google Scholar] [CrossRef]
- Zhu, X.; Cheng, D.; Zhang, Z.; Lin, S.; Dai, J. An empirical study of spatial attention mechanisms in deep networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6688–6697. [Google Scholar]
- Qin, Z.; Zhang, P.; Wu, F.; Li, X. Fcanet: Frequency channel attention networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 783–792. [Google Scholar]
- Yu, F.; Koltun, V.; Funkhouser, T. Dilated residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 472–480. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Al-Dhabyani, W.; Gomaa, M.; Khaled, H.; Fahmy, A. Dataset of Breast Ultrasound Images. Data Brief 2019, 28, 104863. [Google Scholar] [CrossRef]
- Park, J.; Woo, S.; Lee, J.-Y.; Kweon, I.S. Bam: Bottleneck attention module. arXiv 2018, arXiv:1807.06514. [Google Scholar] [CrossRef]
- He, Q.; Yang, Q.; Xie, M. HCTNet: A hybrid CNN-transformer network for breast ultrasound image segmentation. Comput. Biol. Med. 2023, 155, 106629. [Google Scholar] [CrossRef]










| Method | Parameters/M | FLOPs/G |
|---|---|---|
| Unet | 34.53 | 65.53 |
| Transunet | 92.23 | 148.06 |
| Swin-unet | 27.15 | 47.32 |
| HCTNet | 22.20 | 38.91 |
| MambaUnet | 27.84 | 32.51 |
| Ours | 21.76 | 39.19 |
| Base Model | Dataset | Module | Dice/% ↑ | HD95/mm ↓ | Pre/% | Rec/% |
|---|---|---|---|---|---|---|
| Mamba | Experimental dataset | CAM | 72.98 | 25.43 | 75.11 | 76.06 |
| SAM | 73.07 | 26.69 | 74.90 | 76.53 | ||
| BAM | 73.37 | 24.44 | 78.49 | 75.81 | ||
| HAEM | 73.89 | 23.21 | 78.32 | 76.74 |
| Method | Dice/% ↑ | HD95/mm ↓ | Pre/% | Rec/% |
|---|---|---|---|---|
| Unet | 75.23 | 13.47 | 82.19 | 77.91 |
| Transunet | 80.60 | 12.62 | 83.44 | 79.08 |
| Swin-unet | 80.80 | 11.93 | 85.51 | 83.90 |
| HCTNet | 81.24 | 10.53 | 83.59 | 82.12 |
| MambaUnet | 81.57 | 10.62 | 84.85 | 83.94 |
| Ours | 82.89 | 10.38 | 83.97 | 83.82 |
| Method | Dice/% ↑ | HD95/mm ↓ | Pre/% | Rec/% |
|---|---|---|---|---|
| Unet | 72.10 | 13.37 | 76.58 | 74.02 |
| Transunet | 74.61 | 12.63 | 80.28 | 76.63 |
| Swin-unet | 75.64 | 11.37 | 79.22 | 78.75 |
| HCTNet | 76.94 | 9.97 | 80.29 | 76.90 |
| MambaUnet | 77.95 | 10.57 | 80.85 | 79.75 |
| Ours | 79.60 | 9.99 | 82.46 | 79.46 |
| Algorithm | Dice/% ↑ | HD95/mm ↓ | Pre/% | Rec/% |
|---|---|---|---|---|
| Unet | 70.72 | 28.52 | 77.00 | 71.48 |
| FPN | 71.26 | 26.15 | 78.51 | 72.58 |
| DeepLabv3+ | 73.88 | 25.29 | 76.81 | 76.61 |
| Transunet | 73.83 | 23.71 | 79.95 | 73.60 |
| Swin-unet | 74.88 | 22.38 | 78.25 | 76.53 |
| HCTNet | 75.01 | 22.21 | 81.16 | 74.30 |
| VMamba | 73.02 | 23.77 | 80.13 | 72.44 |
| MambaUnet | 75.19 | 22.04 | 81.22 | 75.47 |
| Ours | 76.04 | 20.28 | 81.05 | 76.87 |
| Algorithm | HAEM | CFM-1 | CFM-3 | Dice/% ↑ | HD95/mm ↓ | Pre/% | Rec/% |
|---|---|---|---|---|---|---|---|
| A | 74.36 | 22.81 | 77.53 | 75.74 | |||
| B | √ | 75.00 | 21.19 | 80.75 | 74.91 | ||
| C | √ | √ | 75.52 | 21.58 | 81.04 | 74.69 | |
| D | √ | √ | 75.99 | 21.06 | 81.67 | 76.06 | |
| Ours | √ | √ | √ | 76.04 | 20.28 | 81.05 | 76.87 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Yang, G.; Zhang, Y.; Yang, H. Breast Ultrasound Image Segmentation Integrating Mamba-CNN and Feature Interaction. Sensors 2026, 26, 105. https://doi.org/10.3390/s26010105
Yang G, Zhang Y, Yang H. Breast Ultrasound Image Segmentation Integrating Mamba-CNN and Feature Interaction. Sensors. 2026; 26(1):105. https://doi.org/10.3390/s26010105
Chicago/Turabian StyleYang, Guoliang, Yuyu Zhang, and Hao Yang. 2026. "Breast Ultrasound Image Segmentation Integrating Mamba-CNN and Feature Interaction" Sensors 26, no. 1: 105. https://doi.org/10.3390/s26010105
APA StyleYang, G., Zhang, Y., & Yang, H. (2026). Breast Ultrasound Image Segmentation Integrating Mamba-CNN and Feature Interaction. Sensors, 26(1), 105. https://doi.org/10.3390/s26010105

