G-UNETR++: A Gradient-Enhanced Network for Accurate and Robust Liver Segmentation from Computed Tomography Images
Abstract
:1. Introduction
- We introduce a new network architecture named G-UNETR++ to improve accuracy in liver segmentation with moderate model complexity.
- We propose gradient-based encoders to learn the 3D geometric features such as the boundaries between different organs and tissues.
- We design a novel hybrid loss function that combines dice loss, CE loss, and HD loss to address class imbalance and improve segmentation in challenging cases.
2. Related Work
2.1. CNN-Based Segmentation Networks
2.2. Transformer-Based Segmentation Networks
2.3. Hybrid Segmentation Networks
3. Methods
3.1. Architecture
3.2. Gradient-Enhanced Encoder Scheme
3.3. Loss Function
4. Experiments and Results
4.1. Datasets
4.2. Data Preprocessing and Augmentation
4.3. Post-Processing
4.4. Experimental Environment and Parameters
4.5. Evaluation Metrics
4.6. Segmentation Results of the Proposed Model
4.7. Ablation Study
4.8. Comparison of Models
5. Discussion and Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef]
- Lu, X.; Wu, J.; Ren, X.; Zhang, B.; Li, Y. The study and application of the improved region growing algorithm for liver segmentation. Optik 2014, 125, 2142–2147. [Google Scholar] [CrossRef]
- Yang, X.; Yu, H.; Choi, Y.; Lee, W.; Wang, B.; Yang, J.; Hwang, H.; Kim, J.H.; Song, J.; Cho, B.H.; et al. A hybrid semi-automatic method for liver segmentation based on level-set methods using multiple seed points. Comput. Methods Programs Biomed. 2014, 113, 69–79. [Google Scholar] [CrossRef]
- Lu, D.; Wu, Y.; Harris, G.; Cai, W. Iterative mesh transformation for 3D segmentation of livers with cancers in CT images. Comput. Med. Imaging Graph. 2015, 43, 1–14. [Google Scholar] [CrossRef]
- Dawant, B.M.; Li, R.; Lennon, B.; Li, S. Semi-automatic segmentation of the liver and its evaluation on the MICCAI 2007 grand challenge data set. In Proceedings of the MICCAI 2007 Workshop: 3D Segmentation in the Clinic: A Grand Challenge, Brisbane, Australia, 29 October 2007; pp. 215–221. [Google Scholar]
- Hermoye, L.; Laamari-Azjal, I.; Cao, Z.; Annet, L.; Lerut, J.; Dawant, B.M.; Van Beers, B.E. Liver segmentation in living liver transplant donors: Comparison of semiautomatic and manual methods. Radiology 2005, 234, 171–178. [Google Scholar] [CrossRef] [PubMed]
- Jiang, L.; Ou, J.; Liu, R.; Zou, Y.; Xie, T.; Xiao, H.; Bai, T. RMAU-Net: Residual Multi-Scale Attention U-Net for liver and tumor segmentation in CT images. Comput. Biol. Med. 2023, 158, 106838. [Google Scholar] [CrossRef]
- Chen, Y.; Zheng, C.; Zhou, T.; Feng, L.; Liu, L.; Zeng, Q.; Wang, G. A deep residual attention-based U-Net with a biplane joint method for liver segmentation from CT scans. Comput. Biol. Med. 2023, 152, 106421. [Google Scholar] [CrossRef] [PubMed]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention MICCAI International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Gul, S.; Khan, M.S.; Bibi, A.; Khandakar, A.; Ayari, M.A.; Chowdhury, M.E.H. Deep learning techniques for liver and liver tumor segmentation: A review. Comput. Biol. Med. 2022, 147, 105620. [Google Scholar] [CrossRef]
- Yu, X.; Yang, Q.; Zhou, Y.; Cai, L.Y.; Gao, R.; Lee, H.H.; Li, T.; Bao, S.; Xu, Z.; Lasko, T.A.; et al. UNesT: Local spatial representation learning with hierarchical transformer for efficient medical segmentation. Med. Image Anal. 2023, 90, 102939. [Google Scholar] [CrossRef]
- Hatamizadeh, A.; Tang, Y.; Nath, V.; Yang, D.; Myronenko, A.; Landman, B.; Roth, H.R.; Xu, D. UNETR: Transformers for 3d medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2022; pp. 574–584. [Google Scholar]
- Hu, H.; Zhang, Z.; Xie, Z.; Lin, S. Local relation networks for image recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3464–3473. [Google Scholar]
- Mohajelin, F.; Sheykhivand, S.; Shabani, A.; Danishvar, M.; Danishvar, S.; Lahijan, L.Z. Automatic recognition of multiple emotional classes from EEG signals through the use of graph theory and convolutional neural networks. Sensors 2024, 24, 5883. [Google Scholar] [CrossRef]
- Ardabili, S.Z.; Bahmani, S.; Lahijan, L.Z.; Khaleghi, N.; Sheykhivand, S.; Danishvar, S. A novel approach for automatic detection of driver fatigue using EEG signals based on graph convolutional networks. Sensors 2024, 24, 364. [Google Scholar] [CrossRef]
- Khoshkhabar, M.; Meshgini, S.; Afrouzian, R.; Danishvar, S. Automatic liver tumor segmentation from CT images using graph convolutional network. Sensors 2023, 23, 7561. [Google Scholar] [CrossRef]
- Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. Transunet: Transformers make strong encoders for medical image segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar]
- Hatamizadeh, A.; Nath, V.; Tang, Y.; Yang, D.; Roth, H.; Xu, D. Swin UNETR: Swin transformers for semantic segmentation of brain tumors in MRI images. arXiv 2022, arXiv:2201.01266. [Google Scholar]
- Zhou, H.-Y.; Guo, J.; Zhang, Y.; Yu, L.; Wang, L.; Yu, Y. nnFormer: Interleaved transformer for volumetric segmentation. arXiv 2021, arXiv:2109.03201. [Google Scholar]
- Shaker, A.M.; Maaz, M.; Rasheed, H.; Khan, S.; Yang, M.-H.; Khan, F.S. UNETR++: Delving into efficient and accurate 3D medical image segmentation. IEEE Trans. Med. Imaging 2024, 43, 3377–3390. [Google Scholar] [CrossRef]
- Landman, B.; Xu, Z.; Igelsias, J.E.; Styner, M.; Langerak, T.R.; Klein, A. 2015 MICCAI multi-atlas labeling beyond the cranial vault workshop and challenge. In Proceedings of the MICCAI Multi-Atlas Labeling Beyond Cranial Vault—Workshop Challenge, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Çiçek, Ö.; Abdulkadir, A.; Lienkamp, S.S.; Brox, T.; Ronneberger, O. 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention, Athens, Greece, 17–21 October 2016; pp. 424–432. [Google Scholar]
- Dou, Q.; Chen, H.; Jin, Y.; Yu, L.; Qin, J.; Heng, P.-A. 3D deeply supervised network for automatic liver segmentation from CT volumes. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Athens, Greece, 17–21 October 2016; pp. 149–157. [Google Scholar]
- Gibson, E.; Giganti, F.; Hu, Y.; Bonmati, E.; Bandula, S.; Gurusamy, K.; Davidson, B.; Pereira, S.P.; Clarkson, M.J.; Barratt, D.C. Automatic multi-organ segmentation on abdominal CT with dense V-networks. IEEE Trans. Med. Imaging 2018, 37, 1822–1834. [Google Scholar] [CrossRef]
- Milletari, F.; Navab, N.; Ahmadi, S.-A. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. In Proceedings of the 2016 4th International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; pp. 565–571. [Google Scholar]
- Roth, H.R.; Oda, H.; Hayashi, Y.; Oda, M.; Shimizu, N.; Fujiwara, M.; Misawa, K.; Mori, K. Hierarchical 3D fully convolutional networks for multi-organ segmentation. arXiv 2017, arXiv:1704.06382. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2021, arXiv:2010.11929. [Google Scholar]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-Unet: U-Net-like pure transformer for medical image segmentation. arXiv 2021, arXiv:2105.05537. [Google Scholar]
- Karimi, D.; Vasylechko, S.D.; Gholipour, A. Convolution-free medical image segmentation using transformers. arXiv 2021, arXiv:2102.13645. [Google Scholar]
- Khan, A.; Rauf, Z.; Sohail, A.; Rehman, A.; Asif, H.M.; Asif, A.; Farooq, U. A survey of the vision transformers and its CNN-transformer based variants. arXiv 2023, arXiv:2305.09880. [Google Scholar]
- Valanarasu, J.M.J.; Oza, P.; Hacihaliloglu, I.; Patel, V.M. Medical transformer: Gated axial-attention for medical image segmentation. In Proceedings of the 24th International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, France, 27 September–1 October 2021; pp. 36–46. [Google Scholar]
- Xie, Y.; Zhang, J.; Shen, C.; Xia, Y. CoTr: Efficiently bridging CNN and transformer for 3D medical image segmentation. In Proceedings of the 24th International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, France, 27 September–1 October 2021; pp. 171–180. [Google Scholar]
- Wang, W.; Chen, C.; Ding, M.; Yu, H.; Zha, S.; Li, J. TransBTS: Multimodal brain tumor segmentation using transformer. In Proceedings of the 24th International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, France, 27 September–1 October 2021; pp. 109–119. [Google Scholar]
- Zhou, Y.; Kong, Q.; Zhu, Y.; Su, Z. MCFA-UNet: Multiscale cascaded feature attention U-Net for liver segmentation. IRBM 2023, 44, 100789. [Google Scholar] [CrossRef]
- Guo, X.; Schwartz, L.H.; Zhao, B. Automatic liver segmentation by integrating fully convolutional networks into active contour models. Med. Phys. 2019, 46, 4455–4469. [Google Scholar] [CrossRef] [PubMed]
- Song, L.; Wang, H.; Wang, Z.J. Bridging the gap between 2D and 3D contexts in CT volume for liver and tumor segmentation. IEEE J. Biomed. Health Inform. 2021, 25, 3450–3459. [Google Scholar] [CrossRef] [PubMed]
- Lei, T.; Wang, R.; Zhang, Y.; Wan, Y.; Liu, C.; Nandi, A.K. DefED-Net: Deformable encoder-decoder network for liver and liver tumor segmentation. IEEE Trans. Radiat. Plasma Med. Sci. 2022, 6, 68–78. [Google Scholar] [CrossRef]
- Chen, Y.; Hu, F.; Wang, Y.; Zheng, C. Hybrid-attention densely connected U-Net with GAP for extracting livers from CT volumes. Med. Phys. 2022, 49, 1015–1033. [Google Scholar] [CrossRef]
- Kushnure, D.T.; Talbar, S.N. HFRU-Net: High-level feature fusion and recalibration UNet for automatic liver and tumor segmentation in CT images. Comput. Methods Programs Biomed. 2022, 213, 106501. [Google Scholar] [CrossRef]
- Li, R.; Xu, L.; Xie, K.; Song, J.; Ma, X.; Chang, L.; Yan, Q. DHT-Net: Dynamic hierarchical transformer network for liver and tumor Segmentation. IEEE J. Biomed. Health Inform. 2023, 27, 3443–3454. [Google Scholar] [CrossRef]
- Zhu, J.; Liu, Z.; Gao, W.; Fu, Y. CotepRes-Net: An efficient U-Net based deep learning method of liver segmentation from Computed Tomography images. Biomed. Signal Process. Control 2024, 88, 105660. [Google Scholar] [CrossRef]
- Karimi, D.; Salcudean, S.E. Reducing the Hausdorff distance in medical image segmentation with convolutional neural networks. IEEE Trans. Med. Imaging 2020, 39, 499–513. [Google Scholar] [CrossRef]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. arXiv 2018, arXiv:1807.10165. [Google Scholar]
- Lee, C.-Y.; Xie, S.; Gallagher, P.; Zhang, Z.; Tu, Z. Deeply-supervised nets. arXiv 2014, arXiv:1409.5185. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2017, arXiv:1412.6980. [Google Scholar]
- Chibuike, O.; Yang, X. Convolutional Neural Network–Vision Transformer Architecture with Gated Control Mechanism and Multi-Scale Fusion for Enhanced Pulmonary Disease Classification. Diagnostics 2024, 14, 2790. [Google Scholar] [CrossRef] [PubMed]
- Fan, T.; Wang, G.; Li, Y.; Wang, H. MA-Net: A multi-scale attention network for liver and tumor segmentation. IEEE Access 2020, 8, 179656–179665. [Google Scholar] [CrossRef]
- Sakboonyara, B.; Taeprasartsit, P. U-Net and mean-shift histogram for efficient liver segmentation from CT images. In Proceedings of the 11th International Conference on Knowledge and Smart Technology (KST), Phuket, Thailand, 23–26 January 2019; pp. 51–56. [Google Scholar]
- Seo, H.; Huang, C.; Bassenne, M.; Xiao, R.; Xing, L. Modified U-Net (mU-Net) with incorporation of object-dependent high level features for improved liver and liver-tumor segmentation in CT images. IEEE Trans. Med. Imaging 2020, 39, 1316–1325. [Google Scholar] [CrossRef]
- Budak, Ü.; Guo, Y.; Tanyildizi, E.; Şengür, A. Cascaded deep convolutional encoder-decoder neural networks for efficient liver tumor segmentation. Med. Hypotheses 2020, 134, 109431. [Google Scholar] [CrossRef] [PubMed]
Model | DSC | Parameters (M) | FLOPs (G) |
---|---|---|---|
TransUNet [17] | 0.9408 | 96.07 | 88.91 |
UNETR [12] | 0.9457 | 92.49 | 75.76 |
Swin-UNETR [18] | 0.9572 | 62.83 | 384.2 |
nnFormer [19] | 0.9684 | 150.5 | 213.4 |
UNETR++ [20] | 0.9642 | 42.96 | 47.98 |
Dataset | DSC | VOE | RAVD | ASSD | MSSD | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|---|---|---|---|---|
LiTS | 0.9738 | 0.0509 | 0.0201 | 0.64 | 12.97 | 0.9986 | 0.9744 | 0.9736 | 0.9738 |
Sliver07 | 0.9732 | 0.0520 | 0.0141 | 0.76 | 18.34 | 0.9971 | 0.9789 | 0.9678 | 0.9733 |
3D-IRCADb | 0.9750 | 0.0488 | 0.0206 | 0.59 | 12.29 | 0.9975 | 0.9768 | 0.9735 | 0.9750 |
Model | DSC | VOE | RAVD | ASSD | MSSD |
---|---|---|---|---|---|
UNETR++ (Baseline) | 0.9726 | 0.0531 | 0.0186 | 0.7876 | 18.9311 |
+ G-Encoders (G-UNETR++) | 0.9737 | 0.0511 | 0.0201 | 0.6510 | 18.4870 |
Loss Function | DSC | VOE | RAVD | ASSD | MSSD |
---|---|---|---|---|---|
Dice loss + CE loss | 0.9731 | 0.0523 | 0.0189 | 0.7541 | 19.0292 |
The proposed loss | 0.9737 | 0.0511 | 0.0201 | 0.6510 | 18.4870 |
Model | DSC | VOE | RAVD | ASSD | MSSD |
---|---|---|---|---|---|
Without post-processing | 0.9737 | 0.0511 | 0.0201 | 0.6510 | 18.4870 |
With post-processing | 0.9738 | 0.0509 | 0.0201 | 0.6411 | 12.9686 |
Methods | DSC | VOE | RAVD | ASSD | MSSD |
---|---|---|---|---|---|
FCN + ACM [35] | 0.9430 | - | - | 2.30 | 34.70 |
MA-Net [47] | 0.9597 | 0.1000 | 0.0600 | - | - |
S-Net + T-Net + SEL + PRL [36] | 0.9680 | 0.0700 | 0.0150 | - | - |
DefED-Net [37] | 0.9630 | 0.0688 | 0.0146 | 1.37 | 37.60 |
HDU-Net [38] | 0.9650 | 0.0670 | 0.0090 | 1.22 | - |
HFRU-Net [39] | 0.9660 | 0.0550 | 0.0050 | 1.02 | 40.37 |
DHT-Net [40] | 0.9660 | 0.0670 | 0.0090 | 1.12 | - |
CotepRes-Net [41] | 0.9688 | 0.0578 | 0.0039 | 1.09 | 16.08 |
DRAUNet [8] | 0.9710 | 0.0548 | 0.0108 | 1.31 | - |
Proposed | 0.9738 | 0.0509 | 0.0201 | 0.64 | 12.97 |
Methods | DSC | VOE | RAVD | ASSD | MSSD |
---|---|---|---|---|---|
FCN + ACM [35] | 0.9620 | - | - | 1.40 | 23.80 |
U-Net [48] | 0.9500 | 0.0932 | 0.0108 | 2.85 | 59.15 |
DRAUNet [8] | 0.9685 | 0.0618 | 0.0105 | 1.18 | - |
Proposed | 0.9732 | 0.0520 | 0.0141 | 0.76 | 18.34 |
Methods | DSC | VOE | RAVD | ASSD | MSSD |
---|---|---|---|---|---|
mU-Net [49] | 0.9601 | 0.0973 | 0.0038 | 3.11 | 9.20 |
EDCNN [50] | 0.9520 | 0.0910 | 0.0700 | 1.43 | 19.37 |
S-Net + T-Net + SEL + PRL [36] | 0.9650 | 0.0960 | 0.0030 | 3.21 | - |
DefED-Net [37] | 0.9660 | 0.0565 | 0.0023 | 2.61 | - |
HDU-Net [38] | 0.9618 | 0.0721 | 0.0002 | 1.20 | - |
HFRU-Net [39] | 0.9720 | 0.0570 | 0.0110 | 1.15 | 36.24 |
RMAU-Net [7] | 0.9703 | 0.0517 | 0.0013 | - | - |
DRAUNet [8] | 0.9741 | 0.0540 | 0.0018 | 0.76 | - |
Proposed | 0.9750 | 0.0488 | 0.0206 | 0.59 | 12.29 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, S.; Han, K.; Shin, H.; Park, H.; Kim, S.; Kim, J.; Yang, X.; Yang, J.D.; Yu, H.C.; You, H. G-UNETR++: A Gradient-Enhanced Network for Accurate and Robust Liver Segmentation from Computed Tomography Images. Appl. Sci. 2025, 15, 837. https://doi.org/10.3390/app15020837
Lee S, Han K, Shin H, Park H, Kim S, Kim J, Yang X, Yang JD, Yu HC, You H. G-UNETR++: A Gradient-Enhanced Network for Accurate and Robust Liver Segmentation from Computed Tomography Images. Applied Sciences. 2025; 15(2):837. https://doi.org/10.3390/app15020837
Chicago/Turabian StyleLee, Seungyoo, Kyujin Han, Hangyeul Shin, Harin Park, Seunghyon Kim, Jeonghun Kim, Xiaopeng Yang, Jae Do Yang, Hee Chul Yu, and Heecheon You. 2025. "G-UNETR++: A Gradient-Enhanced Network for Accurate and Robust Liver Segmentation from Computed Tomography Images" Applied Sciences 15, no. 2: 837. https://doi.org/10.3390/app15020837
APA StyleLee, S., Han, K., Shin, H., Park, H., Kim, S., Kim, J., Yang, X., Yang, J. D., Yu, H. C., & You, H. (2025). G-UNETR++: A Gradient-Enhanced Network for Accurate and Robust Liver Segmentation from Computed Tomography Images. Applied Sciences, 15(2), 837. https://doi.org/10.3390/app15020837