Prompt-Guided Refinement: A Novel Technique for Improving Intervertebral Disc Semantic Labeling
Abstract
1. Introduction
2. Related Work
3. Method
3.1. Problem Formulation
3.2. Encoder–Decoder Backbone
3.3. Multi-Scale Prompt Conditioner (MSPC)
3.3.1. Skeleton Encoding
3.3.2. Prompt–Feature Interaction
3.4. Prompt-Guided Refinement Module (PGR)
3.5. Training Objective
3.5.1. Localization Loss
3.5.2. Prompt Consistency Loss
3.5.3. Skeleton Structural Loss
3.5.4. Total Objective
4. Experimental Results
4.1. Dataset
4.2. Metrics
4.3. Optimization and Implementation Details
4.4. Results
5. Ablation Study
5.1. Impact of Each Module
5.2. Hyper-Parameter Effect
Distributional Stability Analysis on T2-Weighted MRI
5.3. Statistical Significance of the Improvements
5.4. Limitations
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Azad, R.; Rouhier, L.; Cohen-Adad, J. Stacked Hourglass Network with a Multi-level Attention Mechanism: Where to Look for Intervertebral Disc Labeling. In Machine Learning in Medical Imaging, Proceedings of the 12th International Workshop, MLMI 2021, Strasbourg, France, 27 September 2021; Springer: Cham, Switzerland, 2021; pp. 406–415. [Google Scholar]
- Urban, J.P.; Roberts, S. Degeneration of the intervertebral disc. Arthritis Res. Ther. 2003, 5, 120. [Google Scholar] [CrossRef]
- Chen, C.; Belavy, D.; Yu, W.; Chu, C.; Armbrecht, G.; Bansmann, M.; Felsenberg, D.; Zheng, G. Localization and segmentation of 3D intervertebral discs in MR images by data driven estimation. IEEE Trans. Med. Imaging 2015, 34, 1719–1729. [Google Scholar] [CrossRef]
- Glocker, B.; Feulner, J.; Criminisi, A.; Haynor, D.R.; Konukoglu, E. Automatic localization and identification of vertebrae in arbitrary field-of-view CT scans. In Medical Image Computing and Computer-Assisted Intervention, Proceedings of the 15th International Conference, Nice, France, 1–5 October 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 590–598. [Google Scholar]
- Ayed, I.B.; Punithakumar, K.; Garvin, G.; Romano, W.; Li, S. Graph cuts with invariant object-interaction priors: Application to intervertebral disc segmentation. In Information Processing in Medical Imaging, Proceedings of the 22nd International Conference, IPMI 2011, Kloster Irsee, Germany, 3–8 July 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 221–232. [Google Scholar]
- Cheng, Y.K.; Lin, C.L.; Huang, Y.C.; Chen, J.C.; Lan, T.P.; Lian, Z.Y.; Chuang, C.H. Automatic Segmentation of Specific Intervertebral Discs through a Two-Stage MultiResUNet Model. J. Clin. Med. 2021, 10, 4760. [Google Scholar] [CrossRef] [PubMed]
- Ji, X.; Zheng, G.; Belavy, D.; Ni, D. Automated intervertebral disc segmentation using deep convolutional neural networks. In Computational Methods and Clinical Applications for Spine Imaging, Proceedings of the 4th International Workshop and Challenge, CSI 2016, Athens, Greece, 17 October 2016; Springer: Cham, Switzerland, 2016; pp. 38–48. [Google Scholar]
- Dolz, J.; Desrosiers, C.; Ayed, I.B. IVD-Net: Intervertebral disc localization and segmentation in MRI with a multi-modal UNet. In Computational Methods and Clinical Applications for Spine Imaging, Proceedings of the 5th International Workshop and Challenge, CSI 2018, Granada, Spain, 16 September 2018; Springer: Cham, Switzerland, 2018; pp. 130–143. [Google Scholar]
- Mbarki, W.; Bouchouicha, M.; Frizzi, S.; Tshibasu, F.; Farhat, L.B.; Sayadi, M. Lumbar spine discs classification based on deep convolutional neural networks using axial view MRI. Interdiscip. Neurosurg. 2020, 22, 100837. [Google Scholar] [CrossRef]
- Vania, M.; Lee, D. Intervertebral disc instance segmentation using a multistage optimization mask-RCNN (MOM-RCNN). J. Comput. Des. Eng. 2021, 8, 1023–1036. [Google Scholar] [CrossRef]
- Wimmer, M.; Major, D.; Novikov, A.A.; Bühler, K. Fully automatic cross-modality localization and labeling of vertebral bodies and intervertebral discs in 3D spinal images. Int. J. Comput. Assist. Radiol. Surg. 2018, 13, 1591–1603. [Google Scholar] [CrossRef] [PubMed]
- Liu, L.; Wolterink, J.M.; Brune, C.; Veldhuis, R.N. Anatomy-aided deep learning for medical image segmentation: A review. Phys. Med. Biol. 2021, 66, 11TR01. [Google Scholar] [CrossRef]
- Rouhier, L.; Romero, F.P.; Cohen, J.P.; Cohen-Adad, J. Spine intervertebral disc labeling using a fully convolutional redundant counting model. arXiv 2020, arXiv:2003.04387. [Google Scholar] [CrossRef]
- Azad, R.; Heidari, M.; Cohen-Adad, J.; Adeli, E.; Merhof, D. Intervertebral Disc Labeling With Learning Shape Information, A Look Once Approach. In Predictive Intelligence in Medicine, Proceedings of the 5th International Workshop, PRIME 2022, Singapore, 22 September 2022; Springer: Cham, Switzerland, 2022. [Google Scholar]
- Chen, C.; Xie, W.; Franke, J.; Grutzner, P.; Nolte, L.P.; Zheng, G. Automatic X-ray landmark detection and shape segmentation via data-driven joint estimation of image displacements. Med. Image Anal. 2014, 18, 487–499. [Google Scholar] [CrossRef]
- Ullmann, E.; Pelletier Paquette, J.F.; Thong, W.E.; Cohen-Adad, J. Automatic labeling of vertebral levels using a robust template-based approach. Int. J. Biomed. Imaging 2014, 2014, 719520. [Google Scholar] [CrossRef]
- Azad, R.; Kazerouni, A.; Azad, B.; Khodapanah Aghdam, E.; Velichko, Y.; Bagci, U.; Merhof, D. Laplacian-former: Overcoming the limitations of vision transformers in local texture detection. In Medical Image Computing and Computer-Assisted Intervention, Proceedings of the 26th International Conference, Vancouver, BC, Canada, 8–12 October 2023; Springer: Cham, Switzerland, 2023; pp. 736–746. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Senthilkumaran, N.; Vaithegi, S. Image segmentation by using thresholding techniques for medical images. Comput. Sci. Eng. Int. J. 2016, 6, 1–13. [Google Scholar] [CrossRef]
- Zhao, Y.; Gui, W.; Chen, Z.; Tang, J.; Li, L. Medical images edge detection based on mathematical morphology. In Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, China, 17–18 January 2006; pp. 6492–6495. [Google Scholar]
- Yang, Y.; Wang, M.; Ma, L.; Zhang, X.; Zhang, K.; Zhao, X.; Teng, Q.; Liu, H. Cervical Intervertebral Disc Segmentation Based on Multi-Scale Information Fusion and Its Application. Electronics 2024, 13, 432. [Google Scholar] [CrossRef]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv 2014, arXiv:1412.7062. [Google Scholar]
- Yu, F.; Koltun, V. Multi-scale context aggregation by dilated convolutions. arXiv 2015, arXiv:1511.07122. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Zhang, H.; Dana, K.; Shi, J.; Zhang, Z.; Wang, X.; Tyagi, A.; Agrawal, A. Context encoding for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7151–7160. [Google Scholar]
- Alryalat, S.A.; Al-Antary, M.; Arafa, Y.; Azad, B.; Boldyreff, C.; Ghnaimat, T.; Al-Antary, N.; Alfegi, S.; Elfalah, M.; Abu-Ameerh, M. Deep learning prediction of response to anti-VEGF among diabetic macular edema patients: Treatment response analyzer system (TRAS). Diagnostics 2022, 12, 312. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015, Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar]
- Yu, C.; Wang, J.; Peng, C.; Gao, C.; Yu, G.; Sang, N. Bisenet: Bilateral segmentation network for real-time semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 325–341. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar] [CrossRef]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Wang, J.; Sun, K.; Cheng, T.; Jiang, B.; Deng, C.; Zhao, Y.; Liu, D.; Mu, Y.; Tan, M.; Wang, X.; et al. Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3349–3364. [Google Scholar] [CrossRef] [PubMed]
- Hou, C.; Zhang, W.; Wang, H.; Liu, F.; Liu, D.; Chang, J. A semantic segmentation model for lumbar MRI images using divergence loss. Appl. Intell. 2023, 53, 12063–12076. [Google Scholar] [CrossRef]
- Altun, İ.; Altun, S.; Alkan, A. LSS-UNET: Lumbar spinal stenosis semantic segmentation using deep learning. Multimed. Tools Appl. 2023, 82, 41287–41305. [Google Scholar] [CrossRef]
- Li, Z.; Zhou, X.; Tong, T. A Two-Stage Network for Segmentation of Vertebrae and Intervertebral Discs: Integration of Efficient Local-Global Fusion Using 3D Transformer and 2D CNN. In Neural Information Processing, Proceedings of the 30th International Conference, ICONIP 2023, Changsha, China, 20–23 November 2023; Springer: Singapore, 2023; pp. 467–479. [Google Scholar]
- Satpute, S.; Manza, R.; Manza, G.; Shaikh, A. Localization of Intervertebral Discs Using Deep-Learning and Region Growing Technique. In Advances in Intelligent Systems Research, Proceedings of the First International Conference on Advances in Computer Vision and Artificial Intelligence Technologies (ACVAIT 2022), Aurangabad, India, 1–2 August 2022; Atlantis Press: Dordrecht, The Netherlands, 2023; pp. 88–98. [Google Scholar]
- Sáenz-Gamboa, J.J.; Domenech, J.; Alonso-Manjarrés, A.; Gómez, J.A.; de la Iglesia-Vayá, M. Automatic semantic segmentation of the lumbar spine: Clinical applicability in a multi-parametric and multi-center study on magnetic resonance images. Artif. Intell. Med. 2023, 140, 102559. [Google Scholar] [CrossRef]
- Xu, Y.; Su, H.; Ma, G.; Liu, X. A novel dual-modal emotion recognition algorithm with fusing hybrid features of audio signal and speech context. Complex Intell. Syst. 2023, 9, 951–963. [Google Scholar] [CrossRef]
- Zhou, X.; Qi, W.; Ovur, S.E.; Zhang, L.; Hu, Y.; Su, H.; Ferrigno, G.; De Momi, E. A novel muscle-computer interface for hand gesture recognition using depth vision. J. Ambient. Intell. Humaniz. Comput. 2020, 11, 5569–5580. [Google Scholar] [CrossRef]
- Cohen-Adad, J.; Alonso-Ortiz, E.; Abramovic, M.; Arneitz, C.; Atcheson, N.; Barlow, L.; Barry, R.L.; Barth, M.; Battiston, M.; Büchel, C.; et al. Open-access quantitative MRI data of the spinal cord and reproducibility across participants, sites and manufacturers. Sci. Data 2021, 8, 219. [Google Scholar] [CrossRef] [PubMed]
- Bozorgpour, A.; Azad, B.; Azad, R.; Velichko, Y.; Bagci, U.; Merhof, D. HCA-Net: Hierarchical context attention network for intervertebral disc semantic labeling. In Proceedings of the 2024 IEEE International Symposium on Biomedical Imaging (ISBI), Athens, Greece, 27–30 May 2024; pp. 1–5. [Google Scholar]
- Silvoster, L.; Kumar, R.M.S. Graph cut-based segmentation for intervertebral disc in human MRI. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 2025, 13, 2475992. [Google Scholar] [CrossRef]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-unet: Unet-like pure transformer for medical image segmentation. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 205–218. [Google Scholar]




| Method | T1 | T2 | ||||
|---|---|---|---|---|---|---|
| DTT (mm) | FNR (%) | FPR (%) | DTT (mm) | FNR (%) | FPR (%) | |
| Template Matching [16] | 1.97 (±4.08) | 8.1 | 2.53 | 2.05 (±3.21) | 11.1 | 2.11 |
| Countception [13] | 1.03 (±2.81) | 4.24 | 0.9 | 1.78 (±2.64) | 3.88 | 1.5 |
| Pose Estimation [1] | 1.32 (±1.33) | 0.32 | 0.0 | 1.31 (±2.79) | 1.2 | 0.6 |
| Look Once [14] | 1.20 (±1.90) | 0.7 | 0.0 | 1.28 (±2.61) | 0.9 | 0.0 |
| Graph-method [45] | 1.84 (±1.31) | 7.8 | 2.1 | 2.07 (±2.95) | 12.01 | 2.7 |
| Swin-Net [46] | 1.44 (±1.22) | 1.3 | 0.4 | 1.86 (±3.10) | 4.61 | 1.8 |
| HCA-Net [44] | 1.19 (±1.08) | 0.3 | 0.0 | 1.26 (±2.16) | 0.61 | 0.0 |
| Proposed PGR-Net (Ours) | 1.17 (±1.05) | 0.27 | 0.0 | 1.24 (±2.09) | 0.58 | 0.0 |
| Configuration | T1 | T2 | ||||
|---|---|---|---|---|---|---|
| DTT | FNR (%) | FPR (%) | DTT | FNR (%) | FPR (%) | |
| Baseline (Hourglass) | 1.45 | 7.3 | 1.2 | 1.80 | 5.4 | 1.8 |
| + PGR only | 1.39 | 2.8 | 0.8 | 1.42 | 2.4 | 0.9 |
| + MSPC only | 1.31 | 1.9 | 0.6 | 1.34 | 1.6 | 0.3 |
| + MSPC + | 1.26 | 1.1 | 0.4 | 1.30 | 0.9 | 0.2 |
| + MSPC + PGR | 1.23 | 0.9 | 0.3 | 1.28 | 0.8 | 0.2 |
| + PGR + | 1.25 | 0.8 | 0.3 | 1.31 | 0.7 | 0.2 |
| + MSPC + PGR + (Full Model) | 1.17 | 0.27 | 0.0 | 1.24 | 0.58 | 0.0 |
| Configuration | DTT (mm) | FNR (%) | ||
|---|---|---|---|---|
| 0.2 | 0.5 | Weak prompt conditioning | 1.29 | 0.69 |
| 0.4 | 0.3 | Under-regularized | 1.27 | 0.67 |
| 0.4 | 0.5 | Balanced configuration | 1.24 | 0.61 |
| 0.6 | 0.5 | Strong prompt constraint | 1.26 | 0.63 |
| 0.4 | 0.7 | Over-regularized skeleton | 1.27 | 0.65 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alharbi, M.N.; Alahmadi, M.D. Prompt-Guided Refinement: A Novel Technique for Improving Intervertebral Disc Semantic Labeling. Mathematics 2025, 13, 3944. https://doi.org/10.3390/math13243944
Alharbi MN, Alahmadi MD. Prompt-Guided Refinement: A Novel Technique for Improving Intervertebral Disc Semantic Labeling. Mathematics. 2025; 13(24):3944. https://doi.org/10.3390/math13243944
Chicago/Turabian StyleAlharbi, Mohammed N., and Mohammad D. Alahmadi. 2025. "Prompt-Guided Refinement: A Novel Technique for Improving Intervertebral Disc Semantic Labeling" Mathematics 13, no. 24: 3944. https://doi.org/10.3390/math13243944
APA StyleAlharbi, M. N., & Alahmadi, M. D. (2025). Prompt-Guided Refinement: A Novel Technique for Improving Intervertebral Disc Semantic Labeling. Mathematics, 13(24), 3944. https://doi.org/10.3390/math13243944

