Colorectal Polyp Segmentation Based on Deep Learning Methods: A Systematic Review
Abstract
1. Introduction
1.1. Research Progress on Polyp Segmentation
1.2. Relevant Review Survey
1.3. Our Contribution
- (1)
- Classify polyp segmentation methods into five categories based on their network architectures: CNN, Transformer, Hybrid, Mamba, and Other.
- (2)
- Conduct a systematic study of polyp segmentation methods. Summarize the 87 papers and categorize the specific problems addressed by each polyp segmentation method into five categories.
- (3)
- Provide a comprehensive compilation of datasets required for image and video polyp segmentation.
- (4)
- We conducted a performance evaluation of methods for solving different problems, comparing and analyzing 44 models.
- (5)
- Discuss the existing limitations and future research trends.
2. Literature Search and Review Procedures
3. Review About Polyp
4. Polyp Segmentation Model
4.1. CNN-Based Methods
4.1.1. Polyp Diversity and Boundary Factors
Problems of Difficult Boundary Segmentation
Issues on Diversity in the Morphology, Size, Brightness, and Color of Polyps
4.1.2. Limited Polyp Dataset
4.1.3. Generalization of Training Models and Domain Adaptation
4.1.4. Real-Time Performance of the Model
4.2. Transformer-Based Methods
4.2.1. Polyp Diversity and Boundary Factors
- Issues on Diversity in the Morphology, Size, Brightness, and Color of Polyps
- 2.
- Problems of Difficult Boundary Segmentation
4.2.2. Limited Polyp Dataset
4.2.3. Generalization of Training Models and Domain Adaptation
4.2.4. Real-Time Performance of the Model
4.2.5. Limitations of Architecture
4.3. Hybrid Architecture Methods
4.3.1. Polyp Diversity and Boundary Factors
- Issues on Diversity in the Morphology, Size, Brightness, and Color of Polyps
- 2.
- Problems of Difficult Boundary Segmentation
4.3.2. Limited Polyp Dataset
4.3.3. Generalization of Training Models and Domain Adaptation
4.3.4. Real-Time Performance of the Model
4.4. Mamba-Based Methods
4.4.1. Polyp Diversity and Boundary Factors
- Issues on Diversity in the Morphology, Size, Brightness, and Color of Polyps
- 2.
- Problems of Difficult Boundary Segmentation
4.4.2. Limitations of Architecture
4.5. Other Architectural Methods
4.5.1. Polyp Diversity and Boundary Factors
- Issues on Diversity in the Morphology, Size, Brightness, and Color of Polyps
- 2.
- Problems of Difficult Boundary Segmentation
4.5.2. Limited Polyp Dataset
4.5.3. Real-Time Performance of the Model
4.6. Video Polyp Segmentation
4.7. Remarks
5. Polyp Segmentation Loss Function
- Weighted Binary Cross-Entropy Loss;
- 2.
- Dice Loss;
- 3.
- Focal Loss;
- 4.
- Tversky Loss;
6. Polyp Segmentation Datasets
6.1. Datasets of Polyp Image
Datasets of Polyp Image/Video | Dataset | Year | Number of Images | Number of Video Sequences | Resolution |
---|---|---|---|---|---|
Datasets of polyp image | CVC-ColonDB [44] | 2012 | 300 | 13 | 574 × 500 |
ETIS-Larib [139] | 2014 | 196 | 34 | 1225 × 966 | |
CVC-ClinicDB [140] | 2015 | 612 | 31 | 384 × 288 | |
CVC-EndoSceneStill [141] | 2017 | 912 | 44 | 384 × 288 to 574 × 500 | |
Kvasir-SEG [142] | 2020 | 1000 | N/A | 332 × 487 to 1920 × 1072 | |
PICCOLO [144] | 2020 | 3433 | 40 | 854 × 480 to 1920 × 1080 | |
Hyper-Kvasir [143] | 2020 | 110,079 | 374 | 332 × 487 to 1920 × 1072 | |
BKAI-IGH [145] | 2021 | 1200 | N/A | 1280 × 959 | |
PolypGen [146] | 2023 | 8037 | N/A | 384 × 288 to 1920 × 1080 | |
Datasets of polyp video | ASU-Mayo [61] | 2016 | 36,458 | 38 | 688 × 550 |
LDPolypVideo [147] | 2021 | 40,266 | 160 | 560 × 480 | |
Kvasir-Capsule [148] | 2021 | 4,741,504 | 117 | 336×336 | |
SUN-SEG | 2022 | 158,690 | 1013 | 1158 ×1008 to 1240 × 1080 |
6.2. Datasets of Polyp Video
7. Polyp Segmentation Evaluation Metrics
8. Polyp Segmentation Model Performance Evaluation
9. Open Research Challenges
9.1. Lack of Large-Scale Datasets
9.2. Image Quality and Noise
9.3. Complex Shapes and Boundaries
9.4. Insufficient Generalization Ability
9.5. High Real-Time Requirements
9.6. Lack of Interpretability
9.7. Ethics and Privacy
10. Future Research Directions
10.1. Develop Larger-Scale and More Diverse Datasets
10.2. Exploring and Improving Deep Learning Architectures
10.3. Weak Supervised
10.4. Combine with Prior Knowledge
10.5. Domain Adaptation
10.6. Develop Lightweight Models
10.7. Interpretability
10.8. Federated Learning
11. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AP | Average Precision |
BSCA | Bit-Slice Context Attention |
BWM | Balanced Weight Module |
CMAM | Convolutional Multi-Scale Attention Module |
CNN | Convolutional Neural Network |
CRC | Colorectal Cancer |
DCL | Domain Contrastive Learning |
DCT | Discrete Cosine Transform |
DDPM | Denoising Diffusion Probabilistic Model |
DLCA | Dataset-Level Color Augmentation |
DPC | Dual-Path Complementary |
DSC | Dice Similarity Coefficient |
FCNs | Fully Convolutional Networks |
FGW | Flow-Guided Warping |
FoBS | Focus on Boundary Segmentation |
FPS | Frames Per Second |
FT | Fourier Transform |
GFlops | Giga Floating Point Operations Per Second |
IFU | Iterative Feedback Unit |
IoU | Intersection over Union |
MACE | Multi-Path Attention Encoder |
MAST | Mixture-Attention Siamese Transformer |
MCAD | Multi-Path Attention Decoder |
mDice | mean Dice coefficient |
mIoU | mean Intersection over Union |
MLP | Multi-Layer Perceptron |
PAM | Parallel Attention Module |
PVTv2 | Pyramid Vision Transformer v2 |
RFS | Reference Frame Selection |
RSA | Region Self-Attention |
SAM | Segment Anything Model |
SAM2 | Segment Anything Model 2 |
SSBU | Segmentation-Squeeze-Bottleneck Union |
SSFM | Selective Shared Fusion Module |
SSM | State Space Model |
TRM | Temporal Reasoning Module |
ViT | Vision Transformer |
VSS | Visual State Space |
WoS | Web of Science |
References
- Siegel, R.L.; Miller, K.D.; Wagle, N.S.; Jemal, A. Cancer statistics, 2023. CA Cancer J. Clin. 2023, 73, 17–48. [Google Scholar] [CrossRef]
- Haggar, F.A.; Boushey, R.P. Colorectal Cancer Epidemiology: Incidence, Mortality, Survival, and Risk Factors. Clin. Colon Rectal Surg. 2009, 22, 191–197. [Google Scholar] [CrossRef]
- Lin, L.; Lv, G.; Wang, B.; Xu, C.; Liu, J. Polyp-LVT: Polyp segmentation with lightweight vision transformers. Knowl.-Based Syst. 2024, 300, 112181. [Google Scholar] [CrossRef]
- Mamonov, A.V.; Figueiredo, I.N.; Figueiredo, P.N.; Richard Tsai, Y.-H. Automated Polyp Detection in Colon Capsule Endoscopy. IEEE Trans. Med. Imaging 2014, 33, 1488–1502. [Google Scholar] [CrossRef]
- Gross, S.; Kennel, M.; Stehle, T.; Wulff, J.; Tischendorf, J.; Trautwein, C.; Aach, T. Polyp Segmentation in NBI Colonoscopy. In Proceedings of the Bildverarbeitung für die Medizin 2009, Heidelberg, Germany, 21–24 March 2009; Meinzer, H.-P., Deserno, T.M., Handels, H., Tolxdorff, T., Eds.; Springer: Berlin, Heidelberg, 2009; pp. 252–256. [Google Scholar]
- Jerebko, A.K.; Teerlink, S.; Franaszek, M.; Summers, R.M. Polyp segmentation method for CT colonography computer-aided detection. In Proceedings of the Medical Imaging 2003, San Diego, CA, USA, 12–20 February 2003; Clough, A.V., Amini, A.A., Eds.; SPIE: Bellingham, WA, USA, 2003; p. 359. [Google Scholar]
- Yao, J.; Miller, M.; Franaszek, M.; Summers, R.M. Colonic polyp segmentation in CT colonography-based on fuzzy clustering and deformable models. IEEE Trans. Med. Imaging 2004, 23, 1344–1352. [Google Scholar] [CrossRef]
- Lu, L.; Barbu, A.; Wolf, M.; Liang, J.; Salganicoff, M.; Comaniciu, D. Accurate polyp segmentation for 3D CT colongraphy using multi-staged probabilistic binary learning and compositional model. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
- Bernal, J.; Núñez, J.M.; Sánchez, F.J.; Vilariño, F. Polyp Segmentation Method in Colonoscopy Videos by Means of MSA-DOVA Energy Maps Calculation. In Clinical Image-Based Procedures. Translational Research in Medical Imaging; Linguraru, M.G., Oyarzun Laura, C., Shekhar, R., Wesarg, S., González Ballester, M.Á., Drechsler, K., Sato, Y., Erdt, M., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2014; Volume 8680, pp. 41–49. ISBN 978-3-319-13908-1. [Google Scholar]
- Hwang, S.; Celebi, M.E. Polyp detection in Wireless Capsule Endoscopy videos based on image segmentation and geometric feature. In Proceedings of the 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, USA, 14–19 March 2010; pp. 678–681. [Google Scholar]
- Ganz, M.; Yang, X.; Slabaugh, G. Automatic Segmentation of Polyps in Colonoscopic Narrow-Band Imaging Data. IEEE Trans. Biomed. Eng. 2012, 59, 2144–2151. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2023. [Google Scholar] [CrossRef]
- Gu, A.; Dao, T. Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv 2024. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Manjunath, K.N.; Siddalingaswamy, P.C.; Gopalakrishna Prabhu, K. An Improved Method of Colon Segmentation in Computed Tomography Colonography Images Using Domain Knowledge. J. Med. Imaging Health Inform. 2016, 6, 916–924. [Google Scholar] [CrossRef]
- Brandao, P.; Mazomenos, E.; Ciuti, G.; Caliò, R.; Bianchi, F.; Menciassi, A.; Dario, P.; Koulaouzidis, A.; Arezzo, A.; Stoyanov, D. Fully convolutional neural networks for polyp segmentation in colonoscopy. In Proceedings of the Medical Imaging 2017: Computer-Aided Diagnosis, Orlando, FL, USA, 13–16 February 2017; SPIE: Bellingham, WA, USA, 2017; pp. 101–107. [Google Scholar]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. In Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Granada, Spain, 20 September 2018; Stoyanov, D., Taylor, Z., Carneiro, G., Syeda-Mahmood, T., Martel, A., Maier-Hein, L., Tavares, J.M.R.S., Bradley, A., Papa, J.P., Belagiannis, V., et al., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 3–11. [Google Scholar]
- Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Johansen, D.; Lange, T.D.; Halvorsen, P.; Johansen, H.D. ResUNet++: An Advanced Architecture for Medical Image Segmentation. In Proceedings of the 2019 IEEE International Symposium on Multimedia (ISM), San Diego, CA, USA, 9–11 December 2019; pp. 225–2255. [Google Scholar]
- Fan, D.-P.; Ji, G.-P.; Zhou, T.; Chen, G.; Fu, H.; Shen, J.; Shao, L. PraNet: Parallel Reverse Attention Network for Polyp Segmentation. arXiv 2020, arXiv:2006.11392. [Google Scholar] [CrossRef]
- Zhang, Y.; Liu, H.; Hu, Q. TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2021, Strasbourg, France, 27 September–1 October 2021; de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 14–24. [Google Scholar]
- Ji, G.-P.; Xiao, G.; Chou, Y.-C.; Fan, D.-P.; Zhao, K.; Chen, G.; Gool, L.V. Video Polyp Segmentation: A Deep Learning Perspective. Mach. Intell. Res. 2022, 19, 531–549. [Google Scholar] [CrossRef]
- Li, H.; Zhai, D.-H.; Xia, Y. ERDUnet: An Efficient Residual Double-coding Unet for Medical Image Segmentation. IEEE Trans. Circuits Syst. Video Technol. 2023, 34, 2093–2096. [Google Scholar] [CrossRef]
- Cao, J.; Wang, X.; Qu, Z.; Zhuo, L.; Li, X.; Zhang, H.; Yang, Y.; Wei, W. WDFF-Net: Weighted Dual-Branch Feature Fusion Network for Polyp Segmentation with Object-Aware Attention Mechanism. IEEE J. Biomed. Health Inform. 2024, 28, 4118–4131. [Google Scholar] [CrossRef] [PubMed]
- Li, S.; Sui, X.; Fu, J.; Fu, H.; Luo, X.; Feng, Y.; Xu, X.; Liu, Y.; Ting, D.S.W.; Goh, R.S.M. Few-Shot Domain Adaptation with Polymorphic Transformers. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2021, Strasbourg, France, 27 September–1 October 2021; de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 330–340. [Google Scholar]
- Wang, J.; Huang, Q.; Tang, F.; Meng, J.; Su, J.; Song, S. Stepwise Feature Fusion: Local Guides Global. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2022, Singapore, 18–22 September 2022; pp. 110–120. [Google Scholar]
- Rahman, M.M.; Marculescu, R. Medical Image Segmentation via Cascaded Attention Decoding. In Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2–7 January 2023; pp. 6211–6220. [Google Scholar]
- Xiao, B.; Hu, J.; Li, W.; Pun, C.-M.; Bi, X. CTNet: Contrastive Transformer Network for Polyp Segmentation. IEEE Trans. Cybern. 2024, 54, 5040–5053. [Google Scholar] [CrossRef]
- Xu, Z.; Tang, F.; Chen, Z.; Zhou, Z.; Wu, W.; Yang, Y.; Liang, Y.; Jiang, J.; Cai, X.; Su, J. Polyp-Mamba: Polyp Segmentation with Visual Mamba. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2024, Marrakesh, Morocco, 6–10 October 2024; Linguraru, M.G., Dou, Q., Feragen, A., Giannarou, S., Glocker, B., Lekadir, K., Schnabel, J.A., Eds.; Lecture Notes in Computer Science. Springer Nature: Cham, Switzerland, 2024; Volume 15008, pp. 510–521, ISBN 978-3-031-72110-6. [Google Scholar]
- Sánchez-Peralta, L.F.; Bote-Curiel, L.; Picón, A.; Sánchez-Margallo, F.M.; Pagador, J.B. Deep learning to find colorectal polyps in colonoscopy: A systematic literature review. Artif. Intell. Med. 2020, 108, 101923. [Google Scholar] [CrossRef]
- Mei, J.; Zhou, T.; Huang, K.; Zhang, Y.; Zhou, Y.; Wu, Y.; Fu, H. A survey on deep learning for polyp segmentation: Techniques, challenges and future trends. Vis. Intell. 2025, 3, 1. [Google Scholar] [CrossRef]
- Wu, Z.; Lv, F.; Chen, C.; Hao, A.; Li, S. Colorectal Polyp Segmentation in the Deep Learning Era: A Comprehensive Survey. arXiv 2024. [Google Scholar] [CrossRef]
- Gupta, M.; Mishra, A. A systematic review of deep learning based image segmentation to detect polyp. Artif. Intell. Rev. 2024, 57, 7. [Google Scholar] [CrossRef]
- Xia, Q.; Zheng, H.; Zou, H.; Luo, D.; Tang, H.; Li, L.; Jiang, B. A comprehensive review of deep learning for medical image segmentation. Neurocomputing 2025, 613, 128740. [Google Scholar] [CrossRef]
- Li, S.; Ren, Y.; Yu, Y.; Jiang, Q.; He, X.; Li, H. A survey of deep learning algorithms for colorectal polyp segmentation. Neurocomputing 2025, 614, 128767. [Google Scholar] [CrossRef]
- Nemani, P.; Vadali, V.S.S.; Medi, P.R.; Marisetty, A.; Vollala, S.; Kumar, S. Cross-modal hybrid architectures for gastrointestinal tract image analysis: A systematic review and futuristic applications. Image Vis. Comput. 2024, 148, 105068. [Google Scholar] [CrossRef]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
- Pickhardt, P.J.; Kim, D.H.; Pooler, B.D.; Hinshaw, J.L.; Barlow, D.; Jensen, D.; Reichelderfer, M.; Cash, B.D. Assessment of volumetric growth rates of small colorectal polyps with CT colonography: A longitudinal study of natural history. Lancet Oncol. 2013, 14, 711–720. [Google Scholar] [CrossRef]
- Kudo, S.; Hirota, S.; Nakajima, T.; Hosobe, S.; Kusaka, H.; Kobayashi, T.; Himori, M.; Yagyuu, A. Colorectal tumours and pit pattern. J. Clin. Pathol. 1994, 47, 880–885. [Google Scholar] [CrossRef] [PubMed]
- Putniković, D.; Jevtić, J.; Ristić, N.; Milovanovich, I.D.; Đuknić, M.; Radusinović, M.; Popovac, N.; Đorđić, I.; Leković, Z.; Janković, R.M. Endoscopic and Histopathological Findings in the Upper Gastrointestinal Tract in Pediatric Chron’s Disease. preprints 2024. [Google Scholar] [CrossRef]
- Winawer, S.J.; Zauber, A.G.; Ho, M.N.; O’Brien, M.J.; Gottlieb, L.S.; Sternberg, S.S.; Waye, J.D.; Schapiro, M.; Bond, J.H.; Panish, J.F. Prevention of colorectal cancer by colonoscopic polypectomy. The National Polyp Study Workgroup. N. Engl. J. Med. 1993, 329, 1977–1981. [Google Scholar] [CrossRef] [PubMed]
- Lieberman, D.; Moravec, M.; Holub, J.; Michaels, L.; Eisen, G. Polyp Size and Advanced Histology in Patients Undergoing Colonoscopy Screening: Implications for CT Colonography. Gastroenterology 2008, 135, 1100–1105. [Google Scholar] [CrossRef]
- Snover, D.C. Update on the serrated pathway to colorectal carcinoma. Hum. Pathol. 2011, 42, 1–10. [Google Scholar] [CrossRef]
- Bernal, J.; Sánchez, J.; Vilariño, F. Towards automatic polyp detection with a polyp appearance model. Pattern Recognit. 2012, 45, 3166–3182. [Google Scholar] [CrossRef]
- Kuipers, E.J.; Rösch, T.; Bretthauer, M. Colorectal cancer screening—Optimizing current strategies and new directions. Nat. Rev. Clin. Oncol. 2013, 10, 130–142. [Google Scholar] [CrossRef]
- Rex, D.K.; Johnson, D.A.; Lieberman, D.A.; Burt, R.W.; Sonnenberg, A. Colorectal cancer prevention 2000: Screening recommendations of the American College of Gastroenterology1. Am. J. Gastroenterol. 2000, 95, 868–877. [Google Scholar] [CrossRef]
- Van Rijn, J.C.; Reitsma, J.B.; Stoker, J.; Bossuyt, P.M.; Van Deventer, S.J.; Dekker, E. Polyp Miss Rate Determined by Tandem Colonoscopy: A Systematic Review. Am. J. Gastroenterol. 2006, 101, 343–350. [Google Scholar] [CrossRef] [PubMed]
- Ferlitsch, M.; Hassan, C.; Bisschops, R.; Bhandari, P.; Dinis-Ribeiro, M.; Risio, M.; Paspatis, G.A.; Moss, A.; Libânio, D.; Lorenzo-Zúñiga, V.; et al. Colorectal polypectomy and endoscopic mucosal resection: European Society of Gastrointestinal Endoscopy (ESGE) Guideline—Update 2024. Endoscopy 2024, 56, 516–545. [Google Scholar] [CrossRef]
- Granados-Romero, J.J.; Valderrama-Treviño, A.I.; Contreras-Flores, E.H.; Barrera-Mera, B.; Herrera Enríquez, M.; Uriarte-Ruíz, K.; Ceballos-Villalba, J.C.; Estrada-Mata, A.G.; Alvarado Rodríguez, C.; Arauz-Peña, G. Colorectal cancer: A review. Int. J. Res. Med. Sci. 2017, 5, 4667. [Google Scholar] [CrossRef]
- Winawer, S.J.; Zauber, A.G.; O’Brien, M.J.; Geenen, J.; Waye, J.D. The National Polyp Study at 40: Challenges then and now. Gastrointest. Endosc. 2021, 93, 720–726. [Google Scholar] [CrossRef]
- Løberg, M.; Kalager, M.; Holme, Ø.; Hoff, G.; Adami, H.-O.; Bretthauer, M. Long-Term Colorectal-Cancer Mortality after Adenoma Removal. N. Engl. J. Med. 2014, 371, 799–807. [Google Scholar] [CrossRef]
- Heresbach, D.; Barrioz, T.; Lapalus, M.G.; Coumaros, D.; Bauret, P.; Potier, P.; Sautereau, D.; Boustière, C.; Grimaud, J.C.; Barthélémy, C.; et al. Miss rate for colorectal neoplastic polyps: A prospective multicenter study of back-to-back video colonoscopies. Endoscopy 2008, 40, 284–290. [Google Scholar] [CrossRef] [PubMed]
- Pickhardt, P.J.; Nugent, P.A.; Choi, J.R.; Schindler, W.R. Flat Colorectal Lesions in Asymptomatic Adults:</b> Implications for Screening with CT Virtual Colonoscopy. Am. J. Roentgenol. 2004, 183, 1343–1347. [Google Scholar] [CrossRef]
- Ahn, S.B.; Han, D.S.; Bae, J.H.; Byun, T.J.; Kim, J.P.; Eun, C.S. The Miss Rate for Colorectal Adenoma Determined by Quality-Adjusted, Back-to-Back Colonoscopies. Gut Liver 2012, 6, 64–70. [Google Scholar] [CrossRef]
- Hassan, C.; Repici, A.; Sharma, P.; Correale, L.; Zullo, A.; Bretthauer, M.; Senore, C.; Spada, C.; Bellisario, C.; Bhandari, P.; et al. Efficacy and safety of endoscopic resection of large colorectal polyps: A systematic review and meta-analysis. Gut 2016, 65, 806–820. [Google Scholar] [CrossRef]
- Moss, A.; Williams, S.J.; Hourigan, L.F.; Brown, G.; Tam, W.; Singh, R.; Zanati, S.; Burgess, N.G.; Sonson, R.; Byth, K.; et al. Long-term adenoma recurrence following wide-field endoscopic mucosal resection (WF-EMR) for advanced colonic mucosal neoplasia is infrequent: Results and risk factors in 1000 cases from the Australian Colonic EMR (ACE) study. Gut 2015, 64, 57–65. [Google Scholar] [CrossRef]
- Jass, J.R. Classification of colorectal cancer based on correlation of clinical, morphological and molecular features. Histopathology 2007, 50, 113–130. [Google Scholar] [CrossRef]
- Bressler, B.; Paszat, L.F.; Chen, Z.; Rothwell, D.M.; Vinden, C.; Rabeneck, L. Rates of New or Missed Colorectal Cancers After Colonoscopy and Their Risk Factors: A Population-Based Analysis. Gastroenterology 2007, 132, 96–102. [Google Scholar] [CrossRef]
- Kaminski, M.F.; Polkowski, M.; Zwierko, M.; Butruk, E. Quality Indicators for Colonoscopy and the Risk of Interval Cancer. N. Engl. J. Med. 2010, 362, 1795–1803. [Google Scholar] [CrossRef]
- Corley, D.A.; Jensen, C.D.; Marks, A.R.; Zhao, W.K.; Lee, J.K.; Doubeni, C.A.; Zauber, A.G.; De Boer, J.; Fireman, B.H.; Schottinger, J.E.; et al. Adenoma Detection Rate and Risk of Colorectal Cancer and Death. N. Engl. J. Med. 2014, 370, 1298–1306. [Google Scholar] [CrossRef]
- Tajbakhsh, N.; Gurudu, S.R.; Liang, J. Automated Polyp Detection in Colonoscopy Videos Using Shape and Context Information. IEEE Trans. Med. Imaging 2016, 35, 630–644. [Google Scholar] [CrossRef] [PubMed]
- Urban, G.; Tripathi, P.; Alkayali, T.; Mittal, M.; Jalali, F.; Karnes, W.; Baldi, P. Deep Learning Localizes and Identifies Polyps in Real Time With 96% Accuracy in Screening Colonoscopy. Gastroenterology 2018, 155, 1069–1078.e8. [Google Scholar] [CrossRef]
- Bui, N.-T.; Hoang, D.-H.; Nguyen, Q.-T.; Tran, M.-T.; Le, N. MEGANet: Multi-Scale Edge-Guided Attention Network for Weak Boundary Polyp Segmentation. In Proceedings of the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–7 January 2024; pp. 7970–7979. [Google Scholar]
- Kang, X.; Ma, Z.; Liu, K.; Li, Y.; Miao, Q. Multi-scale information sharing and selection network with boundary attention for polyp segmentation. Eng. Appl. Artif. Intell. 2025, 139, 109467. [Google Scholar] [CrossRef]
- Liu, Z.; Zheng, S.; Sun, X.; Zhu, Z.; Zhao, Y.; Yang, X.; Zhao, Y. The Devil Is in the Boundary: Boundary-enhanced Polyp Segmentation. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 5414–5423. [Google Scholar] [CrossRef]
- Zhou, T.; Zhou, Y.; He, K.; Gong, C.; Yang, J.; Fu, H.; Shen, D. Cross-level Feature Aggregation Network for Polyp Segmentation. Pattern Recognit. 2023, 140, 109555. [Google Scholar] [CrossRef]
- Yue, G.; Han, W.; Jiang, B.; Zhou, T.; Cong, R.; Wang, T. Boundary Constraint Network with Cross Layer Feature Integration for Polyp Segmentation. IEEE J. Biomed. Health Inform. 2022, 26, 4090–4099. [Google Scholar] [CrossRef]
- Lin, Y.; Wu, J.; Xiao, G.; Guo, J.; Chen, G.; Ma, J. BSCA-Net: Bit Slicing Context Attention network for polyp segmentation. Pattern Recognit. 2022, 132, 108917. [Google Scholar] [CrossRef]
- Chen, H.; Ju, H.; Qin, J.; Song, J.; Lyu, Y.; Liu, X. Dataset-level color augmentation and multi-scale exploration methods for polyp segmentation. Expert. Syst. Appl. 2025, 260, 125395. [Google Scholar] [CrossRef]
- Zhou, T.; Zhou, Y.; Li, G.; Chen, G.; Shen, J. Uncertainty-aware Hierarchical Aggregation Network for Medical Image Segmentation. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 7440–7453. [Google Scholar] [CrossRef]
- Wang, T.; Qi, X.; Yang, G. Polyp Segmentation via Semantic Enhanced Perceptual Network. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 12594–12607. [Google Scholar] [CrossRef]
- Jin, Q.; Hou, H.; Zhang, G.; Li, Z. FEGNet: A Feedback Enhancement Gate Network for Automatic Polyp Segmentation. IEEE J. Biomed. Health Inform. 2023, 27, 3420–3430. [Google Scholar] [CrossRef]
- Jain, S.; Atale, R.; Gupta, A.; Mishra, U.; Seal, A.; Ojha, A.; Kuncewicz, J.; Krejcar, O. CoInNet: A Convolution-Involution Network with a Novel Statistical Attention for Automatic Polyp Segmentation. IEEE Trans. Med. Imaging 2023, 42, 3987–4000. [Google Scholar] [CrossRef]
- Wang, C.; Xu, R.; Xu, S.; Meng, W.; Zhang, X. Automatic polyp segmentation via image-level and surrounding-level context fusion deep neural network. Eng. Appl. Artif. Intell. 2023, 123, 106168. [Google Scholar] [CrossRef]
- Zhang, R.; Lai, P.; Wan, X.; Fan, D.-J.; Gao, F.; Wu, X.-J.; Li, G. Lesion-Aware Dynamic Kernel for Polyp Segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2022, Singapore, 18–22 September 2022; Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S., Eds.; Springer Nature: Cham, Switzerland, 2022; pp. 99–109. [Google Scholar]
- Srivastava, A.; Jha, D.; Chanda, S.; Pal, U.; Johansen, H.D.; Johansen, D.; Riegler, M.A.; Ali, S.; Halvorsen, P. MSRF-Net: A Multi-Scale Residual Fusion Network for Biomedical Image Segmentation. IEEE J. Biomed. Health Inform. 2022, 26, 2252–2263. [Google Scholar] [CrossRef] [PubMed]
- Liu, M.; Jiao, L.; Liu, X.; Li, L.; Liu, F.; Yang, S.; Wang, S.; Hou, B. Multi-scale Contourlet Knowledge Guide Learning Segmentation. IEEE Trans. Multimed. 2023, 26, 4831–4845. [Google Scholar] [CrossRef]
- Song, P.; Li, J.; Fan, H. Attention based multi-scale parallel network for polyp segmentation. Comput. Biol. Med. 2022, 146, 105476. [Google Scholar] [CrossRef]
- Liu, G.; Chen, Z.; Liu, D.; Chang, B.; Dou, Z. FTMF-Net: A Fourier Transform-Multiscale Feature Fusion Network for Segmentation of Small Polyp Objects. IEEE Trans. Instrum. Meas. 2023, 72, 502085. [Google Scholar] [CrossRef]
- Gu, Y.; Zhou, T.; Zhang, Y.; Zhou, Y.; He, K.; Gong, C.; Fu, H. Dual-scale enhanced and cross-generative consistency learning for semi-supervised medical image segmentation. Pattern Recognit. 2025, 158, 110962. [Google Scholar] [CrossRef]
- Ren, G.; Lazarou, M.; Yuan, J.; Stathaki, T. Towards Automated Polyp Segmentation Using Weakly- and Semi-Supervised Learning and Deformable Transformers. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada, 17–24 June 2023; pp. 4355–4364. [Google Scholar]
- Yu, Z.; Zhao, L.; Liao, T.; Zhang, X.; Chen, G.; Xiao, G. A novel non-pretrained deep supervision network for polyp segmentation. Pattern Recognit. 2024, 154, 110554. [Google Scholar] [CrossRef]
- Guo, X.; Chen, Z.; Liu, J.; Yuan, Y. Non-equivalent images and pixels: Confidence-aware resampling with meta-learning mixup for polyp segmentation. Med. Image Anal. 2022, 78, 102394. [Google Scholar] [CrossRef]
- Wang, A.; Xu, M.; Zhang, Y.; Islam, M.; Ren, H. S^2ME: Spatial-Spectral Mutual Teaching and Ensemble Learning for Scribble-Supervised Polyp Segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2023, Vancouver, BC, Canada, 8–12 October 2023; Greenspan, H., Madabhushi, A., Mousavi, P., Salcudean, S., Duncan, J., Syeda-Mahmood, T., Taylor, R., Eds.; Springer Nature: Cham, Switzerland, 2023; pp. 35–45. [Google Scholar]
- Jia, X.; Shen, Y.; Yang, J.; Song, R.; Zhang, W.; Meng, M.Q.-H.; Liao, J.C.; Xing, L. PolypMixNet: Enhancing semi-supervised polyp segmentation with polyp-aware augmentation. Comput. Biol. Med. 2024, 170, 108006. [Google Scholar] [CrossRef]
- Zhang, Z.; Li, Y.; Shin, B.-S. Generalizable Polyp Segmentation via Randomized Global Illumination Augmentation. IEEE J. Biomed. Health Inform. 2024, 28, 2138–2151. [Google Scholar] [CrossRef]
- Lu, Z.; Zhang, Y.; Zhou, Y.; Wu, Y.; Zhou, T. Domain-interactive Contrastive Learning and Prototype-guided Self-training for Cross-domain Polyp Segmentation. IEEE Trans. Med. Imaging 2024. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Chen, C. Unsupervised Adaptation of Polyp Segmentation Models via Coarse-to-Fine Self-Supervision. In Proceedings of the Information Processing in Medical Imaging, San Carlos de Bariloche, Argentina, 18–23 June 2023; Frangi, A., de Bruijne, M., Wassermann, D., Navab, N., Eds.; Springer Nature: Cham, Switzerland, 2023; pp. 250–262. [Google Scholar]
- Wu, C.; Long, C.; Li, S.; Yang, J.; Jiang, F.; Zhou, R. MSRAformer: Multiscale spatial reverse attention network for polyp segmentation. Comput. Biol. Med. 2022, 151, 106274. [Google Scholar] [CrossRef] [PubMed]
- Liu, J.; Chen, Q.; Zhang, Y.; Wang, Z.; Deng, X.; Wang, J. Multi-level feature fusion network combining attention mechanisms for polyp segmentation. Inf. Fusion 2024, 104, 102195. [Google Scholar] [CrossRef]
- Xu, W.; Xu, R.; Wang, C.; Li, X.; Xu, S.; Guo, L. PSTNet: Enhanced Polyp Segmentation with Multi-Scale Alignment and Frequency Domain Integration. IEEE J. Biomed. Health Inform. 2024, 28, 6042–6053. [Google Scholar] [CrossRef] [PubMed]
- Yue, G.; Xiao, H.; Zhou, T.; Tan, S.; Liu, Y.; Yan, W. Progressive Feature Enhancement Network for Automated Colorectal Polyp Segmentation. IEEE Trans. Autom. Sci. Eng. 2024, 22, 5792–5803. [Google Scholar] [CrossRef]
- Dong, B.; Wang, W.; Fan, D.-P.; Li, J.; Fu, H.; Shao, L. Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers. CAAI Artif. Intell. Res. 2023, 2, 9150015. [Google Scholar] [CrossRef]
- Liu, G.; Yao, S.; Liu, D.; Chang, B.; Chen, Z.; Wang, J.; Wei, J. CAFE-Net: Cross-Attention and Feature Exploration Network for polyp segmentation. Expert Syst. Appl. 2024, 238, 121754. [Google Scholar] [CrossRef]
- Li, W.; Lu, W.; Chu, J.; Fan, F. LACINet: A Lesion-Aware Contextual Interaction Network for Polyp Segmentation. IEEE Trans. Instrum. Meas. 2023, 72, 5029112. [Google Scholar] [CrossRef]
- Cai, L.; Chen, L.; Huang, J.; Wang, Y.; Zhang, Y. Know your orientation: A viewpoint-aware framework for polyp segmentation. Med. Image Anal. 2024, 97, 103288. [Google Scholar] [CrossRef]
- Rahman, M.M.; Munir, M.; Jha, D.; Bagci, U.; Marculescu, R. PP-SAM: Perturbed Prompts for Robust Adaption of Segment Anything Model for Polyp Segmentation. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 17–18 June 2024; pp. 4989–4995. [Google Scholar]
- Jha, D.; Tomar, N.K.; Bhattacharya, D.; Bagci, U. TransRUPNet for Improved Polyp Segmentation. arXiv 2024. [Google Scholar] [CrossRef]
- Huang, X.; Gong, H.; Zhang, J. HST-MRF: Heterogeneous Swin Transformer with Multi-Receptive Field for Medical Image Segmentation. IEEE J. Biomed. Health Inform. 2024, 28, 4048–4061. [Google Scholar] [CrossRef]
- Lin, A.; Chen, B.; Xu, J.; Zhang, Z.; Lu, G.; Zhang, D. DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation. IEEE Trans. Instrum. Meas. 2022, 71, 4005615. [Google Scholar] [CrossRef]
- Liu, Y.; Yang, Y.; Jiang, Y.; Xie, Z. Multi-view orientational attention network combining point-based affinity for polyp segmentation. Expert Syst. Appl. 2024, 249, 123663. [Google Scholar] [CrossRef]
- Yin, X.; Zeng, J.; Hou, T.; Tang, C.; Gan, C.; Jain, D.K.; García, S. RSAFormer: A method of polyp segmentation with region self-attention transformer. Comput. Biol. Med. 2024, 172, 108268. [Google Scholar] [CrossRef] [PubMed]
- Yang, C.; Zhang, Z. PFD-Net: Pyramid Fourier Deformable Network for medical image segmentation. Comput. Biol. Med. 2024, 172, 108302. [Google Scholar] [CrossRef] [PubMed]
- Wu, H.; Min, W.; Gai, D.; Huang, Z.; Geng, Y.; Wang, Q.; Chen, R. HD-Former: A hierarchical dependency Transformer for medical image segmentation. Comput. Biol. Med. 2024, 178, 108671. [Google Scholar] [CrossRef]
- Du, H.; Wang, J.; Liu, M.; Wang, Y.; Meijering, E. SwinPA-Net: Swin Transformer-Based Multiscale Feature Pyramid Aggregation Network for Medical Image Segmentation. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 5355–5366. [Google Scholar] [CrossRef]
- Zhang, W.; Fu, C.; Zheng, Y.; Zhang, F.; Zhao, Y.; Sham, C.-W. HSNet: A hybrid semantic network for polyp segmentation. Comput. Biol. Med. 2022, 150, 106173. [Google Scholar] [CrossRef]
- Li, W.; Nie, X.; Li, F.; Huang, Z.; Zeng, G. FMCA-Net: A feature secondary multiplexing and dilated convolutional attention polyp segmentation network based on pyramid vision transformer. Expert Syst. Appl. 2025, 260, 125419. [Google Scholar] [CrossRef]
- Jiang, X.; Zhu, Y.; Liu, Y.; Wang, N.; Yi, L. MC-DC: An MLP-CNN Based Dual-path Complementary Network for Medical Image Segmentation. Comput. Methods Programs Biomed. 2023, 242, 107846. [Google Scholar] [CrossRef]
- Wu, H.; Zhao, Z.; Wang, Z. META-Unet: Multi-Scale Efficient Transformer Attention Unet for Fast and High-Accuracy Polyp Segmentation. IEEE Trans. Autom. Sci. Eng. 2023, 21, 4117–4128. [Google Scholar] [CrossRef]
- Sanderson, E.; Matuszewski, B.J. FCN-Transformer Feature Fusion for Polyp Segmentation. In Proceedings of the 26th Annual Conference, MIUA 2022, Cambridge, UK, 27–29 July 2022; Volume 13413, pp. 892–907. [Google Scholar] [CrossRef]
- Li, W.; Huang, Z.; Li, F.; Zhao, Y.; Zhang, H. CIFG-Net: Cross-level information fusion and guidance network for Polyp Segmentation. Comput. Biol. Med. 2024, 169, 107931. [Google Scholar] [CrossRef] [PubMed]
- Shao, H.; Zhang, Y.; Hou, Q. Polyper: Boundary Sensitive Polyp Segmentation. Proc. AAAI Conf. Artif. Intell. 2024, 38, 4731–4739. [Google Scholar] [CrossRef]
- Cai, L.; Wu, M.; Chen, L.; Bai, W.; Yang, M.; Lyu, S.; Zhao, Q. Using Guided Self-Attention with Local Information for Polyp Segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2022, Singapore, 18–22 September 2022; Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S., Eds.; Springer Nature: Cham, Switzerland, 2022; pp. 629–638. [Google Scholar]
- Xie, Y.; Zhou, T.; Zhou, Y.; Chen, G. SimTxtSeg: Weakly-Supervised Medical Image Segmentation with Simple Text Cues. arXiv 2024. [Google Scholar] [CrossRef]
- Li, H.; Zhang, D.; Yao, J.; Han, L.; Li, Z.; Han, J. ASPS: Augmented Segment Anything Model for Polyp Segmentation. arXiv 2024. [Google Scholar] [CrossRef]
- Xiong, X.; Wu, Z.; Tan, S.; Li, W.; Tang, F.; Chen, Y.; Li, S.; Ma, J.; Li, G. SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation. arXiv 2024, arXiv:2408.08870. [Google Scholar] [CrossRef]
- Jha, D.; Tomar, N.K.; Sharma, V.; Bagci, U. TransNetR: Transformer-based Residual Network for Polyp Segmentation with Multi-Center Out-of-Distribution Testing. arXiv 2023, arXiv:2303.07428. [Google Scholar]
- Xie, J.; Liao, R.; Zhang, Z.; Yi, S.; Zhu, Y.; Luo, G. ProMamba: Prompt-Mamba for polyp segmentation. arXiv 2024. [Google Scholar] [CrossRef]
- Zhang, M.; Yu, Y.; Jin, S.; Gu, L.; Ling, T.; Tao, X. VM-UNET-V2: Rethinking Vision Mamba UNet for Medical Image Segmentation. In Proceedings of the Bioinformatics Research and Applications, Milan, Italy, 13–15 September 2024; Peng, W., Cai, Z., Skums, P., Eds.; Springer Nature: Singapore, 2024; pp. 335–346. [Google Scholar]
- Dutta, T.K.; Majhi, S.; Nayak, D.R.; Jha, D. SAM-Mamba: Mamba Guided SAM Architecture for Generalized Zero-Shot Polyp Segmentation. arXiv 2024. [Google Scholar] [CrossRef]
- Zhu, X.; Wang, W.; Zhang, C.; Wang, H. Polyp-Mamba: A Hybrid Multi-Frequency Perception Gated Selection Network for polyp segmentation. Inf. Fusion 2025, 115, 102759. [Google Scholar] [CrossRef]
- Wu, R.; Liu, Y.; Liang, P.; Chang, Q. H-vmunet: High-order Vision Mamba UNet for Medical Image Segmentation. arXiv 2024. [Google Scholar] [CrossRef]
- Du, Y.; Jiang, Y.; Tan, S.; Liu, S.-Q.; Li, Z.; Li, G.; Wan, X. Highlighted Diffusion Model as Plug-in Priors for Polyp Segmentation. IEEE J. Biomed. Health Inform. 2024, 29, 1209–1220. [Google Scholar] [CrossRef]
- Wang, Z.; Liu, M.; Jiang, J.; Qu, X. Colorectal polyp segmentation with denoising diffusion probabilistic models. Comput. Biol. Med. 2024, 180, 108981. [Google Scholar] [CrossRef]
- Mansoori, M.; Shahabodini, S.; Abouei, J.; Plataniotis, K.N.; Mohammadi, A. Polyp SAM 2: Advancing Zero shot Polyp Segmentation in Colorectal Cancer Detection. arXiv 2024. [Google Scholar] [CrossRef]
- Banik, D.; Roy, K.; Krejcar, O.; Bhattacharjee, D. dHBLSN: A diligent hierarchical broad learning system network for cogent polyp segmentation. Knowl.-Based Syst. 2024, 300, 112228. [Google Scholar] [CrossRef]
- Wu, H.; Zhao, Z.; Zhong, J.; Wang, W.; Wen, Z.; Qin, J. PolypSeg+: A Lightweight Context-Aware Network for Real-Time Polyp Segmentation. IEEE Trans. Cybern. 2023, 53, 2610–2621. [Google Scholar] [CrossRef]
- Wan, L.; Chen, Z.; Xiao, Y.; Zhao, J.; Feng, W.; Fu, H. Iterative feedback-based models for image and video polyp segmentation. Comput. Biol. Med. 2024, 177, 108569. [Google Scholar] [CrossRef]
- Lu, Y.; Yang, Y.; Xing, Z.; Wang, Q.; Zhu, L. Diff-VPS: Video Polyp Segmentation via a Multi-task Diffusion Network with Adversarial Temporal Reasoning. arXiv 2024, arXiv:2409.07238. [Google Scholar]
- Chen, G.; Yang, J.; Pu, X.; Ji, G.-P.; Xiong, H.; Pan, Y.; Cui, H.; Xia, Y. MAST: Video Polyp Segmentation with a Mixture-Attention Siamese Transformer. arXiv 2024, arXiv:2401.12439. [Google Scholar] [CrossRef]
- Xu, Z.; Rittscher, J.; Ali, S. SSTFB: Leveraging self-supervised pretext learning and temporal self-attention with feature branching for real-time video polyp segmentation. arXiv 2024, arXiv:2406.10200. [Google Scholar]
- Hu, Q.; Yi, Z.; Zhou, Y.; Peng, F.; Liu, M.; Li, Q.; Wang, Z. SALI: Short-term Alignment and Long-term Interaction Network for Colonoscopy Video Polyp Segmentation. arXiv 2024, arXiv:2406.13532. [Google Scholar]
- Yang, Y.; Xing, Z.; Zhu, L. Vivim: A Video Vision Mamba for Medical Video Object Segmentation. arXiv 2024, arXiv:2401.14168. [Google Scholar] [CrossRef]
- Wang, M.; An, X.; Pei, Z.; Li, N.; Zhang, L.; Liu, G.; Ming, D. An Efficient Multi-Task Synergetic Network for Polyp Segmentation and Classification. IEEE J. Biomed. Health Inform. 2023, 28, 1228–1239. [Google Scholar] [CrossRef]
- Mazumdar, H.; Chakraborty, C.; Sathvik, M.; Jayakumar, P.; Kaushik, A. Optimizing Pix2Pix GAN with Attention Mechanisms for AI-Driven Polyp Segmentation in IoMT-Enabled Smart Healthcare. IEEE J. Biomed. Health Inform. 2023, 29, 3825–3832. [Google Scholar] [CrossRef]
- Tomar, N.K.; Jha, D.; Bagci, U.; Ali, S. TGANet: Text-Guided Attention for Improved Polyp Segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2022, Singapore, 18–22 September 2022; pp. 151–160. [Google Scholar]
- Liu, T.; Ye, X.; Hu, K.; Xiong, D.; Zhang, Y.; Li, X.; Gao, X. Polyp segmentation with distraction separation. Expert Syst. Appl. 2023, 228, 120434. [Google Scholar] [CrossRef]
- Ji, Z.; Qian, H.; Ma, X. Progressive Group Convolution Fusion network for colon polyp segmentation. Biomed. Signal Process. Control 2024, 96, 106586. [Google Scholar] [CrossRef]
- Silva, J.; Histace, A.; Romain, O.; Dray, X.; Granado, B. Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer. Int. J. Comput. Assist. Radiol. Surg. 2014, 9, 283–293. [Google Scholar] [CrossRef] [PubMed]
- Bernal, J.; Sánchez, F.J.; Fernández-Esparrach, G.; Gil, D.; Rodríguez, C.; Vilariño, F. WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput. Med. Imaging Graph. 2015, 43, 99–111. [Google Scholar] [CrossRef]
- Vázquez, D.; Bernal, J.; Sánchez, F.J.; Fernández-Esparrach, G.; López, A.M.; Romero, A.; Drozdzal, M.; Courville, A. A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images. J. Healthc. Eng. 2017, 2017, 4037190. [Google Scholar] [CrossRef]
- Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Halvorsen, P.; de Lange, T.; Johansen, D.; Johansen, H.D. Kvasir-SEG: A Segmented Polyp Dataset. In Proceedings of the MultiMedia Modeling, Daejeon, Republic of Korea, 5–8 January 2020; Ro, Y.M., Cheng, W.-H., Kim, J., Chu, W.-T., Cui, P., Choi, J.-W., Hu, M.-C., De Neve, W., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 451–462. [Google Scholar]
- Borgli, H.; Thambawita, V.; Smedsrud, P.H.; Hicks, S.; Jha, D.; Eskeland, S.L.; Randel, K.R.; Pogorelov, K.; Lux, M.; Nguyen, D.T.D.; et al. HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Sci. Data 2020, 7, 283. [Google Scholar] [CrossRef]
- Sánchez-Peralta, L.F.; Pagador, J.B.; Picón, A.; Calderón, Á.J.; Polo, F.; Andraka, N.; Bilbao, R.; Glover, B.; Saratxaga, C.L.; Sánchez-Margallo, F.M. PICCOLO White-Light and Narrow-Band Imaging Colonoscopic Dataset: A Performance Comparative of Models and Datasets. Appl. Sci. 2020, 10, 8501. [Google Scholar] [CrossRef]
- Ngoc Lan, P.; An, N.S.; Hang, D.V.; Long, D.V.; Trung, T.Q.; Thuy, N.T.; Sang, D.V. NeoUNet: Towards Accurate Colon Polyp Segmentation and Neoplasm Detection. In Proceedings of the Advances in Visual Computing, Virtual, 4–6 October 2021; Bebis, G., Athitsos, V., Yan, T., Lau, M., Li, F., Shi, C., Yuan, X., Mousas, C., Bruder, G., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 15–28. [Google Scholar]
- Ali, S.; Jha, D.; Ghatwary, N.; Realdon, S.; Cannizzaro, R.; Salem, O.E.; Lamarque, D.; Daul, C.; Riegler, M.A.; Anonsen, K.V.; et al. A multi-centre polyp detection and segmentation dataset for generalisability assessment. Sci. Data 2023, 10, 75. [Google Scholar] [CrossRef] [PubMed]
- Ma, Y.; Chen, X.; Cheng, K.; Li, Y.; Sun, B. LDPolypVideo Benchmark: A Large-Scale Colonoscopy Video Dataset of Diverse Polyps. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2021, Strasbourg, France, 27 September–1 October 2021; de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 387–396. [Google Scholar]
- Smedsrud, P.H.; Thambawita, V.; Hicks, S.A.; Gjestang, H.; Nedrejord, O.O.; Næss, E.; Borgli, H.; Jha, D.; Berstad, T.J.D.; Eskeland, S.L.; et al. Kvasir-Capsule, a video capsule endoscopy dataset. Sci. Data 2021, 8, 142. [Google Scholar] [CrossRef]
- Misawa, M.; Kudo, S.; Mori, Y.; Hotta, K.; Ohtsuka, K.; Matsuda, T.; Saito, S.; Kudo, T.; Baba, T.; Ishida, F.; et al. Development of a computer-aided detection system for colonoscopy and a publicly accessible large colonoscopy video database (with video). Gastrointest. Endosc. 2021, 93, 960–967.e3. [Google Scholar] [CrossRef]
- Jha, S.; Son, L.H.; Kumar, R.; Priyadarshini, I.; Smarandache, F.; Long, H.V. Neutrosophic image segmentation with Dice Coefficients. Measurement 2019, 134, 762–772. [Google Scholar] [CrossRef]
- Shamir, R.R.; Duchin, Y.; Kim, J.; Sapiro, G.; Harel, N. Continuous Dice Coefficient: A Method for Evaluating Probabilistic Segmentations. arXiv 2019. [Google Scholar] [CrossRef]
- Zhao, R.; Qian, B.; Zhang, X.; Li, Y.; Wei, R.; Liu, Y.; Pan, Y. Rethinking dice loss for medical image segmentation. In Proceedings of the 2020 IEEE international conference on data mining (ICDM), Sorrento, Italy, 17–20 November 2020; pp. 851–860. [Google Scholar]
- Jha, D.; Ali, S.; Tomar, N.; Johansen, H.; Johansen, D.; Rittscher, J.; Riegler, M.; Halvorsen, P. Real-Time Polyp Detection, Localization and Segmentation in Colonoscopy Using Deep Learning. IEEE Access 2021, 9, 40496–40510. [Google Scholar] [CrossRef] [PubMed]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Coleman, C.; Kang, D.; Narayanan, D.; Nardi, L.; Zhao, T.; Zhang, J.; Bailis, P.; Olukotun, K.; Ré, C.; Zaharia, M. Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark. SIGOPS Oper. Syst. Rev. 2019, 53, 14–25. [Google Scholar] [CrossRef]
- Wickstrøm, K.; Kampffmeyer, M.; Jenssen, R. Uncertainty and interpretability in convolutional neural networks for semantic segmentation of colorectal polyps. Med. Image Anal. 2020, 60, 101619. [Google Scholar] [CrossRef]
- Ramos, D.L.; Hortúa, H.J. Deep Bayesian segmentation for colon polyps: Well-calibrated predictions in medical imaging. Biomed. Signal Process. Control 2025, 104, 107383. [Google Scholar] [CrossRef]
- Char, D.S.; Shah, N.H.; Magnus, D. Implementing Machine Learning in Health Care—Addressing Ethical Challenges. N. Engl. J. Med. 2018, 378, 981–983. [Google Scholar] [CrossRef]
- Price, W.N.; Cohen, I.G. Privacy in the age of medical big data. Nat. Med. 2019, 25, 37–43. [Google Scholar] [CrossRef] [PubMed]
- Mittelstadt, B. Principles alone cannot guarantee ethical AI. Nat. Mach. Intell. 2019, 1, 501–507. [Google Scholar] [CrossRef]
- Kaissis, G.A.; Makowski, M.R.; Rückert, D.; Braren, R.F. Secure, privacy-preserving and federated machine learning in medical imaging. Nat. Mach. Intell. 2020, 2, 305–311. [Google Scholar] [CrossRef]
- Vayena, E.; Blasimme, A.; Cohen, I.G. Machine learning in medicine: Addressing ethical challenges. PLoS Med. 2018, 15, e1002689. [Google Scholar] [CrossRef]
- Geis, J.R.; Brady, A.P.; Wu, C.C.; Spencer, J.; Ranschaert, E.; Jaremko, J.L.; Langer, S.G.; Borondy Kitts, A.; Birch, J.; Shields, W.F.; et al. Ethics of Artificial Intelligence in Radiology: Summary of the Joint European and North American Multisociety Statement. Radiology 2019, 293, 436–440. [Google Scholar] [CrossRef] [PubMed]
- Willemink, M.J.; Koszek, W.A.; Hardell, C.; Wu, J.; Fleischmann, D.; Harvey, H.; Folio, L.R.; Summers, R.M.; Rubin, D.L.; Lungren, M.P. Preparing Medical Imaging Data for Machine Learning. Radiology 2020, 295, 4–15. [Google Scholar] [CrossRef] [PubMed]
- Kelly, C.J.; Karthikesalingam, A.; Suleyman, M.; Corrado, G.; King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 2019, 17, 195. [Google Scholar] [CrossRef] [PubMed]
- Larrazabal, A.J.; Nieto, N.; Peterson, V.; Milone, D.H.; Ferrante, E. Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc. Natl. Acad. Sci. USA 2020, 117, 12592–12594. [Google Scholar] [CrossRef]
- Bai, L.; Chen, T.; Tan, Q.; Nah, W.J.; Li, Y.; He, Z.; Yuan, S.; Chen, Z.; Wu, J.; Islam, M.; et al. EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2024, Marrakesh, Morocco, 6–10 October 2024; Linguraru, M.G., Dou, Q., Feragen, A., Giannarou, S., Glocker, B., Lekadir, K., Schnabel, J.A., Eds.; Springer Nature: Cham, Switzerland, 2024; pp. 296–306. [Google Scholar]
- Shang, W.; Ren, D.; Yang, Y.; Zuo, W. Aggregating nearest sharp features via hybrid transformers for video deblurring. Inf. Sci. 2025, 694, 121689. [Google Scholar] [CrossRef]
- ELKarazle, K.; Raman, V.; Chua, C.; Then, P. A Hessian-Based Technique for Specular Reflection Detection and Inpainting in Colonoscopy Images. IEEE J. Biomed. Health Inform. 2024, 28, 4724–4736. [Google Scholar] [CrossRef]
- Sharma, V.; Kumar, A.; Jha, D.; Bhuyan, M.K.; Das, P.K.; Bagci, U. ControlPolypNet: Towards Controlled Colon Polyp Synthesis for Improved Polyp Segmentation. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 17–18 June 2024; pp. 2325–2334. [Google Scholar]
- Ji, G.-P.; Liu, J.; Xu, P.; Barnes, N.; Khan, F.S.; Khan, S.; Fan, D.-P. Frontiers in Intelligent Colonoscopy. arXiv 2024. [Google Scholar] [CrossRef]
- Schön, R.; Lorenz, J.; Ludwig, K.; Lienhart, R. Adapting the Segment Anything Model During Usage in Novel Situations. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 17–18 June 2024; pp. 3616–3626. [Google Scholar]
- Hu, Q.; Yi, Z.; Zhou, Y.; Huang, F.; Liu, M.; Li, Q.; Wang, Z. MonoBox: Tightness-free Box-supervised Polyp Segmentation using Monotonicity Constraint. arXiv 2024. [Google Scholar] [CrossRef]
- Long, J.; Lin, J.; Liu, D. W-PolypBox: Exploring bounding box priors constraints for weakly supervised polyp segmentation. Biomed. Signal Process. Control 2025, 103, 107418. [Google Scholar] [CrossRef]
- Zhang, Z.; Jiang, Y.; Wang, Y.; Xie, B.; Zhang, W.; Li, Y.; Chen, Z.; Jin, X.; Zeng, W. Exploring Contrastive Pre-training for Domain Connections in Medical Image Segmentation. IEEE Trans. Med. Imaging 2025, 44, 1686–1698. [Google Scholar] [CrossRef]
- Fang, Z.; Liu, Y.; Wu, H.; Qin, J. VP-SAM: Taming Segment Anything Model for Video Polyp Segmentation via Disentanglement and Spatio-Temporal Side Network. In Proceedings of the Computer Vision—ECCV 2024, Milan, Italy, 29 September–4 October 2024; Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G., Eds.; Springer Nature: Cham, Switzerland, 2025; pp. 367–383. [Google Scholar]
- Xiao, A.; Xuan, W.; Qi, H.; Xing, Y.; Ren, R.; Zhang, X.; Shao, L.; Lu, S. CAT-SAM: Conditional Tuning for Few-Shot Adaptation of Segment Anything Model. In Proceedings of the Computer Vision—ECCV 2024, Milan, Italy, 29 September–4 October 2024; Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G., Eds.; Springer Nature: Cham, Switzerland, 2025; pp. 189–206. [Google Scholar]
- Wang, L.; Xu, Q.; Chen, C.; Yang, H.; Deng, G. Adaptive cascade decoders for segmenting challenging regions in medical images. Comput. Biol. Med. 2025, 185, 109572. [Google Scholar] [CrossRef]
- Pan, H.; Jha, D.; Biswas, K.; Bagci, U. Frequency-Based Federated Domain Generalization for Polyp Segmentation. arXiv 2024. [Google Scholar] [CrossRef]
- Li, W.; Zhang, Y.; Zhou, H.; Yang, W.; Xie, Z.; He, Y. CLMS: Bridging domain gaps in medical imaging segmentation with source-free continual learning for robust knowledge transfer and adaptation. Med. Image Anal. 2025, 100, 103404. [Google Scholar] [CrossRef]
- Huang, L.; Ruan, S.; Xing, Y.; Feng, M. A review of uncertainty quantification in medical image analysis: Probabilistic and non-probabilistic methods. Med. Image Anal. 2024, 97, 103223. [Google Scholar] [CrossRef]
- Wang, J.; Jin, Y.; Stoyanov, D.; Wang, L. FedDP: Dual Personalization in Federated Medical Image Segmentation. IEEE Trans. Med. Imaging 2024, 43, 297–308. [Google Scholar] [CrossRef] [PubMed]
- Li, L.; Fan, Y.; Tse, M.; Lin, K.-Y. A review of applications in federated learning. Comput. Ind. Eng. 2020, 149, 106854. [Google Scholar] [CrossRef]
- Ma, Y.; Wang, J.; Yang, J.; Wang, L. Model-Heterogeneous Semi-Supervised Federated Learning for Medical Image Segmentation. IEEE Trans. Med. Imaging 2024, 43, 1804–1815. [Google Scholar] [CrossRef] [PubMed]
- Stelter, L.; Corbetta, V.; Beets-Tan, R.; Silva, W. Assessing the Impact of Federated Learning and Differential Privacy on Multi-centre Polyp Segmentation. In Proceedings of the 2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, 15–19 July 2024; pp. 1–4. [Google Scholar]
Title of the Paper | Year | Description | Limitations |
---|---|---|---|
Deep learning to find colorectal polyps in colonoscopy: A systematic literature review [30] | 2020 | This review analyzes 35 papers published between 2015 and 2018 that include polyp detection, localization, and segmentation. | The number of analytical studies is relatively small; these are models from 2015–2018. |
A survey on deep learning for polyp segmentation: Techniques, challenges, and future trends [31] | 2024 | This review examines 58 papers from 2019 to 2023 and analyzes 24 models. | There has been no evaluation or analysis of the model’s real-time performance. |
Colorectal Polyp Segmentation in the Deep Learning Era: A Comprehensive Survey [32] | 2024 | This review examines 115 papers from 2014 to 2023 and analyzes 40 model papers from 2015 to 2023. | The models being analyzed do not have real-time performance evaluation. |
A systematic review of deep learning based image segmentation to detect polyp [33] | 2024 | This review examines 117 papers from 2018 to 2023 on the use of deep learning image segmentation models for polyp segmentation. It analyzes 16 models. | The number of models in the statistical table is relatively small. |
A Comprehensive Review of Deep Learning for Medical Image Segmentation [34] | 2024 | This review analyzes 41 medical image segmentation papers from 2015 to 2024, covering skin lesions, hippocampus, tumors, and polyps. | The polyp segmentation model has a small sample size; there is no performance evaluation analysis. |
A survey of deep learning algorithms for colorectal polyp segmentation [35] | 2024 | This review discusses four challenges encountered by deep learning methods in the task of colorectal polyp segmentation, along with related papers. | There is no statistical examination of Mamba-based models, no model statistical table, and no performance evaluation analysis of the model. |
Cross-modal hybrid architectures for gastrointestinal tract image analysis: A systematic review and futuristic applications [36] | 2024 | This review surveys CNN and Transformer methods for organ and polyp segmentation. | There is no statistical examination of Mamba-based models and no assessment of the models’ performance. |
Architecture | Year | Methods | Backbone | Problems Solved | Advantages | Limitations |
---|---|---|---|---|---|---|
CNN | 2024 | RGIA [86] | N/A | Domain transfer problem. | Accurately segmenting polyps under complex shapes and various lighting conditions. | Insufficient handling of domain shift factors such as color and shape differences. |
PolypMixNet [85] | ResNet34 | Limited annotated data and class imbalance. | Achieving near fully supervised model performance with just 15% labeled data. | Performance still lags behind fully supervised methods on some datasets. | ||
DEC-Seg [80] | Res2Net | Changes in the shape, size, and location of polyps; pixel-by-pixel annotation. | Significantly improves segmentation performance with 10% and 30% labeled data. | Challenges remain in handling fuzzy areas and unclear boundaries. | ||
MEGANet [63] | ResNet-34 and Res2Net-50 | Complex background, varying sizes and shapes of polyps, unclear boundaries. | Edge-guided attention; Laplacian operator to enhance feature extraction of polyp boundaries. | Insufficient real-time performance. | ||
NPD-Net [82] | Res2Net50 | Overfitting. | Unsupervised deep supervision strategy. | The segmentation ability will decrease on abnormal images. | ||
DCL-PS [87] | ResNet-101 | Distribution differences between the source domain and the target domain. | Better cross-domain feature alignment. | The edge details of large polyps are not captured accurately enough. | ||
DLCA_ CMAM [69] | ConvNeXt | The color appearance in the polyp dataset is imbalanced. | Color statistics knowledge for data augmentation. | DLCA requires more computation time compared to Color Jittering. | ||
MISNet [64] | Res2Net | The location, size, and shape of the polyps vary; the boundaries are unclear. | Adaptive fusion of multi-layer features; accurately capturing the details of polyp boundaries. | Segmentation performance is poor in extremely low-contrast environments. | ||
WDFF-Net [24] | HarDNet68 | Polyps exhibit significant differences in color, size, shape, appearance, and location. | The dual-branch architecture (PFF and SFF) has strong complementarity. | The FPS performance is slightly inferior to that of lightweight models like HarDNet-MSEG. | ||
UHA-Net [70] | Res2Net-50 | Scale variation of lesion areas. | The segmentation effect of small polyps is good. | Complete pixel-level annotated data is needed for training. | ||
SEPNet [71] | PVTv2-B2 | Difficult to learn accurate semantics. | Handling polyps with similar backgrounds. | Loss of edge details. | ||
FoBS [65] | DeepLabV3+ | The appearance of polyps varies. | Multilevel boundary enhancement. | Noise-sensitive. | ||
2023 | S2ME [84] | ResNet50 | Existing weak supervision methods are mainly limited to spatial domain learning. | Spatial-frequency dual branch; pixel-level adaptive fusion strategy. | Lack of comparison with the latest methods. | |
EMTSNet [134] | Res2Net50 | How to accurately segment polyps. | Effective extraction of multi-scale features. | The segmentation results for small-sized polyps are not satisfactory. | ||
GAN-PSNet [135] | Res2Net50 | Polyps with variable appearance. | Attention mechanism integrated into the discriminator. | Higher time complexity. | ||
FEGNet [72] | Res2Net50 | The shape, size, and texture of polyps vary; the boundaries are unclear. | Skilled in handling the segmentation of complex-shaped and small-sized polyps. | Insufficient generalization performance. | ||
CoInNet [73] | DenseNet121 | The shape, color, size, and texture of polyps vary, and their boundaries are unclear. | It has excellent segmentation capability for small polyps with an area of ≤1000 pixels. | Detection difficulties in specific scenarios. | ||
RPANet [88] | ResNet101 | Domain adaptation for polyp segmentation models. | No source domain data is required, suitable for cross-hospital deployment. | Insufficient handling of situations where the foreground and background have minimal differences. | ||
CFA-Net [66] | Res2Net-50 | Polyps vary in shape and size, with indistinct boundaries. | Effective in handling small polyps and unclear boundaries. | The segmentation effect of large polyps is not ideal. | ||
PolypSeg+ [127] | ResNet50 | Polyps are diverse, with low contrast against the background and blurred boundaries. | High segmentation accuracy; good real-time performance. | The segmentation effect of very small polyps is poor. | ||
FTMF-Net [79] | Res2Net | Boundary details; global context information. | Boundary feature extraction from frequency domain perspective. | The segmentation of complex samples still needs improvement. | ||
ISCNet [74] | ResNet34 | Low contrast, small size, and a wide variety. | Performs well when treating small polyps. | Segmentation errors still exist in the edge regions. | ||
I-RA [77] | Res2Net50 | Representation of directionality, singularity, and regularity in the spectral domain. | Contour knowledge guidance enhances spectral domain feature representation. | Real-time performance still needs improvement. | ||
WS- DefSegNet [81] | Res2net | Pixel-wise annotated dataset. | Weakly supervised and semi-supervised polyp segmentation framework. | The performance on certain datasets is still lower than that of state-of-the-art fully supervised methods. | ||
ERDUnet [23] | Unet | Extracting global contextual features is difficult; the parameters are too large to be applied clinically. | The model has a small parameter size and achieves an average FPS of 40.71, accurately segmenting large-scale targets. | Recognition errors may occur when the target and background features are too similar. | ||
2022 | LDNet [75] | Res2Net50 | The diversity of polyps in shape, size, and brightness, as well as their subtle contrast with the background. | Dynamic kernel generation and update mechanism; efficient self-attention and lesion-aware cross-attention. | The real-time performance and computational efficiency of the model have not been discussed in detail. | |
MSRF-Net [76] | N/A | Targets of different sizes; difficult to train on small-scale and biased datasets. | The dual-scale dense fusion; the shape stream. | The performance of low-contrast image segmentation is weak. | ||
BSCA-Net [68] | Res2Net50 | Extraction of boundary information. | Obtain geometric information from multiple angles. | Loss of some low-level details | ||
AMNet [78] | Res2Net50 | Small polyps; similar to the surrounding environment. | Multi-scale fusion; parallel attention mechanism; reverse context fusion. | Performance still needs to be improved in complex real-world environments. | ||
BCNet [67] | Res2Net50 | Inaccurate segmentation. | Excels at handling ambiguous boundaries and complex backgrounds. | There are still cases of inaccurate segmentation in specific complex scenarios. | ||
NIP [83] | ResNet101 | Limited training data, significant variation in polyps, and class imbalance. | MLMix data augmentation methods and CAR resampling strategies. | Real-time performance has not been evaluated in detail. | ||
TGANet [136] | ResNet50 | Changes in polyp size can affect model training. | Outstanding performance in the segmentation of small and flat polyps. | Real-time performance metrics not discussed in detail. | ||
Transformer | 2024 | CTNet [28] | MiT-b3 | Lack of features with advanced semantic details. | Supervised contrastive learning strategy. | Small polyp segmentation is weak. |
HST-MRF [99] | Swin Transformer | Feature information loss problem caused by patch segmentation. | Multi-receptive field patch segmentation strategy. | Relatively weak in precision metrics. | ||
MLFF-Net [90] | PVTv2 | Insufficient feature utilization and feature fusion conflicts. | Multi-layer feature fusion and multi-scale attention. | The segmentation effect of multiple polyps is not ideal. | ||
PSTNet [91] | Shunted Transformer | Feature misalignment in multi-scale aggregation. | Enhance noise suppression capabilities. | Need to improve computational efficiency. | ||
Polyp-LVT [3] | PVTv2 | Application limitations in clinical settings. | The parameter count is reduced by approximately 44% compared to the baseline model Polyp-PVT. | There is still room for improvement in accuracy. | ||
PFENet [92] | PVTv2-B2 | Limitations of global context modeling and cross-layer feature interactions. | The CFE module enhances feature representation capabilities; the CMG module fuses features. | Decline in out-of-domain generalization performance. | ||
VANet [96] | CvT-13 | Significant differences in appearance; unclear boundaries. | VAFormer reduces interference from non-polyp regions; BAFormer optimizes boundary segmentation. | Segmentation performance decreases for polyps that are too small (<5%) or too large (>25%). | ||
PP-SAM [97] | ViT | Dependence on large amounts of labeled data | The PP-SAM framework enhances SAM’s robustness in polyp segmentation. | Only supports binary classification and single bounding box prompts. | ||
2023 | Polyp-PVT [93] | PVTv2 | Differences between features at different levels; feature fusion. | Strong feature extraction capability. | Boundary detection has its limitations. | |
PVT- CASCADE [27] | PVTv2 + TransUNet | Limitations of Transformer models. | Multi-stage loss and feature aggregation framework. | Generalization requires more experiments. | ||
CAFE-Net [94] | PVTv2 | The ability to aggregate multi-scale features is limited. | The MFA module maximizes the utilization of previously learned features. | The efficiency of calculations needs improvement. | ||
DSNet [137] | PVTv2 | False positive/negative interference. | Robust performance. | False positives can occur with multiple image noise. | ||
LACINet [95] | Shunted Transformer-S | Complex backgrounds interfere with pixel-level prediction performance. | The LPM mechanism effectively reduces noise interference and redundant information. | False detections or missed detections may occur under low light or overexposure conditions. | ||
TransRUPNet [98] | pvt_v2_b2 | Real-time polyp segmentation. | Real-time processing speed 47.07 FPS. | Performance significantly declines on out-of-distribution datasets. | ||
2022 | MSRAformer [89] | Swin Transformer | Distinguish between polyp area and boundary. | Enhance the edge details of the segmentation target. | There is still room for improvement in the segmentation of small lesion areas. | |
SSFormer [26] | PVTv2 | The polyps vary in size and have complex shapes; their borders are unclear. | Strong generalization ability; progressive local decoder. | The findings on previously unseen medical images still require further validation. | ||
DS-TransUNet [100] | Swin | Ignored the pixel-level intrinsic structural features within each block. | Dual-scale encoding mechanism simultaneously captures coarse-grained and fine-grained features. | Performance is limited when using the same patch size. | ||
Hybrid Architecture | 2024 | CIFG-Net [111] | PVTv2 | The boundaries are unclear. | Improved mDice by 3.9% and mIoU by 4.2% compared to PraNet. | Inference speed and resource consumption not mentioned. |
MVOA-Net [101] | PVTv2-b2 | Intra-class inconsistency; inter-class indistinction. | Skilled in multi-target polyp segmentation. | The computational efficiency is not competitive. | ||
RSAFormer [102] | PVTv2-B4 | Boundary similarity. | RSA module can better extract boundary information. | Performs poorly when samples have reflections and shadows. | ||
Polyper [112] | Swin-T | Fuzzy Boundary. | The segmentation effect for small polyps (image proportion <6%) is good. | Unable to handle false positives or false negatives effectively. | ||
TransNetR [117] | ResNet50 | Inefficient real-time processing speeds. | Processing speed reaches 54.60 FPS. | Detection of tiny polyps still has room for improvement. | ||
PFD-Net [103] | PVTv2 | Failed to effectively utilize local details and global semantic information. | Spatial-frequency joint method enhances the local and global features of feature maps. | When the target and background are extremely similar, segmentation performance is poor. | ||
HD-Former [104] | ResNet-34 | Unable to account for multi-level dependencies between spatial and channel dimensions. | Dual Cross-Attention Transformer (DCAT) module is used for multi-level feature fusion. | Increased computational overhead. | ||
ASPS [115] | ViT-B +MSCAN-L | Ignoring local details. | Enhanced OOD performance and domain generalization. | There is still room for improvement in the segmentation of small polyps. | ||
PGCF [138] | PVTv2 | Recognition of complex features. | PGCF module extracts multi-scale features. | Inference time is slow, affecting real-time performance. | ||
FMCA-Net [107] | PVT_v2 | Blurred edges; insufficient feature extraction. | The D-BFRM module extracts and enhances polyp features. | The segmentation of small polyps is not ideal. | ||
SAM2-Unet [116] | Hiera | Design a simple and efficient universal segmentation framework. | Insert an adapter into the encoder to achieve parameter-efficient fine-tuning. | The performance on some datasets is slightly below that of other methods. | ||
SimTxtSeg [114] | ConvNeXt-Tiny +BERT-BASE | Reducing annotation costs while maintaining segmentation performance. | Text-to-visual prompt converter and text–visual hybrid attention module. | Pseudo-label generation relies on the SAM model and is sensitive to its performance. | ||
2023 | MC-DC [108] | Res2Net-50+Wave-MLP | Design of the feature decoder. | Multi-layer features that integrate MLP and CNN. | The generalization ability needs improvement. | |
META-Unet [109] | ResNet34 | Low contrast. | The dual-branch structure can simultaneously capture global and local features. | The accuracy of polyp segmentation with ambiguous boundaries is low. | ||
2022 | SwinPA-Net [105] | Swin-B | The size and type of lesions vary too much. | The dense multiplicative connection module is used for multi-scale feature fusion. | Real-time constraints. | |
HSNet [106] | PVTv2+ Res2Net50 | Ignore the visual details of small polyps. | Interactive attention mechanism; MSP module integrates different scales. | Detection of small polyps still needs improvement. | ||
PPFormer [113] | CvT+VGG-16 | Transformer has insufficient local feature extraction. | PP-guided self-attention mechanism; local-to-global mechanism. | Small polyp segmentation is weak. | ||
FCBFormer [110] | PVTv2-B3 | Only able to predict low-resolution segmentation maps. | FCN and Transformer parallel branches. | Data dependency. | ||
Mamba | 2024 | VM-UNetV2 [119] | VSS | Limitations of CNN and Transformer. | VSS blocks capture long-range dependencies. | Robustness still needs to be further validated. |
Prompt-Mamba [118] | vision-mamba | Polyps of various shapes and colors. | Image feature extraction in Visual Mamba. | Performance is slightly lower on some datasets. | ||
Polyp-Mamba [121] | ResNet34+ Mamba | The polyp has blurred margins. | Mamba and ResNet extract global and local features. | Misjudgments may occur in situations with clear boundaries. | ||
Polyp-Mamba [29] | VSS | Cross-scale dependencies and the consistency of feature representations and semantic embeddings | The SAS module facilitates the interaction of multi-scale semantic information. | Limited validation exists for segmentation of small or complex polyp structures | ||
H-vmunet [122] | N/A | Defects of the SS2D module. | The H-SS2D module reduces redundant information. | Lack of analysis of model robustness. | ||
Other | 2024 | SAM 2 [125] | N/A | Depends on extensive dataset labeling. | Strong zero-shot learning capabilities. | The effects vary significantly between different prompts. |
DDPM [124] | U-Net ϵθ | Efficient and accurate polyp segmentation. | Majority voting strategy enhances performance. | Generalization performance needs improvement. | ||
dHBLSN [126] | BLS(Broad Learning System) | Cost calculation. | BLS does not require multiple layers to learn complex features. | Difficult to remove complex flat polyps. | ||
HDM [123] | Diffusion U-Net | Domain disparities; low efficiency. | Generation of prior features for polyp segmentation using diffusion models. | Unclear boundaries are difficult to accurately segment. |
Architecture | Year | Methods | Backbone | Problems Solved | Advantages | Limitations |
---|---|---|---|---|---|---|
Hybrid Architecture | 2021 | TransFuse [21] | ResNet-34+ DeiT -S | Efficiency in modeling the global context; low-level details. | Parallel branches; the BiFusion module fuses multi-level features. | Performance on extremely small target datasets has not been discussed in detail. |
CNN | 2020 | PraNet [20] | Res2Net | Polyps exhibit diverse shapes and have indistinct margins. | Parallel partial decoder and reverse attention module. | Detail segmentation remains inadequate in complex boundary situations. |
CNN | 2019 | ResUNet++ [19] | ResUNet | How to accurately segment polyps. | ASPP improves the ability to segment polyps of various shapes and sizes. | Resizing the image may cause some details to be lost. |
CNN | 2018 | Unet++ [18] | UNet | There are semantic differences. | Re-designed skip pathways. | High computational complexity. |
Architecture | Year | Methods | Backbone | Problems Solved | Advantages | Limitations |
---|---|---|---|---|---|---|
CNN | 2022 | PNS+ [22] | Res2Net50 | Lack of large-scale datasets with fine-grained segmentation annotations. | Constructed the SUN-SEG dataset; achieved an inference speed of 170 frames per second. | Insufficient robustness. |
CNN | 2024 | SSTFB [131] | Res2Net-50 | Decline in video polyp segmentation performance. | Enhancing representation learning with a spatio-temporal self-attention mechanism. | Some mis-segmentation may occur in complex backgrounds. |
Mamba | 2024 | Vivim [133] | Temporal Mamba Block | Vanilla SSMs cannot preserve non-causal spatial information. | Temporal Mamba Block and ST-Mamba Module. | The model is relatively complex. |
Hybrid | 2024 | MAST [130] | PVTv2-B2 | Modeling long-range spatiotemporal relationships is challenging. | The Siamese Transformer and hybrid attention module capture the spatiotemporal relationships between video frames. | Generalization ability needs to be verified. |
Hybrid | 2024 | FlowICBNet [128] | PVT | Predictive discontinuity problems caused by low-quality frames. | RFS and FGW modules select the optimal historical reference frame. | The IFU exhibits an over-correction phenomenon. |
Hybrid | 2024 | SALI [132] | PVTv2 | Low-quality frames limit segmentation accuracy. | The SAM module alleviates spatial variations; the LIM module reconstructs polyp features. | Segmentation remains inaccurate for frames with drastic changes. |
Hybrid | 2024 | Diff-VPS [129] | N/A | Highly camouflaged polyps and redundant temporal clues. | Multi-task diffusion model; the TRM module captures dynamic video features. | There is still room for improvement in detecting highly concealed polyps. |
Indicator | Definition | Meaning |
---|---|---|
True Positive (TP) | Correctly predicted as positive class. | It is a polyp and has been correctly predicted. |
False Positive (FP) | Incorrectly predicted as positive class. | Falsely predicted as a polyp, but not actually a polyp. |
True Negative (TN) | Correctly predicted as negative class. | Not a polyp and correctly predicted. |
False Negative (FN) | Incorrectly predicted as negative class. | It is a polyp but was incorrectly predicted not to be a polyp. |
Method | CVC-ClinicDB | Kvasir-SEG | CVC-ColonDB | ETIS-Larib | EndoScene | |||||
---|---|---|---|---|---|---|---|---|---|---|
mDice | mIoU | mDice | mIoU | mDice | mIoU | mDice | mIoU | mDice | mIoU | |
MEGANet [63] | 0.938 | 0.894 | 0.913 | 0.863 | 0.793 | 0.714 | 0.739 | 0.665 | 0.899 | 0.834 |
Polyper [112] | 0.945 | 0.899 | 0.948 | 0.904 | 0.837 | 0.746 | 0.865 | 0.785 | 0.924 | 0.867 |
Polyp-Mamba [29] | 0.949 | 0.907 | 0.940 | 0.881 | 0.829 | 0.743 | 0.825 | 0.747 | 0.921 | 0.875 |
ISCNet [74] | 0.961 | 0.920 | 0.939 | 0.898 | 0.828 | 0.741 | 0.804 | 0.716 | 0.871 | 0.778 |
CTNet [28] | 0.936 | 0.887 | 0.917 | 0.863 | 0.813 | 0.734 | 0.810 | 0.734 | 0.908 | 0.844 |
PraNet [20] | 0.899 | 0.849 | 0.898 | 0.840 | 0.709 | 0.640 | 0.628 | 0.567 | 0.871 | 0.797 |
CFA-Net [66] | 0.933 | 0.883 | 0.915 | 0.861 | 0.743 | 0.665 | 0.732 | 0.655 | 0.893 | 0.827 |
DLCA_CMAM [69] | 0.944 | 0.900 | 0.929 | 0.882 | 0.818 | 0.745 | 0.839 | 0.766 | 0.903 | 0.837 |
Polyp-PVT [93] | 0.937 | 0.889 | 0.917 | 0.864 | 0.808 | 0.727 | 0.787 | 0.706 | 0.900 | 0.833 |
CAFE-Net [94] | 0.943 | 0.899 | 0.933 | 0.889 | 0.820 | 0.740 | 0.822 | 0.738 | 0.901 | 0.834 |
UNet [15] | 0.823 | 0.755 | 0.818 | 0.746 | 0.512 | 0.444 | 0.398 | 0.335 | 0.710 | 0.627 |
TGANet [136] | 0.946 | 0.899 | 0.898 | 0.833 | 0.755 | 0.824 | 0.636 | 0.782 | 0.885 | 0.899 |
HST-MRF [99] | 0.935 | 0.884 | 0.928 | 0.885 | 0.831 | 0.776 | 0.773 | 0.719 | 0.901 | 0.846 |
DS-TransUNet [100] | 0.936 | 0.887 | 0.935 | 0.889 | 0.798 | 0.722 | 0.761 | 0.687 | 0.911 | 0.846 |
TransFuse [21] | 0.942 | 0.897 | 0.920 | 0.870 | 0.781 | 0.706 | 0.737 | 0.663 | 0.894 | 0.826 |
MVOA-Net [101] | 0.947 | 0.902 | 0.935 | 0.891 | 0.824 | 0.745 | 0.820 | 0.743 | 0.904 | 0.838 |
SwinPA-Net [105] | 0.941 | 0.894 | 0.925 | 0.876 | 0.807 | 0.726 | 0.762 | 0.684 | 0.893 | 0.823 |
HSNet [106] | 0.948 | 0.905 | 0.926 | 0.877 | 0.810 | 0.735 | 0.808 | 0.734 | 0.903 | 0.839 |
SAM 2 [125] | 0.930 | 0.870 | 0.939 | 0.885 | 0.934 | 0.877 | 0.941 | 0.890 | - | - |
PFD-Net [103] | 0.939 | 0.893 | 0.928 | 0.880 | 0.816 | 0.737 | 0.826 | 0.746 | - | - |
CoInNet [73] | 0.930 | 0.887 | 0.926 | 0.872 | 0.797 | 0.729 | 0.759 | 0.690 | - | - |
AMNet [78] | 0.936 | 0.888 | 0.912 | 0.865 | 0.762 | 0.690 | 0.756 | 0.679 | - | - |
DEC-Seg [80] | 0.859 | 0.804 | 0.893 | 0.830 | 0.721 | 0.640 | 0.634 | 0.564 | - | - |
WS-DefSegNet [81] | 0.807 | 0.746 | 0.768 | 0.709 | 0.667 | 0.588 | 0.596 | 0.517 | - | - |
NPD-Net [82] | 0.925 | 0.876 | 0.905 | 0.850 | 0.764 | 0.682 | 0.737 | 0.659 | - | - |
MISNet [64] | 0.918 | 0.869 | 0.903 | 0.846 | 0.762 | 0.690 | 0.764 | 0.686 | - | - |
UHA-Net [70] | 0.927 | 0.881 | 0.908 | 0.857 | 0.769 | 0.695 | 0.746 | 0.670 | - | - |
SEPNet [71] | 0.937 | 0.887 | 0.922 | 0.869 | 0.819 | 0.740 | 0.795 | 0.718 | - | - |
FEGNet [72] | 0.943 | 0.901 | 0.923 | 0.874 | 0.767 | 0.686 | 0.719 | 0.645 | - | - |
MLFF-Net [90] | 0.943 | 0.897 | 0.919 | 0.866 | 0.820 | 0.742 | 0.784 | 0.707 | - | - |
PSTNet [91] | 0.945 | 0.901 | 0.935 | 0.895 | 0.827 | 0.748 | 0.800 | 0.726 | - | - |
PFENet [92] | 0.940 | 0.897 | 0.931 | 0.886 | 0.821 | 0.745 | 0.809 | 0.735 | - | - |
FMCA-Net [107] | 0.944 | 0.898 | 0.919 | 0.866 | 0.804 | 0.723 | 0.841 | 0.765 | - | - |
FCBFormer [110] | 0.947 | 0.902 | 0.939 | 0.890 | 0.783 | 0.706 | 0.796 | 0.715 | - | - |
CIFG-Net [111] | 0.938 | 0.891 | 0.925 | 0.876 | 0.815 | 0.733 | 0.806 | 0.726 | - | - |
SAM2-UNet [116] | 0.907 | 0.856 | 0.928 | 0.879 | 0.808 | 0.730 | 0.796 | 0.723 | - | - |
Prompt-Mamba [118] | 0.888 | 0.814 | 0.886 | 0.808 | 0.820 | 0.712 | 0.771 | 0.663 | - | - |
VM-UNetV2 [119] | 0.944 | 0.893 | 0.913 | 0.842 | 0.758 | 0.610 | 0.839 | 0.723 | - | - |
Polyp-Mamba [121] | 0.941 | 0.896 | 0.919 | 0.867 | 0.791 | 0.713 | 0.756 | 0.668 | - | - |
RPANet [88] | 0.800 | 0.719 | 0.858 | 0.782 | - | - | 0.632 | 0.552 | - | - |
TransNetR [117] | 0.766 | 0.691 | 0.871 | 0.802 | - | - | - | - | - | - |
DDPM [124] | 0.967 | 0.937 | 0.934 | 0.886 | - | - | - | - | - | - |
TransRUPNet [98] | 0.854 | 0.777 | 0.901 | 0.845 | - | - | - | - | - | - |
Polyp-LVT [3] | 0.935 | 0.882 | 0.909 | 0.851 | - | - | - | - | 0.904 | 0.835 |
Models | Params (M) | Flops (G) | FPS (Frames/s) |
---|---|---|---|
MISNet [64] | 33.63 | 45.94 | 30.68 |
CFA-Net [66] | 25.24 | 55.36 | 23.50 |
SEPNet [71] | 25.96 | 12.52 | 62.00 |
TGANet [136] | 42.30 | 19.84 | 85.00 |
MSRF-Net [76] | 18.38 | 20.26 | 14.38 |
I-RA [77] | 35.85 | 15.65 | 2.06 |
PraNet [20] | 30.50 | 13.08 | 37.31 |
LACINet [95] | 22.36 | 11.74 | 28.41 |
MVOA-Net [101] | 27.86 | 29.73 | 30.78 |
TransNetR [117] | 27.27 | 10.58 | 54.60 |
Polyp-Mamba [121] | 49.50 | 27.90 | 23.80 |
ERDUnet [23] | 10.21 | 10.30 | 27.03 |
PolypSeg+ [127] | 2.54 | 7.23 | 31.00 |
FlowICBNet [128] | 98.40 | N/A | 29.00 |
SSTFB [131] | 33.40 | N/A | 126.00 |
UNet [15] | 34.52 | N/A | 55.10 |
WDFF-Net [24] | 17.46 | N/A | 83.82 |
META-Unet [109] | 27.32 | N/A | 75.00 |
Polyp-PVT [93] | 125.60 | N/A | 66.00 |
MC-DC [108] | 65.93 | 41.13 | N/A |
FTMF-Net [79] | 44.77 | 8.85 | N/A |
Polyp-LVT [3] | 25.11 | 13.21 | N/A |
NPD-Net [82] | 29.22 | 14.51 | N/A |
CTNet [28] | 44.19 | 15.20 | N/A |
FMCA-Net [107] | 28.61 | 14.36 | N/A |
PFD-Net [103] | 28.89 | 16.81 | N/A |
ISCNet [74] | 23.34 | 15.57 | N/A |
CAFE-Net [94] | 35.53 | 16.12 | N/A |
DS-TransUNet [100] | 287.75 | 51.09 | N/A |
MAST [130] | 25.69 | 21.02 | N/A |
Prompt-Mamba [118] | 102.00 | N/A | N/A |
HSNet [106] | 29.23 | N/A | N/A |
MEGANet [63] | 44.19 | N/A | N/A |
TransFuse [21] | 26.30 | N/A | N/A |
HST-MRF [99] | 174.73 | N/A | N/A |
TransRUPet [98] | N/A | N/A | 47.07 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, X.; Isa, N.A.M.; Chen, C.; Lv, F. Colorectal Polyp Segmentation Based on Deep Learning Methods: A Systematic Review. J. Imaging 2025, 11, 293. https://doi.org/10.3390/jimaging11090293
Liu X, Isa NAM, Chen C, Lv F. Colorectal Polyp Segmentation Based on Deep Learning Methods: A Systematic Review. Journal of Imaging. 2025; 11(9):293. https://doi.org/10.3390/jimaging11090293
Chicago/Turabian StyleLiu, Xin, Nor Ashidi Mat Isa, Chao Chen, and Fajin Lv. 2025. "Colorectal Polyp Segmentation Based on Deep Learning Methods: A Systematic Review" Journal of Imaging 11, no. 9: 293. https://doi.org/10.3390/jimaging11090293
APA StyleLiu, X., Isa, N. A. M., Chen, C., & Lv, F. (2025). Colorectal Polyp Segmentation Based on Deep Learning Methods: A Systematic Review. Journal of Imaging, 11(9), 293. https://doi.org/10.3390/jimaging11090293