Advancing Colorectal Polyp Detection in Colonoscopy Through Region-Guided Deep Learning †
Abstract
1. Introduction
2. Related Work
3. Methods
3.1. Dataset Description
3.2. YOLOv11 Detection
- Backbone (C3k2): Extracts image features using a Cross-Stage Partial (CSP) design that fuses multi-scale information, enabling detection of small or flat polyps.
- Neck (PAN-FPN): Merges shallow and deep features for robust multi-scale detection across varying polyp sizes and textures.
- Head (Decoupled, anchor-free): Predicts objectness, bounding boxes, and class probabilities separately, improving convergence and generalization without relying on anchor boxes.
Training on YOLOv11
3.3. YOLOv11 Segmentation
3.4. SAM
3.5. SAM2
4. Results and Discussion
4.1. Experimental Environment
4.2. Evaluation Matrics
4.3. Performance of the Applied Models
4.3.1. Yolov11 Detection Results
4.3.2. Yolov11 Segmentation Results
4.3.3. SAM Results
4.3.4. SAM 2 Results
4.3.5. Experimental Environment
4.4. Predicted Results
4.4.1. Classification and Statistics of Error Cases
- False Positive Rate (FPR):
- False Negative Rate (FNR):
4.4.2. False Positive Case Analysis
4.4.3. Strategies for False Positive Reduction
4.5. Integrated Component Analysis
4.6. Comparative Analysis
4.7. Benchmarking Against Prior Work Based on Dataset
4.8. Extended Comparison with Detection and Segmentation Frameworks
4.9. Robustness Under Clinical Imaging Variability
4.10. Generalization and Clinical Applicability
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Morgan, E.; Arnold, M.; Gini, A.; Lorenzoni, V.; Cabasag, C.J.; Laversanne, M.; Vignat, J.; Ferlay, J.; Murphy, N.; Bray, F. Global burden of colorectal cancer in 2020 and 2040: Incidence and mortality estimates from GLOBOCAN. Gut 2023, 72, 338–344. [Google Scholar] [CrossRef] [PubMed]
- Tudela, Y.; Majó, M.; de la Fuente, N.; Galdran, A.; Krenzer, A.; Puppe, F.; Yamlahi, A.; Tran, T.N.; Matuszewski, B.J.; Fitzgerald, K.; et al. A complete benchmark for polyp detection, segmentation and classification in colonoscopy images. Front. Oncol. 2024, 14, 1417862. [Google Scholar] [CrossRef] [PubMed]
- Li, S.; Ren, Y.; Yu, Y.; Jiang, Q.; He, X.; Li, H. A survey of deep learning algorithms for colorectal polyp segmentation. Neurocomputing 2025, 614, 128767. [Google Scholar] [CrossRef]
- Lalinia, M.; Sahafi, A. Colorectal polyp detection in colonoscopy images using YOLO-V8 network. Signal Image Video Process. 2023, 18, 2047–2058. [Google Scholar] [CrossRef]
- Wan, J.-J.; Zhu, P.-C.; Chen, B.-L.; Yu, Y.-T. A semantic feature enhanced YOLOv5-based network for polyp detection from colonoscopy images. Sci. Rep. 2024, 14, 15478. [Google Scholar] [CrossRef]
- Ma, J.; He, Y.; Li, F.; Han, L.; You, C.; Wang, B. Segment anything in medical images. Nat. Commun. 2024, 15, 654. [Google Scholar] [CrossRef]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.-Y.; et al. Segment anything. arXiv 2023, arXiv:2304.02643. [Google Scholar]
- Wang, M.; Xu, C.; Fan, K. An efficient fine tuning strategy of segment anything model for polyp segmentation. Sci. Rep. 2025, 15, 14088. [Google Scholar] [CrossRef]
- Hasan, M.; Islam, J.; Hasan, M.R.; Khaliluzzaman, M. Colorectal polyp detection and localization in colonoscopy images based on YOLOv8. In Proceedings of the 27th International Conference on Computer and Information Technology (ICCIT), Cox’s Bazar, Bangladesh, 20–22 December 2024; pp. 2858–2863. [Google Scholar]
- Delowar, K.E.; Uddin, M.B.; Khaliluzzaman, J.; Rabbi, R.I.; Hossen, J.; Hossen, M.M. PolyNet: A self-attention based CNN model for classifying the colon polyp from colonoscopy image. Inform. Med. Unlocked 2025, 56, 101654. [Google Scholar] [CrossRef]
- He, L.; Zhou, Y.; Liu, L.; Ma, J. Research and Application of YOLOv11-Based Object Segmentation in Intelligent Recognition at Construction Sites. Buildings 2024, 14, 3777. [Google Scholar] [CrossRef]
- Khanam, R.; Hussain, M. YOLOv11: An Overview of the Key Architectural Enhancements. arXiv 2024, arXiv:2410.17725. [Google Scholar] [CrossRef]
- Solak, A.; Ceylan, R. A sensitivity analysis for polyp segmentation with U-Net. Multimed. Tools Appl. 2023, 82, 34199–34227. [Google Scholar] [CrossRef]
- Mei, J.; Zhou, T.; Huang, K.; Zhang, Y.; Zhou, Y.; Wu, Y. A survey on deep learning for polyp segmentation: Techniques, challenges and future trends. Vis. Intell. 2025, 3, 1. [Google Scholar] [CrossRef]
- Sahafi, A.; Koulaouzidis, A.; Lalinia, M. Polypoid Lesion Segmentation Using YOLO-V8 Network in Wireless Video Capsule Endoscopy Images. Diagnostics 2024, 14, 474. [Google Scholar] [CrossRef] [PubMed]
- Lee, G.-E.; Cho, J.; Choi, S.-I. Shallow and reverse attention network for colon polyp segmentation. Sci. Rep. 2023, 13, 15243. [Google Scholar] [CrossRef]
- Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Halvorsen, P.; de Lange, T.; Johansen, D.; Johansen, H.D. Kvasir-SEG: A Segmented Polyp Dataset. arXiv 2019, arXiv:1911.07069. [Google Scholar] [CrossRef]
- Hidayatullah, N.; Syakrani, M.R.; Sholahuddin, T.; Gelar, R.T. Yolov8 to yolo11: A comprehensive architecture in-depth comparative review. arXiv 2025, arXiv:2501.13400. [Google Scholar]
- Shaukat, A.; Yadav, A.; Kwon, R.S.; Wallace, M.B.; Byrne, M.F.; Repici, A.; East, J.E.; Gupta, N.; Wani, S.; Sharma, P.; et al. Framework and metrics for the clinical use and evaluation of AI in endoscopy. Gastrointest. Endosc. 2022, 96, 1185–1193. [Google Scholar]
- Ghose, P.; Ghose, A.; Sadhukhan, D.; Pal, S.; Mitra, M. Improved polyp detection from colonoscopy images using finetuned YOLO-v5. Multimed. Tools Appl. 2023, 83, 42929–42954. [Google Scholar] [CrossRef]
- Wan, J.; Chen, B.; Yu, Y. Polyp Detection from Colorectum Images by Using Attentive YOLOv5. Diagnostics 2021, 11, 2264. [Google Scholar] [CrossRef] [PubMed]
- Wang, S.; Xie, J.; Cui, Y.; Chen, Z. Colorectal Polyp Detection Model by Using Super-Resolution Reconstruction and YOLO. Electronics 2024, 13, 2298. [Google Scholar] [CrossRef]










| Data Type | Quantity | Description |
|---|---|---|
| Colonoscopy images | 1000 | RGB frames containing polyps. |
| Annotated images | 1000 | Images with colored polyp regions. |
| Binary masks | 1000 | Pixel-level segmentation masks. |
| Bounding box labels | 1000 | Text files containing normalized coordinates (x, y, width, height). |
| Characteristics | YOLOv8 | YOLOv9 | YOLOv11 (Ours) |
|---|---|---|---|
| Backbone structure | C2f-based CNN | C2f reparameterization | C3k2 with enhanced CSP & transformer-inspired blocks |
| Feature extraction | Local convolutional features | Improved gradient flow | Stronger multi-scale and contextual feature learning |
| Neck (feature fusion) | PAN-FPN | Optimized PAN-FPN | Refined PAN-FPN with better scale interaction |
| Detection head | Decoupled head | Decoupled head | Improved decoupled head with better box regression stability |
| Anchor design | Anchor-free | Anchor-free | Anchor-free |
| Small object sensitivity | Moderate | Improved | High (better small/flat polyp detection) |
| Inference latency | Low | Low–moderate | Lower/comparable with higher accuracy |
| Localization precision | High | Higher | Highest (improved IoU stability) |
| Metric | Value |
|---|---|
| Accuracy | 99.00% |
| Precision | 0.9796 |
| Recall | 1.0000 |
| F1-score | 0.9897 |
| IoU | 0.9764 |
| mAP@0.5 | 0.9937 |
| mAP@0.5:0.95 | 0.9935 |
| Metric | Value |
|---|---|
| Accuracy | 88.28% |
| Precision | 0.9992 |
| Recall | 1.0000 |
| F1-score | 0.9996 |
| IoU | 0.9154 |
| mAP@0.5 | 0.9950 |
| mAP@0.5:0.95 | 0.9909 |
| Metric | Value |
|---|---|
| Precision | 0.9898 |
| Recall | 0.9601 |
| F1-score | 0.9708 |
| IoU | 0.9500 |
| Dice | 0.9708 |
| Metric | Value |
|---|---|
| Precision | 0.9946 |
| Recall | 0.9660 |
| F1-score | 0.9771 |
| IoU | 0.9608 |
| Dice | 0.9771 |
| Model | Avg. Latency (ms) | FPS | p95 Latency (ms) | p95 FPS |
|---|---|---|---|---|
| YOLOv11-Detection (Proposed) | 18.96 | 52.7 | 24.53 | 40.8 |
| Configuration | Mask-Based Bounding Box Generation | Region-Aware Preprocessing | Region-Guided Supervision | Precision | Recall | F1-Score | IoU | mAP@0.5 |
|---|---|---|---|---|---|---|---|---|
| Baseline YOLOv11 | ✗ | ✗ | ✗ | 0.9152 | 0.8801 | 0.89721 | 0.9083 | 0.9284 |
| Mask-based Bounding Box Generation | ✓ | ✗ | ✗ | 0.9421 | 0.9254 | 0.93342 | 0.9354 | 0.9561 |
| Mask-based Bounding Box Generation and Region-aware Preprocessing | ✓ | ✓ | ✗ | 0.9653 | 0.9851 | 0.97492 | 0.9584 | 0.9823 |
| Proposed (All modules) | ✓ | ✓ | ✓ | 0.9796 | 1.0000 | 0.9897 | 0.9764 | 0.9935 |
| Model | Precision | Recall | F1-Scrore | IoU | mAP@0.5 |
|---|---|---|---|---|---|
| YOLOv11-Segmentation | 0.9991 | 1.0000 | 0.9959 | 0.9154 | 0.9909 |
| SAM | 0.9898 | 0.9601 | 0.9708 | 0.9500 | - |
| SAM 2 | 0.9946 | 0.9660 | 0.9771 | 0.9608 | - |
| YOLOv11-Detection (Best model) | 0.9796 | 1.0000 | 0.9897 | 0.9764 | 0.9935 |
| Papers | Method | Precision | Recall | F1-Score | IoU |
|---|---|---|---|---|---|
| M. Lalinia and A. Sahafi [4] | YOLOv8 | 95.60 | 91.70 | 92.40 | - |
| P. Ghose, A. Ghose [20] | Fine-tuned YOLOv5 with augmentation | 99.01 | 98.95 | 98.54 | - |
| Wan, J.; Chen, B.; Yu, Y. [21] | Attention–YOLOv5-Lite-Prune | 91.40 | 74.70 | - | - |
| Wang et al. [22] | YOLOv5 | 91.3 | 92.1 | 91.7 | - |
| Ours (YOLOv11) | YOLOv11 Detection | 97.96 | 100.00 | 98.97 | 97.64 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Nahiyan, F.; Nahar, S.; Alam, T.; Khaliluzzaman, M.; Hassan, M.M. Advancing Colorectal Polyp Detection in Colonoscopy Through Region-Guided Deep Learning. Eng. Proc. 2026, 124, 118. https://doi.org/10.3390/engproc2026124118
Nahiyan F, Nahar S, Alam T, Khaliluzzaman M, Hassan MM. Advancing Colorectal Polyp Detection in Colonoscopy Through Region-Guided Deep Learning. Engineering Proceedings. 2026; 124(1):118. https://doi.org/10.3390/engproc2026124118
Chicago/Turabian StyleNahiyan, Fairooz, Simoon Nahar, Taslim Alam, Md. Khaliluzzaman, and Mohammad Mahadi Hassan. 2026. "Advancing Colorectal Polyp Detection in Colonoscopy Through Region-Guided Deep Learning" Engineering Proceedings 124, no. 1: 118. https://doi.org/10.3390/engproc2026124118
APA StyleNahiyan, F., Nahar, S., Alam, T., Khaliluzzaman, M., & Hassan, M. M. (2026). Advancing Colorectal Polyp Detection in Colonoscopy Through Region-Guided Deep Learning. Engineering Proceedings, 124(1), 118. https://doi.org/10.3390/engproc2026124118

