Deep Learning Models for Anatomical Location Classification in Esophagogastroduodenoscopy Images and Videos: A Quantitative Evaluation with Clinical Data
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Acquisition
2.2. Experimental Environments
2.3. Data Labeling
2.4. Classification Model for Gastrointestinal Anatomical Positions
2.5. Post-Processing Algorithm for Real-Time Video
2.6. Model Performance Evaluation
3. Results
3.1. Evaluation of Model on Still Images
3.2. Evaluation of a Gastrointestinal Anatomical Position Prediction Model with Endoscopy Video Data
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Young Suk, C.; Sang Hoon, L.; Hyun Ju, S.; Dong Wook, K.; Yoon Jung, C.; Han Ho, J. Effect of Gastric Cancer Screening on Patients with Gastric Cancer: A Nationwide Population-based Study. Korean Soc. Gastrointest. Cancer 2020, 8, 102–108. [Google Scholar]
- Mori, Y.; Arita, T.; Shimoda, K.; Yasuda, K.; Yoshida, T.; Kitano, S. Effect of periodic endoscopy for gastric cancer on early detection and improvement of survival. Gastric Cancer 2001, 4, 132–136. [Google Scholar] [CrossRef] [PubMed]
- Park, H.A.; Nam, S.Y.; Lee, S.K.; Kim, S.G.; Shim, K.-N.; Park, S.M.; Lee, S.-Y.; Han, H.S.; Shin, Y.M.; Kim, K.-M.; et al. The Korean guideline for gastric cancer screening. J. Korean Med. Assoc. 2015, 58, 373–384. [Google Scholar] [CrossRef]
- Yalamarthi, S.; Witherspoon, P.; McCole, D.; Auld, C.D. Missed diagnoses in patients with upper gastrointestinal cancers. Endoscopy 2004, 36, 874–879. [Google Scholar] [CrossRef]
- Kim, Y.D.; Bae, W.K.; Choi, Y.H.; Jwa, Y.J.; Jung, S.K.; Lee, B.H.; Paik, W.H.; Kim, J.W.; Kim, N.H.; Kim, K.A.; et al. Difference in adenoma detection rates according to colonoscopic withdrawal times and the level of expertise. Korean J. Gastroenterol. 2014, 64, 278–283. [Google Scholar] [CrossRef]
- Zhang, Q.; Chen, Z.Y.; Chen, C.D.; Liu, T.; Tang, X.W.; Ren, Y.T.; Huang, S.L.; Cui, X.B.; An, S.L.; Xiao, B.; et al. Training in early gastric cancer diagnosis improves the detection rate of early gastric cancer: An observational study in China. Medicine 2015, 94, e384. [Google Scholar] [CrossRef]
- Gupta, N.; Gaddam, S.; Wani, S.B.; Bansal, A.; Rastogi, A.; Sharma, P. Longer inspection time is associated with increased detection of high-grade dysplasia and esophageal adenocarcinoma in Barrett’s esophagus. Gastrointest. Endosc. 2012, 76, 531–538. [Google Scholar] [CrossRef]
- Teh, J.L.; Tan, J.R.; Lau, L.J.; Saxena, N.; Salim, A.; Tay, A.; Shabbir, A.; Chung, S.; Hartman, M.; So, J.B. Longer examination time improves detection of gastric cancer during diagnostic upper gastrointestinal endoscopy. Clin. Gastroenterol. Hepatol. 2015, 13, 480–487.e482. [Google Scholar] [CrossRef]
- Yao, K. The endoscopic diagnosis of early gastric cancer. Ann. Gastroenterol. 2013, 26, 11–22. [Google Scholar]
- Bisschops, R.; Areia, M.; Coron, E.; Dobru, D.; Kaskas, B.; Kuvaev, R.; Pech, O.; Ragunath, K.; Weusten, B.; Familiari, P.; et al. Performance measures for upper gastrointestinal endoscopy: A European Society of Gastrointestinal Endoscopy (ESGE) Quality Improvement Initiative. Endoscopy 2016, 48, 843–864. [Google Scholar] [CrossRef]
- Wen, J.; Cheng, Y.; Hu, X.; Yuan, P.; Hao, T.; Shi, Y. Workload, burnout, and medical mistakes among physicians in China: A cross-sectional study. Biosci. Trends 2016, 10, 27–33. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Chen, F.; Yu, T.; An, J.; Huang, Z.; Liu, J.; Hu, W.; Wang, L.; Duan, H.; Si, J. Real-time gastric polyp detection using convolutional neural networks. PLoS ONE 2019, 14, e0214133. [Google Scholar] [CrossRef] [PubMed]
- De Souza, L.A., Jr.; Palm, C.; Mendel, R.; Hook, C.; Ebigbo, A.; Probst, A.; Messmann, H.; Weber, S.; Papa, J.P. A survey on Barrett’s esophagus analysis using machine learning. Comput. Biol. Med. 2018, 96, 203–213. [Google Scholar] [CrossRef]
- Lee, G.P.; Kim, Y.J.; Park, D.K.; Kim, Y.J.; Han, S.K.; Kim, K.G. Gastro-BaseNet: A Specialized Pre-Trained Model for Enhanced Gastroscopic Data Classification and Diagnosis of Gastric Cancer and Ulcer. Diagnostics 2024, 14, 75. [Google Scholar] [CrossRef]
- Wu, L.; Shang, R.; Sharma, P.; Zhou, W.; Liu, J.; Yao, L.; Dong, Z.; Yuan, J.; Zeng, Z.; Yu, Y.; et al. Effect of a deep learning-based system on the miss rate of gastric neoplasms during upper gastrointestinal endoscopy: A single-centre, tandem, randomised controlled trial. Lancet Gastroenterol. Hepatol. 2021, 6, 700–708. [Google Scholar] [CrossRef]
- Takiyama, H.; Ozawa, T.; Ishihara, S.; Fujishiro, M.; Shichijo, S.; Nomura, S.; Miura, M.; Tada, T. Automatic anatomical classification of esophagogastroduodenoscopy images using deep convolutional neural networks. Sci. Rep. 2018, 8, 7497. [Google Scholar] [CrossRef]
- He, Q.; Bano, S.; Ahmad, O.F.; Yang, B.; Chen, X.; Valdastri, P.; Lovat, L.B.; Stoyanov, D.; Zuo, S. Deep learning-based anatomical site classification for upper gastrointestinal endoscopy. Int. J. Comput. Assist. Radiol. Surg. 2020, 15, 1085–1094. [Google Scholar] [CrossRef]
- Sun, M.; Ma, L.; Su, X.; Gao, X.; Liu, Z.; Ma, L. Channel separation-based network for the automatic anatomical site recognition using endoscopic images. Biomed. Signal Process. Control 2022, 71, 103167. [Google Scholar] [CrossRef]
- Nam, S.J.; Moon, G.; Park, J.H.; Kim, Y.; Lim, Y.J.; Choi, H.S. Deep Learning-Based Real-Time Organ Localization and Transit Time Estimation in Wireless Capsule Endoscopy. Biomedicines 2024, 12, 1704. [Google Scholar] [CrossRef]
- Rey, J.F.; Lambert, R. ESGE recommendations for quality control in gastrointestinal endoscopy: Guidelines for image documentation in upper and lower GI endoscopy. Endoscopy 2001, 33, 901–903. [Google Scholar] [CrossRef]
- Ruder, S. An overview of gradient descent optimization algorithms. arXiv 2016, arXiv:1609.04747. [Google Scholar]
- Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A comprehensive survey on transfer learning. Proc. IEEE 2020, 109, 43–76. [Google Scholar] [CrossRef]
- Tasci, E.; Uluturk, C.; Ugur, A. A voting-based ensemble deep learning method focusing on image augmentation and preprocessing variations for tuberculosis detection. Neural Comput. Appl. 2021, 33, 15541–15555. [Google Scholar] [CrossRef]
- Bradley, A.P. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 1997, 30, 1145–1159. [Google Scholar] [CrossRef]
- Hajian-Tilaki, K. Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation. Casp. J. Intern. Med. 2013, 4, 627–635. [Google Scholar]
Data Categorization Criteria | Number of Image Data (%) | |
---|---|---|
Primary Classification | Secondary Classification | |
Esophagus | 6910 (22.0) | 4707 (68.1) |
Gastroesophageal junction | 2203 (31.8) | |
Cardia | 2126 (6.7) | |
Upper body | 8815 (28.0) | 3046 (34.5) |
Middle body | 3521 (39.9) | |
Lower body | 2248 (25.5) | |
Angle | 2226 (7.0) | |
Antrum | 4161 (13.2) | |
Duodenal bulb | 2631 (8.3) | 1036 (39.3) |
Duodenum second portion | 1595 (60.6) | |
Non-clear | 4534 (14.4) | |
Total | 31,403 (100.0) |
Accuracy (95% CI) | F1 Score (95% CI) | Precision (95% CI) | Recall (95% CI) | AUC (95% CI) | ||
---|---|---|---|---|---|---|
Primary classification | Esophagus Part, Cardia, Gastric Bodies Part, Angle, Antrum, Duodenum Part, and Non-Clear | |||||
ResNet101 | 75.68 (73.33–78.02) | 75.58 (73.20–77.92) | 75.71 (73.35–78.06) | 75.67 (73.33–78.02) | 95.26 (94.50–95.95) | |
InceptionV3 | 85.35 (83.41–87.22) | 85.24 (83.26–87.17) | 85.33 (83.32–87.28) | 85.32 (83.41–87.22) | 97.63 (97.13–98.10) | |
InceptionResNetV2 | 84.75 (82.70–86.75) | 84.67 (82.53–86.62) | 84.65 (82.48–86.66) | 84.76 (82.70–86.75) | 97.82 (97.32–98.29) | |
Secondary classification | Esophagus and Gastroesophageal Junction | |||||
ResNet101 | 77.11 (72.85–81.45) | 77.15 (72.83–81.45) | 77.21 (72.85–81.45) | 77.11 (72.85–81.45) | 82.70 (87.48–86.45) | |
InceptionV3 | 83.65 (79.83–87.37) | 83.54 (79.52–87.34) | 84.65 (81.13–88.23) | 83.60 (79.83–87.37) | 88.59 (85.10–91.75) | |
InceptionResNetV2 | 82.01 (77.95–85.75) | 81.94 (77.79–85.72) | 82.64 (78.60–86.14) | 81.99 (77.95–85.75) | 86.60 (82.82–90.04) | |
Upper Body, Middle Body, and Lower Body | ||||||
ResNet101 | 51.58 (47.47–55.67) | 50.72 (46.62–54.97) | 51.12 (46.82–55.41) | 51.48 (47.47–55.67) | 70.25 (67.02–73.33) | |
InceptionV3 | 57.71 (53.75–61.95) | 57.35 (53.35–61.66) | 57.77 (53.36–61.71) | 57.77 (53.75–61.95) | 74.68 (53.75–61.95) | |
InceptionResNetV2 | 60.16 (56.02–64.22) | 59.45 (55.15–63.49) | 59.50 (55.21–63.58) | 60.21 (56.02–64.22) | 77.31 (74.25–80.11) | |
Duodenal Bulb and Duodenum Second Portion | ||||||
ResNet101 | 83.60 (78.39–88.64) | 83.52 (78.26–88.65) | 83.79 (78.43–89.03) | 83.52 (78.39–88.64) | 92.96 (89.11–96.21) | |
InceptionV3 | 89.81 (85.23–94.32) | 89.78 (85.22–94.32) | 89.91 (85.23–94.34) | 89.77 (85.23–94.32) | 96.85 (94.56–98.73) | |
InceptionResNetV2 | 93.07 (89.20–96.59) | 93.17 (89.18–96.59) | 93.22 (89.23–96.62) | 93.18 (89.20–96.59) | 97.21 (94.95–99.02) |
Accuracy (95% CI) | F1 Score (95% CI) | Precision (95% CI) | Recall (95% CI) | |
---|---|---|---|---|
ResNet101 | 68.40 (66.00–70.90) | 67.29 (64.72–70.04) | 67.29 (64.52–70.05) | 68.41 (65.87–70.87) |
InceptionV3 | 80.10 (77.90–82.10) | 79.79 (77.36–82.09) | 80.57 (78.20–82.92) | 80.08 (77.70–82.30) |
InceptionResNetV2 | 78.00 (75.70–80.30) | 77.72 (75.31–80.09) | 78.28 (75.74–80.68) | 78.02 (75.71–80.24) |
Data Categorization Criteria | Sensitivity (95% CI) | Specificity (95% CI) | Data Categorization Criteria | Sensitivity (95% CI) | Specificity (95% CI) |
---|---|---|---|---|---|
Esophagus | 79.49 (72.46–86.57) | 99.91 (99.73–100.00) | Angle | 90.62 (86.23–94.59) | 97.97 (97.10–98.71) |
Gastroesophageal junction | 87.23 (77.19–95.24) | 97.60 (96.76–98.40) | Antrum | 88.30 (83.62–92.68) | 88.30 (96.61–98.42) |
Cardia | 90.00 (85.56–94.09) | 97.07 (96.02–98.02) | Duodenal bulb | 68.18 (56.34–80.00) | 98.42 (97.67–99.08) |
Upper body | 37.88 (27.14–49.33) | 98.57 (97.81–99.24) | Duodenum second portion | 92.44 (87.13–96.77) | 98.42 (98.87–99.82) |
Middle body | 43.84 (31.14–56.90) | 98.50 (97.75–99.09) | Non-clear | 79.81 (74.21–85.96) | 95.33 (94.13–96.56) |
Lower body | 55.88 (40.62–71.05) | 97.54 (96.63–98.37) |
F1 Score (95% CI) | Precision (95% CI) | Recall (95% CI) | |
---|---|---|---|
4 Frames | 59.13 (41.63–75.24) | 75.60 (53.07–90.10) | 57.81 (40.34–75.00) |
7 Frames | 61.37 (41.08–76.86) | 73.08 (51.70–87.79) | 57.21 (39.33–77.18) |
10 Frames | 59.84 (38.42–76.31) | 72.15 (49.84–85.00) | 57.15 (37.14–77.01) |
13 Frames | 55.69 (36.93–74.97) | 65.97 (47.26–82.28) | 56.72 (35.53–75.52) |
Post-Processing | F1 Score (95% CI) | Precision (95% CI) | Recall (95% CI) | |
---|---|---|---|---|
ResNet101 | Without post-processing | 45.15 (26.36–55.05) | 61.46 (44.20–73.23) | 45.76 (27.49–54.05) |
With post-processing | 47.87 (21.03–61.03) | 61.74 (39.62–82.06) | 47.79 (24.92–59.21) | |
InceptionV3 | Without post-processing | 55.12 (38.73–71.83) | 69.63 (46.56–86.52) | 54.69 (39.29–66.58) |
With post-processing | 59.66 (40.76–76.82) | 70.07 (43.35–91.28) | 59.07 (42.10–74.32) | |
InceptionResNetV2 | Without post-processing | 56.25 (39.25–73.20) | 69.51 (49.78–86.92) | 54.23 (38.34–72.08) |
With post-processing | 61.37 (41.08–76.86) | 73.08 (51.70–87.79) | 57.21 (39.33–77.18) |
Top 1 | Top 5 | Total (n = 20) | |||||||
---|---|---|---|---|---|---|---|---|---|
Labels | Sensitivity | Specificity | Avg. Frames | Sensitivity | Specificity | Avg. Frames | Sensitivity | Specificity | Avg. Frames |
ES | 100.00 | 98.10 | 280 | 91.09 | 95.07 | 462 | 89.79 | 95.68 | 650 |
GE | 65.00 | 100.00 | 400 | 54.96 | 98.14 | 318 | 52.11 | 98.64 | 345 |
CR | 81.13 | 99.54 | 1060 | 78.43 | 96.46 | 564 | 57.41 | 98.05 | 600 |
UB | 32.00 | 99.35 | 500 | 26.40 | 95.91 | 655 | 19.27 | 95.88 | 248 |
MB | 38.88 | 95.80 | 720 | 19.15 | 97.97 | 440 | 19.36 | 95.88 | 1218 |
LB | 66.66 | 96.53 | 540 | 25.39 | 98.08 | 420 | 27.22 | 93.37 | 811 |
AG | 96.80 | 93.41 | 1880 | 69.95 | 95.76 | 573 | 65.18 | 95.07 | 525 |
AT | 76.35 | 93.84 | 2960 | 77.72 | 96.13 | 1188 | 74.10 | 93.45 | 955 |
BB | 78.94 | 95.74 | 380 | 75.78 | 95.96 | 192 | 56.69 | 95.27 | 216 |
SD | 83.01 | 100.00 | 1060 | 74.03 | 99.44 | 426 | 78.55 | 99.18 | 292 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kang, S.M.; Lee, G.P.; Kim, Y.J.; Kim, K.O.; Kim, K.G. Deep Learning Models for Anatomical Location Classification in Esophagogastroduodenoscopy Images and Videos: A Quantitative Evaluation with Clinical Data. Diagnostics 2024, 14, 2360. https://doi.org/10.3390/diagnostics14212360
Kang SM, Lee GP, Kim YJ, Kim KO, Kim KG. Deep Learning Models for Anatomical Location Classification in Esophagogastroduodenoscopy Images and Videos: A Quantitative Evaluation with Clinical Data. Diagnostics. 2024; 14(21):2360. https://doi.org/10.3390/diagnostics14212360
Chicago/Turabian StyleKang, Seong Min, Gi Pyo Lee, Young Jae Kim, Kyoung Oh Kim, and Kwang Gi Kim. 2024. "Deep Learning Models for Anatomical Location Classification in Esophagogastroduodenoscopy Images and Videos: A Quantitative Evaluation with Clinical Data" Diagnostics 14, no. 21: 2360. https://doi.org/10.3390/diagnostics14212360
APA StyleKang, S. M., Lee, G. P., Kim, Y. J., Kim, K. O., & Kim, K. G. (2024). Deep Learning Models for Anatomical Location Classification in Esophagogastroduodenoscopy Images and Videos: A Quantitative Evaluation with Clinical Data. Diagnostics, 14(21), 2360. https://doi.org/10.3390/diagnostics14212360