Real-World Evaluation of an AI-Assisted Diagnostic Support System for Early Gastric Cancer: Diagnostic Performance, Confidence Stratification, and Determinants of False-Positive Diagnosis
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Design and Patients
2.2. Ethics Statement
2.3. Endoscopic Equipment and Examination Procedure
2.4. AI Assessment and Confidence Categorization
2.5. Regional Reproducibility
2.6. Statistical Analysis
3. Results
3.1. Overall Diagnostic Performance of the AI System
3.2. Stepwise Diagnostic Characteristics Across Four AI Confidence Categories
3.3. Factors Associated with False-Positive Diagnosis Among AI-Positive Lesions (TP vs. FP)
3.4. Factors Associated with False-Positive Diagnosis Among Non-Neoplastic Lesions (FP vs. TN)
4. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial intelligence |
| B | Consider biopsy |
| LC | Low confidence |
| NPV | Negative predictive value |
| PPV | Positive predictive value |
| OR | Odds ratio |
| CI | Confidence interval |
References
- Muto, M.; Yao, K.; Kaise, M.; Kato, M.; Uedo, N.; Yagi, K.; Tajiri, H. Magnifying endoscopy simple diagnostic algorithm for early gastric cancer (MESDA-G). Dig. Endosc. 2016, 28, 379–393. [Google Scholar] [CrossRef] [PubMed]
- Yao, K.; Uedo, N.; Kamada, T.; Hirasawa, T.; Nagahama, T.; Yoshinaga, S.; Oka, M.; Inoue, K.; Mabe, K.; Yao, T.; et al. Guidelines for endoscopic diagnosis of early gastric cancer. Dig. Endosc. 2020, 32, 663–698. [Google Scholar] [CrossRef] [PubMed]
- Hosokawa, O.; Hattori, M.; Douden, K.; Hayashi, H.; Ohta, K.; Kaizaki, Y. Difference in accuracy between gastroscopy and colonoscopy for detection of cancer. Hepatogastroenterology 2007, 54, 442–444. [Google Scholar] [PubMed]
- Pimenta-Melo, A.R.; Monteiro-Soares, M.; Libânio, D.; Dinis-Ribeiro, M. Missing rate for gastric cancer during upper gastrointestinal endoscopy: A systematic review and meta-analysis. Eur. J. Gastroenterol. Hepatol. 2016, 28, 1041–1049. [Google Scholar] [CrossRef] [PubMed]
- Hu, H.; Gong, L.; Dong, D.; Zhu, L.; Wang, M.; He, J.; Shu, L.; Cai, Y.; Cai, S.; Su, W.; et al. Identifying early gastric cancer under magnifying narrow-band images with deep learning: A multicenter study. Gastrointest. Endosc. 2021, 93, 1333–1341.e1333. [Google Scholar] [CrossRef] [PubMed]
- Wu, L.; Zhou, W.; Wan, X.; Zhang, J.; Shen, L.; Hu, S.; Ding, Q.; Mu, G.; Yin, A.; Huang, X.; et al. A deep neural network improves endoscopic detection of early gastric cancer without blind spots. Endoscopy 2019, 51, 522–531. [Google Scholar] [CrossRef] [PubMed]
- Wu, L.; Shang, R.; Sharma, P.; Zhou, W.; Liu, J.; Yao, L.; Dong, Z.; Yuan, J.; Zeng, Z.; Yu, Y.; et al. Effect of a deep learning-based system on the miss rate of gastric neoplasms during upper gastrointestinal endoscopy: A single-centre, tandem, randomised controlled trial. Lancet Gastroenterol. Hepatol. 2021, 6, 700–708. [Google Scholar] [CrossRef] [PubMed]
- Hirasawa, T.; Aoyama, K.; Tanimoto, T.; Ishihara, S.; Shichijo, S.; Ozawa, T.; Ohnishi, T.; Fujishiro, M.; Matsuo, K.; Fujisaki, J.; et al. Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastric Cancer 2018, 21, 653–660. [Google Scholar] [CrossRef] [PubMed]
- Shi, Y.; Fan, H.; Li, L.; Hou, Y.; Qian, F.; Zhuang, M.; Miao, B.; Fei, S. The value of machine learning approaches in the diagnosis of early gastric cancer: A systematic review and meta-analysis. World J. Surg. Oncol. 2024, 22, 40. [Google Scholar] [CrossRef] [PubMed]
- Hirasawa, T.; Ikenoyama, Y.; Ishioka, M.; Namikawa, K.; Horiuchi, Y.; Nakashima, H.; Fujisaki, J. Current status and future perspective of artificial intelligence applications in endoscopic diagnosis and management of gastric cancer. Dig. Endosc. 2021, 33, 263–272. [Google Scholar] [CrossRef] [PubMed]
- Mori, Y.; Ishihara, R.; Ogata, H.; Kutsumi, H.; Saito, Y.; Sumiyama, K.; Sekiguchi, M.; Tajiri, H.; Fujishiro, M.; Matsuda, K.; et al. Artificial Intelligence in Gastrointestinal Endoscopy: The Japan Gastroenterological Endoscopy Society Position Statements. Dig. Endosc. 2025, 37, 1116–1122. [Google Scholar] [CrossRef] [PubMed]
- Ebigbo, A.; Messmann, H.; Lee, S.H. Artificial Intelligence Applications in Image-Based Diagnosis of Early Esophageal and Gastric Neoplasms. Gastroenterology 2025, 169, 396–415.e392. [Google Scholar] [CrossRef] [PubMed]
- Ikenoyama, Y.; Hirasawa, T.; Ishioka, M.; Namikawa, K.; Yoshimizu, S.; Horiuchi, Y.; Ishiyama, A.; Yoshio, T.; Tsuchida, T.; Takeuchi, Y.; et al. Detecting early gastric cancer: Comparison between the diagnostic ability of convolutional neural networks and endoscopists. Dig. Endosc. 2021, 33, 141–150. [Google Scholar] [CrossRef] [PubMed]
- Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef] [PubMed]
- Lei, C.; Sun, W.; Wang, K.; Weng, R.; Kan, X.; Li, R. Artificial intelligence-assisted diagnosis of early gastric cancer: Present practice and future prospects. Ann. Med. 2025, 57, 2461679. [Google Scholar] [CrossRef] [PubMed]
- Kanesaka, T. Artificial Intelligence for the Detection of Neoplastic Lesions During Upper Gastrointestinal Endoscopy: Diagnostic Performance and Future Directions. Dig. Endosc. 2026, 38, e70090. [Google Scholar] [CrossRef] [PubMed]
- Ueyama, H.; Kato, Y.; Akazawa, Y.; Yatagai, N.; Komori, H.; Takeda, T.; Matsumoto, K.; Ueda, K.; Matsumoto, K.; Hojo, M.; et al. Application of artificial intelligence using a convolutional neural network for diagnosis of early gastric cancer based on magnifying endoscopy with narrow-band imaging. J. Gastroenterol. Hepatol. 2021, 36, 482–489. [Google Scholar] [CrossRef] [PubMed]
- Yatagai, N.; Ueyama, H.; Ikemura, M.; Uchida, R.; Utsunomiya, H.; Abe, D.; Oki, S.; Suzuki, N.; Ikeda, A.; Akazawa, Y.; et al. Clinicopathological and Endoscopic Features of Raspberry-Shaped Gastric Cancer in Helicobacter pylori-Uninfected Patients. Digestion 2020, 15, 41–48. [Google Scholar] [CrossRef] [PubMed]
- Mizutani, H.; Tsuji, Y.; Kubota, D.; Hisada, H.; Miura, Y.; Ohki, D.; Takeuchi, C.; Kakushima, N.; Yamamichi, N.; Kikuchi, R.; et al. Impact of interaction between an artificial intelligence endoscopic support system and endoscopists on diagnosis of gastric neoplastic lesions. Endosc. Int. Open 2025, 13, a26950556. [Google Scholar] [CrossRef] [PubMed]




| Characteristics | Value |
|---|---|
| Patients, n | 47 |
| Lesions, n | 89 |
| Age, years (mean ± SD) | 71.9 ± 9.5 |
| Male sex, n (%) | 74 (83.1%) |
| Lesion size, mm (median, IQR) | 10.0 (5.0–24.0) |
| Lesion size ≥ 30 mm, n (%) | 17 (19.1%) |
| Lesion location | |
| Upper third | 21 (23.6%) |
| Middle third | 31 (34.8%) |
| Lower third | 35 (39.3%) |
| Esophagogastric junction | 2 (2.2%) |
| Helicobacter pylori status | |
| Never infected | 18 (20.2%) |
| Currently infected | 15 (16.9%) |
| Previously infected | 53 (59.6%) |
| Unknown | 3 (3.4%) |
| Pathological diagnosis | |
| Adenocarcinoma | 41 (46.1%) |
| Malignant lymphoma * | 2 (2.2%) |
| Non-neoplastic lesions | 46 (51.7%) |
| Fundic gland polyp | 11 |
| Hyperplastic polyp | 8 |
| Ulcer scar | 8 |
| Erosion | 8 |
| Others ‡ | 11 |
| Median number of evaluations per lesion | 5 |
| Total number of AI judgements | 474 |
| Number of considered biopsies (%) | 373 (78.9%) |
| Number of positivity in AI diagnosis (B judgment rate ≥ 50%) | 66 (74.2%) |
| AI Positive (n = 66) | AI Negative (n = 23) | Diagnostic Index | |
|---|---|---|---|
| Pathology positive (n = 41) | 40 | 1 | Sensitivity: 97.6% |
| Pathology negative (n = 48) | 26 | 22 | Specificity: 45.8% |
| PPV: 60.6% | NPV: 95.7% | Overall accuracy: 69.7% |
| AI Confidence Category | B Judgment Rate (%) | N | True Positive | False Positive | True Negative | False Negative | Pathologically Cancer Positive Rate (%) |
|---|---|---|---|---|---|---|---|
| B | 100 | 41 | 31 | 10 | 0 | 0 | 75.6 |
| B/LC | 70–99 | 25 | 9 | 16 | 0 | 0 | 36.0 |
| LC/B | 50–69 | 9 | 0 | 0 | 8 | 1 | 11.1 |
| LC | 1–49 | 14 | 0 | 0 | 14 | 0 | 0.0 |
| Variable | Adjusted OR | 95% CI | p Value |
|---|---|---|---|
| Low regional reproducibility (score = 3) | 2.70 | 1.22–6.67 | 0.015 |
| Lesion size ≥ 30 mm | 2.41 | 1.08–5.38 | 0.032 |
| Scar | 3.68 | 1.02–13.2 | 0.047 |
| Erosion | 5.43 | 1.11–26.6 | 0.037 |
| Variable | Adjusted OR | 95% CI | p Value |
|---|---|---|---|
| Lesion size ≥ 30 mm | 2.66 | 1.01–7.03 | 0.048 |
| Fundic gland polyp | 0.59 | 0.18–1.94 | 0.39 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Osawa, S.; Yamada, T.; Inui, W.; Niwa, T.; Takahashi, K.; Egami, T.; Inagaki, K.; Takebe, T.; Ito, T.; Takahashi, S.; et al. Real-World Evaluation of an AI-Assisted Diagnostic Support System for Early Gastric Cancer: Diagnostic Performance, Confidence Stratification, and Determinants of False-Positive Diagnosis. J. Clin. Med. 2026, 15, 2609. https://doi.org/10.3390/jcm15072609
Osawa S, Yamada T, Inui W, Niwa T, Takahashi K, Egami T, Inagaki K, Takebe T, Ito T, Takahashi S, et al. Real-World Evaluation of an AI-Assisted Diagnostic Support System for Early Gastric Cancer: Diagnostic Performance, Confidence Stratification, and Determinants of False-Positive Diagnosis. Journal of Clinical Medicine. 2026; 15(7):2609. https://doi.org/10.3390/jcm15072609
Chicago/Turabian StyleOsawa, Satoshi, Takanori Yamada, Wataru Inui, Tomoyuki Niwa, Kenichi Takahashi, Takatoshi Egami, Keisuke Inagaki, Tomohiro Takebe, Tatsuhiro Ito, Satoru Takahashi, and et al. 2026. "Real-World Evaluation of an AI-Assisted Diagnostic Support System for Early Gastric Cancer: Diagnostic Performance, Confidence Stratification, and Determinants of False-Positive Diagnosis" Journal of Clinical Medicine 15, no. 7: 2609. https://doi.org/10.3390/jcm15072609
APA StyleOsawa, S., Yamada, T., Inui, W., Niwa, T., Takahashi, K., Egami, T., Inagaki, K., Takebe, T., Ito, T., Takahashi, S., Onoue, S., Asai, Y., Sugiura, K., Matsuura, T., Ishida, N., Yamade, M., Iwaizumi, M., Hamaya, Y., & Sugimoto, K. (2026). Real-World Evaluation of an AI-Assisted Diagnostic Support System for Early Gastric Cancer: Diagnostic Performance, Confidence Stratification, and Determinants of False-Positive Diagnosis. Journal of Clinical Medicine, 15(7), 2609. https://doi.org/10.3390/jcm15072609

