Performance of a Chest Radiography AI Algorithm for Detection of Missed or Mislabeled Findings: A Multicenter Study
Abstract
:1. Introduction
Related Work
2. Materials and Methods
2.1. Approval and Disclosures
2.2. Chest Radiographs
2.3. AI Algorithm
2.4. Statistical Analyses
3. Results
4. Discussion
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Ekpo, E.U.; Egbe, N.O.; Akpan, B.E. Radiographers’ performance in chest X-ray interpretation: The Nigerian experience. Br. J. Radiol. 2015, 88, 20150023. [Google Scholar] [CrossRef] [PubMed]
- Speets, A.M.; van der Graaf, Y.; Hoes, A.W.; Kalmijn, S.; Sachs, A.P.; Rutten, M.J.; Gratama, J.W.C.; van Swijndregt, A.D.M.; Mali, W.P. Chest radiography in general practice: Indications, diagnostic yield and consequences for patient management. Br. J. Gen. Pract. 2006, 56, 574–578. [Google Scholar] [PubMed]
- Forrest, J.V.; Friedman, P.J. Radiologic errors in patients with lung cancer. West J. Med. 1981, 134, 485–490. [Google Scholar] [PubMed]
- Kelly, B. The chest radiograph. Ulster Med. J. 2012, 81, 143–148. [Google Scholar]
- Schaefer-Prokop, C.; Neitzel, U.; Venema, H.W.; Uffmann, M.; Prokop, M. Digital chest radiography: An update on modern technology, dose containment and control of image quality. Eur. Radiol. 2008, 18, 1818–1830. [Google Scholar] [CrossRef]
- Satia, I.; Bashagha, S.; Bibi, A.; Ahmed, R.; Mellor, S.; Zaman, F. Assessing the accuracy and certainty in interpreting chest X-rays in the medical division. Clin. Med. 2013, 13, 349–352. [Google Scholar] [CrossRef]
- Fancourt, N.; Deloria Knoll, M.; Barger-Kamate, B.; De Campo, J.; De Campo, M.; Diallo, M.; Ebruke, B.E.; Feikin, D.R.; Gleeson, F.; Gong, W.; et al. Standardized Interpretation of Chest Radiographs in Cases of Pediatric Pneumonia From the PERCH Study. Clin. Infect. Dis. 2017, 64 (Suppl. 3), S253–S261. [Google Scholar] [CrossRef]
- Berlin, L. Reporting the “missed” radiologic diagnosis: Medicolegal and ethical considerations. Radiology 1994, 192, 183–187. [Google Scholar] [CrossRef]
- Quekel, L.G.; Kessels, A.G.; Goei, R.; van Engelshoven, J.M. Miss rate of lung cancer on the chest radiograph in clinical practice. Chest 1999, 115, 720–724. [Google Scholar] [CrossRef]
- Institute of Medicine (US) Committee on Quality of Health Care in America; Kohn, L.T.; Corrigan, J.M.; Donaldson, M.S. (Eds.) To Err Is Human: Building a Safer Health System; National Academies Press: Washington, DC, USA, 2000. [Google Scholar]
- Ebrahimian, S.; Kalra, M.K.; Agarwal, S.; Bizzo, B.C.; Elkholy, M.; Wald, C.; Allen, B.; Dreyer, K.J. FDA-regulated AI algorithms: Trends, strengths, and gaps of validation studies. Acad. Radiol. 2021; in press. [Google Scholar] [CrossRef]
- Li, B.; Kang, G.; Cheng, K.; Zhang, N. Attention-Guided Convolutional Neural Network for Detecting Pneumonia on Chest X-rays. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2019, 2019, 4851–4854. [Google Scholar]
- Wu, J.T.; Wong, K.C.; Gur, Y.; Ansari, N.; Karargyris, A.; Sharma, A.; Morris, M.; Saboury, B.; Ahmad, H.; Boyko, O.; et al. Comparison of Chest Radiograph Interpretations by Artificial Intelligence Algorithm vs Radiology Residents. JAMA Netw. Open 2020, 3, e2022779. [Google Scholar] [CrossRef] [PubMed]
- Li, X.; Shen, L.; Xie, X.; Huang, S.; Xie, Z.; Hong, X.; Yu, J. Multi-resolution convolutional networks for chest X-ray radiograph based lung nodule detection. Artif. Intell. Med. 2020, 103, 101744. [Google Scholar] [CrossRef] [PubMed]
- Lan, C.C.; Hsieh, M.S.; Hsiao, J.K.; Wu, C.W.; Yang, H.H.; Chen, Y.; Hsieh, P.C.; Tzeng, I.S.; Wu, Y.K. Deep Learning-based Artificial Intelligence Improves Accuracy of Error-prone Lung Nodules. Int. J. Med. Sci. 2022, 19, 490. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Jiang, B.; Zhang, L.; Greuter, M.J.; de Bock, G.H.; Zhang, H.; Xie, X. Lung nodule detectability of artificial intelligence-assisted CT image reading in lung cancer screening. Curr. Med. Imaging 2022, 18, 327–334. [Google Scholar] [CrossRef] [PubMed]
- Rudolph, J.; Huemmer, C.; Ghesu, F.C.; Mansoor, A.; Preuhs, A.; Fieselmann, A.; Fink, N.; Dinkel, J.; Koliogiannis, V.; Schwarze, V.; et al. Artificial Intelligence in Chest Radiography Reporting Accuracy: Added Clinical Value in the Emergency Unit Setting Without 24/7 Radiology Coverage. Investig. Radiol. 2022, 57, 90–98. [Google Scholar] [CrossRef] [PubMed]
- Nguyen, N.H.; Nguyen, H.Q.; Nguyen, N.T.; Nguyen, T.V.; Pham, H.H.; Nguyen, T.N.-M. Deployment and validation of an AI system for detecting abnormal chest radiographs in clinical settings. Front. Digit. Health 2022, 4, 890759. [Google Scholar] [CrossRef]
- Ajmera, P.; Kharat, A.; Gupte, T.; Pant, R.; Kulkarni, V.; Duddalwar, V.; Lamghare, P. Observer performance evaluation of the feasibility of a deep learning model to detect cardiomegaly on chest radiographs. Acta Radiol. Open 2022, 11, 20584601221107345. [Google Scholar] [CrossRef]
- Homayounieh, F.; Digumarthy, S.; Ebrahimian, S.; Rueckel, J.; Hoppe, B.F.; Sabel, B.O.; Conjeti, S.; Ridder, K.; Sistermanns, M.; Wang, L.; et al. An Artificial Intelligence–Based Chest X-ray Model on Human Nodule Detection Accuracy From a Multicenter Study. JAMA Netw. Open 2021, 4, e2141096. [Google Scholar] [CrossRef]
- Engle, E.; Gabrielian, A.; Long, A.; Hurt, D.E.; Rosenthal, A. Performance of Qure. ai automatic classifiers against a large annotated database of patients with diverse forms of tuberculosis. PLoS ONE 2020, 15, e0224445. [Google Scholar] [CrossRef]
- Yoo, H.; Kim, K.H.; Singh, R.; Digumarthy, S.R.; Kalra, M.K. Validation of a deep learning algorithm for the detection of malignant pulmonary nodules in chest radiographs. JAMA Netw. Open 2020, 3, e2017135. [Google Scholar] [CrossRef] [PubMed]
- Behzadi-Khormouji, H.; Rostami, H.; Salehi, S.; Derakhshande-Rishehri, T.; Masoumi, M.; Salemi, S.; Keshavarz, A.; Gholamrezanezhad, A.; Assadi, M.; Batouli, A. Deep learning, reusable and problem-based architectures for detection of consolidation on chest X-ray images. Comput. Methods Programs Biomed. 2020, 185, 105162. [Google Scholar] [CrossRef] [PubMed]
- Itri, J.N.; Tappouni, R.R.; McEachern, R.O.; Pesch, A.J.; Patel, S.H. Fundamentals of diagnostic error in imaging. Radiographics 2018, 38, 1845–1865. [Google Scholar] [CrossRef] [PubMed]
- Thian, Y.L.; Ng, D.; Hallinan, J.T.; Jagmohan, P.; Sia, D.S.; Tan, C.H.; Ting, Y.H.; Kei, P.L.; Pulickal, G.G.; Tiong, V.T.; et al. Deep Learning Systems for Pneumothorax Detection on Chest Radiographs: A Multicenter External Validation Study. Radiol. Artif. Intell. 2021, 3, e200190. [Google Scholar] [CrossRef]
- Arora, R.; Bansal, V.; Buckchash, H.; Kumar, R.; Sahayasheela, V.J.; Narayanan, N.; Pandian, G.N.; Raman, B. AI-based diagnosis of COVID-19 patients using X-ray scans with stochastic ensemble of CNNs. Phys. Eng. Sci. Med. 2021, 44, 1257–1271. [Google Scholar] [CrossRef]
- Nabulsi, Z.; Sellergren, A.; Jamshy, S.; Lau, C.; Santos, E.; Kiraly, A.P.; Ye, W.; Yang, J.; Pilgrim, R.; Kazemzadeh, S.; et al. Deep learning for distinguishing normal versus abnormal chest radiographs and generalization to two unseen diseases tuberculosis and COVID-19. Sci. Rep. 2021, 11, 1–5. [Google Scholar]
- Baltruschat, I.; Steinmeister, L.; Nickisch, H.; Saalbach, A.; Grass, M.; Adam, G.; Knopp, T.; Ittrich, H. Smart chest X-ray worklist prioritization using artificial intelligence: A clinical workflow simulation. Eur. Radiol. 2021, 31, 3837–3845. [Google Scholar] [CrossRef]
- O’neill, T.J.; Xi, Y.; Stehel, E.; Browning, T.; Ng, Y.S.; Baker, C.; Peshock, R.M. Active reprioritization of the reading worklist using artificial intelligence has a beneficial effect on the turnaround time for interpretation of head CT with intracranial hemorrhage. Radiol. Artif. Intell. 2020, 3, e200024. [Google Scholar] [CrossRef]
Authors (Year) | Sample Size and Approach | Results |
---|---|---|
Lan et al. (2022) [15] | 60 Chest CTs assessed both manually and with AI assistance | Unaided false positive (FP) rate was 0.617–0.650/CT and sensitivity was 59.2–67.0%; with AI assistance, the FP was 0.067–0.2/CT and the sensitivity was 59.2–77.3% |
Zhang et al. (2022) [16] | 860 chest CT screenings assessed by 14 residents and 15 radiologists; in addition, one radiologist and one resident re-evaluated CTs with AI assistance. | The accuracy and sensitivity of radiologists for solid nodules were 86% and 52%, compared to 99.1% and 98.8% with AI-assistance. |
Rudolph et al. (2022) [17] | 563 CXRs retrospectively assessed by multiple radiologists and compared with an AI system. | AI-assisted interpretation mimicked the most sensitive unassisted interpretation, with AUCs of 0.837 (pneumothorax), 0.823 (pleural effusion), and 0.747 (lung lesions) |
Nguyen et al. (2022) [18] | 6285 CXRs for abnormality detection with an AI algorithm | AI had 79.6% accuracy, 68.6% sensitivity, and 83.9% specificity; AI algorithms can help with interpretation of the CXRs as a second reader. |
Ajmera et al. (2022) [19] | 1012 posteroanterior CXRs for diagnosis of cardiomegaly | An AI algorithm improved sensitivity for identifying cardiomegaly from 40.5% to 88.4%. |
Homayounieh et al. (2021) [20] | 100 posteroanterior CXRs; detection of pulmonary nodules | Mean detection accuracy of pulmonary nodules increased by 6.4% with AI assistance for different levels of detection difficulty and reader experience. |
Site ASite A | Site B | All Remaining Sites | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
CXRs | PA | Portable | PA | Portable | PA | Portable | ||||||
Findings | MS | ML | MS | ML | MS | ML | MS | ML | MS | ML | MS | ML |
Consolidation | 49 | 3 | 3 | 0 | 4 | 0 | 0 | 0 | 3 | 0 | 0 | 0 |
Pulmonary nodule | 28 | 1 | 8 | 1 | 8 | 1 | 3 | 0 | 2 | 0 | 2 | 0 |
Pneumothorax | 68 | 6 | 20 | 1 | 2 | 0 | 0 | 0 | 3 | 0 | 0 | 0 |
Pleural effusion | 10 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 2 | 0 | 1 | 0 |
Rib fracture | 36 | 0 | 9 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Findings | Sensitivity | Specificity | Accuracy | AUC |
---|---|---|---|---|
Pulmonary nodule | 96 | 100 | 96 | 0.98 (0.94–1.00) |
Consolidation | 98 | 100 | 98 | 0.99 (0.97–1.00) |
Rib fracture | 87 | 100 | 94 | 0.94 (0.85–1.00) |
Pleural effusion | 100 | 17 | 67 | 0.82 (0.54–1.00) |
Pneumothorax | 84 | 100 | 85 | 0.92 (0.86–0.98) |
Findings | True Positive | True Negative | False Positive | False Negative |
---|---|---|---|---|
Pulmonary nodule | 51 | 1 | 0 | 2 |
Consolidation | 62 | 62 | 0 | 1 |
Rib fracture | 20 | 25 | 0 | 3 |
Pleural effusion | 9 | 1 | 0 | 5 |
Pneumothorax | 80 | 5 | 0 | 15 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kaviani, P.; Digumarthy, S.R.; Bizzo, B.C.; Reddy, B.; Tadepalli, M.; Putha, P.; Jagirdar, A.; Ebrahimian, S.; Kalra, M.K.; Dreyer, K.J. Performance of a Chest Radiography AI Algorithm for Detection of Missed or Mislabeled Findings: A Multicenter Study. Diagnostics 2022, 12, 2086. https://doi.org/10.3390/diagnostics12092086
Kaviani P, Digumarthy SR, Bizzo BC, Reddy B, Tadepalli M, Putha P, Jagirdar A, Ebrahimian S, Kalra MK, Dreyer KJ. Performance of a Chest Radiography AI Algorithm for Detection of Missed or Mislabeled Findings: A Multicenter Study. Diagnostics. 2022; 12(9):2086. https://doi.org/10.3390/diagnostics12092086
Chicago/Turabian StyleKaviani, Parisa, Subba R. Digumarthy, Bernardo C. Bizzo, Bhargava Reddy, Manoj Tadepalli, Preetham Putha, Ammar Jagirdar, Shadi Ebrahimian, Mannudeep K. Kalra, and Keith J. Dreyer. 2022. "Performance of a Chest Radiography AI Algorithm for Detection of Missed or Mislabeled Findings: A Multicenter Study" Diagnostics 12, no. 9: 2086. https://doi.org/10.3390/diagnostics12092086