Next Article in Journal
A Practical Approach for Quantitative Polymerase Chain Reaction, the Gold Standard in Microbiological Diagnosis
Previous Article in Journal
Less Is More: Audience Cognition of Comic Simplification in the Characters of Peking Opera
 
 
Article

Nuances of Interpreting X-ray Analysis by Deep Learning and Lessons for Reporting Experimental Findings

School of Computer Science, University of St Andrews, North Haugh, St Andrews KY16 9SX, UK
*
Author to whom correspondence should be addressed.
Academic Editor: Ahmad Taher Azar
Received: 5 November 2021 / Revised: 20 December 2021 / Accepted: 5 January 2022 / Published: 16 January 2022
With the increase in the availability of annotated X-ray image data, there has been an accompanying and consequent increase in research on machine-learning-based, and ion particular deep-learning-based, X-ray image analysis. A major problem with this body of work lies in how newly proposed algorithms are evaluated. Usually, comparative analysis is reduced to the presentation of a single metric, often the area under the receiver operating characteristic curve (AUROC), which does not provide much clinical value or insight and thus fails to communicate the applicability of proposed models. In the present paper, we address this limitation of previous work by presenting a thorough analysis of a state-of-the-art learning approach and hence illuminate various weaknesses of similar algorithms in the literature, which have not yet been fully acknowledged and appreciated. Our analysis was performed on the ChestX-ray14 dataset, which has 14 lung disease labels and metainfo such as patient age, gender, and the relative X-ray direction. We examined the diagnostic significance of different metrics used in the literature including those proposed by the International Medical Device Regulators Forum, and present the qualitative assessment of the spatial information learned by the model. We show that models that have very similar AUROCs can exhibit widely differing clinical applicability. As a result, our work demonstrates the importance of detailed reporting and analysis of the performance of machine-learning approaches in this field, which is crucial both for progress in the field and the adoption of such models in practice. View Full-Text
Keywords: roentgen; chest; disease; thorax; error; label roentgen; chest; disease; thorax; error; label
Show Figures

Figure 1

MDPI and ACS Style

Valsson, S.; Arandjelović, O. Nuances of Interpreting X-ray Analysis by Deep Learning and Lessons for Reporting Experimental Findings. Sci 2022, 4, 3. https://doi.org/10.3390/sci4010003

AMA Style

Valsson S, Arandjelović O. Nuances of Interpreting X-ray Analysis by Deep Learning and Lessons for Reporting Experimental Findings. Sci. 2022; 4(1):3. https://doi.org/10.3390/sci4010003

Chicago/Turabian Style

Valsson, Steinar, and Ognjen Arandjelović. 2022. "Nuances of Interpreting X-ray Analysis by Deep Learning and Lessons for Reporting Experimental Findings" Sci 4, no. 1: 3. https://doi.org/10.3390/sci4010003

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop