Towards Automatic Detection of Pneumothorax in Emergency Care with Deep Learning Using Multi-Source Chest X-ray Data
Abstract
1. Introduction
- -
- We present a new unified and harmonized chest X-ray dataset framework to facilitate multi-source deep learning for pneumothorax detection. We modified and harmonized the structure from individual datasets (including label standardization) to make data unification possible.
- -
- We perform a set of experiments with models trained on multi-source data, trying to address the biases that pneumothorax models commonly suffer (chest tubes as confounder biases, label mismatch, etc.) with the aim of creating a model that is useful for clinical usage.
- -
- We evaluated the performance of the trained models on unseen data from other public datasets in a pseudo-clinical scenario. Our results show that multi-source datasets, in general, lead to models with better performance than single-source datasets.
2. Materials and Methods
2.1. Datasets
- ChestX-ray14 (‘NIH’) [26]: This was the first dataset with a massive amount of images, released in 2016 by the US National Institute of Health (NIH). It contains 112,120 frontal images from 30,805 individual patients, containing 14 different labels (amongst them, 5302 labelled with pneumothorax (4.73%) and 60,361 with no findings) labelled mainly via NLP techniques. A small subset of the dataset also contains bounding boxes for disease localization. Posterior research and expert labelling of the dataset changed the final pneumothorax count to 5138 (4.58%).
- CheXpert [27,44]: This was released in 2019 by Stanford University (US), containing 224,316 frontal and lateral images from 65,240 patients, using also 14 labels (most of them overlapping with the ones from NIH dataset). It has 19,466 pneumothorax (8.68%) and 22,528 normal images (10.04%). It also contains a label ‘Support Devices’ for external devices such as pacemakers, endotracheal tubes, valves, catheters, etc., and introduces the possibility of uncertainty of a label (‘−1’ value). Labels were extracted via NLP, and they have recently released an extension with radiology reports included [32].
- MIMIC-CXR [28,45]: This was released in 2019 by the Massachusetts Institute of Technology (MIT, US); it contains 377,110 multi-view images from 65,379 patients, using the same labels and NLP labeler as the previous dataset, but also releasing the radiology reports and original images in DICOM format. It contains 14,239 pneumothorax images (3.78%) and 143,363 normal images (38.02%).
- PadChest [29]: This was made public in 2019 by University of Alicante (Spain); it comprises 160,868 multi-view images from 67,625 unique patients, containing the Spanish radiology report and the extracted hierarchical labels using Unified Medical Language System (UMLS) terminology. A subset of the dataset is also manually labeled. It contains 851 pneumothorax images (0.52%, 411 of them manually labeled) and 50,616 normal images (31.47%).
- VinDr-CXR [30,46]: This was released by VinBigData in 2022; it contains 18,000 frontal images from two Vietnamese hospitals, manually annotated from radiologists, using both global and local labels. It contains 12,657 normal images (70.3%) and 76 pneumothorax images (0.004%). The dataset was released in DICOM format.
- SIIM-ACR Pneumothorax Detection Challenge [47]: This dataset was released in 2019 by the Society of Imaging Informatics in Medicine (SIIM) and American College of Radiology (ACR) for a competition focused on detecting pneumothorax in chest radiographs using segmentation. It contains 12,047 frontal images—9378 (78%) without pneumothorax and 2669 with pneumothorax (22%)—and their corresponding segmentation masks.
- BRAX [48,49]: This was released in 2022, and contains 40,967 frontal and lateral images from 18,442 patients from the Hospital Israelita Albert Einstein (Brazil). They adapted the NLP labeler used in CheXpert and MIMIC-CXR to Portuguese, thus extracting the same 14 disease labels. Original DICOM files were also publicly released. It contains 214 pneumothorax images (0.52%) and 29,009 images without findings (71%).
- Indiana [25,50]: This was the first multilabel chest X-ray dataset, released in 2012 by Indiana University (US), and contains 7470 images (frontal and lateral) from 3851 studies coming from two hospitals from the Indiana region, along with the radiology reports and MeSH codification. It contains 54 images with pneumothorax (0.72%) and 2696 normal images (36.09%).
- CRADI [55,56]: This was another dataset released in 2021 with images from several institutions in Shanghai, China. It contains 25 different labels annotated using NLP. The training set contains 74,082 frontal images, but it is not publicly available without prior request due to PII reasons; the external test set contains 10,440 images (one per patient), 201 with pneumothorax (1.92%), and 2737 without findings (26.22%).
2.2. Dataset Harmonization
- -
- Metadata organization: We cataloged everything into a single master table/CSV file, where each row corresponds to one image and includes all important metadata related to it (patient ID, study ID, image ID, view position, date, patient sex and age, and image path and image labels, amongst other values).
- -
- Label harmonization: We mapped each label to a common convention (e.g., “Pneumothorax” and “PTX” were mapped to “pneumothorax”, whereas “normal” or “pathological” were mapped to “no_finding”). We performed one-hot encoding on all labels. For images with uncertainty labels (labeled as ‘−1’ in CheXpert and MIMIC datasets), we relabeled them as disease-negative (‘0’). For datasets not containing the ‘no_finding’ column, it was set to positive if none of the other pathological columns was positive, and negative otherwise.
- -
- Split identifiers: To prevent data leakage, we assigned each patient study a unique cross-dataset identifier in order to allow train–validation splits at a patient level, ensuring no patient’s images appeared in both sets.
- -
- Projections: We correctly handled data projections and corrected some labelling errors for specific images, especially in pneumothorax-positive images.
- -
- DICOM conversion: Datasets that used DICOM (Digital Imaging and Communications in Medicine) format required additional processing to extract metadata out of the DICOM files, as well as the raw images, which were converted to 8-bit PNG images and saved to disk prior to training.
- -
- Original splits: If present in the dataset, original training, validation, and test splits were not used, as the ratio was sometimes different to the one we used, and our target was evaluation in external datasets.
- -
- -
- Segmentation information: Binary labels of interest such as ‘pneumothorax’ or ‘tubes’ were inferred if there were segmentation masks available. Segmentation masks themselves were not used for training.
2.3. Preprocessing
2.4. Model Architecture
2.5. Evaluation
2.6. Experiments
- Single-source vs. multi-source training: We retrained the same network architecture on each dataset independently (NIH, CheXpert, MIMIC.-CXR, PadChest, SIIM-ACR, and VinDR-CXR), and compared test-set metrics from single-source models to those of the multi-source model.
- Threshold optimization: To maximize clinical utility, we evaluated threshold selection strategies on the validation set. Choosing an optimal threshold is crucial, as a suboptimal choice can lead to a model that can obtain high performance metrics but no clinical utility. Our aim was to balance false negatives (which risk missed pneumothoraces and can be potentially life-threatening) and false positives (which can lead to lack of clinical confidence).
3. Results
3.1. Evaluation of Multi-Source Model on External Datasets
- -
- Mean ± SD (macro-averaged): Taking the arithmetic mean and standard deviation of each metric across individual datasets, each dataset is treated equally (irrespective of the number of samples), which is useful for evaluating model robustness to different dataset sources.
- -
- Overall (micro-averaged): By combining all test samples into a single dataset and calculating their metrics, so that each image has equal weight irrespective of its source, this is useful for reflecting real-world performance but can hide biases in smaller test datasets.
3.2. Single-Source vs. Multi-Source Training
3.3. Threshold Optimization
4. Discussion
5. Deployment and Integrability
5.1. In-Device Integration
5.2. Server Integration with PACS-RIS Service
5.3. Local Web Dashboard
5.4. Cloud Dashboard
5.5. Mobile Integration
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
Abbreviation | Meaning |
ACP | Acute Chest Pain |
ACR | American College of Radiology |
AES | Advanced Encryption Standard |
AI | Artificial Intelligence |
AUC | Area Under the Curve |
AP | Anteroposterior (radiographic projection) Average Precision |
AUROC | Area Under the Receiver Operating Characteristic Curve |
CNN | Convolutional Neural Network |
CSV | Comma Separated Values |
C-STORE | DICOM Composite Object Store Service Class |
CPU | Central Processing Unit |
CT | Computed Tomography |
DICOM | Digital Imaging and Communications in Medicine |
EMR | Electronic Medical Record |
EU | European Union |
FDA | Food and Drug Administration (U.S.) |
FHIR | Fast Healthcare Interoperability Resources |
GPU | Graphics Processing Unit |
HL7 | Health Level 7 |
HTTPS | Hypertext Transfer Protocol Secure |
IMIB | Instituto Murciano de Investigación Biosanitaria |
JPG | Joint Photographic Experts Group |
JSON | Javascript Object Notation |
MDR | Medical Device Regulation |
MeSH | Medical Subject Headings |
MDPI | Multidisciplinary Digital Publishing Institute |
MRI | Magnetic Resonance Imaging |
NIH | National Institute of Health (U.S.) |
NLP | Natural Language Processing |
NPV | Negative Predictive Value |
PA | Posteroanterior (radiographic projection) |
PACS | Picture Archiving and Communication System |
PHI | Personal Health Information |
PII | Personal Identification Information |
PNG | Portable Network Graphics |
PPV | Positive Predictive Value |
PR | Precision-Recall |
PTX | Pneumothorax |
RIS | Radiology Information System |
RGB | Red Green Blue |
ROC | Receiver Operating Characteristic |
SIIM | Society of Imaging Informatics in Medicine |
SOTA | State Of The Art |
TLS | Transport Layer Security |
UMLS | Unified Medical Language System |
US | United States |
VPC | Virtual Private Cloud |
WHO | World Health Organization |
References
- Wenderott, K.; Krups, J.; Zaruchas, F.; Weigl, M. Effects of Artificial Intelligence Implementation on Efficiency in Medical Imaging—A Systematic Literature Review and Meta-Analysis. npj Digit. Med. 2024, 7, 265. [Google Scholar] [CrossRef]
- Chen, X.; Wang, X.; Zhang, K.; Fung, K.-M.; Thai, T.C.; Moore, K.; Mannel, R.S.; Liu, H.; Zheng, B.; Qiu, Y. Recent Advances and Clinical Applications of Deep Learning in Medical Image Analysis. Med. Image Anal. 2022, 79, 102444. [Google Scholar] [CrossRef] [PubMed]
- Mazurowski, M.A.; Buda, M.; Saha, A.; Bashir, M.R. Deep Learning in Radiology: An Overview of the Concepts and a Survey of the State of the Art with Focus on MRI. J. Magn. Reson. Imaging JMRI 2019, 49, 939–954. [Google Scholar] [CrossRef]
- Choy, G.; Khalilzadeh, O.; Michalski, M.; Do, S.; Samir, A.E.; Pianykh, O.S.; Geis, J.R.; Pandharipande, P.V.; Brink, J.A.; Dreyer, K.J. Current Applications and Future Impact of Machine Learning in Radiology. Radiology 2018, 288, 318–328. [Google Scholar] [CrossRef] [PubMed]
- Kapoor, N.; Lacson, R.; Khorasani, R. Workflow Applications of Artificial Intelligence in Radiology and an Overview of Available Tools. J. Am. Coll. Radiol. 2020, 17, 1363–1370. [Google Scholar] [CrossRef]
- Selby, I.A.; González Solares, E.; Breger, A.; Roberts, M.; Escudero Sánchez, L.; Babar, J.; Rudd, J.H.F.; Walton, N.A.; Sala, E.; Schönlieb, C.-B.; et al. A Pipeline for Automated Quality Control of Chest Radiographs. Radiol. Artif. Intell. 2025, 7, e240003. [Google Scholar] [CrossRef] [PubMed]
- Hosny, A.; Parmar, C.; Quackenbush, J.; Schwartz, L.H.; Aerts, H.J.W.L. Artificial Intelligence in Radiology. Nat. Rev. Cancer 2018, 18, 500–510. [Google Scholar] [CrossRef]
- Hwang, E.J.; Goo, J.M.; Park, C.M. AI Applications for Thoracic Imaging: Considerations for Best Practice. Radiology 2025, 314, e240650. [Google Scholar] [CrossRef]
- Benjamens, S.; Dhunnoo, P.; Meskó, B. The State of Artificial Intelligence-Based FDA-Approved Medical Devices and Algorithms: An Online Database. npj Digit. Med. 2020, 3, 118. [Google Scholar] [CrossRef]
- Afshari Mirak, S.; Tirumani, S.H.; Ramaiya, N.; Mohamed, I. The Growing Nationwide Radiologist Shortage: Current Opportunities and Ongoing Challenges for International Medical Graduate Radiologists. Radiology 2025, 314, e232625. [Google Scholar] [CrossRef]
- World Health Organization (WHO). To X-Ray or Not to X-Ray? Available online: https://www.who.int/news-room/feature-stories/detail/to-x-ray-or-not-to-x-ray- (accessed on 27 April 2025).
- NHS England. Diagnostic Imaging Dataset Statistical Release: 24 April 2025; NHS England: London, UK, 2025. Available online: https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2025/04/Statistical-Release-24th-April-2025.pdf (accessed on 27 April 2025).
- Sociedad Española de Radiología Médica (SERAM). Informe sobre la Radiografía Convencional. Versión 3; SERAM: Madrid, Spain, 2021; Available online: https://seram.es/wp-content/uploads/2021/09/informe_rx_simple_v3.pdf (accessed on 27 April 2025).
- Tajmir, S.H.; Alkasab, T.K. Toward Augmented Radiologists: Changes in Radiology Education in the Era of Machine Learning and Artificial Intelligence. Acad. Radiol. 2018, 25, 747–750. [Google Scholar] [CrossRef]
- Annarumma, M.; Withey, S.J.; Bakewell, R.J.; Pesce, E.; Goh, V.; Montana, G. Automated Triaging of Adult Chest Radiographs with Deep Artificial Neural Networks. Radiology 2019, 291, 196–202. [Google Scholar] [CrossRef] [PubMed]
- Hwang, E.J.; Nam, J.G.; Lim, W.H.; Park, S.J.; Jeong, Y.S.; Kang, J.H.; Hong, E.K.; Kim, T.M.; Goo, J.M.; Park, S.; et al. Deep Learning for Chest Radiograph Diagnosis in the Emergency Department. Radiology 2019, 293, 573–580. [Google Scholar] [CrossRef]
- Khader, F.; Han, T.; Müller-Franzes, G.; Huck, L.; Schad, P.; Keil, S.; Barzakova, E.; Schulze-Hagen, M.; Pedersoli, F.; Schulz, V.; et al. Artificial Intelligence for Clinical Interpretation of Bedside Chest Radiographs. Radiology 2022, 307, 220510. [Google Scholar] [CrossRef] [PubMed]
- Yun, J.; Ahn, Y.; Cho, K.; Oh, S.Y.; Lee, S.M.; Kim, N.; Seo, J.B. Deep Learning for Automated Triaging of Stable Chest Radiographs in a Follow-up Setting. Radiology 2023, 309, e230606. [Google Scholar] [CrossRef]
- Kolossváry, M.; Raghu, V.K.; Nagurney, J.T.; Hoffmann, U.; Lu, M.T. Deep Learning Analysis of Chest Radiographs to Triage Patients with Acute Chest Pain Syndrome. Radiology 2023, 306, e221926. [Google Scholar] [CrossRef] [PubMed]
- Bintcliffe, O.; Maskell, N. Spontaneous pneumothorax. BMJ 2014, 348, g2928. [Google Scholar] [CrossRef] [PubMed]
- O’Connor, A.R.; Morgan, W.E. Radiological Review of Pneumothorax. BMJ 2005, 330, 1493–1497. [Google Scholar] [CrossRef]
- Medical Devices; Radiology Devices; Classification of the Radiological Computer Aided Triage and Notification Software. Available online: https://www.federalregister.gov/documents/2020/01/22/2020-00496/medical-devices-radiology-devices-classification-of-the-radiological-computer-aided-triage (accessed on 3 May 2025).
- Electronic Code of Federal Regulations. U.S. Government Publishing Office. Code of Federal Regulations—21 CFR 892.2080—Radiological Computer Aided Triage and Notification Software. Available online: https://www.ecfr.gov/current/title-21/part-892/section-892.2080 (accessed on 3 May 2025).
- European Commission MDCG 2021-24—Guidance on Classification of Medical Devices—Annex VIII Rule 11. Available online: https://health.ec.europa.eu/latest-updates/mdcg-2021-24-guidance-classification-medical-devices-2021-10-04_en (accessed on 3 May 2025).
- Demner-Fushman, D.; Kohli, M.D.; Rosenman, M.B.; Shooshan, S.E.; Rodriguez, L.; Antani, S.; Thoma, G.R.; McDonald, C.J. OpenI Indiana Dataset: Preparing a Collection of Radiology Examinations for Distribution and Retrieval. J. Am. Med. Inform. Assoc. 2016, 23, 304–310. [Google Scholar] [CrossRef]
- Wang, X.; Peng, Y.; Lu, L.; Lu, Z.; Bagheri, M.; Summers, R.M. ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 3462–3471. [Google Scholar] [CrossRef]
- Irvin, J.; Rajpurkar, P.; Ko, M.; Yu, Y.; Ciurea-Ilcus, S.; Chute, C.; Marklund, H.; Haghgoo, B.; Ball, R.; Shpanskaya, K.; et al. CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison. arXiv 2019, arXiv:1901.07031. [Google Scholar] [CrossRef]
- Johnson, A.E.W.; Pollard, T.J.; Berkowitz, S.J.; Greenbaum, N.R.; Lungren, M.P.; Deng, C.; Mark, R.G.; Horng, S. MIMIC-CXR: A Large Publicly Available Database of Labeled Chest Radiographs. arXiv 2019, arXiv:1901.07042. [Google Scholar] [CrossRef]
- Bustos, A.; Pertusa, A.; Salinas, J.-M.; de la Iglesia-Vayá, M. PadChest: A Large Chest x-Ray Image Dataset with Multi-Label Annotated Reports. arXiv 2019, arXiv:1901.07441. [Google Scholar] [CrossRef]
- Nguyen, H.Q.; Lam, K.; Le, L.T.; Pham, H.H.; Tran, D.Q.; Nguyen, D.B.; Le, D.D.; Pham, C.M.; Tong, H.T.T.; Dinh, D.H.; et al. VinDr-CXR: An Open Dataset of Chest X-Rays with Radiologist’s Annotations. Sci. Data 2022, 9, 429. [Google Scholar] [CrossRef]
- Oakden-Rayner, L. Exploring Large Scale Public Medical Image Datasets. Acad. Radiol. 2019, 27, 106–112. [Google Scholar] [CrossRef] [PubMed]
- Chambon, P.; Delbrouck, J.-B.; Sounack, T.; Huang, S.-C.; Chen, Z.; Varma, M.; Truong, S.Q.; Chuong, C.T.; Langlotz, C.P. CheXpert Plus: Augmenting a Large Chest X-Ray Dataset with Text Radiology Reports, Patient Demographics and Additional Image Formats. arXiv 2024, arXiv:2405.19538. [Google Scholar] [CrossRef]
- Oakden-Rayner, L.; Dunnmon, J.; Carneiro, G.; Ré, C. Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging. arXiv 2019, arXiv:1909.12475. [Google Scholar] [CrossRef]
- Majkowska, A.; Mittal, S.; Steiner, D.F.; Reicher, J.J.; McKinney, S.M.; Duggan, G.E.; Eswaran, K.; Cameron Chen, P.-H.; Liu, Y.; Kalidindi, S.R.; et al. Chest Radiograph Interpretation with Deep Learning Models: Assessment with Radiologist-Adjudicated Reference Standards and Population-Adjusted Evaluation. Radiology 2019, 294, 421–431. [Google Scholar] [CrossRef] [PubMed]
- Nabulsi, Z.; Sellergren, A.; Jamshy, S.; Lau, C.; Santos, E.; Ye, W.; Yang, J.; Pilgrim, R.; Kazemzadeh, S.; Yu, J.; et al. Deep Learning for Distinguishing Normal versus Abnormal Chest Radiographs and Generalization to Unseen Diseases. Nat. Sci. Rep. 2021. [Google Scholar] [CrossRef]
- Damgaard, C.; Eriksen, T.N.; Juodelyte, D.; Cheplygina, V.; Jiménez-Sánchez, A. Augmenting Chest X-Ray Datasets with Non-Expert Annotations. arXiv 2023, arXiv:2309.02244. [Google Scholar] [CrossRef]
- Cheplygina, V.; Cathrine, D.; Eriksen, T.N.; Jiménez-Sánchez, A. NEATX: Non-Expert Annotations of Tubes in X-Rays. Zenodo 2025. [Google Scholar] [CrossRef]
- Hallinan, J.T.P.D.; Feng, M.; Ng, D.; Sia, S.Y.; Tiong, V.T.Y.; Jagmohan, P.; Makmur, A.; Thian, Y.L. Detection of Pneumothorax with Deep Learning Models: Learning From Radiologist Labels vs Natural Language Processing Model Generated Labels. Acad. Radiol. 2022, 29, 1350–1358. [Google Scholar] [CrossRef] [PubMed]
- Filice, R.W.; Stein, A.; Wu, C.C.; Arteaga, V.A.; Borstelmann, S.; Gaddikeri, R.; Galperin-Aizenberg, M.; Gill, R.R.; Godoy, M.C.; Hobbs, S.B.; et al. Crowdsourcing Pneumothorax Annotations Using Machine Learning Annotations on the NIH Chest X-Ray Dataset. J. Digit. Imaging 2020, 33, 490–496. [Google Scholar] [CrossRef]
- Goldberger, A.L.; Amaral, L.A.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 2000, 101, E215–E220. [Google Scholar] [CrossRef] [PubMed]
- Pham, H.H.; Tran, T.T.; Nguyen, H.Q. VinDr-PCXR: An Open, Large-Scale Pediatric Chest X-Ray Dataset for Interpretation of Common Thoracic Diseases (Ver-sion 1.0.0). PhysioNet. RRID:SCR_007345. 2022. Available online: https://doi.org/10.13026/k8qc-na36 (accessed on 26 June 2025).
- Fan, W.; Yang, Y.; Qi, J.; Zhang, Q.; Liao, C.; Wen, L.; Wang, S.; Wang, G.; Xia, Y.; Wu, Q.; et al. A Deep-Learning-Based Framework for Identifying and Localizing Multiple Abnormalities and Assessing Cardiomegaly in Chest X-Ray. Nat. Commun. 2024, 15, 1347. [Google Scholar] [CrossRef]
- Feng, S. CANDID-II Dataset. 2022. Available online: https://doi.org/10.17608/k6.auckland.19606921.v1 (accessed on 26 June 2025).
- Stanford Center for AI in Medicine & Imaging. CheXpert Dataset. Available online: https://doi.org/10.71718/y7pj-4v93 (accessed on 28 May 2025).
- Johnson, A.; Pollard, T.; Mark, R.; Berkowitz, S.; Horng, S. MIMIC-CXR Database (Version 2.1.0). PhysioNet. RRID:SCR_007345. 2024. Available online: https://doi.org/10.13026/4jqj-jw95 (accessed on 26 June 2025).
- Nguyen, H.Q.; Pham, H.H.; Tuan Linh, L.; Dao, M.; Khanh, L. VinDr-CXR: An Open Dataset of Chest X-Rays with Radiologist Annotations (Version 1.0.0). PhysioNet. RRID:SCR_007345. 2021. Available online: https://doi.org/10.13026/3akn-b287 (accessed on 26 June 2025).
- SIIM-ACR Pneumothorax Segmentation. Available online: https://kaggle.com/competitions/siim-acr-pneumothorax-segmentation (accessed on 5 March 2023).
- Reis, E.P.; de Paiva, J.P.Q.; da Silva, M.C.B.; Ribeiro, G.A.S.; Paiva, V.F.; Bulgarelli, L.; Lee, H.M.H.; Santos, P.V.; Brito, V.M.; Amaral, L.T.W.; et al. BRAX, Brazilian Labeled Chest x-Ray Dataset. Sci. Data 2022, 9, 487. [Google Scholar] [CrossRef]
- Reis, E.P.; Paiva, J.; Bueno da Silva, M.C.; Sousa Ribeiro, G.A.; Fornasiero Paiva, V.; Bulgarelli, L.; Lee, H.; dos Santos, P.V.; brito v Amaral, L.; Beraldo, G.; et al. BRAX, a Brazilian Labeled Chest X-Ray Dataset (Version 1.1.0). PhysioNet. RRID:SCR_007345. 2022. Available online: https://doi.org/10.13026/grwk-yh18 (accessed on 26 June 2025).
- Indiana University; U.S. National Library of Medicine. Open-I: Open Access Biomedical Image Search Engine. Available online: https://openi.nlm.nih.gov/faq (accessed on 28 May 2025).
- Feng, S.; Azzollini, D.; Kim, J.S.; Jin, C.-K.; Gordon, S.P.; Yeoh, J.; Kim, E.; Han, M.; Lee, A.; Patel, A.; et al. Curation of the CANDID-PTX Dataset with Free-Text Reports. Radiol. Artif. Intell. 2021, 3, e210136. [Google Scholar] [CrossRef] [PubMed]
- CANDID-PTX Dataset. 2021. Available online: https://doi.org/10.17608/k6.auckland.14173982 (accessed on 28 May 2025).
- Wang, Y.; Wang, K.; Peng, X.; Shi, L.; Sun, J.; Zheng, S.; Shan, F.; Shi, W.; Liu, L. DeepSDM: Boundary-Aware Pneumothorax Segmentation in Chest X-Ray Images. Neurocomputing 2021, 454, 201–211. [Google Scholar] [CrossRef]
- Wang, Y. PTX-498: A Multi-Center Pneumothorax Segmentation Chest X-Ray Image Dataset. Zenodo 2021. [Google Scholar] [CrossRef]
- Zhang, Y.; Liu, M.; Hu, S.; Shen, Y.; Lan, J.; Jiang, B.; de Bock, G.H.; Vliegenthart, R.; Chen, X.; Xie, X. Development and Multicenter Validation of Chest X-Ray Radiography Interpretations Based on Natural Language Processing. Commun. Med. 2021, 1, 1–12. [Google Scholar] [CrossRef]
- Liu, M.; Xie, X. Chest Radiograph at Diverse Institutes (CRADI) Dataset. Zenodo 2021. [Google Scholar] [CrossRef]
- Development of a Digital Image Database for Chest Radiographs With and Without a Lung Nodule. Available online: https://www.ajronline.org/doi/epdf/10.2214/ajr.174.1.1740071 (accessed on 28 May 2025).
- Gohagan, J.K.; Prorok, P.C.; Greenwald, P.; Kramer, B.S. The PLCO Cancer Screening Trial: Background, Goals, Organization, Operations, Results. Rev. Recent Clin. Trials 2015, 10, 173–180. [Google Scholar] [CrossRef] [PubMed]
- National Lung Screening Trial Research Team Data from the National Lung Screening Trial (NLST). 2013. Available online: https://doi.org/10.7937/TCIA.HMQ8-J677 (accessed on 26 June 2025).
- Liu, Y.; Wu, Y.-H.; Ban, Y.; Wang, H.; Cheng, M.-M. TBX11K: Rethinking Computer-Aided Tuberculosis Diagnosis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2646–2655. [Google Scholar] [CrossRef]
- Jaeger, S.; Candemir, S.; Antani, S.; Wáng, Y.-X.J.; Lu, P.-X.; Thoma, G. Two Public Chest X-Ray Datasets for Computer-Aided Screening of Pulmonary Diseases. Quant. Imaging Med. Surg. 2014, 4, 475–477. [Google Scholar] [CrossRef]
- Ogawa, R.; Kido, T.; Kido, T.; Mochizuki, T. Effect of Augmented Datasets on Deep Convolutional Neural Networks Applied to Chest Radiographs. Clin. Radiol. 2019, 74, 697–701. [Google Scholar] [CrossRef] [PubMed]
- Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv 2020, arXiv:1905.11946. [Google Scholar] [CrossRef]
- Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. arXiv 2016, arXiv:1608.06993. [Google Scholar] [CrossRef]
- Thian, Y.L.; Ng, D.; Hallinan, J.T.P.D.; Jagmohan, P.; Sia, S.Y.; Tan, C.H.; Ting, Y.H.; Kei, P.L.; Pulickal, G.G.; Tiong, V.T.Y.; et al. Deep Learning Systems for Pneumothorax Detection on Chest Radiographs: A Multicenter External Validation Study. Radiol. Artif. Intell. 2021, 3, e200190. [Google Scholar] [CrossRef] [PubMed]
- Haque, M.I.U.; Dubey, A.K.; Danciu, I.; Justice, A.C.; Ovchinnikova, O.S.; Hinkle, J.D. Effect of Image Resolution on Automated Classification of Chest X-Rays. J. Med. Imaging 2023, 10, 044503. [Google Scholar] [CrossRef]
- Pereira, S.C.; Rocha, J.; Campilho, A.; Sousa, P.; Mendonça, A.M. Lightweight Multi-Scale Classification of Chest Radiographs via Size-Specific Batch Normalization. Comput. Methods Programs Biomed. 2023, 236, 107558. [Google Scholar] [CrossRef]
- Wollek, A.; Hyska, S.; Sabel, B.; Ingrisch, M.; Lasser, T. Higher Chest X-Ray Resolution Improves Classification Performance. arXiv 2023, arXiv:2306.06051. [Google Scholar] [CrossRef]
- Comparison of Fine-Tuning Strategies for Transfer Learning in Medical Image Classification. Available online: https://arxiv.org/html/2406.10050v1 (accessed on 2 April 2025).
- Mosquera, C.; Ferrer, L.; Milone, D.H.; Luna, D.; Ferrante, E. Class Imbalance on Medical Image Classification: Towards Better Evaluation Practices for Discrimination and Calibration Performance. Eur. Radiol. 2024, 34, 7895–7903. [Google Scholar] [CrossRef]
- Cohen, J.P.; Hashir, M.; Brooks, R.; Bertrand, H. On the Limits of Cross-Domain Generalization in Automated X-Ray Prediction. arXiv 2020, arXiv:2002.02497. [Google Scholar] [CrossRef]
- Rueckel, J.; Huemmer, C.; Fieselmann, A.; Ghesu, F.-C.; Mansoor, A.; Schachtner, B.; Wesp, P.; Trappmann, L.; Munawwar, B.; Ricke, J.; et al. Pneumothorax Detection in Chest Radiographs: Optimizing Artificial Intelligence System for Accuracy and Confounding Bias Reduction Using in-Image Annotations in Algorithm Training. Eur. Radiol. 2021, 31, 7888–7900. [Google Scholar] [CrossRef] [PubMed]
- Pooch, E.H.P.; Ballester, P.L.; Barros, R.C. Can We Trust Deep Learning Models Diagnosis? The Impact of Domain Shift in Chest Radiograph Classification. arXiv 2019, arXiv:1909.01940. [Google Scholar] [CrossRef]
- Bercean, B.; Buburuzan, A.; Birhala, A.; Avramescu, C.; Tenescu, A.; Marcu, M. Breaking Down Covariate Shift on Pneumothorax Chest X-Ray Classification. In Proceedings of the Uncertainty for Safe Utilization of Machine Learning in Medical Imaging; Sudre, C.H., Baumgartner, C.F., Dalca, A., Mehta, R., Qin, C., Wells, W.M., Eds.; Springer Nature: Cham, Switzerland, 2023; pp. 157–166. [Google Scholar]
- Jiménez-Sánchez, A.; Juodelyte, D.; Chamberlain, B.; Cheplygina, V. Detecting Shortcuts in Medical Images—A Case Study in Chest X-Rays. In Proceedings of the 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), Cartagena, Colombia, 18–21 April 2023; pp. 1–5. [Google Scholar]
- Seah, J.; Tang, C.; Buchlak, Q.D.; Milne, M.R.; Holt, X.; Ahmad, H.; Lambert, J.; Esmaili, N.; Oakden-Rayner, L.; Brotchie, P.; et al. Do Comprehensive Deep Learning Algorithms Suffer from Hidden Stratification? A Retrospective Study on Pneumothorax Detection in Chest Radiography. BMJ Open 2021, 11, e053024. [Google Scholar] [CrossRef] [PubMed]
- Rajpurkar, P.; Irvin, J.; Zhu, K.; Yang, B.; Mehta, H.; Duan, T.; Ding, D.; Bagul, A.; Langlotz, C.; Shpanskaya, K.; et al. CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning. arXiv 2017, arXiv:1711.05225. [Google Scholar]
- Taylor, A.G.; Mielke, C.; Mongan, J. Automated Detection of Moderate and Large Pneumothorax on Frontal Chest X-Rays Using Deep Convolutional Neural Networks: A Retrospective Study. PLOS Med. 2018, 15, e1002697. [Google Scholar] [CrossRef]
- Gündel, S.; Grbic, S.; Georgescu, B.; Liu, S.; Maier, A.; Comaniciu, D. Learning to Recognize Abnormalities in Chest X-Rays with Location-Aware Dense Networks. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications: CIARP 2018; Lecture Notes in Computer Science (LNIP, Volume 11401); Springer: Cham, Switzerland; pp. 757–765. [CrossRef]
- Cid, Y.D.; Macpherson, M.; Gervais-Andre, L.; Zhu, Y.; Franco, G.; Santeramo, R.; Lim, C.; Selby, I.; Muthuswamy, K.; Amlani, A.; et al. Development and Validation of Open-Source Deep Neural Networks for Comprehensive Chest x-Ray Reading: A Retrospective, Multicentre Study. Lancet Digit. Health 2024, 6, e44–e57. [Google Scholar] [CrossRef]
- Wang, C.-H.; Lin, T.; Chen, G.; Lee, M.-R.; Tay, J.; Wu, C.-Y.; Wu, M.-C.; Roth, H.R.; Yang, D.; Zhao, C.; et al. Deep Learning-Based Diagnosis and Localization of Pneumothorax on Portable Supine Chest X-Ray in Intensive and Emergency Medicine: A Retrospective Study. J. Med. Syst. 2023, 48, 1. [Google Scholar] [CrossRef]
- Hillis, J.M.; Bizzo, B.C.; Mercaldo, S.; Chin, J.K.; Newbury-Chaet, I.; Digumarthy, S.R.; Gilman, M.D.; Muse, V.V.; Bottrell, G.; Seah, J.C.Y.; et al. Evaluation of an Artificial Intelligence Model for Detection of Pneumothorax and Tension Pneumothorax in Chest Radiographs. JAMA Netw. Open 2022, 5, e2247172. [Google Scholar] [CrossRef]
- Feng, S.; Liu, Q.; Patel, A.; Bazai, S.U.; Jin, C.-K.; Kim, J.S.; Sarrafzadeh, M.; Azzollini, D.; Yeoh, J.; Kim, E.; et al. Automated Pneumothorax Triaging in Chest X-Rays in the New Zealand Population Using Deep-Learning Algorithms. J. Med. Imaging Radiat. Oncol. 2022, 66, 1035–1043. [Google Scholar] [CrossRef]
- Sze-To, A.; Riasatian, A.; Tizhoosh, H.R. Searching for Pneumothorax in X-Ray Images Using Autoencoded Deep Features. Sci. Rep. 2021, 11, 9817. [Google Scholar] [CrossRef] [PubMed]
- Wollek, A.; Graf, R.; Čečatka, S.; Fink, N.; Willem, T.; Sabel, B.O.; Lasser, T. Attention-Based Saliency Maps Improve Interpretability of Pneumothorax Classification. Radiol. Artif. Intell. 2023, 5, e220187. [Google Scholar] [CrossRef] [PubMed]
- Arun, N.; Gaw, N.; Singh, P.; Chang, K.; Aggarwal, M.; Chen, B.; Hoebel, K.; Gupta, S.; Patel, J.; Gidwani, M.; et al. Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging. Radiol. Artif. Intell. 2021, 3, e200267. [Google Scholar] [CrossRef] [PubMed]
- Zhou, L.; Yin, X.; Zhang, T.; Feng, Y.; Zhao, Y.; Jin, M.; Peng, M.; Xing, C.; Li, F.; Wang, Z.; et al. Detection and Semiquantitative Analysis of Cardiomegaly, Pneumothorax, and Pleural Effusion on Chest Radiographs. Radiol. Artif. Intell. 2021, 3, e200172. [Google Scholar] [CrossRef]
- David, R.; Duke, J.; Jain, A.; Reddi, V.J.; Jeffries, N.; Li, J.; Kreeger, N.; Nappier, I.; Natraj, M.; Wang, T.; et al. TensorFlow Lite Micro: Embedded Machine Learning for TinyML Systems. Proc. Mach. Learn. Syst. 2021, 3, 800–811. [Google Scholar]
Dataset | Source | Date | Projections | # Patients | # Images | # Pneumothorax | # Normal |
---|---|---|---|---|---|---|---|
NIH | National Institute of Health, USA | 2016 | Frontal | 30,805 | 112,120 | 5302 (4.73%) | 60,361 (53.83%) |
CheXpert | Stanford University, USA. | 2019 | Frontal/Lateral | 65,240 | 224,316 | 19,466 (8.68%) | 22,528 (10.04%) |
MIMIC-CXR | MIT, USA | 2019 | Frontal/Lateral | 65,379 | 377,110 | 14,239 (3.78%) | 143,363 (38.02%) |
PadChest | Universidad de Alicante, Spain | 2019 | Frontal/Lateral/ Others | 69,882 | 160,868 | 851 (0.52%) | 50,616 (31.47%) |
VinDr-CXR | VinBigBrain Group, Vietnam | 2020 | Frontal | 18,000 | 18,000 | 76 (0.004%) | 12,657 (70.3%) |
SIIM-ACR | SIIM-ACR Pneumothorax Challenge, USA | 2019 | Frontal | 12,047 | 12,047 | 2669 (22%) | 9378 (78%) |
BRAX | Hospital Israelita Albert Einstein, Brazil | 2022 | Frontal/Lateral | 18,442 | 40,967 | 214 (0.52%) | 29,009 (71%) |
CANDID-PTX | Dunedin Hospital, New Zealand | 2021 | Frontal | 13,744 | 19,237 | 3196 (16.61%) | - |
Indiana | Indiana University, US | 2015 | Frontal/Lateral | 3851 | 7470 | 54 (0.72%) | 2696 (36.09%) |
PTX-498 | Shanghai, China | 2021 | Frontal | 498 | 498 | 498 (100%) | 0 (0%) |
CRADI | Shanghai, China | 2021 | Frontal | - (10,440 1) | 74,082 (10,440 1) | 201 (1.92%) | 2737 (26.22%) |
Dataset | # Images | # Pneumothorax |
---|---|---|
NIH | 52,157 | 5036 (9.66%) |
CheXpert | 93,797 | 16,962 (18.08%) |
MIMIC-CXR | 112,806 | 10,794 (9.57%) |
PadChest | 49,100 | 393 (0.8%) |
VinDr-CXR | 8006 | 84 (1.05%) |
SIIM-ACR | 6703 | 2572 (38.37%) |
Total | 322,569 | 35,841 (11.1%) |
Dataset | # Images | # Pneumothorax |
---|---|---|
BRAX | 19,429 | 158 (0.81%) |
CANDID-PTX | 19,237 | 3196 (16.61%) |
Indiana | 3822 | 28 (0.73%) |
PTX-498 | 498 | 498 (100%) |
CRADI | 10,440 | 201 (1.93%) |
Total | 53,429 | 4081 (7.64%) |
Dataset | Prevalence | Sensitivity (Recall) | Specificity | PPV (Precision) | NPV | F1-score | ROC-AUC | AP (PR-AUC) |
---|---|---|---|---|---|---|---|---|
BRAX | 0.8% (158/19,429) | 0.538 (0.464–0.617) | 0.953 (0.949–0.956) | 0.085 (0.068–0.101) | 0.996 (0.995–0.997) | 0.147 (0.119–0.173) | 0.846 (0.813–0.881) | 0.199 (0.137–0.264) |
CANDID-PTX | 16.6% (3196/19,237) | 0.874 (0.862–0.886) | 0.929 (0.925–0.933) | 0.71 (0.695–0.725) | 0.974 (0.971–0.976) | 0.783 (0.772–0.794) | 0.959 (0.955–0.963) | 0.892 (0.883–0.9) |
CRADI | 1.9% (201/10,440) | 0.902 (0.859–0.94) | 0.96 (0.956–0.963) | 0.305 (0.266–0.342) | 0.998 (0.997–0.999) | 0.455 (0.41–0.497) | 0.972 (0.956–0.984) | 0.726 (0.661–0.789) |
Indiana | 0.7% (28/3822) | 0.751 (0.579–0.897) | 0.941 (0.933–0.948) | 0.086 (0.054–0.126) | 0.998 (0.996–0.999) | 0.154 (0.1–0.218) | 0.899 (0.812–0.967) | 0.536 (0.344–0.704) |
PTX-498 1 | 100.0% (498/498) | 0.881 (0.851–0.91) | - | 1.0 (1.0–1.0) | - | - | - | - |
Macro-Average 2 | 7.6% (4081/53,426) | 0.789 (0.637–0.942) | 0.946 (0.932–0.959) | 0.437 (0.032–0.842) | 0.992 (0.980–1.003) | 0.385 (0.083–0.687) | 0.919 (0.861–0.977) | 0.588 (0.291–0.886) |
Overall 3 (micro-average) | 7.6% (4081/53,426) | 0.862 (0.852–0.872) | 0.945 (0.943–0.947) | 0.566 (0.554–0.577) | 0.988 (0.987–0.989) | 0.683 (0.673–0.693) | 0.961 (0.957–0.964) | 0.825 (0.814–0.834) |
Class | Prevalence | Sensitivity (Recall) | Specificity | PPV (Precision) | NPV | F1-score | ROC AUC | AP (PR-AUC) |
---|---|---|---|---|---|---|---|---|
Pneumothorax | 7.6% (4081/53,426) | 0.862 (0.852–0.872) | 0.945 (0.943–0.947) | 0.566 (0.554–0.577) | 0.988 (0.987–0.989) | 0.683 (0.673–0.693) | 0.961 (0.957–0.964) | 0.825 (0.814–0.834) |
Support devices | 14.2% (7593/53,426) | 0.014 (0.012–0.017) | 0.999 (0.999–1.0) | 0.785 (0.721–0.855) | 0.86 (0.856–0.863) | 0.028 (0.023–0.033) | 0.763 (0.757–0.769) | 0.449 (0.437–0.461) |
No finding | 72.1% (38,527/53,426) | 0.558 (0.553–0.563) | 0.646 (0.639–0.654) | 0.803 (0.798–0.808) | 0.361 (0.356–0.367) | 0.658 (0.654–0.663) | 0.653 (0.648–0.658) | 0.814 (0.81–0.819) |
AP | CheXpert | MIMIC-CXR | NIH | PadChest | SIIM-ACR | VinDR-CXR | Multi-Source (All) |
---|---|---|---|---|---|---|---|
BRAX | 0.211 (0.148–0.278) | 0.2 (0.139–0.265) | 0.173 (0.117–0.233) | 0.065 (0.036–0.101) | 0.05 (0.033–0.073) | 0.008 (0.006–0.009) | 0.199 (0.137–0.264) |
CANDID-PTX | 0.841 (0.83–0.852) | 0.829 (0.817–0.84) | 0.753 (0.74–0.766) | 0.619 (0.601–0.637) | 0.645 (0.628–0.663) | 0.424 (0.407–0.442) | 0.892 (0.883–0.9) |
CRADI | 0.361 (0.292–0.433) | 0.615 (0.544–0.682) | 0.542 (0.471–0.615) | 0.287 (0.215–0.353) | 0.514 (0.444–0.583) | 0.122 (0.094–0.156) | 0.726 (0.661–0.789) |
Indiana | 0.356 (0.186–0.531) | 0.346 (0.174–0.514) | 0.273 (0.122–0.422) | 0.114 (0.035–0.23) | 0.085 (0.039–0.148) | 0.07 (0.026–0.143) | 0.536 (0.344–0.704) |
Micro-Average | 0.442 (0.167–0.717) | 0.497 (0.218–0.777) | 0.435 (0.172–0.698) | 0.271 (0.021–0.522) | 0.324 (0.023–0.624) | 0.156 (0.029–0.341) | 0.588 (0.291–0.886) |
Overall | 0.741 (0.728–0.753) | 0.724 (0.711–0.737) | 0.679 (0.667–0.692) | 0.44 (0.425–0.455) | 0.569 (0.552–0.585) | 0.218 (0.208–0.227) | 0.825 (0.814–0.834) |
ROC-AUC | CheXpert | MIMIC-CXR | NIH | PadChest | SIIM-ACR | VinDR-CXR | Multi-Source (All) |
---|---|---|---|---|---|---|---|
BRAX | 0.764 (0.722–0.809) | 0.796 (0.757–0.836) | 0.829 (0.794–0.864) | 0.704 (0.658–0.75) | 0.769 (0.728–0.809) | 0.483 (0.438–0.526) | 0.828 (0.792–0.865) |
CANDID-PTX | 0.945 (0.94–0.949) | 0.937 (0.932–0.942) | 0.901 (0.894–0.907) | 0.839 (0.83–0.847) | 0.863 (0.856–0.87) | 0.758 (0.749–0.767) | 0.962 (0.958–0.966) |
CRADI | 0.914 (0.889–0.936) | 0.954 (0.935–0.97) | 0.924 (0.901–0.944) | 0.849 (0.818–0.875) | 0.886 (0.856–0.915) | 0.831 (0.803–0.858) | 0.969 (0.952–0.982) |
Indiana | 0.934 (0.852–0.983) | 0.9 (0.834–0.952) | 0.847 (0.751–0.927) | 0.748 (0.626–0.836) | 0.841 (0.743–0.92) | 0.851 (0.787–0.906) | 0.896 (0.809–0.965) |
Micro-Average | 0.889 (0.805–0.974) | 0.897 (0.826–0.968) | 0.875 (0.831–0.920) | 0.785 (0.714–0.856) | 0.840 (0.789–0.890) | 0.731 (0.561–0.901) | 0.914 (0.848–0.980) |
Overall | 0.941 (0.937–0.945) | 0.926 (0.921–0.931) | 0.916 (0.91–0.921) | 0.828 (0.822–0.835) | 0.895 (0.89–0.9) | 0.783 (0.776–0.79) | 0.961 (0.958–0.965) |
Sensitivity | CheXpert | MIMIC-CXR | NIH | PadChest | SIIM-ACR | VinDR-CXR | Multi-Source (All) |
---|---|---|---|---|---|---|---|
BRAX | 0.513 (0.443–0.589) | 0.463 (0.39–0.543) | 0.361 (0.295–0.43) | 0.278 (0.208–0.346) | 0.629 (0.556–0.704) | 0.032 (0.006–0.061) | 0.506 (0.434–0.582) |
CANDID-PTX | 0.919 (0.909–0.929) | 0.72 (0.705–0.736) | 0.678 (0.662–0.695) | 0.464 (0.447–0.482) | 0.925 (0.916–0.934) | 0.241 (0.227–0.255) | 0.878 (0.866–0.889) |
CRADI | 0.905 (0.861–0.942) | 0.836 (0.785–0.884) | 0.726 (0.665–0.784) | 0.612 (0.547–0.672) | 0.851 (0.803–0.898) | 0.094 (0.056–0.136) | 0.906 (0.865–0.944) |
Indiana | 0.931 (0.823–1.0) | 0.573 (0.385–0.75) | 0.574 (0.387–0.75) | 0.357 (0.2–0.524) | 0.823 (0.667–0.957) | 0.251 (0.097–0.421) | 0.751 (0.579–0.897) |
PTX498 | 0.958 (0.938–0.976) | 0.794 (0.757–0.831) | 0.663 (0.62–0.707) | 0.556 (0.512–0.598) | 0.844 (0.809–0.876) | 0.348 (0.305–0.39) | 0.871 (0.839–0.9) |
Micro-Average | 0.845 (0.658–1.032) | 0.677 (0.521–0.833) | 0.600 (0.456–0.745) | 0.453 (0.316–0.591) | 0.814 (0.704–0.925) | 0.193 (0.065–0.321) | 0.782 (0.617–0.948) |
Overall | 0.908 (0.898–0.916) | 0.723 (0.709–0.737) | 0.666 (0.651–0.68) | 0.475 (0.46–0.489) | 0.899 (0.89–0.909) | 0.238 (0.225–0.251) | 0.863 (0.852–0.873) |
Specificity | CheXpert | MIMIC-CXR | NIH | PadChest | SIIM-ACR | VinDR-CXR | Multi-Source (All) |
---|---|---|---|---|---|---|---|
BRAX | 0.872 (0.867–0.877) | 0.959 (0.956–0.962) | 0.98 (0.978–0.982) | 0.95 (0.947–0.953) | 0.793 (0.788–0.799) | 0.92 (0.916–0.924) | 0.956 (0.952–0.958) |
CANDID-PTX | 0.812 (0.806–0.818) | 0.961 (0.958–0.963) | 0.934 (0.93–0.938) | 0.957 (0.954–0.96) | 0.501 (0.494–0.509) | 0.962 (0.959–0.965) | 0.933 (0.929–0.937) |
CRADI | 0.724 (0.716–0.734) | 0.956 (0.952–0.96) | 0.97 (0.967–0.973) | 0.909 (0.904–0.915) | 0.723 (0.715–0.732) | 0.991 (0.989–0.993) | 0.952 (0.948–0.956) |
Indiana | 0.883 (0.873–0.893) | 0.963 (0.957–0.969) | 0.957 (0.95–0.963) | 0.94 (0.933–0.947) | 0.691 (0.677–0.705) | 0.982 (0.978–0.986) | 0.934 (0.926–0.941) |
Micro-Average | 0.823 (0.750–0.896) | 0.960 (0.957–0.963) | 0.960 (0.940–0.980) | 0.939 (0.918–0.960) | 0.677 (0.552–0.802) | 0.964 (0.932–0.995) | 0.944 (0.932–0.956) |
Overall | 0.823 (0.819–0.826) | 0.959 (0.957–0.961) | 0.961 (0.96–0.963) | 0.943 (0.941–0.945) | 0.676 (0.672–0.68) | 0.953 (0.951–0.955) | 0.946 (0.944–0.948) |
PPV-Precision | CheXpert | MIMIC-CXR | NIH | PadChest | SIIM-ACR | VinDR-CXR | Multi-Source (All) |
---|---|---|---|---|---|---|---|
BRAX | 0.032 (0.025–0.038) | 0.085 (0.066–0.103) | 0.13 (0.1–0.161) | 0.044 (0.031–0.056) | 0.024 (0.02–0.029) | 0.003 (0.001–0.006) | 0.085 (0.067–0.102) |
CANDID-PTX | 0.494 (0.481–0.506) | 0.784 (0.771–0.799) | 0.674 (0.658–0.689) | 0.682 (0.662–0.702) | 0.27 (0.262–0.278) | 0.555 (0.529–0.579) | 0.724 (0.71–0.739) |
CRADI | 0.061 (0.052–0.069) | 0.271 (0.237–0.306) | 0.322 (0.281–0.366) | 0.117 (0.098–0.136) | 0.057 (0.049–0.066) | 0.173 (0.108–0.242) | 0.27 (0.236–0.304) |
Indiana | 0.056 (0.037–0.077) | 0.105 (0.06–0.159) | 0.09 (0.051–0.129) | 0.043 (0.019–0.066) | 0.02 (0.012–0.028) | 0.096 (0.032–0.173) | 0.078 (0.048–0.114) |
Micro-Average | 0.329 (0.093–0.751) | 0.449 (0.031–0.867) | 0.443 (0.056–0.831) | 0.377 (0.062–0.816) | 0.274 (0.144–0.693) | 0.365 (0.047–0.778) | 0.431 (0.019–0.844) |
Overall | 0.298 (0.29–0.305) | 0.594 (0.58–0.608) | 0.588 (0.573–0.602) | 0.408 (0.395–0.422) | 0.187 (0.181–0.192) | 0.297 (0.28–0.311) | 0.569 (0.556–0.58) |
F1-score | CheXpert | MIMIC-CXR | NIH | PadChest | SIIM-ACR | VinDR-CXR | Multi-Source (All) |
---|---|---|---|---|---|---|---|
BRAX | 0.06 (0.048–0.072) | 0.143 (0.114–0.171) | 0.191 (0.151–0.233) | 0.076 (0.055–0.096) | 0.047 (0.038–0.055) | 0.006 (0.001–0.012) | 0.146 (0.118–0.173) |
CANDID-PTX | 0.643 (0.631–0.654) | 0.751 (0.739–0.763) | 0.676 (0.664–0.689) | 0.553 (0.536–0.568) | 0.418 (0.408–0.428) | 0.336 (0.319–0.353) | 0.794 (0.783–0.804) |
CRADI | 0.114 (0.099–0.128) | 0.409 (0.365–0.451) | 0.446 (0.398–0.494) | 0.196 (0.167–0.226) | 0.107 (0.092–0.122) | 0.121 (0.074–0.173) | 0.416 (0.372–0.457) |
Indiana | 0.105 (0.071–0.143) | 0.176 (0.104–0.256) | 0.154 (0.092–0.218) | 0.076 (0.034–0.116) | 0.038 (0.024–0.054) | 0.138 (0.048–0.235) | 0.141 (0.088–0.2) |
Micro-Average | 0.231 (0.046–0.507) | 0.370 (0.089–0.650) | 0.367 (0.123–0.610) | 0.225 (0.000–0.451) | 0.152 (0.027–0.332) | 0.150 (0.013–0.287) | 0.374 (0.066–0.682) |
Overall | 0.448 (0.439–0.457) | 0.652 (0.641–0.664) | 0.624 (0.612–0.636) | 0.439 (0.426–0.452) | 0.309 (0.302–0.316) | 0.264 (0.251–0.278) | 0.686 (0.675–0.695) |
Threshold | PPV (Precision) | Sensitivity (Recall) | Specificity | NPV | F1-Score | TP | FP | FN | TN | Total |
---|---|---|---|---|---|---|---|---|---|---|
0.0 | 0.0506 | 1.0000 | 0.0000 | 0.0000 | 0.0963 | 1393 | 26,142 | 0 | 0 | 27,535 |
0.1 | 0.1665 | 0.9017 | 0.7595 | 0.9931 | 0.2811 | 1256 | 6287 | 137 | 19,855 | 27,535 |
0.2 | 0.2432 | 0.8241 | 0.8633 | 0.9893 | 0.3755 | 1148 | 3573 | 245 | 22,569 | 27,535 |
0.3 | 0.3103 | 0.7645 | 0.9095 | 0.9864 | 0.4415 | 1065 | 2367 | 328 | 23,775 | 27,535 |
0.4 | 0.3663 | 0.7064 | 0.9349 | 0.9835 | 0.4825 | 984 | 1702 | 409 | 24,440 | 27,535 |
0.5 | 0.4181 | 0.6490 | 0.9519 | 0.9807 | 0.5086 | 904 | 1258 | 489 | 24,884 | 27,535 |
0.6 | 0.4955 | 0.5937 | 0.9678 | 0.9781 | 0.5402 | 827 | 842 | 566 | 25,300 | 27,535 |
0.7 | 0.5604 | 0.5126 | 0.9786 | 0.9741 | 0.5354 | 714 | 560 | 679 | 25,582 | 27,535 |
0.8 | 0.6366 | 0.4049 | 0.9877 | 0.9689 | 0.4950 | 564 | 322 | 829 | 25,820 | 27,535 |
0.9 | 0.7715 | 0.2764 | 0.9956 | 0.9627 | 0.4070 | 385 | 114 | 1008 | 26,028 | 27,535 |
1.0 | 0.0000 | 0.0000 | 1.0000 | 0.9494 | 0.0000 | 0 | 0 | 1393 | 26,142 | 27,535 |
Target Metric | Target Value | Achieved Metric | Achieved Value | Threshold |
---|---|---|---|---|
Recall | 0.80 | Precision | 0.264 | 0.232 |
Recall | 0.85 | Precision | 0.213 | 0.160 |
Recall | 0.90 | Precision | 0.167 | 0.101 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ibañez Caturla, S.; Berná Mestre, J.d.D.; Mozos, O.M. Towards Automatic Detection of Pneumothorax in Emergency Care with Deep Learning Using Multi-Source Chest X-ray Data. Future Internet 2025, 17, 292. https://doi.org/10.3390/fi17070292
Ibañez Caturla S, Berná Mestre JdD, Mozos OM. Towards Automatic Detection of Pneumothorax in Emergency Care with Deep Learning Using Multi-Source Chest X-ray Data. Future Internet. 2025; 17(7):292. https://doi.org/10.3390/fi17070292
Chicago/Turabian StyleIbañez Caturla, Santiago, Juan de Dios Berná Mestre, and Oscar Martinez Mozos. 2025. "Towards Automatic Detection of Pneumothorax in Emergency Care with Deep Learning Using Multi-Source Chest X-ray Data" Future Internet 17, no. 7: 292. https://doi.org/10.3390/fi17070292
APA StyleIbañez Caturla, S., Berná Mestre, J. d. D., & Mozos, O. M. (2025). Towards Automatic Detection of Pneumothorax in Emergency Care with Deep Learning Using Multi-Source Chest X-ray Data. Future Internet, 17(7), 292. https://doi.org/10.3390/fi17070292