International Multicenter Validation of an Expanded AI Diagnostic System for 18 Pathologies in Thoracic and Musculoskeletal Radiography
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Design
2.2. Inclusion and Exclusion Criteria
2.2.1. Inclusion Criteria
- •
- Examinations acquired via direct digital radiography (DR) or computed radiography (CR) systems, including bedside imaging performed with portable units. Radiographic fluoroscopy (RF) images from musculoskeletal (MSK) examinations were also included if they were in DICOM format.
- •
- Thoracic Imaging: All chest X-rays from inpatient and outpatient settings, including Anteroposterior (AP), Posteroanterior (PA), and lateral views. All patient positions were included, meaning X-rays acquired in standing, sitting, and supine positions, as well as lateral decubitus views when available.
- •
- MSK Imaging: The entire appendicular and axial skeleton, including upper and lower limbs, pelvis, and the spine (cervical, thoracic, lumbar), encompassing all standard radiographic projections, weight-bearing (under load) and non-weight-bearing positions.
- •
- Patient Demographics: There was no age restriction. The study intentionally included the full pediatric spectrum, including neonates and infants under 1 year of age, as well as adult and geriatric populations. The Rayvolve® AI suite is designed and CE-marked for use across all age groups without restriction. Its performance in pediatric populations, characterized by complex skeletal maturation and open growth plates, has been specifically validated in previous multicenter studies [23,26], while its thoracic capabilities have been established across broad clinical cohorts [25].
2.2.2. Exclusion Criteria and “Real-World” Quality Paradigm
- •
- Anatomical Exclusions: Radiographs of the skull, facial bones, and dental imaging were excluded as these regions are not currently supported by the Rayvolve® AI Suite.
- •
- No Technical Exclusions: No images were excluded based on technical quality. To evaluate the system’s performance within a natural distribution of clinical data, radiographs with suboptimal exposure (over/under-exposed), patient rotation, or overlapping medical devices (e.g., tubes, lines, or implants) were retained in the final cohort. This approach was adopted to avoid “cherry-picking” bias and to rigorously assess the AI’s robustness against the inherent heterogeneity of global radiological workflows.
2.2.3. Metadata and Metadata Extraction
2.3. AI System Description and Inference
2.4. Reference Standard Determination
2.4.1. Expert Panel and Task Distribution
2.4.2. Annotation Methodology
2.4.3. Blinding and Mitigation of Bias
- •
- Readers had no access to the AI suite’s predictions, bounding boxes, or confidence scores.
- •
- All readers were blinded to original clinical indications, patient history, referral notes, and previous radiological reports.
- •
- No access was provided to collateral imaging (e.g., CT or MRI) or longitudinal follow-up data.
- •
- Inter-reader Blinding: Each of the first two readers was blinded to the other’s findings during the initial independent phase.
2.5. Data Analysis Plan and Statistical Analysis
3. Results
3.1. Dataset Characteristics
3.2. Primary Objective: Performance Across the Expanded Diagnostic Scope
3.3. Secondary Objective: Performance Across Historically Validated Scopes
3.4. Subgroup Performance and Generalizability
4. Discussion
4.1. Global Resilience and Algorithmic Stability
4.2. From Point Solutions to Unified Diagnostic Horizons
4.3. Clinical Significance in MSK
4.4. Thoracic Performance and the Safety Net Paradigm
4.5. Challenges in the Analysis and Current Missed Findings
4.6. Strengths and Limitations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AAC | Abdominal Aortic Calcification |
| AI | Artificial Intelligence |
| AP | Anteroposterior |
| AUC | Area Under the Curve |
| CI | Confidence Interval |
| CNIL | Commission on Informatics and Liberty |
| CNNs | Convolutional Neural Networks |
| CR | Computed Radiography |
| CT | Computerized Tomography scan |
| DICOM | Digital Imaging and Communications in Medicine |
| DL | Deep Learning |
| DR | Digital Radiography |
| FN | False Negative |
| FP | False Positive |
| GT | Ground Truth |
| HRNet | High-Resolution Networks |
| IoU | Intersection-over-Union |
| kVp | kiloVoltage |
| mAs | milliampere-seconds |
| MR | Méthodologie de Référence (see CNIL) |
| MRI | Magnetic Resonance Imaging |
| MSK | Musculoskeletal |
| NPV | Negative Predictive Value |
| PA | Posteroanterior |
| PPV | Positive Predictive Value |
| PVT | Pyramid Vision Transformer |
| RF | Radiographic Fluoroscopy |
| ROC | Receiver Operating Characteristic |
| STARD-AI | Standards for Reporting Diagnostic Accuracy AI |
| TN | True Negative |
| TP | True Positive |
| VGG | Visual Geometry Group |
| X-ray | Radiograph |
References
- Delrue, L.; Gosselin, R. Difficulties in Interpretation of Chest Radiography. In Comparative Interpretation of CT and Standard Radiography of the Chest; Springer: Berlin/Heidelberg, Germany, 2011; pp. 27–49. [Google Scholar]
- Longo, D.L. Harrison’s Principles of Internal Medicine, 20th ed.; McGraw-Hill Education: New York, NY, USA, 2018. [Google Scholar]
- Smith-Bindman, R.; Kwan, M.L.; Marlow, E.C.; Theis, M.K.; Bolch, W.; Cheng, S.Y.; Bowles, E.J.A.; Duncan, J.R.; Greenlee, R.T.; Kushi, L.H.; et al. Trends in Use of Medical Imaging in US Health Care Systems and in Ontario, Canada, 2000–2016. JAMA 2019, 322, 843–856. [Google Scholar] [CrossRef] [PubMed]
- Wei, C.J.; Tsai, W.C.; Tiu, C.M.; Wu, H.T.; Chiou, H.J.; Chang, C.Y. Systematic analysis of missed extremity fractures in emergency radiology. Acta Radiol. 2006, 47, 710–717. [Google Scholar] [CrossRef] [PubMed]
- Graber, M.L.; Franklin, N.; Gordon, R. Diagnostic error in internal medicine. Arch. Intern. Med. 2005, 165, 1493–1499. [Google Scholar] [CrossRef] [PubMed]
- Shanafelt, T.D.; West, C.P.; Sinsky, C.; Trockel, M.; Tutty, M.; Satele, D.V.; Carlasare, L.E.; Dyrbye, L.N. Changes in Burnout and Satisfaction With Work-Life Integration in Physicians and the General US Working Population Between 2011 and 2017. Mayo Clin. Proc. 2019, 94, 1681–1694. [Google Scholar] [CrossRef]
- Kuo, R.Y.L.; Harrison, C.; Curran, T.A.; Jones, B.; Freethy, A.; Cussons, D.; Stewart, M.; Collins, G.S.; Furniss, D. Artificial Intelligence in Fracture Detection: A Systematic Review and Meta-Analysis. Radiology 2022, 304, 50–62. [Google Scholar] [CrossRef]
- Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef]
- Guly, H.R. Diagnostic errors in an accident and emergency department. Emerg. Med. J. 2001, 77, 263–269. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar] [CrossRef]
- Wang, J.; Sun, K.; Cheng, T.; Jiang, B.; Deng, C.; Zhao, Y.; Liu, D.; Mu, Y.; Tan, M.; Wang, X.; et al. Deep High-Resolution Representation Learning for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3349–3364. [Google Scholar] [CrossRef]
- McKinney, S.M.; Sieniek, M.; Godbole, V.; Godwin, J.; Antropova, N.; Ashrafian, H.; Back, T.; Chesus, M.; Corrado, G.S.; Darzi, A.; et al. International evaluation of an AI system for breast cancer screening. Nature 2020, 577, 89–94. [Google Scholar] [CrossRef]
- Seah, J.C.Y.; Tang, C.H.M.; Buchlak, Q.D.; Holt, X.G.; Wardman, J.B.; Aimoldin, A.; Esmaili, N.; Ahmad, H.; Pham, H.; Lambert, J.F.; et al. Effect of a comprehensive deep-learning model on the accuracy of chest x-ray interpretation by radiologists: A retrospective, multireader multicase study. Lancet Digit. Health 2021, 3, e496–e506. [Google Scholar] [CrossRef]
- Hwang, E.J.; Nam, J.G.; Lim, W.H.; Park, S.J.; Jeong, Y.S.; Kang, J.H.; Hong, E.K.; Kim, T.M.; Goo, J.M.; Park, S.; et al. Deep Learning for Chest Radiograph Diagnosis in the Emergency Department. Radiology 2019, 293, 573–580. [Google Scholar] [CrossRef]
- Strohm, L.; Hehakaya, C.; Ranschaert, E.R.; Boon, W.P.C.; Moors, E.H.M. Implementation of artificial intelligence (AI) applications in radiology: Hindering and facilitating factors. Eur. Radiol. 2020, 30, 5525–5532. [Google Scholar] [CrossRef] [PubMed]
- Oppenheimer, J.; Lüken, S.; Hamm, B.; Niehues, S.M. A Prospective Approach to Integration of AI Fracture Detection Software in Radiographs into Clinical Workflow. Life 2023, 13, 223. [Google Scholar] [CrossRef] [PubMed]
- Aggarwal, R.; Sounderajah, V.; Martin, G.; Ting, D.S.W.; Karthikesalingam, A.; King, D.; Ashrafian, H.; Darzi, A. Diagnostic accuracy of deep learning in medical imaging: A systematic review and meta-analysis. NPJ Digit. Med. 2021, 4, 65. [Google Scholar] [CrossRef] [PubMed]
- Dupuis, M.; Delbos, L.; Veil, R.; Adamsbaum, C. External validation of a commercially available deep learning algorithm for fracture detection in children. Diagn. Interv. Imaging 2022, 103, 151–159. [Google Scholar] [CrossRef]
- Fu, T.; Viswanathan, V.; Attia, A.; Zerbib-Attal, E.; Kosaraju, V.; Barger, R.; Vidal, J.; Bittencourt, L.K.; Faraji, N. Assessing the Potential of a Deep Learning Tool to Improve Fracture Detection by Radiologists and Emergency Physicians on Extremity Radiographs. Acad. Radiol. 2024, 31, 1989–1999. [Google Scholar] [CrossRef]
- Bettinger, H.; Lenczner, G.; Guigui, J.; Rotenberg, L.; Zerbib, E.; Attia, A.; Vidal, J.; Beaumel, P. Evaluation of the Performance of an Artificial Intelligence (AI) Algorithm in Detecting Thoracic Pathologies on Chest Radiographs. Diagnostics 2024, 14, 1183. [Google Scholar] [CrossRef]
- Raj, S.; Sadegi, B.; Simon, J. Enhancing Pediatric Fracture Detection: Multicenter Evaluation of a Deep Learning AI Model and Its Impact on Radiologist Performance. Acad. Radiol. 2025, 33, 1121–1129. [Google Scholar] [CrossRef]
- Kelly, B.S.; Judge, C.; Bollard, S.M.; Clifford, S.M.; Healy, G.M.; Aziz, A.; Mathur, P.; Islam, S.; Yeom, K.W.; Lawlor, A.; et al. Radiology artificial intelligence: A systematic review and evaluation of methods (RAISE). Eur. Radiol. 2022, 32, 7998–8007. [Google Scholar] [CrossRef]
- Gefter, W.B.; Post, B.A.; Hatabu, H. Commonly missed findings on chest radiographs: Causes and consequences. Chest 2023, 163, 650–661. [Google Scholar] [CrossRef]
- O’Sullivan, J.W.; Muntinga, T.; Grigg, S.; Ioannidis, J.P. Prevalence and outcomes of incidental imaging findings: Umbrella review. BMJ 2018, 361, k2387. [Google Scholar] [CrossRef]
- Qin, Z.Z.; Ahmed, S.; Sarker, M.S.; Paul, K.; Adel, A.S.M.H.; Naheyan, T.; Barrett, R.; Banu, S.; Creswell, J. Tuberculosis detection from chest x-rays for triaging in a high tuberculosis-burden setting: An evaluation of five artificial intelligence algorithms. Lancet Digit. Health 2021, 3, e543–e554. [Google Scholar] [CrossRef] [PubMed]
- Pasa, F.; Golkov, V.; Pfeiffer, F.; Kremers, D.; Pfeiffer, D. Efficient Deep Network Architectures for Fast Chest X-Ray Tuberculosis Screening and Visualization. Sci. Rep. 2019, 9, 6268. [Google Scholar] [CrossRef] [PubMed]
- World Health Organization. Chest Radiography in Tuberculosis Detection: Summary of Current WHO Recommendations and Guidance on Programmatic Approaches; WHO: Geneva, Switzerland, 2021; Available online: https://www.who.int/publications/i/item/9789241511506 (accessed on 6 January 2026).
- Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; van der Laak, J.A.W.M.; van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef] [PubMed]
- Sounderajah, V.; Guni, A.; Liu, X.; Collins, G.S.; Karthikesalingam, A.; Markar, S.R.; Golub, R.M.; Denniston, A.K.; Shetty, S.; Moher, D.; et al. The STARD-AI reporting guideline for diagnostic accuracy studies using artificial intelligence. Nat. Med. 2025, 31, 3283–3289. [Google Scholar] [CrossRef]
- ISO 25237:2017; Health informatics—Pseudonymization. ISO: Geneva, Switzerland, 2017.
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef]
- AZtrauma: AI for Fracture Detection. Available online: https://www.azmed.co/azproducts-pages/aztrauma (accessed on 21 January 2026).
- AZchest: AI for Chest X-Ray Analysis. Available online: https://www.azmed.co/fr/azproducts-pages/azchest (accessed on 21 January 2026).
- Brady, A. Error and Discrepancy in Radiology: Inevitable or Avoidable? Insights Imaging 2016, 8, 171–182. [Google Scholar] [CrossRef]
- Cohen, E.; Ouertani, M.S.; Beaumel, P.; Magetic, P.; Pedowski, P.; Pinatel, F.; Violeau, M.; Brader, P.; Bajić, S.; Malzy, P.; et al. Performance of a complete AI radiographic suite across 258,373 X-rays from 26 countries: A worldwide evaluation. Radiography 2026, 32, 103361. [Google Scholar] [CrossRef]
- Wünnemann, F.; Rehnitz, C.; Weber, M. Incidental findings in musculoskeletal imaging. Radiologe 2017, 57, 286–295. [Google Scholar] [CrossRef]
- Berbaum, K.S.; Franken, E.A., Jr.; Dorfman, D.D.; Rooholamini, S.A.; Kathol, M.H.; Barloon, T.J.; Behlke, F.M.; Sato, Y.; Lu, C.H.; El-Khoury, G.Y.; et al. Satisfaction of search in diagnostic radiology. Investig. Radiol. 1990, 25, 133–140. [Google Scholar] [CrossRef]
- Rueckel, J.; Huemmer, C.; Fieselmann, A.; Ghesu, F.C.; Mansoor, A.; Schachtner, B.; Wesp, P.; Trappmann, L.; Munawwar, B.; Ricke, J.; et al. Pneumothorax detection in chest radiographs: Optimizing artificial intelligence system for accuracy and confounding bias reduction using in-image annotations in algorithm training. Eur. Radiol. 2021, 31, 7888–7900. [Google Scholar] [CrossRef]
- World Health Organization. Global Tuberculosis Report 2025; WHO: Geneva, Switzerland, 2025; ISBN 978-92-4-011692-4. [Google Scholar]



| Population | Subgroup | Total | AZtrauma | AZchest |
|---|---|---|---|---|
| Age | 0–18 years old | 5309 | 2520 | 2653 |
| 19–60 years old | 12,193 | 6423 | 5864 | |
| >60 years old | 4079 | 2182 | 1938 | |
| Sex | Male | 10,769 | 5634 | 5193 |
| Female | 10,812 | 5491 | 5263 | |
| Country | Argentina | 553 | 189 | 364 |
| Australia | 466 | 192 | 274 | |
| Belgium | 623 | 277 | 346 | |
| Brazil | 541 | 183 | 358 | |
| Bulgaria | 469 | 128 | 340 | |
| Canada | 473 | 386 | 87 | |
| Estonia | 615 | 160 | 455 | |
| France | 1244 | 1028 | 216 | |
| Germany | 1190 | 950 | 240 | |
| India | 1268 | 179 | 1089 | |
| Israël | 565 | 276 | 290 | |
| Italy | 953 | 226 | 727 | |
| Morocco | 1029 | 474 | 555 | |
| Poland | 858 | 634 | 224 | |
| Portugal | 1624 | 1175 | 449 | |
| Romania | 678 | 432 | 247 | |
| Spain | 2461 | 1244 | 1217 | |
| Switzerland | 1585 | 765 | 819 | |
| UK | 2078 | 416 | 1662 | |
| USA | 2309 | 1810 | 498 | |
| Body Region | Ankle | 892 | 892 | NA |
| Clavicle | 333 | 333 | NA | |
| Chest | 10,456 | NA | 10,456 | |
| Elbow | 802 | 802 | NA | |
| Femur | 401 | 401 | NA | |
| Foot | 499 | 499 | NA | |
| Forearm | 523 | 523 | NA | |
| Hand | 852 | 852 | NA | |
| Hip/Pelvis | 1034 | 1034 | NA | |
| Humerus | 106 | 106 | NA | |
| Knee | 807 | 807 | NA | |
| Ribs | 384 | 384 | NA | |
| Shoulder | 948 | 948 | NA | |
| Tibia/Fibula | 1309 | 1309 | NA | |
| Wrist | 1333 | 1333 | NA | |
| Cervical Spine | 195 | 195 | NA | |
| Thoracic Spine | 350 | 350 | NA | |
| Lumbar Spine | 357 | 357 | NA |
| Pathology (n = Prevalence) | AUC 1 (95% CI 2) | Se 3 (95% CI) | Sp 4 (95% CI) | PPV 5 (95% CI) | NPV 6 (95% CI) |
|---|---|---|---|---|---|
| Hilar/Mediastinal adenopathy (2.3%) | 96.9% [95.5–98.1] | 95.5% [92.1–97.5] | 87.5% [86.8–88.1] | 15.3% [13.6–17.2] | 99.9% [99.8–99.9] |
| Lung Cavity (1.5%) | 97.6% [96.0–98.9] | 96.8% [92.8–98.6] | 87.5% [86.9–88.1] | 10.6% [9.1–12.3] | 99.9% [99.9–100.0] |
| Tuberculosis (1.2%) | 96.4% [93.9–98.4] | 95.2% [90.0–97.8] | 87.2% [86.5–87.8] | 8.3% [7.0–9.8] | 99.9% [99.9–100.0] |
| Focal Bone Lesions (2.4%) | 99.2% [98.5–99.7] | 98.9% [96.8–99.6] | 96.1% [95.8–96.5] | 38.6% [35.0–42.3] | 100.0% [99.9–100.0] |
| Interstitial Pattern (3.7%) | 96.3% [94.9–97.5] | 95.6% [93.1–97.2] | 87.7% [87.1–88.3] | 23.0% [21.0–25.1] | 99.8% [99.7–99.9] |
| Mediastinal Widening (1.9%) | 98.4% [97.5–99.2] | 97.5% [94.2–98.9] | 86.6% [85.9–87.2] | 12.2% [10.7–14.0] | 99.9% [99.9–100.0] |
| Pneumonia (5.8%) | 97.1% [96.1–97.9] | 96.1% [94.2–97.3] | 88.2% [87.5–88.8] | 33.4% [31.3–35.7] | 99.7% [99.6–99.8] |
| Atelectasis (6.8%) | 96.1% [95.1–97.0] | 95.1% [93.3–96.5] | 87.5% [86.8–88.2] | 35.8% [33.7–38.0] | 99.6% [99.4–99.7] |
| Old Fracture (8.4%) | 96.2% [95.3–97.0] | 94.5% [92.8–95.8] | 88.1% [87.4–88.7] | 42.2% [40.1–44.3] | 99.4% [99.2–99.6] |
| Pathology (n = Prevalence) | AUC 1 (95% CI 2) | Se 3 (95% CI) | Sp 4 (95% CI) | PPV 5 (95% CI) | NPV 6 (95% CI) |
|---|---|---|---|---|---|
| Cardiomegaly (7.8%) | 97.9% [97.3–98.5] | 97.8% [96.5–98.6] | 84.6% [83.9–85.3] | 34.8% [32.9–36.8] | 99.8% [99.7–99.9] |
| Consolidation (3.4%) | 96.1% [94.7–97.4] | 95.2% [92.4–97.0] | 85.8% [85.1–86.5] | 18.9% [17.1–20.8] | 99.8% [99.7–99.9] |
| Pneumothorax (2.4%) | 96.6% [95.1–98.0] | 94.5% [91.0–96.7] | 89.4% [88.8–90.0] | 18.2% [16.2–20.3] | 99.9% [99.7–99.9] |
| Pulmonary Nodule (3.8%) | 96.1% [94.8–97.3] | 95.5% [93.0–97.1] | 85.0% [84.3–85.7] | 20.1% [18.4–22.0] | 99.8% [99.7–99.9] |
| Pleural Effusion (7.2%) | 97.2% [96.5–97.9] | 96.7% [95.1–97.7] | 86.5% [85.8–87.2] | 35.7% [33.6–37.8] | 99.7% [99.6–99.8] |
| Pulmonary Edema (3.8%) | 96.3% [94.9–97.6] | 95.7% [93.3–97.3] | 88.5% [87.9–89.1] | 24.8% [22.7–27.0] | 99.8% [99.7–99.9] |
| Acute/Subacute Fracture (20.4%) | 97.4% [97.0–97.9] | 97.0% [96.3–97.7] | 86.8% [86.1–87.5] | 65.3% [63.7–66.9] | 99.1% [98.9–99.3] |
| Dislocation (4.5%) | 97.0% [95.9–97.9] | 95.8% [93.6–97.2] | 85.2% [84.5–85.9] | 23.2% [21.5–25.1] | 99.8% [99.6–99.8] |
| Joint Effusion (5.7%) | 98.0% [97.2–98.7] | 97.8% [96.4–98.7] | 85.7% [85.0–86.4] | 29.4% [27.5–31.4] | 99.8% [99.7–99.9] |
| AZtrauma Subgroup (n = Prevalence) | AUC 1 (95% CI 4) | Se 2 (95% CI) | Sp 3 (95% CI) | |
|---|---|---|---|---|
| Age | 0–18 years old (7.5%) | 96.7% [96.0–97.4] | 95.8% [94.3–96.9] | 86.1% [85.5–86.8] |
| 19–60 years old (7.7%) | 97.5% [97.0–97.8] | 96.8% [96.0–97.4] | 90.2% [89.9–90.6] | |
| >60 years old (11.1%) | 97.7% [97.1–98.2] | 96.9% [95.7–97.7] | 86.0% [85.3–86.7] | |
| Sex | Male (8.4%) | 97.0% [96.5–97.5] | 96.3% [95.4–97.0] | 88.6% [88.2–89.0] |
| Female (8.1%) | 97.6% [97.2–98.0] | 97.0% [96.2–97.6] | 88.4% [88.0–88.8] | |
| Country | Argentina (8.0%) | 97.5% [94.9–99.5] | 97.4% [90.9–99.3] | 85.9% [83.4–88.0] |
| Australia (8.1%) | 95.1% [91.2–98.3] | 94.9% [87.5–98.0] | 85.7% [83.3–87.9] | |
| Belgium (8.4%) | 94.0% [90.9–96.9] | 92.2% [85.9–95.9] | 86.4% [84.4–88.2] | |
| Brazil (7.9%) | 95.6% [92.1–98.6] | 94.4% [86.6–97.8] | 86.6% [84.1–88.7] | |
| Bulgaria (8.7%) | 95.2% [91.2–98.5] | 92.9% [83.0–97.2] | 86.0% [83.0–88.6] | |
| Canada (8.6%) | 95.9% [93.7–97.8] | 94.6% [90.0–97.1] | 85.8% [84.1–87.3] | |
| Estonia (8.1%) | 96.3% [92.9–99.0] | 95.4% [87.3–98.4] | 86.6% [84.0–88.9] | |
| France (8.7%) | 96.4% [95.2–97.4] | 94.6% [92.1–96.4] | 90.3% [89.5–91.1] | |
| Germany (8.2%) | 98.5% [97.8–99.2] | 98.7% [97.0–99.5] | 86.6% [85.5–87.6] | |
| India (8.7%) | 97.0% [94.4–99.1] | 93.6% [85.9–97.2] | 92.0% [90.0–93.7] | |
| Israël (8.6%) | 95.9% [93.1–98.3] | 95.8% [90.5–98.2] | 86.0% [83.9–87.8] | |
| Italy (8.5%) | 96.2% [93.7–98.4] | 94.8% [88.4–97.8] | 86.3% [84.1–88.2] | |
| Morocco (8.1%) | 97.6% [96.1–98.9] | 97.4% [94.1–98.9] | 86.3% [84.8–87.7] | |
| Poland (8.1%) | 97.4% [96.2–98.6] | 96.5% [93.5–98.2] | 87.9% [86.6–89.0] | |
| Portugal (8.2%) | 98.7% [98.1–99.3] | 98.3% [96.8–99.2] | 95.4% [94.8–96.0] | |
| Romania (8.5%) | 95.8% [93.8–97.6] | 94.0% [89.6–96.6] | 86.2% [84.6–87.7] | |
| Spain (7.8%) | 97.7% [96.9–98.4] | 97.3% [95.5–98.4] | 86.1% [85.1–86.9] | |
| Switzerland (8.5%) | 97.2% [96.0–98.4] | 96.6% [94.1–98.1] | 86.0% [84.8–87.1] | |
| United Kingdom (8.2%) | 97.5% [96.1–98.7] | 97.1% [93.3–98.7] | 85.7% [84.1–87.2] | |
| United States (8.3%) | 98.2% [97.5–98.8] | 97.9% [96.6–98.7] | 91.0% [90.3–91.6] | |
| Body region | Ankle (8.3%) | 98.0% [97.0–98.9] | 97.6% [95.5–98.7] | 92.6% [91.8–93.4] |
| Clavicle (9.3%) | 95.3% [93.0–97.3] | 92.9% [87.7–96.0] | 83.9% [82.0–85.7] | |
| Elbow (6.0%) | 98.2% [97.3–99.1] | 97.9% [95.2–99.1] | 85.4% [84.2–86.5] | |
| Femur (9.9%) | 94.4% [92.0–96.6] | 92.9% [88.5–95.7] | 91.1% [89.7–92.3] | |
| Foot (8.8%) | 98.1% [97.3–98.8] | 97.4% [95.8–98.4] | 92.8% [92.2–93.5] | |
| Forearm (12.1%) | 96.2% [94.8–97.5] | 94.4% [91.2–96.5] | 88.9% [87.5–90.1] | |
| Hand (9.0%) | 98.2% [97.5–98.9] | 97.5% [95.9–98.5] | 94.4% [93.8–95.0] | |
| Hip/Pelvis (6.0%) | 98.2% [97.3–99.0] | 97.7% [95.4–98.9] | 87.4% [86.5–88.3] | |
| Humerus (13.7%) | 97.9% [96.3–98.4] | 96.6% [94.2–98.1] | 86.7% [85.2–88.0] | |
| Knee (7.2%) | 98.0% [97.2–98.8] | 98.0% [95.8–99.1] | 83.8% [82.6–84.9] | |
| Ribs (11.5%) | 94.4% [90.3–98.1] | 91.8% [82.2–96.5] | 90.0% [86.9–92.4] | |
| Shoulder (8.2%) | 97.0% [95.7–98.1] | 96.1% [93.4–97.7] | 89.4% [88.4–90.3] | |
| Tibia/Fibula (11.0%) | 98.0% [96.6–99.1] | 97.6% [94.6–99.0] | 91.1% [89.6–92.3] | |
| Wrist (6.4%) | 97.8% [96.7–98.8] | 97.4% [94.9–98.7] | 82.6% [81.4–83.7] | |
| Cervical Spine (7.8%) | 96.5% [93.3–99.1] | 94.7% [87.2–97.9] | 84.2% [81.6–86.4] | |
| Thoracic Spine (5.9%) | 96.2% [93.9–98.3] | 95.2% [89.1–97.9] | 81.5% [79.6–83.3] | |
| Lumbar Spine (5.9%) | 96.2% [93.6–98.4] | 94.3% [88.1–97.4] | 85.9% [84.2–87.5] | |
| AZchest Subgroup (n = Prevalence) | AUC 1 (95% CI 4) | Se 2 (95% CI) | Sp 3 (95% CI) | |
|---|---|---|---|---|
| Age | 0–18 years old (4.0%) | 97.3% [96.7–97.8] | 96.4% [95.3–97.3] | 86.6% [86.2–87.0] |
| 19–60 years old (3.7%) | 97.3% [96.9–97.7] | 96.5% [95.8–97.1] | 87.8% [87.6–88.0] | |
| >60 years old (4.8%) | 95.8% [95.0–96.5] | 94.7% [93.3–95.8] | 85.5% [85.1–86.0] | |
| Sex | Male (3.8%) | 97.3% [96.9–97.7] | 96.9% [96.2–97.5] | 86.4% [86.1–86.7] |
| Female (4.1%) | 96.5% [96.0–96.9] | 95.3% [94.5–96.0] | 87.8% [87.5–88.0] | |
| Country | Argentina (3.9%) | 97.9% [96.5–99.2] | 97.8% [94.6–99.2] | 85.8% [84.8–86.8] |
| Australia (4.5%) | 96.2% [94.0–98.0] | 94.3% [89.6–97.0] | 87.2% [86.0–88.3] | |
| Belgium (3.9%) | 97.9% [96.7–99.0] | 97.7% [94.3–99.1] | 86.4% [85.4–87.4] | |
| Brazil (3.5%) | 98.1% [96.9–99.2] | 98.2% [94.7–99.4] | 85.6% [84.5–86.6] | |
| Bulgaria (3.0%) | 98.1% [96.5–99.3] | 97.7% [93.5–99.2] | 85.2% [84.1–86.3] | |
| Canada (3.5%) | 97.2% [93.0–99.7] | 97.4% [86.8–99.6] | 86.3% [84.1–88.2] | |
| Estonia (4.7%) | 95.4% [93.5–97.0] | 93.5% [89.9–95.8] | 87.2% [86.3–88.1] | |
| France (3.9%) | 97.6% [95.7–99.1] | 96.4% [91.0–98.6] | 86.4% [85.1–87.7] | |
| Germany (4.6%) | 94.9% [92.3–97.2] | 93.1% [87.7–96.2] | 86.5% [85.2–87.7] | |
| India (4.3%) | 97.1% [96.1–97.9] | 96.2% [94.4–97.5] | 92.1% [91.7–92.6] | |
| Israël (3.0%) | 98.1% [96.3–99.5] | 98.2% [93.8–99.5] | 85.8% [84.6–86.9] | |
| Italy (4.6%) | 93.8% [92.2–95.3] | 91.6% [88.6–93.8] | 87.3% [86.6–88.0] | |
| Morocco (5.0%) | 95.8% [94.3–97.1] | 93.6% [90.5–95.7] | 87.6% [86.8–88.3] | |
| Poland (4.6%) | 96.9% [95.0–98.5] | 95.5% [90.6–97.9] | 87.0% [85.7–88.2] | |
| Portugal (3.4%) | 97.9% [96.7–99.0] | 98.0% [95.0–99.2] | 86.3% [85.3–87.1] | |
| Romania (4.1%) | 94.9% [92.3–97.3] | 92.4% [86.6–95.8] | 87.0% [85.7–88.1] | |
| Spain (3.1%) | 98.2% [97.4–98.8] | 98.0% [96.3–98.9] | 86.7% [86.2–87.3] | |
| Switzerland (4.0%) | 97.9% [97.0–98.7] | 97.9% [96.0–98.9] | 86.1% [85.5–86.8] | |
| United Kingdom (4.1%) | 97.5% [96.9–98.2] | 97.0% [95.7–97.9] | 86.3% [85.9–86.8] | |
| United States (3.3%) | 97.6% [96.1–98.8] | 97.7% [94.7–99.0] | 86.0% [85.1–86.8] | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Sultan, J.-L.; Beaumel, P.; Dementjeva, M.; Declercq, H.; Sultan, I.; Reinas, J.; Durán Vila, M.D. International Multicenter Validation of an Expanded AI Diagnostic System for 18 Pathologies in Thoracic and Musculoskeletal Radiography. Diagnostics 2026, 16, 1137. https://doi.org/10.3390/diagnostics16081137
Sultan J-L, Beaumel P, Dementjeva M, Declercq H, Sultan I, Reinas J, Durán Vila MD. International Multicenter Validation of an Expanded AI Diagnostic System for 18 Pathologies in Thoracic and Musculoskeletal Radiography. Diagnostics. 2026; 16(8):1137. https://doi.org/10.3390/diagnostics16081137
Chicago/Turabian StyleSultan, Jean-Laurent, Pauline Beaumel, Maria Dementjeva, Hugo Declercq, Ilana Sultan, Julia Reinas, and Maria Dolores Durán Vila. 2026. "International Multicenter Validation of an Expanded AI Diagnostic System for 18 Pathologies in Thoracic and Musculoskeletal Radiography" Diagnostics 16, no. 8: 1137. https://doi.org/10.3390/diagnostics16081137
APA StyleSultan, J.-L., Beaumel, P., Dementjeva, M., Declercq, H., Sultan, I., Reinas, J., & Durán Vila, M. D. (2026). International Multicenter Validation of an Expanded AI Diagnostic System for 18 Pathologies in Thoracic and Musculoskeletal Radiography. Diagnostics, 16(8), 1137. https://doi.org/10.3390/diagnostics16081137

