Concordance Between the Multidisciplinary Team and ChatGPT-4o Decisions: A Blinded, Cross-Sectional Concordance Study in Systemic Autoimmune Rheumatic Diseases
Abstract
1. Introduction
2. Materials & Methods
2.1. Study Features & Patient Selection
2.2. Case Presentation to GBT and Agreement Assessment
2.2.1. Standardized AI Prompt (Full Text Applied Uniformly Across All Cases)
- (1)
- The most likely clinical and radiological diagnosis;
- (2)
- The recommended management plan, including immunosuppressive and/or antifibrotic strategies; and
- (3)
- The need for any additional diagnostic procedures (e.g., detailed HRCT review, bronchoscopy, lung biopsy).
2.2.2. Implementation of the Prompt in the Study
- •
- 0–0: both MDT and AI agree that no treatment/investigation is needed (full concordance)
- •
- 1–1: both MDT and AI agree that treatment/investigation is required (full concordance)
- •
- 0–1 or 1–0: MDT and AI disagree; one recommends intervention while the other does not (discordance)
2.3. Statistical Methods
3. Results
4. Discussion
5. Limitations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Hayter, S.M.; Cook, M.C. Updated assessment of the prevalence, spectrum and case definition of autoimmune disease. Autoimmun. Rev. 2012, 11, 754–765. [Google Scholar] [CrossRef] [PubMed]
- Riitano, G.; Recalchi, S.; Capozzi, A.; Manganelli, V.; Misasi, R.; Garofalo, T.; Sorice, M.; Longo, A. The Role of Autophagy as a Trigger of Post-Translational Modifications of Proteins and Extracellular Vesicles in the Pathogenesis of Rheumatoid Arthritis. Int. J. Mol. Sci. 2023, 24, 12764. [Google Scholar] [CrossRef] [PubMed]
- De Zorzi, E.; Spagnolo, P.; Cocconcelli, E.; Balestro, E.; Iaccarino, L.; Gatto, M.; Benvenuti, F.; Bernardinello, N.; Doria, A.; Maher, T.M.; et al. Thoracic Involvement in Systemic Autoimmune Rheumatic Diseases: Pathogenesis and Management. Clin. Rev. Allergy Immunol. 2022, 63, 472–489. [Google Scholar] [CrossRef] [PubMed]
- De Clercq, A.; Jans, L.; Gosselin, R.; Delrue, L.; Vereecke, E.; Parkar, A.P.; Schiettecatte, E.; Lecluyse, C.; Smeets, P.; Herregods, N.; et al. Thoracic manifestations of rheumatic disease: A radiologist’s view. Ther. Adv. Musculoskelet. Dis. 2024, 16, 1759720X241293943. [Google Scholar] [CrossRef]
- Kameda, H.; Tokuda, H. Pulmonary involvement in connective tissue disease: A comparison between rheumatology and pulmonology. Respir. Investig. 2022, 60, 322–333. [Google Scholar]
- Sambataro, G.; Palmucci, S.; Luppi, F. Editorial: Multidisciplinary Approach to interstitial lung disease associated with systemic rheumatic diseases. Front. Med. 2022, 9, 1112872. [Google Scholar] [CrossRef]
- Biciusca, V.; Rosu, A.; Stan, S.I.; Cioboata, R.; Biciusca, T.; Balteanu, M.A.; Florescu, C.; Camen, G.C.; Cimpeanu, O.; Bumbea, A.M.; et al. A Practical Multidisciplinary Approach to Identifying Interstitial Lung Disease in Systemic Autoimmune Rheumatic Diseases: A Clinician’s Narrative Review. Diagnostics 2024, 14, 2674. [Google Scholar] [CrossRef]
- Grewal, J.S.; Morisset, J.; Fisher, J.H.; Churg, A.M.; Bilawich, A.M.; Ellis, J.; English, J.; Hague, C.; Khalil, N.; Leipsic, J.; et al. Role of a Regional Multidisciplinary Conference in the Diagnosis of Interstitial Lung Disease. Ann. Am. Thorac. Soc. 2019, 16, 455–462. [Google Scholar] [CrossRef]
- Glenn, L.M.; Troy, L.K.; Corte, T.J. Diagnosing interstitial lung disease by multidisciplinary discussion: A review. Front. Med. 2022, 9, 1017501. [Google Scholar] [CrossRef]
- Shahriar, S.; Lund, B.; Mannuru, N.R.; Arshad, M.A.; Hayawi, K.; Bevara, R.V.K.; Mannuru, A.; Batool, L. Putting GPT-4o to the Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency. Appl. Sci. 2024, 14, 7782. [Google Scholar] [CrossRef]
- Rao, S.J.; Isath, A.; Krishnan, P.; Tangsrivimol, J.A.; Virk, H.U.H.; Wang, Z.; Glicksberg, B.S.; Krittanawong, C. ChatGPT: A Conceptual Review of Applications and Utility in the Field of Medicine. J. Med. Syst. 2024, 48, 59. [Google Scholar] [CrossRef]
- Lahat, A.; Sharif, K.; Zoabi, N.; Shneor Patt, Y.; Sharif, Y.; Fisher, L.; Shani, U.; Arow, M.; Levin, R.; Klang, E. Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4. J. Med. Internet Res. 2024, 26, e54571. [Google Scholar] [CrossRef] [PubMed]
- Günay, S.; Öztürk, A.; Özerol, H.; Yiğit, Y.; Erenler, A.K. Comparison of emergency medicine specialist, cardiologist, and chat-GPT in electrocardiography assessment. Am. J. Emerg. Med. 2024, 80, 51–60. [Google Scholar] [CrossRef] [PubMed]
- Karabuğa, B.; Karaçin, C.; Büyükkör, M.; Bayram, D.; Aydemir, E.; Kaya, O.B.; Yılmaz, M.E.; Çamöz, E.S.; Ergün, Y. The Role of Artificial Intelligence (ChatGPT-4o) in Supporting Tumor Board Decisions. J. Clin. Med. 2025, 14, 3535. [Google Scholar] [CrossRef] [PubMed]
- Jafri, S.; Ahmed, N.; Saifullah, N.; Musheer, M. Epidemiology and Clinico-radiological features of Interstitial Lung Diseases. Pak. J. Med. Sci. 2020, 36, 365–370. [Google Scholar] [CrossRef]
- Johnson, S.R.; Bernstein, E.J.; Bolster, M.B.; Chung, J.H.; Danoff, S.K.; George, M.D.; Khanna, D.; Guyatt, G.; Mirza, R.D.; Aggarwal, R.; et al. 2023 American College of Rheumatology (ACR)/American College of Chest Physicians (CHEST) Guideline for the Treatment of Interstitial Lung Disease in People with Systemic Autoimmune Rheumatic Diseases. Arthritis Care Res. 2024, 76, 1051–1069. [Google Scholar] [CrossRef]
- Labinsky, H.; Nagler, L.K.; Krusche, M.; Griewing, S.; Aries, P.; Kroiß, A.; Strunz, P.-P.; Kuhn, S.; Schmalzing, M.; Gernert, M.; et al. Vignette-based comparative analysis of ChatGPT and specialist treatment decisions for rheumatic patients: Results of the Rheum2Guide study. Rheumatol. Int. 2024, 44, 2043–2053. [Google Scholar] [CrossRef]
- Goncalves, L.; Moura, C. Chat-GPT Performance in Diagnosis of Rheumatological Diseases: A Comparison with Specialist’s Opinion. Arthritis Rheumatol. 2024, 76, 4870–4872. [Google Scholar]
- Tsao, Y.P.; Chen, H.H.; Hsieh, T.Y.; Li, K.J.; Yu, K.H.; Cheng, T.T.; Tseng, J.; Lu, C.; Chen, D. Evidence- and Consensus-Based Recommendations for the Screening, Diagnosis, and Management of Secondary Hypogammaglobulinemia in Patients With Systemic Autoimmune Rheumatic Diseases by the Taiwan College of Rheumatology Experts. Int. J. Rheum. Dis. 2025, 28, e70310. [Google Scholar] [CrossRef]
- Boyle, N.; Miller, J.; Quinn, S.; Maguire, J.; Fabre, A.; Morrisroe, K.; Murphy, D.J.; McCarthy, C. Systemic autoimmune rheumatic diseases-associated interstitial lung disease: A pulmonologist’s perspective. Breathe 2025, 21, 240171. [Google Scholar] [CrossRef]
- Walsh, S.L.F.; Calandriello, L.; Silva, M.; Sverzellati, N. Deep learning for classifying fibrotic lung disease on high-resolution computed tomography: A case-cohort study. Lancet Respir. Med. 2018, 6, 837–845. [Google Scholar] [CrossRef]
- Yang, Y.; Liu, Y.; Chen, Y.; Luo, D.; Xu, K.; Zhang, L. Artificial intelligence for predicting treatment responses in autoimmune rheumatic diseases: Advancements, challenges, and future perspectives. Front. Immunol. 2024, 15, 1477130. [Google Scholar] [CrossRef]
- Adamichou, C.; Genitsaridi, I.; Nikolopoulos, D.; Nikoloudaki, M.; Repa, A.; Bortoluzzi, A.; Fanouriakis, A.; Sidiropoulos, P.; Boumpas, D.T.; Bertsias, G.K. Lupus or not? SLE Risk Probability Index (SLERPI): A simple, clinician-friendly machine learning-based model to assist the diagnosis of systemic lupus erythematosus. Ann. Rheum. Dis. 2021, 80, 758–766. [Google Scholar] [CrossRef]
- Kądziela, M.; Fijałkowska, A.; Kraska-Gacka, M.; Woźniacka, A. The Art of Interpreting Antinuclear Antibodies (ANAs) in Everyday Practice. J. Clin. Med. 2025, 14, 5322. [Google Scholar] [CrossRef]
- Raghu, G.; Remy-Jardin, M.; Myers, J.L.; Richeldi, L.; Ryerson, C.J.; Lederer, D.J.; Behr, J.; Cottin, V.; Danoff, S.K.; Morell, F.; et al. Diagnosis of idiopathic pulmonary fibrosis: An official ATS/ERS/JRS/ALAT clinical practice guideline. Am. J. Respir. Crit. Care Med. 2018, 198, e44–e68. [Google Scholar] [CrossRef]
| Variable | Category | n (%) or Mean ± SD |
|---|---|---|
| Gender | Male | 18 (38.30%) |
| Female | 29 (61.70%) | |
| Age (years) | 61.74 ± 10.40 | |
| Disease duration (months) | 45.80 ± 8.62 | |
| Diagnosis | Rheumatoid arthritis | 15 (31.91%) |
| ANCA-associated vasculitis | 6 (12.77%) | |
| Interstitial pneumonia with autoimmune features | 6 (12.77%) | |
| Sarcoidosis | 6 (12.77%) | |
| Sjögren’s syndrome | 3 (6.38%) | |
| Mixed connective tissue disease | 2 (4.26%) | |
| Systemic sclerosis | 2 (4.26%) | |
| Others | 7 (14.89%) | |
| Number of comorbidities | 0 | 23 (48.94%) |
| 1 | 11 (23.40%) | |
| 2 | 9 (19.15%) | |
| 3 | 2 (4.26%) | |
| 4 | 2 (4.26%) | |
| Smoking status | Current smoker | 11 (23.40%) |
| Non-smoker | 36 (76.60%) | |
| Dyspnea | Present | 19 (40.43%) |
| Absent | 28 (59.57%) | |
| Cough | Present | 19 (40.43%) |
| Absent | 28 (59.57%) | |
| Fever | Present | 5 (10.64%) |
| Absent | 42 (89.36%) | |
| Rheumatoid factor | Positive | 17 (36.17%) |
| Negative | 30 (63.83%) | |
| Anti-CCP antibody | Positive | 12 (25.53%) |
| Negative | 35 (74.47%) | |
| ANA | Positive | 22 (46.81%) |
| Negative | 25 (53.19%) | |
| ANCA | Positive | 8 (17.02%) |
| Negative | 39 (82.98%) | |
| Serum ACE | Elevated | 6 (12.77%) |
| Normal | 41 (87.23%) | |
| Ongoing steroid use | Yes | 16 (34.04%) |
| No | 31 (65.96%) | |
| CRP (mg/L) | 26.83 ± 58.51 | |
| Urea (mg/dL) | 39.43 ± 20.60 | |
| Creatinine (mg/dL) | 0.63 ± 0.45 | |
| AST (U/L) | 18.70 ± 9.26 | |
| ALT (U/L) | 17.89 ± 8.74 | |
| WBC (/mm3) | 8755.11 ± 3192.18 | |
| Hemoglobin (g/dL) | 12.75 ± 1.95 | |
| Platelets (/mm3) | 286,042.55 ± 114,507.82 | |
| FEV1 (% predicted) | 55.19 ± 37.97 | |
| Low | 33 (70.21%) | |
| Normal | 14 (29.79%) | |
| FVC (% predicted) | 57.02 ± 38.83 | |
| Low | 33 (70.21%) | |
| Normal | 14 (29.79%) | |
| DLCO (% predicted) | 36.19 ± 31.31 | |
| Low | 46 (97.87%) | |
| Normal | 1 (2.13%) |
| The Questions | Subcategory | n (%) or κ, Agreement Level |
|---|---|---|
| What is the primary clinical differential diagnosis? | Rheumatoid arthritis with ILD | 8 (17.02%) |
| ANCA vasculitis with pulmonary involvement | 5 (10.64%) | |
| Inactive sarcoidosis | 3 (6.38%) | |
| IPAF with ILD | 2 (4.26%) | |
| Sjögren’s syndrome with ILD | 2 (4.26%) | |
| Other disease agreements | 11 (23.40%) | |
| Disagreement | 16 (34.04%) | |
| Overall agreement | κ = 0.52, Moderate | |
| What is the primary radiologic differential diagnosis? | UIP | 5 (10.64%) |
| fNSIP | 5 (10.64%) | |
| AAV with pulmonary involvement (DAH) | 2 (4.26%) | |
| Cavitating nodule | 2 (4.26%) | |
| Rheumatoid nodule | 2 (4.26%) | |
| Other agreements (OP, LIP, drug reactions) | 14 (29.79%) | |
| Disagreement | 17 (36.17%) | |
| Overall agreement | κ = 0.55, Moderate | |
| Is anti-inflammatory immunosuppressive therapy indicated? (0: no, 1: yes) | Agreement (0–0) | 14 (29.79%) |
| Agreement (1–1) | 25 (53.19%) | |
| Disagreement (0–1) | 2 (4.26%) | |
| Disagreement (1–0) | 6 (12.77%) | |
| Overall agreement | κ = 0.64, Moderate–High | |
| Is antifibrotic therapy indicated? (0: no, 1: yes) | Agreement (0–0) | 39 (82.98%) |
| Agreement (1–1) | 3 (6.38%) | |
| Disagreement (0–1) | 4 (8.51%) | |
| Disagreement (1–0) | 1 (2.13%) | |
| Overall agreement | κ = 0.49, Moderate | |
| Is drug-free follow-up appropriate? (0: no, 1: yes) | Agreement (0–0) | 27 (57.45%) |
| Agreement (1–1) | 13 (27.66%) | |
| Disagreement (0–1) | 4 (8.51%) | |
| Disagreement (1–0) | 3 (6.38%) | |
| Overall agreement | κ = 0.67, Moderate–High | |
| Is additional diagnostic testing indicated? (0: no, 1: yes) | Agreement (0–0) | 28 (59.57%) |
| Agreement (1–1) | 0 (0.00%) | |
| Disagreement (0–1) | 10 (21.28%) | |
| Disagreement (1–0) | 9 (19.15%) | |
| Overall agreement | κ = 0.45, Moderate | |
| Overall Agreement Distribution on Each Decision Domains | 1 | 1 (2.13%) |
| 2 | 4 (8.51%) | |
| 3 | 5 (10.64%) | |
| 4 | 12 (25.53%) | |
| 5 | 14 (29.79%) | |
| 6 | 11 (23.40%) |
| Predictor | Immunosuppressive Treatment | Antifibrotic Treatment | Follow-Up Without Medication | Further Investigation Required | Clinical Diagnosis Agreement | Radiological Diagnosis Agreement |
|---|---|---|---|---|---|---|
| Diagnosis | 0.062 (0.980) | −8.897 (0.133) | −1.303 (0.515) | 0.276 (0.570) | 271.512 (0.999) | −135.216 (0.999) |
| Age | −0.002 (0.954) | 0.099 (0.254) | 0.001 (0.961) | - | −3.943 (0.999) | 3.846 (0.999) |
| Smoking | - | −2.199 (0.996) | - | −0.937 (0.240) | - | 46.438 (1.000) |
| Number of comorbidities | −0.610 (0.090) * | - | MC | - | - | - |
| ESR | - | - | 0.017 (0.445) | - | - | - |
| CRP | - | - | −0.005 (0.563) | - | - | - |
| Dyspnea | - | - | - | −0.353 (0.622) | - | - |
| Cough | - | - | - | −1.384 (0.071) * | - | - |
| Fever | - | - | - | 1.764 (0.123) | - | - |
| FEV1 (no decrease) | 2.426 (0.206) | −20.703 (1.000) | 0.105 (0.948) | - | - | −19.805 (1.000) |
| FVC (no decrease) | −2.263 (0.256) | 21.897 (1.000) | −0.123 (0.938) | - | - | −27.742 (1.000) |
| DLCO (no decrease) | −51.857 (1.000) | −19.347 (0.999) | 17.638 (0.994) | - | - | 2.413 (1.000) |
| Current treatment | MC | MC | MC | MC | MC | MC |
| Ongoing Steroid therapy (yes) | 0.666 (0.444) | - | - | - | - | - |
| RF | 0.514 (0.681) | - | - | - | 97.075 (0.999) | 66.701 (1.000) |
| CCP | 0.171 (0.907) | - | - | - | −41.449 (1.000) | −114.172 (1.000) |
| ANA | 1.726 (0.060) * | - | - | - | 48.454 (0.999) | 47.875 (0.999) |
| ANCA | 1.091 (0.323) | - | - | - | −60.243 (0.999) | −97.385 (1.000) |
| Serum ACE | - | - | - | - | −33.673 (1.000) | −4.633 (1.000) |
| Model significance (p) | 0.437 | 0.273 | 0.828 | 0.159 | 0.288 | 0.083 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Ulutaş, F.; Altınışık, G.; Güngör, G.; Çakmak, V.; Yiğit, N.; Herek, D.; Yiğit, M.; Karasu, U.; Çobankara, V. Concordance Between the Multidisciplinary Team and ChatGPT-4o Decisions: A Blinded, Cross-Sectional Concordance Study in Systemic Autoimmune Rheumatic Diseases. Diagnostics 2026, 16, 113. https://doi.org/10.3390/diagnostics16010113
Ulutaş F, Altınışık G, Güngör G, Çakmak V, Yiğit N, Herek D, Yiğit M, Karasu U, Çobankara V. Concordance Between the Multidisciplinary Team and ChatGPT-4o Decisions: A Blinded, Cross-Sectional Concordance Study in Systemic Autoimmune Rheumatic Diseases. Diagnostics. 2026; 16(1):113. https://doi.org/10.3390/diagnostics16010113
Chicago/Turabian StyleUlutaş, Firdevs, Göksel Altınışık, Gülay Güngör, Vefa Çakmak, Nilüfer Yiğit, Duygu Herek, Murat Yiğit, Uğur Karasu, and Veli Çobankara. 2026. "Concordance Between the Multidisciplinary Team and ChatGPT-4o Decisions: A Blinded, Cross-Sectional Concordance Study in Systemic Autoimmune Rheumatic Diseases" Diagnostics 16, no. 1: 113. https://doi.org/10.3390/diagnostics16010113
APA StyleUlutaş, F., Altınışık, G., Güngör, G., Çakmak, V., Yiğit, N., Herek, D., Yiğit, M., Karasu, U., & Çobankara, V. (2026). Concordance Between the Multidisciplinary Team and ChatGPT-4o Decisions: A Blinded, Cross-Sectional Concordance Study in Systemic Autoimmune Rheumatic Diseases. Diagnostics, 16(1), 113. https://doi.org/10.3390/diagnostics16010113

