Thyroid Nodule Characterization: Which Thyroid Imaging Reporting and Data System (TIRADS) Is More Accurate? A Comparison Between Radiologists with Different Experiences and Artificial Intelligence Software
Abstract
1. Introduction
2. Methods and Materials
3. Statistical Analysis
4. Results
4.1. Agreement Among Human Observers with Different Levels of Experience
4.2. Agreement Between Human Observers with Different Levels of Experience and S-Detect
4.3. Radio/Cytological Agreement
5. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Conflicts of Interest
Correction Statement
References
- Haugen, B.R.; Alexander, E.K.; Bible, K.C.; Doherty, G.M.; Mandel, S.J.; Nikiforov, Y.E.; Pacini, F.; Randolph, G.W.; Sawka, A.M.; Schlumberger, M.; et al. 2015 American Thyroid Association Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer: The American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer. Thyroid 2016, 26, 1–133. [Google Scholar] [CrossRef] [PubMed]
- Gharib, H.; Papini, E.; Garber, J.R.; Duick, D.S.; Harrell, R.M.; Hegedüs, L.; Paschke, R.; Valcavi, R.; Vitti, P.; Aace/Ace/Ame Task Force on Thyroid Nodules. American Association of Clinical Endocrinologists, American College of Endocrinology, and Associazione Medici Endocrinologi Medical Guidelines for Clinical Practice for the Diagnosis and Management of Thyroid Nodules—2016 Update. Endocr. Pract. 2016, 22, 622–639. [Google Scholar] [CrossRef] [PubMed]
- Uludag, M.; Unlu, M.T.; Kostek, M.; Aygun, N.; Caliskan, O.; Ozel, A.; Isgor, A. Management of Thyroid Nodules. Sisli Etfal Hastan. Tip Bul. 2023, 57, 287–304. [Google Scholar] [CrossRef] [PubMed]
- Alexander, E.K.; Cibas, E.S. Diagnosis of thyroid nodules. Lancet Diabetes Endocrinol. 2022, 10, 533–539. [Google Scholar] [CrossRef] [PubMed]
- Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef]
- Seib, C.D.; Sosa, J.A. Evolving Understanding of the Epidemiology of Thyroid Cancer. Endocrinol. Metab. Clin. N. Am. 2019, 48, 23–35. [Google Scholar] [CrossRef]
- Ha, E.J.; Chung, S.R.; Na, D.G.; Ahn, H.S.; Chung, J.; Lee, J.Y.; Park, J.S.; Yoo, R.E.; Baek, J.H.; Baek, S.M.; et al. 2021 Korean Thyroid Imaging Reporting and Data System and Imaging-Based Management of Thyroid Nodules: Korean Society of Thyroid Radiology Consensus Statement and Recommendations. Korean J. Radiol. 2021, 22, 2094–2123. [Google Scholar] [CrossRef] [PubMed]
- Tessler, F.N.; Middleton, W.D.; Grant, E.G.; Hoang, J.K.; Berland, L.L.; Teefey, S.A.; Cronan, J.J.; Beland, M.D.; Desser, T.S.; Frates, M.C.; et al. ACR Thyroid Imaging, Reporting and Data System (TI-RADS): White Paper of the ACR TI-RADS Committee. J. Am. Coll. Radiol. 2017, 14, 587–595. [Google Scholar] [CrossRef] [PubMed]
- Durante, C.; Hegedüs, L.; Czarniecka, A.; Paschke, R.; Russ, G.; Schmitt, F.; Soares, P.; Solymosi, T.; Papini, E. 2023 European Thyroid Association Clinical Practice Guidelines for thyroid nodule management. Eur. Thyroid. J. 2023, 12, e230067. [Google Scholar] [CrossRef] [PubMed]
- Sorrenti, S.; Dolcetti, V.; Radzina, M.; Bellini, M.I.; Frezza, F.; Munir, K.; Grani, G.; Durante, C.; D’Andrea, V.; David, E.; et al. Artificial Intelligence for Thyroid Nodule Characterization: Where Are We Standing? Cancers 2022, 14, 3357. [Google Scholar] [CrossRef] [PubMed]
- David, E.; Grazhdani, H.; Tattaresu, G.; Pittari, A.; Foti, P.V.; Palmucci, S.; Spatola, C.; Lo Greco, M.C.; Inì, C.; Tiralongo, F.; et al. Thyroid Nodule Characterization: Overview and State of the Art of Diagnosis with Recent Developments, from Imaging to Molecular Diagnosis and Artificial Intelligence. Biomedicines 2024, 12, 1676. [Google Scholar] [CrossRef] [PubMed]
- Medici, M.; Liu, X.; Kwong, N.; Angell, T.E.; Marqusee, E.; Kim, M.I.; Alexander, E.K. Long- versus short-interval follow-up of cytologically benign thyroid nodules: A prospective cohort study. BMC Med. 2016, 14, 11. [Google Scholar] [CrossRef] [PubMed]
- Nardi, F.; Basolo, F.; Crescenzi, A.; Fadda, G.; Frasoldati, A.; Orlandi, F.; Palombini, L.; Papini, E.; Zini, M.; Pontecorvi, A.; et al. Italian consensus for the classification and reporting of thyroid cytology. J. Endocrinol. Investig. 2014, 37, 593–599. [Google Scholar] [CrossRef] [PubMed]
- Landis, J.R.; Koch, G.G. The measurement of observer agreement for categorical data. Biometrics 1977, 33, 159–174. [Google Scholar] [CrossRef] [PubMed]
- de Carlos, J.; Garcia, J.; Basterra, F.J.; Pineda, J.J.; Dolores Ollero, M.; Toni, M.; Munarriz, P.; Anda, E. Interobserver variability in thyroid ultrasound. Endocrine 2024, 85, 730–736. [Google Scholar] [CrossRef] [PubMed]
- Grani, G.; Lamartina, L.; Cantisani, V.; Maranghi, M.; Lucia, P.; Durante, C. Interobserver agreement of various thyroid imaging reporting and data systems. Endocr. Connect. 2018, 7, 1–7. [Google Scholar] [CrossRef] [PubMed]
- Wildman-Tobriner, B.; Yang, J.; Allen, B.C.; Ho, L.M.; Miller, C.M.; Mazurowski, M.A. Simplifying risk stratification for thyroid nodules on ultrasound: Validation and performance of an artificial intelligence thyroid imaging reporting and data system. Curr. Probl. Diagn. Radiol. 2024, 53, 695–699. [Google Scholar] [CrossRef] [PubMed]
- Lee, S.E.; Kim, H.J.; Jung, H.K.; Jung, J.H.; Jeon, J.H.; Lee, J.H.; Hong, H.; Lee, E.J.; Kim, D.; Kwak, J.Y. Improving the diagnostic performance of inexperienced readers for thyroid nodules through digital self-learning and artificial intelligence assistance. Front. Endocrinol. 2024, 15, 1372397. [Google Scholar] [PubMed]
ACR | High-Level Observer | Average-Level Observer | Low-Level Observer | S-Detect Observer | ||||
Frequency | Percentage | Frequency | Percentage | Frequency | Percentage | Frequency | Percentage | |
TIRADS 1 | 0 | 0 | 0 | 0 | 0 | 0 | 66 | 19.76 |
TIRADS 2 | 116 | 34.73% | 105 | 31.43% | 100 | 29.94% | 148 | 44.31 |
TIRADS 3 | 142 | 42.51% | 126 | 37.72% | 128 | 38.32% | 76 | 22.75 |
TIRADS 4 | 43 | 12.87% | 60 | 17.96% | 62 | 18.56% | 33 | 9.88 |
TIRADS 5 | 33 | 9.89% | 43 | 12.87% | 44 | 13.17% | 11 | 3.29 |
EU | High-Level Observer | Average-Level Observer | Low-Level Observer | S-Detect Observer | ||||
Frequency | Percentage | Frequency | Percentage | Frequency | Percentage | Frequency | Percentage | |
TIRADS 1 | 0 | 0 | 0 | 0 | 0 | 0 | 22 | 6.58 |
TIRADS 2 | 64 | 19.16% | 45 | 13.47% | 24 | 7.18% | 128 | 38.32 |
TIRADS 3 | 172 | 51.49% | 183 | 54.79% | 196 | 58.68% | 151 | 45.20 |
TIRADS 4 | 54 | 16.16% | 57 | 17.06% | 61 | 18.26% | 11 | 3.29 |
TIRADS 5 | 44 | 13.17% | 49 | 14.67% | 53 | 15.86% | 22 | 6.58 |
K | High-Level Observer | Average-Level Observer | Low-Level Observer | S-Detect Observer | ||||
Frequency | Percentage | Frequency | Percentage | Frequency | Percentage | Frequency | Percentage | |
TIRADS 1 | 0 | 0 | 0 | 0 | 0 | 0 | 22 | 6.58 |
TIRADS 2 | 53 | 15.86% | 57 | 17.06% | 66 | 19.76% | 127 | 38.02 |
TIRADS 3 | 215 | 64.37% | 194 | 58.08% | 179 | 53.59% | 141 | 42.21 |
TIRADS 4 | 22 | 6.58% | 32 | 9.58% | 34 | 10.17% | 33 | 9.88 |
TIRADS 5 | 44 | 13.17% | 51 | 15.26% | 55 | 16.46% | 11 | 3.29 |
High-Level Observer Cytology/Radiology | ACR | EU | K | |||
POS. | NEG. | POS. | NEG. | POS. | NEG. | |
Normal | 33 | 268 | 57 | 179 | 33 | 197 |
Abnormal | 33 | 0 | 33 | 0 | 33 | 0 |
TOTAL | 66 | 268 | 90 | 179 | 66 | 197 |
SEN (95% CI) | 100% (89.4–100) | 100% (89.4–100) | 100% (89.4–100) | |||
PPV (95% CI) | 50% (37.4–62.6) | 36.7% (26.8–47.5) | 50% (37.4–62.6) | |||
NPV (95% CI) | 100% (98.6–100) | 100% (98–100) | 100% (98.1–100) | |||
SPE (95% CI) | 89% (84.9–92.3) | 75,8% (69.9–81.2) | 85.7% (80.4–89.9) | |||
ACC (95% CI) | 94.5% (92.8–96.3) | 87.9% (85.2–90.7) | 92.8% (90.6–95.1) | |||
Average-Level Observer Cytology/Radiology | ACR | EU | K | |||
POS. | NEG. | POS. | NEG. | POS. | NEG. | |
Normal | 64 | 237 | 89 | 212 | 70 | 231 |
Abnormal | 21 | 12 | 21 | 12 | 21 | 12 |
TOTAL | 85 | 249 | 110 | 224 | 91 | 243 |
SEN (95% CI) | 63.6% (45.1–79.6) | 63.6% (45.1–79.6) | 63.6% (45.1–79.6) | |||
PPV (95% CI) | 24.7% (16–35.3) | 19.1% (12.2–27.7) | 23.1% (14.9–33.1) | |||
NPV (95% CI) | 95.2% (91.7–97.5) | 94.6% (90.8–97.2) | 95.1% (91.5–97.4) | |||
SPE (95% CI) | 78.7% (73.7–83.2) | 70.4% (64.9–75.5) | 76.7% (71.6–81.4) | |||
ACC (95% CI) | 71.2% (62.5–79.8) | 67% (58.3–75.8) | 70.2% (61.5–78.9) | |||
Low-Level Observer Cytology/Radiology | ACR | EU | K | |||
POS. | NEG. | POS. | NEG. | POS. | NEG. | |
Normal | 70 | 231 | 183 | 218 | 74 | 227 |
Abnormal | 20 | 13 | 20 | 13 | 20 | 13 |
TOTAL | 90 | 244 | 203 | 231 | 94 | 240 |
SEN (95% CI) | 60.6% (42.1–77.1) | 60.6% (42.1–77–1) | 60.6% (42.1–77.1) | |||
PPV (95% CI) | 22.2% (14.1–32.2) | 9.9% (6.1–14.8) | 21.3% (13.5–30.9) | |||
NPV (95% CI) | 94.7% (91.1–97.1) | 94.4% (90.6–97) | 94.6% (90.9–97.1) | |||
SPE (95% CI) | 76.7% (71.6–81.4) | 54.4% (49.3–59.3) | 75.4% (70.1–80.2) | |||
ACC (95% CI) | 68.7% (59.9–77.5) | 57.5% (48.7–66.3) | 68% (59.2–76.8) | |||
S-Detect Observer Cytology/Radiology | ACR | EU | K | |||
POS. | NEG. | POS. | NEG. | POS. | NEG. | |
Normal | 11 | 290 | 22 | 279 | 22 | 279 |
Abnormal | 22 | 11 | 22 | 11 | 22 | 11 |
TOTAL | 33 | 301 | 44 | 290 | 44 | 290 |
SEN (95% CI) | 66.7% (48.2–82) | 66.7% (48.2–82) | 66.7% (48.2–82) | |||
PPV (95% CI) | 66.7% (48.2–82) | 50% (34.6–65.4) | 50% (34.6–65.4) | |||
NPV (95% CI) | 96.3% (93.6–98.2) | 96.2% (93.3–98.1) | 96.2% (93.3–98.1) | |||
SPE (95% CI) | 96.3% (93.6–98.2) | 92.7% (89.1–95.4) | 92.7% (89.1–95.4) | |||
ACC (95% CI) | 81.5% (73.3–89.7) | 79.7% (71.4–88) | 79.7% (71.4–88) |
Cytology (FNA) | Frequency | Percentage |
---|---|---|
TIR1 | 0 | 0 |
TIR2 | 258 | 77.24% |
TIR3a | 43 | 12.87% |
TIR3b | 0 | 0 |
TIR4 | 33 | 9.88% |
TIR5 | 0 | 0 |
TOTAL | 334 | 100% |
Inter-Observer Agreement | ACR | EU | K |
---|---|---|---|
Among all three human observers | k = 0.624 | k = 0.542 | k = 0.496 |
Between the high-level observer and S-Detect | k = 0.762 | k = 0.417 | k = 0.679 |
Between the average-level observer and S-Detect | k = 0.654 | k = 0.334 | k = 0.603 |
Between the low-level observer and S-Detect | k = 0.596 | k = 0.295 | k = 0.536 |
Human Observers’ Agreement | ACR | EU | K |
---|---|---|---|
Composition | k = 0.826 | k = 0.809 | k = 0.785 |
Shape | k = 0.793 | k = 0.716 | k = 0.687 |
Echogenicity | k = 0.498 | k = 0.441 | k = 0.389 |
Margins | k = 0.134 | k = 0.119 | k = 0.106 |
Calcifications Or Targeted Echogenic Foci | k = 0.416 | k = 0.318 | k = 0.351 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
David, E.; Aliotta, L.; Frezza, F.; Riccio, M.; Cannavale, A.; Pacini, P.; Di Bella, C.; Dolcetti, V.; Seri, E.; Giuliani, L.; et al. Thyroid Nodule Characterization: Which Thyroid Imaging Reporting and Data System (TIRADS) Is More Accurate? A Comparison Between Radiologists with Different Experiences and Artificial Intelligence Software. Diagnostics 2025, 15, 2108. https://doi.org/10.3390/diagnostics15162108
David E, Aliotta L, Frezza F, Riccio M, Cannavale A, Pacini P, Di Bella C, Dolcetti V, Seri E, Giuliani L, et al. Thyroid Nodule Characterization: Which Thyroid Imaging Reporting and Data System (TIRADS) Is More Accurate? A Comparison Between Radiologists with Different Experiences and Artificial Intelligence Software. Diagnostics. 2025; 15(16):2108. https://doi.org/10.3390/diagnostics15162108
Chicago/Turabian StyleDavid, Emanuele, Lorenzo Aliotta, Fabrizio Frezza, Marianna Riccio, Alessandro Cannavale, Patrizia Pacini, Chiara Di Bella, Vincenzo Dolcetti, Elena Seri, Luca Giuliani, and et al. 2025. "Thyroid Nodule Characterization: Which Thyroid Imaging Reporting and Data System (TIRADS) Is More Accurate? A Comparison Between Radiologists with Different Experiences and Artificial Intelligence Software" Diagnostics 15, no. 16: 2108. https://doi.org/10.3390/diagnostics15162108
APA StyleDavid, E., Aliotta, L., Frezza, F., Riccio, M., Cannavale, A., Pacini, P., Di Bella, C., Dolcetti, V., Seri, E., Giuliani, L., Di Segni, M., Lo Conte, G., Bonito, G., Guerrisi, A., Mangini, F., Drudi, F. M., De Vito, C., & Cantisani, V. (2025). Thyroid Nodule Characterization: Which Thyroid Imaging Reporting and Data System (TIRADS) Is More Accurate? A Comparison Between Radiologists with Different Experiences and Artificial Intelligence Software. Diagnostics, 15(16), 2108. https://doi.org/10.3390/diagnostics15162108