Robust Multimodal Deep Learning for Lymphoma Subtype Classification Using 18F-FDG PET Maximum Intensity Projection Images and Clinical Data: A Multi-Center Study
Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Design and Patient Selection
2.2. Data Preprocessing
2.3. Model Architecture for Subtype Classification
2.4. Statistical Analysis
3. Results
3.1. Clinical Characteristics of the Patients
3.2. Performances of LymphoMAP
3.3. Impact of SCN on Model Performance
3.4. Impact of Input Modalities on Performance
3.5. Grad-CAM Visualization
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| 18F-FDG | Fluorine-18 fluorodeoxyglucose |
| PET | Positron emission tomography |
| MIP | Maximum intensity projection |
| HL | Hodgkin lymphoma |
| NHL | Non-Hodgkin lymphoma |
| DLBCL | Diffuse large B-cell lymphoma |
| FL | Follicular lymphoma |
| PNUH | Pusan National University Hospital |
| DICOM | Digital Imaging and Communication in Medicine |
| CT | Computed tomography |
| WBC | White blood cell count |
| ANC | Absolute neutrophil count |
| ALC | Absolute lymphocyte count |
| PLT | Platelet count |
| Hb | Hemoglobin level |
| NLR | Neutrophil-to-lymphocyte ratio |
| PLR | Platelet-to-lymphocyte ratio |
| LDH | Lactate dehydrogenase |
| CNN | Convolutional neural network |
| MCC | Matthews correlation coefficient |
| AUC | Area under the curve |
| ROC | Receiver operating characteristic |
| CI | Confidence interval |
| Grad-CAM | Gradient-weighted class activation mapping |
| 2D | Two-dimensional |
| 3D | Three-dimensional |
| ROI | Region of interest |
References
- Armitage, J.O.; Gascoyne, R.D.; Lunning, M.A.; Cavalli, F. Non-Hodgkin lymphoma. Lancet 2017, 390, 298–310. [Google Scholar] [CrossRef]
- Siegel, R.L.; Kratzer, T.B.; Giaquinto, A.N.; Sung, H.; Jemal, A. Cancer statistics, 2025. CA Cancer J. Clin. 2025, 75, 10–45. [Google Scholar] [CrossRef]
- Alaggio, R.; Amador, C.; Anagnostopoulos, I.; Attygalle, A.D.; Araujo, I.B.d.O.; Berti, E.; Bhagat, G.; Borges, A.M.; Boyer, D.; Calaminici, M.; et al. The 5th edition of the World Health Organization Classification of Haematolymphoid Tumours: Lymphoid Neoplasms. Leukemia 2022, 36, 1720–1748. [Google Scholar] [CrossRef]
- Kanavos, T.; Birbas, E.; Zanos, T.P. A Systematic Review of the Applications of Deep Learning for the Interpretation of Positron Emission Tomography Images of Patients with Lymphoma. Cancers 2024, 17, 69. [Google Scholar] [CrossRef] [PubMed]
- Bodet-Milin, C.; Kraeber-Bodéré, F.; Moreau, P.; Campion, L.; Dupas, B.; Le Gouill, S. Investigation of FDG-PET/CT imaging to guide biopsies in the detection of histological transformation of indolent lymphoma. Haematologica 2008, 93, 471–472. [Google Scholar] [CrossRef] [PubMed]
- De Jesus, F.M.; Yin, Y.; Mantzorou-Kyriaki, E.; Kahle, X.U.; de Haas, R.J.; Yakar, D.; Glaudemans, A.W.J.M.; Noordzij, W.; Kwee, T.C.; Nijland, M. Machine learning in the differentiation of follicular lymphoma from diffuse large B-cell lymphoma with radiomic [18F]FDG PET/CT features. Eur. J. Nucl. Med. 2021, 49, 1535–1543. [Google Scholar] [CrossRef] [PubMed]
- Enke, J.S.; Moltz, J.H.; D’Anastasi, M.; Kunz, W.G.; Schmidt, C.; Maurus, S.; Mühlberg, A.; Katzmann, A.; Sühling, M.; Hahn, H.; et al. Radiomics Features of the Spleen as Surrogates for CT-Based Lymphoma Diagnosis and Subtype Differentiation. Cancers 2022, 14, 713. [Google Scholar] [CrossRef]
- Häggström, I.; Leithner, D.; Alvén, J.; Campanella, G.; Abusamra, M.; Zhang, H.; Chhabra, S.; Beer, L.; Haug, A.; Salles, G.; et al. Deep learning for [18F]fluorodeoxyglucose-PET-CT classification in patients with lymphoma: A dual-centre retrospective analysis. Lancet Digit. Health 2023, 6, e114–e125. [Google Scholar] [CrossRef]
- Aoki, H.; Miyazaki, Y.; Anzai, T.; Yokoyama, K.; Tsuchiya, J.; Shirai, T.; Shibata, S.; Sakakibara, R.; Mitsumura, T.; Honda, T.; et al. Deep convolutional neural network for differentiating between sarcoidosis and lymphoma based on [18F]FDG maximum-intensity projection images. Eur. Radiol. 2023, 34, 374–383. [Google Scholar] [CrossRef]
- Colliot, O.; Thibeau-Sutre, E.; Burgos, N. Reproducibility in Machine Learning for Medical Imaging. In Machine Learning for Brain Disorders; Colliot, O., Ed.; Humana: Louisville, KY, USA, 2023; pp. 631–653. [Google Scholar] [CrossRef]
- Orlhac, F.; Eertink, J.J.; Cottereau, A.-S.; Zijlstra, J.M.; Thieblemont, C.; Meignan, M.; Boellaard, R.; Buvat, I. A Guide to ComBat Harmonization of Imaging Biomarkers in Multicenter Studies. J. Nucl. Med. 2021, 63, 172–179. [Google Scholar] [CrossRef]
- Horng, H.; Singh, A.; Yousefi, B.; Cohen, E.A.; Haghighi, B.; Katz, S.; Noël, P.B.; Shinohara, R.T.; Kontos, D. Generalized ComBat harmonization methods for radiomic features with multi-modal distributions and multiple batch effects. Sci. Rep. 2022, 12, 4493. [Google Scholar] [CrossRef]
- Weiskopf, N.G.; Hripcsak, G.; Swaminathan, S.; Weng, C. Defining and measuring completeness of electronic health records for secondary use. J. Biomed. Inform. 2013, 46, 830–836. [Google Scholar] [CrossRef]
- Carbone, P.P.; Kaplan, H.S.; Musshoff, K.; Smithers, D.W.; Tubiana, M. Report of the Committee on Hodgkin’s Disease Staging Classification. Cancer Res. 1971, 31, 1860–1861. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar] [CrossRef]
- Yanar, E.; Kutan, F.; Ayturan, K.; Kutbay, U.; Algın, O.; Hardalaç, F.; Ağıldere, A.M. A Comparative Analysis of the Mamba, Transformer, and CNN Architectures for Multi-Label Chest X-Ray Anomaly Detection in the NIH ChestX-Ray14 Dataset. Diagnostics 2025, 15, 2215. [Google Scholar] [CrossRef] [PubMed]
- Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]
- Skalská, H.; Freylich, V. Web-Bootstrap Estimate of Area Under ROC Curve. Austrian J. Stat. 2006, 35, 325–330. [Google Scholar]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar] [CrossRef]
- Izrael, D.; Battaglia, A.; Hoaglin, D.; Battaglia, M. Use of the ROC curve and the bootstrap in comparing weighted logistic regression models. In Proceedings of the Twenty-Seventh Annual SAS Users Group International Conference, Orlando, FL, USA, 14–17 April 2002; pp. 1–6. [Google Scholar]
- González-Rincón, J.; Méndez, M.; Gómez, S.; García, J.F.; Martín, P.; Bellas, C.; Pedrosa, L.; Rodríguez-Pinilla, S.M.; I Camacho, F.; Quero, C.; et al. Unraveling transformation of follicular lymphoma to diffuse large B-cell lymphoma. PLoS ONE 2019, 14, e0212813. [Google Scholar] [CrossRef]
- Shipp, M.A. International non-Hodgkin’s lymphoma prognostic factors project. A predictive model for aggressive non-Hodgkin’s lymphoma. N. Engl. J. Med. 1993, 329, 987–994. [Google Scholar]
- Bi, W.L.; Hosny, A.; Schabath, M.B.; Giger, M.L.; Birkbak, N.J.; Mehrtash, A.; Allison, T.; Arnaout, O.; Abbosh, C.; Dunn, I.F.; et al. Artificial intelligence in cancer imaging: Clinical challenges and applications. CA Cancer J. Clin. 2019, 69, 127–157. [Google Scholar] [CrossRef]
- Ortega, C.; Eshet, Y.; Prica, A.; Anconina, R.; Johnson, S.; Constantini, D.; Keshavarzi, S.; Kulanthaivelu, R.; Metser, U.; Veit-Haibach, P. Combination of FDG PET/CT Radiomics and Clinical Parameters for Outcome Prediction in Patients with Hodgkin’s Lymphoma. Cancers 2023, 15, 2056. [Google Scholar] [CrossRef]
- Ngeow, J.Y.Y.; Quek, R.H.H.; Ng, D.C.E.; Hee, S.W.; Tao, M.; Lim, L.C.; Tan, Y.H.; Lim, S.T. High SUV uptake on FDG–PET/CT predicts for an aggressive B-cell lymphoma in a prospective study of primary FDG–PET/CT staging in lymphoma. Ann. Oncol. 2009, 20, 1543–1547. [Google Scholar] [CrossRef]
- Schöder, H.; Noy, A.; Gönen, M.; Weng, L.; Green, D.; Erdi, Y.E.; Larson, S.M.; Yeung, H.W. Intensity of18Fluorodeoxyglucose Uptake in Positron Emission Tomography Distinguishes Between Indolent and Aggressive Non-Hodgkin’s Lymphoma. J. Clin. Oncol. 2005, 23, 4643–4651. [Google Scholar] [CrossRef]
- Diao, Z.; Jiang, H. A multi-instance tumor subtype classification method for small PET datasets using RA-DL attention module guided deep feature extraction with radiomics features. Comput. Biol. Med. 2024, 174, 108461. [Google Scholar] [CrossRef]
- Xu, C.; Feng, J.; Yue, Y.; Cheng, W.; He, D.; Qi, S.; Zhang, G. A hybrid few-shot multiple-instance learning model predicting the aggressiveness of lymphoma in PET/CT images. Comput. Methods Programs Biomed. 2023, 243, 107872. [Google Scholar] [CrossRef]
- Chaudhari, A.S.; Mittra, E.; Davidzon, G.A.; Gulaka, P.; Gandhi, H.; Brown, A.; Zhang, T.; Srinivas, S.; Gong, E.; Zaharchuk, G.; et al. Low-count whole-body PET with deep learning in a multicenter and externally validated study. npj Digit. Med. 2021, 4, 127. [Google Scholar] [CrossRef] [PubMed]
- Hu, F.; Lucas, A.; Chen, A.A.; Coleman, K.; Horng, H.; Ng, R.W.S.; Tustison, N.J.; Davis, K.A.; Shou, H.; Li, M.; et al. DeepComBat: A statistically motivated, hyperparameter-robust, deep learning approach to harmonization of neuroimaging data. Hum. Brain Mapp. 2024, 45, e26708. [Google Scholar] [CrossRef]
- Su, H.; Maji, S.; Kalogerakis, E.; Learned-Miller, E. Multi-view Convolutional Neural Networks for 3D Shape Recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 945–953. [Google Scholar] [CrossRef]
- Paquin, A.R.; Oyogoa, E.; McMurry, H.S.; Kartika, T.; West, M.; Shatzel, J.J. The diagnosis and management of suspected lymphoma in general practice. Eur. J. Haematol. 2022, 110, 3–13. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Liao, Q.; Ding, L.; Zhang, J. Bridging 2D and 3D segmentation networks for computation-efficient volumetric medical image segmentation: An empirical study of 2.5D solutions. Comput. Med Imaging Graph. 2022, 99, 102088. [Google Scholar] [CrossRef] [PubMed]







| Patient Characteristics | Training Cohort (n = 840) | Internal Test Cohort (n = 211) | External Test Cohort (n = 373) | p Value | |
|---|---|---|---|---|---|
| Age (Years), mean ± SD | 67 ± 16 | 68 ± 15 | 68 ± 15 | 0.55 | |
| Sex, n (%) | 0.17 | ||||
| Male | 491 (58%) | 114 (54%) | 231 (62%) | ||
| female | 349 (42%) | 97 (46%) | 142 (38%) | ||
| Site of involvement, n (%) | <0.05 | ||||
| Head and neck | 245 (29%) | 58 (27%) | 142 (38%) | ||
| Thoracic | 51 (6%) | 13 (6%) | 32 (9%) | ||
| Abdominal | 155 (18%) | 40 (19%) | 63 (17%) | ||
| Pelvis | 54 (6%) | 15 (7%) | 22 (6%) | ||
| Etc. | 200 (24%) | 52 (25%) | 114 (31%) | ||
| N/A | 135 (16%) | 33 (16%) | 0 (0%) | ||
| Ann Arbor stage, n (%) | <0.05 | ||||
| I | 118 (14%) | 33 (16%) | 56 (15%) | ||
| II | 172 (20%) | 52 (25%) | 108 (29%) | ||
| III | 148 (18%) | 39 (18%) | 55 (15%) | ||
| IV | 402 (48%) | 87 (41%) | 154 (41%) | ||
| Histologic subtype, n (%) | <0.05 | ||||
| HL | 65 (8%) | 16 (8%) | 21 (6%) | ||
| DLBCL | 334 (40%) | 87 (41%) | 173 (46%) | ||
| FL | 48 (6%) | 7 (3%) | 34 (9%) | ||
| Etc. | 328 (46%) | 85 (48%) | 145 (39%) | ||
| LDH (IU/L), mean ± SD | 381 ± 445 | 365 ± 350 | 314 ± 240 | <0.05 |
| Classification | Cohort | AUC (95% CI) | MCC (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) |
|---|---|---|---|---|---|
| HL vs. NHL | Internal test cohort | 0.89 (0.78, 0.96) | 0.43 (0.26, 0.58) | 0.87 (0.82, 0.91) | 0.76 (0.54, 0.94) |
| External test cohort | 0.84 (0.76, 0.92) | 0.28 (0.16, 0.38) | 0.77 (0.72, 0.81) | 0.76 (0.56, 0.93) | |
| DLBCL vs. FL | Internal test cohort | 0.84 (0.74, 0.92) | 0.39 (0.17, 0.56) | 0.74 (0.65, 0.83) | 0.83 (0.56, 1.0) |
| External test cohort | 0.76 (0.67, 0.84) | 0.25 (0.11, 0.38) | 0.68 (0.48, 0.75) | 0.65 (0.48, 0.81) |
| Classification | Cohort | AUC (95% CI) | MCC (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) |
|---|---|---|---|---|---|
| HL vs. NHL | Internal test cohort | 0.87 (0.75, 0.95) | 0.41 (0.23, 0.56) | 0.83 (0.80, 0.88) | 0.82 (0.60, 1.0) |
| External test cohort | 0.78 (0.68, 0.87) | 0.21 (0.07, 0.33) | 0.81 (0.76, 0.84) | 0.58 (0.35, 0.79) | |
| DLBCL vs. FL | Internal test cohort | 0.81 (0.67, 0.92) | 0.26 (0.04, 0.47) | 0.74 (0.65, 0.83) | 0.64 (0.38, 0.9) |
| External test cohort | 0.74 (0.66, 0.83) | 0.25 (0.10, 0.39) | 0.72 (0.66,0.77) | 0.59 (0.42, 0.75) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Kim, S.; Park, J.H.; Kim, C.-H.; You, S.; Choi, J.-S.; Chang, J.W.; Jo, I.Y.; Lee, B.-J.; Park, I.-S.; Kim, H.S.; et al. Robust Multimodal Deep Learning for Lymphoma Subtype Classification Using 18F-FDG PET Maximum Intensity Projection Images and Clinical Data: A Multi-Center Study. Cancers 2026, 18, 210. https://doi.org/10.3390/cancers18020210
Kim S, Park JH, Kim C-H, You S, Choi J-S, Chang JW, Jo IY, Lee B-J, Park I-S, Kim HS, et al. Robust Multimodal Deep Learning for Lymphoma Subtype Classification Using 18F-FDG PET Maximum Intensity Projection Images and Clinical Data: A Multi-Center Study. Cancers. 2026; 18(2):210. https://doi.org/10.3390/cancers18020210
Chicago/Turabian StyleKim, Seonhwa, Jun Hyeong Park, Chul-Ho Kim, Seulgi You, Jeong-Seok Choi, Jae Won Chang, In Young Jo, Byung-Joo Lee, Il-Seok Park, Han Su Kim, and et al. 2026. "Robust Multimodal Deep Learning for Lymphoma Subtype Classification Using 18F-FDG PET Maximum Intensity Projection Images and Clinical Data: A Multi-Center Study" Cancers 18, no. 2: 210. https://doi.org/10.3390/cancers18020210
APA StyleKim, S., Park, J. H., Kim, C.-H., You, S., Choi, J.-S., Chang, J. W., Jo, I. Y., Lee, B.-J., Park, I.-S., Kim, H. S., Park, Y.-J., & Heo, J. (2026). Robust Multimodal Deep Learning for Lymphoma Subtype Classification Using 18F-FDG PET Maximum Intensity Projection Images and Clinical Data: A Multi-Center Study. Cancers, 18(2), 210. https://doi.org/10.3390/cancers18020210

