Reliable Radiologic Skeletal Muscle Area Assessment—A Biomarker for Cancer Cachexia Diagnosis
Highlights
- SMAART-AI is an uncertainty-aware CT muscle analysis pipeline that combines robust segmentation with ensemble uncertainty and triage, supporting reliable automated muscle quantification across heterogeneous cancer cohorts.
- SMAART-AI enables multimodal integration of imaging-derived muscle metrics (SMA/SMI) with clinical features, improving downstream modeling for prognostic tasks (survival) and clinical endpoints (e.g., cachexia/recurrence prediction).
- Uncertainty-based filtering creates a transparent deployment pathway by flagging higher-risk (noisy/out-of-distribution) cases for expert review while allowing scalable automated processing for routine cases.
- CT-derived muscle biomarkers can be operationalized at scale for cachexia assessment across cancers, strengthening prognostic stratification when combined with clinical data and supporting reproducible, longitudinal monitoring.
Abstract
1. Introduction
- End-to-end automated skeletal muscle quantification: Enabling reproducible and scalable assessment of cancer cachexia across cohorts and supports longitudinal patient monitoring.
- Robust and uncertainty-aware segmentation: SMAART-AI employs a structurally diverse ensemble of nnU-Net models with random initialization to enhance robustness, particularly on noisy or out-of-distribution scans. We integrate multiple uncertainty estimation strategies and demonstrate a strong correlation between uncertainty and error, enabling performance-aware triage and reliable deployment.
- Benchmarking against existing tools: We systematically compare SMAART-AI to widely used commercial and open-source tools (ABACS, DAFS, AW Server, and TotalSegmentator). SMAART-AI demonstrates competitive or superior accuracy while providing reproducibility safeguards, open availability, and uncertainty quantification, which are absent from proprietary solutions.
- Clinical translation via multimodal prognostic modeling: By integrating SMA/SMI with clinical features, SMAART-AI improves prediction of cachexia, recurrence, and survival across multiple cancer types. This highlights the framework’s clinical utility as a data science-driven approach to prognostic modeling in oncology.
2. Materials and Methods
2.1. Datasets
2.1.1. Gastroesophageal
2.1.2. Colorectal
2.1.3. Pancreatic
2.1.4. Ovarian
2.2. Data Processing for Ground Truth Development
2.2.1. Annotations for Segmentation Model Training, Evaluation, and Comparative Analysis
2.2.2. Ground Truth Development for Cancer Cachexia Detection
2.3. SMAART-AI Framework for Reliable Skeletal Muscle Segmentation and Metric Extraction
2.3.1. Automated Selection of Axial Series and Lumbar-Level Slice
2.3.2. nnU-Net for Segmentation
2.3.3. Uncertainty Estimation Methods and Metrics
- Post hoc Calibration: The ‘netcal’ Python library (version 1.3.6) [50] with ’LogisticCalibration’ (Platt scaling) was used. The calibration model was trained using the DL model outputs and corresponding labels. During inference, the DL model outputs were passed through the calibration model to obtain calibrated probabilities.
- Monte Carlo Dropout: A dropout layer (p = 0.20) was added after each convolutional layer in the ‘ResidualEncoderUNet’ architecture. The model was trained with 5-fold cross-validation, and inference was repeated 20 times per fold. At each iteration, the average of the 5-fold ensemble predictions was calculated. The final dropout prediction was the pixel-wise mean of these stochastic predictions, and uncertainty was computed as the mean pixel-wise variance across them.
- Model Ensemble: Ten models were used, five ‘PlainConvUNet’ and five ’ResidualEncoderUNet’, corresponding to 5-fold cross-validation for each architecture. The final prediction was the pixel-wise mean across the ten models, and uncertainty was computed from the mean pixel-wise variance across model outputs.
- Average Probability: Calculated by taking the average of the output probabilities of the predicted class at each pixel in a single image. This metric captures the total uncertainty.
- Average probability (SM): This is the average output probability of pixels marked as skeletal muscle (SM) only. This metric captures the total uncertainty.
- Average Calibrated Probability: Average of the calibrated output probabilities of the predicted class at each pixel in a single image. This metric captures the total uncertainty.
- Coefficient of Variation (pixel-wise): The average of the pixel-wise coefficient of variation, calculated from the ensemble or dropout outputs as the ratio of the standard deviation (SD) to the mean. The average pixel-wise coefficient of variation was computed as the ratio of the standard deviation to the mean of the predicted probability across ensemble models or dropout passes. This metric captures the epistemic uncertainty.
- Coefficient of Variation (SMA): Calculated using the standard deviation and mean of the SMA estimated by each model in the ensemble or multiple inferences in case of the dropout method. This metric captures the epistemic uncertainty.
- Average Variance: It is calculated as the average of the variance computed for each pixel. The pixel-wise variance is calculated using the output probabilities from the ensemble models or multiple inferences using the dropout method. This metric captures the epistemic uncertainty.
- Average Variance (SM): Average of the variance for pixels identified as being part of the skeletal muscle (SM) only. This metric captures the epistemic uncertainty.
- Average Entropy: Estimates the total uncertainty by calculating the binary entropy at each pixel based on the average output probabilities across pixels in either an ensemble of models or multiple inferences with dropout. The average entropy of all pixels across the image is reported.
- Expected Entropy of the Ensemble: Estimates aleatoric uncertainty by calculating the binary entropy at each pixel for all the models in the ensemble. The average entropy is computed for each pixel across all models, and the final reported value is the mean of these pixel-wise average entropies across the entire image.
2.3.4. Statistical Tests for Uncertainty Methods and Metrics
2.3.5. Methods for Identifying High-Error SMA Predictions in SMAART-AI
2.4. Survival Analysis Using SMAART-AI
2.5. Cancer Cachexia Prediction Using SMAART-AI
2.6. Recurrence Prediction Using SMAART-AI
3. Results
3.1. Comparison of the Predicted SMA Between SMAART-AI, TotalSegmentator, DAFS, ABACS, AW Server, and SliceOmatic
3.1.1. Gastroesophageal Dataset (Comparison of SMAART-AI and the Ground Truth Masks Generated by Experts Using SliceOmatic)
| SMAART-AI—Ensemble of Models | SMAART-AI—Dropout Technique | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Patient ID.scan | Area Difference Model vs. SliceOmatic (%) | Jaccard Score (%) | Dice Score (%) | False Positive | False Negative | Area Difference Model vs. SliceOmatic (%) | Jaccard Score (%) | Dice Score (%) | False Positive | False Negative |
| 2.2 | 0.89 | 97.39 | 98.68 | 381 | 189 | 0.586 | 97.64 | 98.81 | 320 | 194 |
| 2.4 | −0.16 | 96.98 | 98.47 | 262 | 291 | 2.360 | 95.44 | 97.67 | 639 | 213 |
| 3.1 | 0.79 | 94.14 | 96.98 | 701 | 539 | −1.056 | 91.45 | 95.54 | 800 | 1016 |
| 4.1 | 0.17 | 98.98 | 99.49 | 119 | 86 | 0.240 | 98.96 | 99.48 | 129 | 81 |
| 4.2 | 1.67 | 97.04 | 98.50 | 505 | 147 | 2.149 | 96.86 | 98.41 | 577 | 115 |
| 5.1 | −1.13 | 94.52 | 97.18 | 298 | 448 | −1.277 | 94.61 | 97.23 | 281 | 451 |
| 5.2 | −0.76 | 95.10 | 97.49 | 272 | 369 | −0.820 | 95.72 | 97.82 | 226 | 331 |
| 5.3 | −0.75 | 94.54 | 97.19 | 374 | 489 | −0.052 | 94.21 | 97.02 | 456 | 464 |
| 5.4 | −0.37 | 94.59 | 97.22 | 429 | 490 | 0.272 | 93.08 | 96.42 | 617 | 572 |
| 7.3 | 5.17 | 86.69 | 92.87 | 2916 | 1393 | 6.251 | 86.56 | 92.80 | 3108 | 1267 |
| 9.1 | −0.66 | 97.28 | 98.62 | 197 | 323 | −1.451 | 97.41 | 98.69 | 109 | 384 |
| 9.2 | 1.95 | 95.01 | 97.44 | 669 | 302 | 2.288 | 94.93 | 97.40 | 709 | 279 |
| 9.3 | 11.14 | 86.11 | 92.53 | 3041 | 523 | 12.427 | 84.90 | 91.83 | 3366 | 556 |
| 15.1 | 5.87 | 92.58 | 96.14 | 488 | 73 | 5.504 | 93.33 | 96.55 | 445 | 56 |
| 15.2 | 0.83 | 98.26 | 99.12 | 259 | 93 | 0.496 | 98.33 | 99.16 | 218 | 119 |
| 15.3 | −1.97 | 96.90 | 98.42 | 106 | 471 | −1.881 | 96.73 | 98.34 | 131 | 479 |
| 15.4 | 2.64 | 92.34 | 96.02 | 1100 | 557 | 2.757 | 92.86 | 96.30 | 1054 | 488 |
| 15.5 | −0.81 | 92.98 | 96.36 | 701 | 877 | 0.597 | 93.36 | 96.56 | 815 | 685 |
| 16.1 | −0.18 | 97.94 | 98.96 | 172 | 205 | −0.320 | 97.85 | 98.91 | 168 | 226 |
| 16.2 | 1.54 | 94.12 | 96.97 | 729 | 436 | 1.799 | 93.88 | 96.84 | 779 | 436 |
| 16.3 | −0.80 | 96.21 | 98.07 | 238 | 363 | −0.436 | 96.22 | 98.07 | 266 | 334 |
| 21.1 | 0.52 | 96.85 | 98.40 | 125 | 90 | 0.328 | 97.27 | 98.62 | 104 | 82 |
| 21.3 | −0.24 | 94.84 | 97.35 | 481 | 527 | −0.173 | 94.93 | 97.40 | 479 | 512 |
| 21.5 | −0.10 | 93.99 | 96.90 | 612 | 612 | 0.015 | 94.21 | 97.02 | 582 | 579 |
| 23.2 | 19.85 | 79.75 | 88.73 | 4170 | 460 | 22.56 | 78.11 | 87.71 | 4665 | 448 |
3.1.2. Colorectal Dataset (Comparison of SMAART-AI, SliceOmatic, DAFS, and TotalSegmentator)
3.1.3. Pancreatic Dataset (Comparing SMAART-AI, SliceOmatic, AW Server, and TotalSegmentator)

3.1.4. Ovarian Dataset (Comparing SMAART-AI, SliceOmatic, ABACS, and TotalSegmentator)
3.2. Comparison of the Uncertainty Methods and Metrics
3.2.1. Correlation Between Model Uncertainty and SMA Estimation Difference
3.2.2. Model Uncertainty for Detecting Performance Degradation
3.3. Survival Analysis
3.4. Cancer Cachexia and Recurrence Prediction
3.5. Anecdotal Evidence of SMAART-AI Tool’s Utility
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Park, M.A.; Whelan, C.J.; Ahmed, S.; Boeringer, T.; Brown, J.; Carson, T.L.; Crowder, S.L.; Gage, K.; Gregg, C.; Jeong, D.K. Defining and Addressing Research Priorities in Cancer Cachexia through Transdisciplinary Collaboration. Cancers 2024, 16, 2364, Correction in Cancers 2025, 17, 971. [Google Scholar] [CrossRef] [PubMed]
- Han, J.; Harrison, L.; Patzelt, L.; Wu, M.; Junker, D.; Herzig, S.; Berriel Diaz, M.; Karampinos, D.C. Imaging modalities for diagnosis and monitoring of cancer cachexia. EJNMMI Res. 2021, 11, 94. [Google Scholar] [CrossRef]
- Mariean, C.R.; Tiucă, O.M.; Mariean, A.; Cotoi, O.S. Cancer cachexia: New insights and future directions. Cancers 2023, 15, 5590. [Google Scholar] [CrossRef]
- Baracos, V.E.; Martin, L.; Korc, M.; Guttridge, D.C.; Fearon, K.C.H. Cancer-associated cachexia. Nat. Rev. Dis. Primers 2018, 4, 17105. [Google Scholar] [CrossRef] [PubMed]
- Babic, A.; Rosenthal, M.H.; Sundaresan, T.K.; Khalaf, N.; Lee, V.; Brais, L.K.; Loftus, M.; Caplan, L.; Denning, S.; Gurung, A. Adipose tissue and skeletal muscle wasting precede clinical diagnosis of pancreatic cancer. Nat. Commun. 2023, 14, 4317. [Google Scholar] [CrossRef] [PubMed]
- Al-Sawaf, O.; Weiss, J.; Skrzypski, M.; Lam, J.M.; Karasaki, T.; Zambrana, F.; Kidd, A.C.; Frankell, A.M.; Watkins, T.B.; Martínez-Ruiz, C. Body composition and lung cancer-associated cachexia in TRACERx. Nat. Med. 2023, 29, 846–858. [Google Scholar] [CrossRef]
- Baba, M.R.; Buch, S.A. Revisiting cancer cachexia: Pathogenesis, diagnosis, and current treatment approaches. Asia-Pac. J. Oncol. Nurs. 2021, 8, 508–518. [Google Scholar] [CrossRef]
- Vigano, A.A.L.; Morais, J.A.; Ciutto, L.; Rosenthall, L.; di Tomasso, J.; Khan, S.; Olders, H.; Borod, M.; Kilgour, R.D. Use of routinely available clinical, nutritional, and functional criteria to classify cachexia in advanced cancer patients. Clin. Nutr. 2017, 36, 1378–1390. [Google Scholar] [CrossRef] [PubMed]
- Nakajima, N. Differential diagnosis of cachexia and refractory cachexia and the impact of appropriate nutritional intervention for cachexia on survival in terminal cancer patients. Nutrients 2021, 13, 915. [Google Scholar] [CrossRef] [PubMed]
- Jiang, Y.; Zhao, Y.; Dai, J.; Yang, Q.; Tang, X.; Fu, L.; Mao, H.; Peng, X.-G. Imaging cancer-associated Cachexia: Utilizing clinical imaging modalities for early diagnosis. Radiol. Imaging Cancer 2025, 7, e240291. [Google Scholar] [CrossRef] [PubMed]
- Mourtzakis, M.; Prado, C.M.M.; Lieffers, J.R.; Reiman, T.; McCargar, L.J.; Baracos, V.E. A practical and precise approach to quantification of body composition in cancer patients using computed tomography images acquired during routine care. Appl. Physiol. Nutr. Metab. 2008, 33, 997–1006. [Google Scholar] [CrossRef]
- Shen, W.; Punyanitya, M.; Wang, Z.; Gallagher, D.; Onge, M.-P.; Albu, J.; Heymsfield, S.B.; Heshka, S. Total body skeletal muscle and adipose tissue volumes: Estimation from a single abdominal cross-sectional image. J. Appl. Physiol. 2004, 97, 2333–2338. [Google Scholar] [CrossRef] [PubMed]
- Faron, A.; Luetkens, J.A.; Schmeel, F.C.; Kuetting, D.L.R.; Thomas, D.; Sprinkart, A.M. Quantification of fat and skeletal muscle tissue at abdominal computed tomography: Associations between single-slice measurements and total compartment volumes. Abdom. Radiol. 2019, 44, 1907–1916. [Google Scholar] [CrossRef] [PubMed]
- Irlbeck, T.; Massaro, J.M.; Bamberg, F.; O’Donnell, C.J.; Hoffmann, U.; Fox, C.S. Association between single-slice measurements of visceral and abdominal subcutaneous adipose tissue with volumetric measurements: The Framingham Heart Study. Int. J. Obes. 2010, 34, 781–787. [Google Scholar] [CrossRef]
- Styner, M.A.; Angelini, E.D. Medical Imaging 2017: Image Processing; SPIE: Bellingham, WA, USA, 2017. [Google Scholar]
- Popuri, K.; Cobzas, D.; Esfandiari, N.; Baracos, V.; Jgersand, M. Body composition assessment in axial CT images using FEM-based automatic segmentation of skeletal muscle. IEEE Trans. Med. Imaging 2015, 35, 512–520. [Google Scholar] [CrossRef]
- Meesters, S.P.L.; Yokota, F.; Okada, T.; Takaya, M.; Tomiyama, N.; Yao, J.; Liguraru, M.G.; Summers, R.M.; Sato, Y. Multi Atlas-Based Muscle Segmentation in Abdominal CT Images with Varying Field of View. In Proceedings of the International Forum on Medical Imaging in Asia (IFMIA), Daejeon, Korea, 16–17 November 2012. [Google Scholar]
- Chung, H.; Cobzas, D.; Birdsell, L.; Lieffers, J.; Baracos, V. Automated Segmentation of Muscle and Adipose Tissue on CT Images for Human Body Composition Analysis; SPIE: Bellingham, WA, USA, 2009; pp. 197–204. [Google Scholar]
- Soria-Utrilla, V.; Sánchez-Torralvo, F.J.; Palmas-Candia, F.X.; Fernández-Jiménez, R.; Mucarzel-Suarez-Arana, F.; Guirado-Peláez, P.; Olveira, G.; García-Almeida, J.M.; Burgos-Peláez, R. AI-assisted body composition assessment using CT imaging in colorectal cancer patients: Predictive capacity for sarcopenia and malnutrition diagnosis. Nutrients 2024, 16, 1869. [Google Scholar] [CrossRef] [PubMed]
- Nowak, S.; Faron, A.; Luetkens, J.A.; Geissler, H.L.; Praktiknjo, M.; Block, W.; Thomas, D.; Sprinkart, A.M. Fully automated segmentation of connective tissue compartments for CT-based body composition analysis: A deep learning approach. Investig. Radiol. 2020, 55, 357–366. [Google Scholar] [CrossRef] [PubMed]
- Koitka, S.; Kroll, L.; Malamutmann, E.; Oezcelik, A.; Nensa, F. Correction to: Fully automated body composition analysis in routine CT imaging using 3D semantic segmentation convolutional neural networks. Eur. Radiol. 2020, 31, 4402. [Google Scholar] [CrossRef]
- Park, H.J.; Shin, Y.; Park, J.; Kim, H.; Lee, I.S.; Seo, D.-W.; Huh, J.; Lee, T.Y.; Park, T.; Lee, J.; et al. Development and validation of a deep learning system for segmentation of abdominal muscle and fat on computed tomography. Korean J. Radiol. 2020, 21, 88–100. [Google Scholar] [CrossRef] [PubMed]
- Dabiri, S.; Popuri, K.; Feliciano, E.M.C.; Caan, B.J.; Baracos, V.E.; Beg, M.F. Muscle segmentation in axial computed tomography (CT) images at the lumbar (L3) and thoracic (T4) levels for body composition analysis. Comput. Med. Imaging Graph. 2019, 75, 47–55. [Google Scholar] [CrossRef]
- Bridge, C.P.; Rosenthal, M.; Wright, B.; Kotecha, G.; Fintelmann, F.; Troschel, F.; Miskin, N.; Desai, K.; Wrobel, W.; Babic, A.; et al. Fully-Automated Analysis of Body Composition from CT in Cancer Patients Using Convolutional Neural Networks; Springer: Berlin/Heidelberg, Germany, 2018; pp. 204–213. [Google Scholar]
- Magudia, K.; Bridge, C.P.; Bay, C.P.; Babic, A.; Fintelmann, F.J.; Troschel, F.M.; Miskin, N.; Wrobel, W.C.; Brais, L.K.; Andriole, K.P.; et al. Population-scale CT-based body composition analysis of a large outpatient population using deep learning to derive age-, sex-, and race-specific reference curves. Radiology 2021, 298, 319–329. [Google Scholar] [CrossRef]
- Castiglione, J.; Somasundaram, E.; Gilligan, L.A.; Trout, A.T.; Brady, S. Automated segmentation of abdominal skeletal muscle on pediatric CT scans using deep learning. Radiol. Artif. Intell. 2021, 3, e200130. [Google Scholar] [CrossRef] [PubMed]
- Dabiri, S.; Popuri, K.; Ma, C.; Chow, V.; Feliciano, E.M.C.; Caan, B.J.; Baracos, V.E.; Beg, M.F. Deep learning method for localization and segmentation of abdominal CT. Comput. Med. Imaging Graph. 2020, 85, 101776. [Google Scholar] [CrossRef] [PubMed]
- Waqas, A.; Dera, D.; Rasool, G.; Bouaynaya, N.C.; Fathallah-Shaykh, H.M. Brain tumor segmentation and surveillance with deep artificial neural networks. In Deep Learning for Biomedical Data Analysis: Techniques, Approaches, and Applications; Springer: Cham, Switzerland, 2021; pp. 311–350. [Google Scholar]
- Ahmed, S.; Dera, D.; Hassan, S.U.; Bouaynaya, N.; Rasool, G. Failure detection in deep neural networks for medical imaging. Front. Med. Technol. 2022, 4, 919046. [Google Scholar] [CrossRef] [PubMed]
- Dolezal, J.M.; Srisuwananukorn, A.; Karpeyev, D.; Ramesh, S.; Kochanny, S.; Cody, B.; Mansfield, A.S.; Rakshit, S.; Bansal, R.; Bois, M.C. Uncertainty-informed deep learning models enable high-confidence predictions for digital histopathology. Nat. Commun. 2022, 13, 6572. [Google Scholar] [CrossRef]
- Nowak, S.; Theis, M.; Wichtmann, B.D.; Faron, A.; Froelich, M.F.; Tollens, F.; Geiler, H.L.; Block, W.; Luetkens, J.A.; Attenberger, U.I.; et al. End-to-end automated body composition analyses with integrated quality control for opportunistic assessment of sarcopenia in CT. Eur. Radiol. 2021, 32, 3142–3151. [Google Scholar] [CrossRef] [PubMed]
- Waqas, A.; Farooq, H.; Bouaynaya, N.C.; Rasool, G. Exploring robust architectures for deep artificial neural networks. Commun. Eng. 2022, 1, 46. [Google Scholar] [CrossRef]
- Waqas, A.; Bui, M.M.; Glassy, E.F.; El Naqa, I.; Borkowski, P.; Borkowski, A.A.; Rasool, G. Revolutionizing digital pathology with the power of generative artificial intelligence and foundation models. Lab. Investig. 2023, 103, 100255. [Google Scholar] [CrossRef]
- Irving, B.A.; Weltman, J.Y.; Brock, D.W.; Davis, C.K.; Gaesser, G.A.; Weltman, A. NIH ImageJ and Slice-O-Matic computed tomography imaging software to quantify soft tissue. Obesity 2007, 15, 370–376. [Google Scholar] [CrossRef]
- Rigiroli, F.; Zhang, D.; Molinger, J.; Wang, Y.; Chang, A.; Wischmeyer, P.E.; Inman, B.A.; Gupta, R.T. Automated versus manual analysis of body composition measures on computed tomography in patients with bladder cancer. Eur. J. Radiol. 2022, 154, 110413. [Google Scholar] [CrossRef]
- Brown, L.R.; Thomson, G.G.; Gardner, E.; Chien, S.; McGovern, J.; Dolan, R.D.; McSorley, S.T.; Forshaw, M.J.; McMillan, D.C.; Wigmore, S.J. Cachexia index for prognostication in surgical patients with locally advanced oesophageal or gastric cancer: Multicentre cohort study. Br. J. Surg. 2024, 111, znae098. [Google Scholar] [CrossRef]
- Beenish Zia, S.A.; Steve, J.; Jared, P.; SAntoine, A.; Yingpo, H.; Jerome, K.; Yannick, L.; Lionel, M.; Florentin, T. White Paper: Accelerate Your Visualization Experience; Intel Corporation, GE Healthcare: Santa Clara, CA, USA, 2020. [Google Scholar]
- Wasserthal, J.; Breit, H.-C.; Meyer, M.T.; Pradella, M.; Hinck, D.; Sauter, A.W.; Heye, T.; Boll, D.T.; Cyriac, J.; Yang, S. TotalSegmentator: Robust segmentation of 104 anatomic structures in CT images. Radiol. Artif. Intell. 2023, 5, e230024. [Google Scholar] [CrossRef] [PubMed]
- Huang, L.; Ruan, S.; Xing, Y.; Feng, M. A review of uncertainty quantification in medical image analysis: Probabilistic and non-probabilistic methods. Med. Image Anal. 2024, 97, 103223. [Google Scholar] [CrossRef] [PubMed]
- Faghani, S.; Moassefi, M.; Rouzrokh, P.; Khosravi, B.; Baffour, F.I.; Ringler, M.D.; Erickson, B.J. Quantifying uncertainty in deep learning of radiologic images. Radiology 2023, 308, e222217. [Google Scholar] [CrossRef] [PubMed]
- Yu, J.; Spielvogel, C.; Haberl, D.; Jiang, Z.; Özer, Ö.; Pusitz, S.; Geist, B.; Beyerlein, M.; Tibu, I.; Yildiz, E. Systemic metabolic and volumetric assessment via whole-body [18F] FDG-PET/CT: Pancreas size predicts cachexia in head and neck squamous cell carcinoma. Cancers 2024, 16, 3352. [Google Scholar] [CrossRef]
- Khosravi, P.; Fuchs, T.J.; Ho, D.J. Artificial Intelligence–Driven Cancer Diagnostics: Enhancing Radiology and Pathology through Reproducibility, Explainability, and Multimodality. Cancer Res. 2025, 85, 2356–2367. [Google Scholar] [CrossRef]
- Rosen, A.W.; Ose, I.; Gögenur, M.; Andersen, L.P.K.; Bojesen, R.D.; Vogelsang, R.P.; Rose, M.H.; Steenfos, P.W.; Hansen, L.B.; Spuur, H.S. Clinical implementation of an AI-based prediction model for decision support for patients undergoing colorectal cancer surgery. Nat. Med. 2025, 31, 3737–3748. [Google Scholar] [CrossRef]
- Permuth, J.B.; Dezsi, K.B.; Vyas, S.; Ali, K.N.; Basinski, T.L.; Utuama, O.A.; Denbo, J.W.; Klapman, J.; Dam, A.; Carballido, E. The Florida pancreas collaborative next-generation biobank: Infrastructure to reduce disparities and improve survival for a diverse cohort of patients with pancreatic cancer. Cancers 2021, 13, 809. [Google Scholar] [CrossRef] [PubMed]
- Permuth, J.B.; Trevino, J.; Merchant, N.; Malafa, M.; Florida Pancreas Collaborative. Partnering to advance early detection and prevention efforts for pancreatic cancer: The Florida Pancreas Collaborative. Futur. Oncol. 2016, 12, 997–1000. [Google Scholar] [CrossRef] [PubMed]
- Fearon, K.; Strasser, F.; Anker, S.D.; Bosaeus, I.; Bruera, E.; Fainsinger, R.L.; Jatoi, A.; Loprinzi, C.; MacDonald, N.; Mantovani, G.; et al. Definition and classification of cancer cachexia: An international consensus. Lancet Oncol. 2011, 12, 489–495. [Google Scholar] [CrossRef]
- Khristenko, E.; Sinitsyn, V.; Rieden, T.; Girod, P.; Kauczor, H.-U.; Mayer, P.; Klauss, M.; Lyadov, V. CT-based screening of sarcopenia and its role in cachexia syndrome in pancreatic cancer. PLoS ONE 2024, 19, e0291185. [Google Scholar] [CrossRef] [PubMed]
- Isensee, F.; Jaeger, P.F.; Kohl, S.A.A.; Petersen, J.; Maier-Hein, K.H. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 2021, 18, 203–211. [Google Scholar] [CrossRef] [PubMed]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Kuppers, F.; Kronenberger, J.; Shantia, A.; Haselhoff, A. Multivariate confidence calibration for object detection. In Proceedings of the IEEE/CVF Conference On Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; IEEE: Piscataway, NJ, USA; pp. 326–327. [Google Scholar]
- Shen, M.; Bu, Y.; Sattigeri, P.; Ghosh, S.; Das, S.; Wornell, G. Post-hoc uncertainty learning using a dirichlet meta-model. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; AAAI Press: Palo Alto, CA, USA; pp. 9772–9781. [Google Scholar]
- Davidson-Pilon, C. lifelines: Survival analysis in Python. J. Open Source Softw. 2019, 4, 1317. [Google Scholar] [CrossRef]
- Tripathi, A.; Waqas, A.; Venkatesan, K.; Yilmaz, Y.; Rasool, G. Building flexible, scalable, and machine learning-ready multimodal oncology datasets. Sensors 2024, 24, 1634. [Google Scholar] [CrossRef] [PubMed]
- Waqas, A. From Graph Theory for Robust Deep Networks to Graph Learning for Multimodal Cancer Analysis; University of South Florida: Tampa, FL, USA, 2024. [Google Scholar]
- Waqas, A.; Naveed, J.; Shahnawaz, W.; Asghar, S.; Bui, M.M.; Rasool, G. Digital pathology and multimodal learning on oncology data. BJR/Artif. Intell. 2024, 1, ubae014. [Google Scholar] [CrossRef]
- Waqas, A.; Tripathi, A.; Ramachandran, R.P.; Stewart, P.A.; Rasool, G. Multimodal data integration for oncology in the era of deep neural networks: A review. Front. Artif. Intell. 2024, 7, 1408843. [Google Scholar] [CrossRef] [PubMed]
- Waqas, A.; Tripathi, A.; Stewart, P.; Naeini, M.; Rasool, G. Embedding-based Multimodal Learning on Pan-Squamous Cell Carcinomas for Improved Survival Outcomes. arXiv 2024, arXiv:2406.08521. [Google Scholar] [CrossRef]
- Waqas, A.; Tripathi, A.; Ahmed, S.; Mukund, A.; Farooq, H.; Johnson, J.O.; Stewart, P.A.; Naeini, M.; Schabath, M.B.; Rasool, G. Self-Normalizing Multi-Omics Neural Network for Pan-Cancer Prognostication. Int. J. Mol. Sci. 2025, 26, 7358. [Google Scholar] [CrossRef]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Bland, J.M.; Altman, D. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986, 327, 307–310. [Google Scholar] [CrossRef]
- Prado, C.M.; Lieffers, J.R.; McCargar, L.J.; Reiman, T.; Sawyer, M.B.; Martin, L.; Baracos, V.E. Prevalence and clinical implications of sarcopenic obesity in patients with solid tumours of the respiratory and gastrointestinal tracts: A population-based study. Lancet Oncol. 2008, 9, 629–635. [Google Scholar] [CrossRef] [PubMed]
- Martin, L.; Birdsell, L.; MacDonald, N.; Reiman, T.; Clandinin, M.T.; McCargar, L.J.; Murphy, R.; Ghosh, S.; Sawyer, M.B.; Baracos, V.E. Cancer cachexia in the age of obesity: Skeletal muscle depletion is a powerful prognostic factor, independent of body mass index. J. Clin. Oncol. 2013, 31, 1539–1547. [Google Scholar] [CrossRef] [PubMed]
- Davis, E.W.; Park, M.A.; Basinski, T.L.; Arnoletti, J.P.; Bloomston, M.; Carson, T.L.; Biachi De Castria, T.; Chen, D.-T.; Cortizas, E.M.; Crowder, S.L. The Impact of Edema on Skeletal Muscle Changes among Patients with Pancreatic Ductal Adenocarcinoma. Cancer Epidemiol. Biomark. Prev. 2025, 34, 1609–1617. [Google Scholar] [CrossRef] [PubMed]




| Cancer Site | Segmentation Models | Survival/Prediction Models | |||||
|---|---|---|---|---|---|---|---|
| No. of Patients | No. of CT Scans | Training Set (Images) | Testing Set (Images) | Annotated (Images) | Training Set (Patients) | Testing Set (Patients) | |
| Gastroesophageal | 24 | 70 | 45 | 25 | 70 | - | - |
| Colorectal | 60 | 60 | 0 | 90 * | 53 | 40 | 20 |
| Pancreatic | 153 | 222 † | 15 | 222 ^ | 109 | 100 | 30 |
| Ovarian | 324 | 324 | 0 | 324 | 154 | 125 | 50 |
| Colorectal | Ovarian | Pancreatic | ||
|---|---|---|---|---|
| Total patient count | 60 | 175 | 130 | |
| Age at diagnosis, mean (SD) | 61.93 ± 12.50 | 64.29 ± 10.58 | 67.81 ± 10.80 | |
| BMI at diagnosis, mean (SD) | 27.50 ± 5.84 | 27.83 ± 6.00 | 28.18 ± 6.56 | |
| Weight at diagnosis, mean (SD) | 172.97 ± 42.99 | 162.17 ± 35.41 | 175.77 ± 43.59 | |
| Height at diagnosis, mean (SD) | 1.68 ± 0.10 | 1.63 ± 0.07 | 1.68 ± 0.11 | |
| Sex, N | ||||
| Female | 28 | 175 | 58 | |
| Male | 32 | 0 | 72 | |
| Ethnicity, N | Race and Ethnicity, N | |||
| Non-Hispanic/Non-Latinx | 53 | 165 | Non-Hispanic White | 107 |
| Hispanic/Latinx | 7 | 10 | Hispanic/Latinx | 13 |
| Race, N | Non-Hispanic Black | 10 | ||
| White | 56 | 163 | ||
| Black | 0 | 6 | ||
| Other | 4 | 6 | ||
| Stage, N | AJCC-7 | FIGO | TNM Stage (Pathological), N | |
| I | 8 | 9 | 1: 0 (T0/Tis, N0, M0) | 8 |
| II | 26 | 14 | 2: IA (T1, N0, M0) | 17 |
| III | 24 | 116 | 3: IB (T2, N0, M0) | 15 |
| IV | 1 | 36 | 4: IIA (T3, N0, M0 | 20 |
| NA | 1 | 5: IIA (T1, N1, M0) | 1 | |
| Grade/Differentiation, N | 6: IIA (T2, N1, M0) | 8 | ||
| Well | 3 | 7: IIB (T3, N1, M0) | 4 | |
| Moderate | 41 | 6 | 8: III (T4, Any N, M0) | 19 |
| Poor | 5 | 45 | 9: IV (Any T, Any N, M1) | 25 |
| Undifferentiated | 6 | 83 | 99: NA | 13 |
| NA | 5 | 41 | ||
| Tumor Sequence number *, N | ||||
| 00 | 5 | |||
| 01 | 23 | |||
| 02 | 7 | |||
| 03 | 140 | |||
| Uncertainty Methods and Metrics | Dropout | Ensemble | |||
|---|---|---|---|---|---|
| GE | GE | CRC | Pan | Ova | |
| Average Probability | −0.863 * | −0.842 * | −0.487 * | −0.503 * | −0.763 * |
| Average Calibrated Probability | −0.813 * | −0.442 * | −0.316 | −0.782 * | |
| Coefficient of Variation (pixel-wise) | 0.739 * | 0.852 * | 0.529 * | 0.526 * | 0.756 * |
| Coefficient of Variation (SMA) | 0.296 | 0.910 * | 0.759 * | 0.522 * | 0.660 * |
| Average Variance | 0.720 * | 0.866 * | 0.571 * | 0.546 * | 0.755 * |
| Average Variance (SM) | 0.664 * | 0.723 * | 0.647 * | 0.523 * | 0.798 * |
| Average Entropy | 0.867 * | 0.843 * | 0.474 * | 0.516 * | 0.749 * |
| Expected Entropy of the Ensemble | 0.869 * | 0.701 * | −0.442 * | −0.316 | 0.655 * |
| Dataset | Uncertainty Metric | Threshold | Flagged% | Threshold Percentile | Sensitivity | Specificity |
|---|---|---|---|---|---|---|
| Gastroesophageal | CoV (SMA) | 2.00 | 20.00% | 80.00% | 80.00% | 95.00% |
| Avg variance | 0.50 | 28.00% | 72.00% | 80.00% | 85.00% | |
| Colorectal | Avg variance | 0.12 | 67.90% | 32.10% | 71.00% | 36.40% |
| 0.20 | 34.00% | 66.00% | 38.70% | 72.70% | ||
| CoV (SMA) | 0.30 | 67.90% | 32.10% | 74.20% | 40.90% | |
| 0.50 | 43.40% | 56.60% | 45.20% | 59.10% | ||
| Pancreatic | Avg variance | 0.20 | 48.60% | 51.40% | 65.70% | 59.50% |
| 0.40 | 14.70% | 85.30% | 25.00% | 90.40% | ||
| CoV (SMA) | 0.60 | 49.50% | 50.50% | 62.90% | 56.80% | |
| 1.00 | 29.40% | 70.60% | 42.90% | 77.00% | ||
| Ovarian | Avg variance | 0.35 | 66.23% | 33.77% | 79.60% | 70.00% |
| 0.60 | 36.36% | 63.64% | 45.60% | 89.70% | ||
| CoV (SMA) | 1.00 | 70.78% | 29.22% | 82.10% | 58.50% | |
| 1.50 | 46.10% | 53.90% | 57.10% | 82.90% |
| Dataset | With BMI/SMI/SMA | With SMI/SMA | With BMI | Without BMI/SMI/SMA |
|---|---|---|---|---|
| Colorectal | 0.524 [0.29–0.80] | 0.560 [0.34–0.83] | 0.500 [0.28–0.78] | 0.548 [0.32–0.80] |
| Pancreatic | 0.660 [0.51–0.78] | 0.629 [0.48–0.76] | 0.613 [0.48–0.73)] | 0.601 [0.47–0.71] |
| Ovarian | 0.676 [0.56–0.78] | 0.676 [0.54–0.77] | 0.659 [0.52–0.77] | 0.659 [0.52–0.77] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Ahmed, S.; Parker, N.; Park, M.; Jeong, D.; Peres, L.C.; Davis, E.W.; Permuth, J.B.; Siegel, E.M.; Schabath, M.B.; Yilmaz, Y.; et al. Reliable Radiologic Skeletal Muscle Area Assessment—A Biomarker for Cancer Cachexia Diagnosis. Cells 2026, 15, 515. https://doi.org/10.3390/cells15060515
Ahmed S, Parker N, Park M, Jeong D, Peres LC, Davis EW, Permuth JB, Siegel EM, Schabath MB, Yilmaz Y, et al. Reliable Radiologic Skeletal Muscle Area Assessment—A Biomarker for Cancer Cachexia Diagnosis. Cells. 2026; 15(6):515. https://doi.org/10.3390/cells15060515
Chicago/Turabian StyleAhmed, Sabeen, Nathan Parker, Margaret Park, Daniel Jeong, Lauren C. Peres, Evan W. Davis, Jennifer B. Permuth, Erin M. Siegel, Matthew B. Schabath, Yasin Yilmaz, and et al. 2026. "Reliable Radiologic Skeletal Muscle Area Assessment—A Biomarker for Cancer Cachexia Diagnosis" Cells 15, no. 6: 515. https://doi.org/10.3390/cells15060515
APA StyleAhmed, S., Parker, N., Park, M., Jeong, D., Peres, L. C., Davis, E. W., Permuth, J. B., Siegel, E. M., Schabath, M. B., Yilmaz, Y., & Rasool, G. (2026). Reliable Radiologic Skeletal Muscle Area Assessment—A Biomarker for Cancer Cachexia Diagnosis. Cells, 15(6), 515. https://doi.org/10.3390/cells15060515

