Next Article in Journal
FLOT Versus CROSS—What Is the Optimal Therapeutic Approach for Locally Advanced Adenocarcinoma of the Esophagus and the Esophagogastric Junction?
Previous Article in Journal
Analysis of Risk Factors for High-Risk Lymph Node Metastasis in Papillary Thyroid Microcarcinoma
Previous Article in Special Issue
The Role of MRI Radiomics Using T2-Weighted Images and the Apparent Diffusion Coefficient Map for Discriminating Between Warthin’s Tumors and Malignant Parotid Gland Tumors
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

MRI-Based Radiomics for Outcome Stratification in Pediatric Osteosarcoma †

by
Esther Ngan
1,
Dolores Mullikin
2,
Ashok J. Theruvath
3,
Ananth V. Annapragada
1,3,
Ketan B. Ghaghada
1,3,
Andras A. Heczey
4 and
Zbigniew A. Starosolski
1,3,*
1
Department of Radiology, Baylor College of Medicine, Houston, TX 77030, USA
2
Mary Bridge Children’s Hospital, Tacoma, WA 98403, USA
3
Department of Radiology, Texas Children’s Hospital, Mark A. Wallace Tower, 6701 Fannin Street, Suite 450, Houston, TX 77030, USA
4
Department of Pediatrics-Oncology, Baylor College of Medicine, Houston, TX 77030, USA
*
Author to whom correspondence should be addressed.
This paper is an extended version in Ngan, E.; Mullikin, D.; Theruvath, A.; Annapragada, A.; Ghaghada, K.; Heczey, A.; Starosolski, Z. Classification of osteosarcoma clinical outcomes using contrast-enhanced MRI radiomics and clinical variables. In Proceedings of the SPR 2025 Annual Meeting, Honolulu, HI, USA, 7–11 April 2025.
Cancers 2025, 17(15), 2586; https://doi.org/10.3390/cancers17152586
Submission received: 28 June 2025 / Revised: 31 July 2025 / Accepted: 3 August 2025 / Published: 6 August 2025
(This article belongs to the Special Issue The Roles of Deep Learning in Cancer Radiotherapy)

Simple Summary

Osteosarcoma (OS) is a rare and aggressive bone cancer affecting children and adolescents and is associated with a low survival rate. Predicting disease progress or response to treatment is challenging due to the tumor’s complexity. This research aims to use advanced MRI techniques and machine learning methods to predict important outcomes including progressive disease, therapy response, relapse, and survival in pediatric OS patients. The findings show that these models can be highly accurate, enabling better decisions about treatment outcomes. This could lead to better risk stratification, treatment planning, and improving patient outcomes. The research also highlights the potential for these methods to be applied in other hospitals and research settings, benefiting the broader medical community.

Abstract

Background/Objectives: Osteosarcoma (OS) is the most common malignant bone tumor in children and adolescents; the survival rate is as low as 24%. Accurate prediction of clinical outcomes remains a challenge due to tumor heterogeneity and the complexity of pediatric cases. This study aims to improve predictions of progressive disease, therapy response, relapse, and survival in pediatric OS using MRI-based radiomics and machine learning methods. Methods: Pre-treatment contrast-enhanced coronal T1-weighted MR scans were collected from 63 pediatric OS patients, with an additional nine external cases used for validation. Three strategies were considered for target region segmentation (whole-tumor, tumor sampling, and bone/soft tissue) and used for MRI-based radiomics. These were then combined with clinical features to predict OS clinical outcomes. Results: The mean age of OS patients was 11.8 ± 3.5 years. Most tumors were located in the femur (65%). Osteoblastic subtype was the most common histological classification (79%). The majority of OS patients (79%) did not have evidence of metastasis at diagnosis. Progressive disease occurred in 27% of patients, 59% of patients showed adequate therapy response, 25% experienced relapse after therapy, and 30% died from OS. Classification models based on bone/soft tissue segmentation generally performed the best, with certain clinical features improving performance, especially for therapy response and mortality. The top performing classifier in each outcome achieved 0.94–1.0 validation ROC AUC and 0.63–1.0 testing ROC AUC, while those without radiomic features (RFs) generally performed suboptimally. Conclusions: This study demonstrates the strong predictive capabilities of MRI-based radiomics and multi-region segmentations for predicting clinical outcomes in pediatric OS.

1. Introduction

Osteosarcoma (OS) is the most common malignant bone tumor in children and adolescents. In the United States, approximately 1000 new cases of osteosarcoma are diagnosed each year, with about half of them occurring in children and adolescents [1]. OS originates when bone-forming cells become cancerous. OS is an aggressive form of bone cancer that can affect both bone and surrounding soft tissues. While the tumor primarily originates in the bone, it can invade nearby soft tissues, contributing to the disease’s progression and complicating treatment outcomes. OS most commonly develops in the ends of long bones, particularly around the knee. OS primarily occurs in the legs and arms, but it can also develop in other parts, such as the pelvis, shoulder, and skull [1].
The standard of care typically involves a combination of neoadjuvant chemotherapy, surgery, and adjuvant therapy to target any remaining cancer cells [1]. Radiation therapy might also be used, particularly when the cancer has spread to areas where surgery is not an option. Surgical options include limb salvage or amputation. Both of these treatment options can significantly impact the patient’s mobility and quality of life. Despite advancements in treatments including surgery and chemotherapy, outcomes remain suboptimal for certain patients. Specifically, survival rates drop significantly in cases of progressive disease during therapy, poor response to treatment, and relapse occurrence. The 5-year survival rate for localized OS is 64–76%, whereas for metastatic OS the survival rate drops significantly to 24% [2]. Diagnostic imaging plays a crucial role in the management of osteosarcoma, with X-rays, MRIs, and CT scans being commonly used. MRI is particularly valuable for providing detailed 3D images of both bone and soft tissue structures due to its superior soft-tissue contrast and non-ionizing nature [3], allowing for a precise assessment of the tumor’s extent. CT scans are primarily used to detect pulmonary metastases, as the lungs are the most common site of distant spread in OS patients [4,5].
Pediatric OS presents unique challenges compared to adult cases, as children and adolescents are still undergoing skeletal growth. This biological complexity, coupled with the heterogeneous nature of OS tumors, complicates the accurate prediction of key clinical outcomes. Existing prognostic factors such as tumor size, location, and histological response to chemotherapy provide some guidance but remain insufficient for precise risk stratification. There is a critical need for more advanced, non-invasive tools to improve the prediction of disease outcomes in pediatric OS.
Radiomics is an emerging field that involves extracting high-dimensional quantitative features from medical images, such as magnetic resonance imaging (MRI), to uncover patterns imperceptible to the human eye [6]. Radiomics data, when combined with other patient information, are analyzed using advanced bioinformatics tools to create mathematical models that have the potential to enhance diagnostic, prognostic, and predictive accuracy [6]. Radiomics has been applied to analyze different diseases, including cancers [7,8,9,10,11,12]. By analyzing these radiomic features (RFs), researchers can capture tumor heterogeneity, which is believed to reflect underlying biological processes such as tumor aggressiveness and treatment response. Radiomics has shown promise in various cancers, including brain tumors [13,14,15,16], lung cancer [17,18,19], and sarcomas [20,21,22,23], for predicting patient outcomes and guiding personalized treatment strategies. However, there is a lack of standardization in RF extraction and reporting across the field. The Image Biomarker Standardization Initiative (IBSI) has proposed guidelines to ensure reproducibility and comparability across radiomic studies [24]. While radiomics is an emerging field in cancer research, only a few studies have explored its application in OS [10,25,26,27,28], not to mention the IBSI recommended ones.
In addition to RFs, our analysis also considered demographics and pre-treatment clinical features such as skip lesions, OS subtype (osteoblastic, chondroblastic, etc.), OS location, and laterality. To further improve our classification performance, we adopted a hierarchical model that included prior outcomes as predictors for subsequent outcomes. Specifically, progressive disease and therapy response were included when modeling relapse after therapy, whereas the previous three factors were included when modeling mortality. Previous studies have been carried out to focus on individual clinical outcomes such as therapy response [10,29], pulmonary metastasis [5], relapse [27], and mortality [25,28,30]. However, these studies have generally examined each outcome separately without considering the interdependencies between them. The hierarchical approach enables us to capture the interdependencies of these clinical events, providing a more comprehensive view of the disease.
Furthermore, most prior studies have relied on whole-tumor segmentation without considering regional variations within the tumor, which could offer more insight into the tumor’s heterogeneity [25,26]. To address this gap, we explored two additional distinct segmentation strategies: (1) whole-tumor segmentation (usual practice); (2) tumor sampling from the whole tumor to account for intratumoral heterogeneity; and (3) bone/soft tissue separation. To our knowledge, no study has comprehensively explored pediatric OS by focusing on multiple clinical outcomes, MRI radiomics, and accounting for tumor structural heterogeneity using seeks to address these gaps via a comprehensive approach to analyze MRI data from pediatric OS patients with extremity tumors. Through this innovative methodology that integrates radiomics, advanced segmentation strategies, and combinations of clinical features, our study aims to enhance our understanding of pediatric OS and improve the prediction of critical clinical outcomes. Identifying reliable imaging biomarkers could facilitate early risk stratification, personalize treatment plans, and ultimately improve patient outcomes. This work represents a novel contribution to the field by addressing multiple clinical outcomes simultaneously and utilizing standardized radiomics features in a pediatric population. Preliminary results from this study were previously presented at the Society for Pediatric Radiology Annual Meeting [31].

2. Materials and Methods

2.1. Patients’ Cohorts

The study was conducted with approval from our institutional review board (H-50282). We identified 131 patients from a tertiary children’s hospital between 2006 and 2022. To identify predictors and outcomes, we searched through the clinical notes in Epic using keywords such as “relapse”, “progression”, “skip lesion”, and “histology type”. Only those with a complete medical record, pre-treatment post-contrast T1-weighted MRIs, and OS located in upper and lower extremities were included (Figure 1). Ultimately, 63 patients were included in the analyses using whole-tumor/tumor-sampling segmentation. Twenty-six of them underwent bone/tissue segmentation. An additional nine patients with their pre-treatment scans taken outside our facility were used as external validation.

2.2. Evaluated Outcomes

This study evaluated four binary outcomes in patients diagnosed with OS. Figure 2 shows the timeline for data collection and outcome evaluation. The first outcome was “progressive disease”, which assessed whether the disease progressed during the neoadjuvant and adjuvant phase, as documented in the clinical notes. This included, but was not limited to, metastasis after diagnosis, tumor regrowth, and relapse during or at the end therapy. The second outcome focused on “response to therapy”, defined as an “adequate” or “poor” response based on the percentage of necrosis on histopathology, with a threshold of 90% necrosis for a favorable response. If a range of necrosis percentages were reported, the mean value would be used for analysis (e.g., a range of 10–20% was replaced by 15%). The third outcome, “relapse/recurrence off therapy,” refers to any recurrence that occurred after the patient had been declared as having no evidence of disease (NED) or completed all the treatments. Relapse off therapy was coded as no if patients relapsed during or at the end of therapy. However, they would be coded as having progressive disease. Finally, the fourth outcome was “OS related mortality”. Patients who died from causes unrelated to OS were coded as no.

2.3. Segmentation Method

The study employed three segmentation methods to delineate the tumor regions (Figure 3). The first method involved whole-tumor segmentation, where the entire tumor area was outlined to capture its complete extent. This approach might include adjacent normal tissues at the tumor’s edges. The second method utilized a tumor sampling approach based on the above whole-tumor mask, dividing the tumor into seven distinct, non-overlapping regions corresponding to each face: top, bottom, front, back, left, right, and middle. The final method involved segmenting between bone and soft tissue regions within the tumor. The bone/soft tissue segmentation was performed by a pediatric radiologist with nine years of experience. Whole-tumor segmentation was subsequently conducted by a postdoctoral researcher with two years of experience in medical image analysis, building upon the initial bone/soft tissue segmentations. Segmentation was performed in 3D Slicer (ver. 5.2.2), and analyses were conducted in Python (ver 3.11.4).

2.4. Data Standardization and Features Sets

Both local and external MRIs were standardized to isotropic spacing and image pixels were min–max normalized. We selected three types of features for classification analyses. First, only radiomic features recommended by IBSI with no filter use were analyzed (IBSI RFs, n = 107). Second, both IBSI RFs and features with filters were considered (all RFs, n = 107 + 1177). Filters included wavelet and Laplacian of Gaussian transformations. RFs were calculated for each segmented region. Thus, segmentations involving bone/soft tissue yielded 2× more RFs than those based on whole-tumor segmentation. Similarly, region-based sampling had 7× more RFs than whole-tumor segmentation. Finally, demographics and pre-treatment clinical features were included when relevant, including laterality, histology (e.g., osteoblastic subtype), tumor location, metastasis at diagnosis, and presence of a skip lesion. Clinical variables were meticulously reviewed and extracted from Epic charts by a pediatric hematologist–oncologist with nine years of experience and a trained personnel to ensure accuracy and consistency. In addition to pre-treatment clinical features, we adopted a hierarchical approach to training classifiers for each of the four outcomes. For progressive disease and therapy response outcomes, only pre-treatment clinical features were included as potential predictors. The relapse outcome classifiers incorporated pre-treatment clinical features, progressive disease, and percentage of necrosis, whereas the mortality outcome classifiers encompassed all previous features, i.e., pre-treatment clinical variables, progressive disease, percentage of necrosis, and relapse.

2.5. Machine Learning

The ML pipeline began with splitting the dataset of 63 patients (or 26 for bone/soft tissue segmentation) into 80% for training and 20% for testing. RF reduction was performed on the training set in two steps. First, correlation analysis was applied to remove the highly correlated RFs. With IBSI RFs (i.e., RFs without use of filters) only, a Spearman correlation of 0.9 was used as the cutoff. For features involving filter use, a Spearman correlation of 0.8 was used as the cutoff. The varied cutoffs were chosen pragmatically based on the total number of features and segmentation strategy, while balancing feature redundancy reduction and preserving sufficient information for model training. In analyses using IBSI RFs, a strict 0.9 cutoff overly limited the feature set and reduced model flexibility in the subsequent steps. In contrast, with all RFs, the number of features increased dramatically to thousands (>7000 with tumor sampling). Thus, we applied a stricter cutoff of 0.9 to effectively reduce redundancy and computational burden. In the correlation-based feature reduction, priority was given to IBSI RFs. For models incorporating clinical features, features were added to the reduced radiomic set. Neighborhood Component Analysis (NCA) was then applied to the updated feature set (whether composed of RFs alone or combined with clinical features), and all features were ranked by their relevance to the outcome. It is possible that the best-performing classifier might not include clinical features if they were not ranked highly enough for inclusion, even though they were part of the initial feature set. Following feature reduction, we applied 5-fold cross-validation for all models, except those using bone/soft tissue segmentation, where 3-fold CV was employed due to the limited number of cases. Linear and nonlinear classifiers with different parameter settings were evaluated, including logistic regression, K-nearest neighbor (KNN), linear discriminant analysis (LDA), support vector machine (SVM), random forest, naïve Bayes, ensembles, and multi-layer perceptron network (MLP). Classification metrics, including receiver operating characteristic area under the curve (ROC AUC), precision–recall area under the curve (PR AUC), accuracy, sensitivity, and specificity, were calculated. Classifier performance was primarily evaluated based on the validation ROC AUC to determine the optimal classifier(s) for each outcome, feature type, and segmentation method. The same training set was used for analyses involving whole-tumor and tumor-sampling segmentation. In total, we selected 72 top-performing classifiers across four outcomes, three segmentation methods, and different combinations of radiomic and clinical features. Machine learning analyses were performed using the Scikit-learn Python package on a Linux workstation equipped with an Intel(R) Core(TM) i9-9900K CPU @ 3.60 GHz (16 logical cores), 64 GB of RAM, and a single NVIDIA GeForce RTX 1080 Ti GPU with 11 GiB memory.

2.6. Statistical Analysis

The Kruskal Wallis test, Chi-square test, or Fisher exact test, depending on the need, were conducted to compare the distribution of predictors and outcomes among different cohorts, including the full sample of 63 patients, the subset of 26 of them for whom bone/soft tissue segmentation was also performed, and the 9 patients with external scans. All statistical analyses were performed using a significance level of 0.05.

3. Results

3.1. Descriptive Statistics

Table 1 shows the summary statistics for the three cohorts: the full sample (63 patients), a sub-cohort with bone/soft tissue segmentation (26 patients), and an external cohort (9 patients). Among the 63 patients, the mean age was 11.82 (SD 3.53), with a predominance of males (43; 68.25%), Caucasians (52; 82.54%), and non-Hispanics (32; 50.79%). OS mostly occurred in the femur (41; 65.08%). Osteoblastic subtype (50; 79.37%) was the most common histological classification. At the time of diagnosis, 13 (20.63%) patients had pulmonary metastases, and 8 (12.70%) exhibited skip lesions. In terms of outcomes, 17 (26.98%) patients had progressive disease during therapy. The mean percentage of necrosis on histopathology was 93%. A total of 16 (25.40%) patients experienced relapse after therapy; 19 (30.16%) died from OS-related complications. No significance differences were found between the three cohorts in any variables except laterality, where patients with external scans had a significantly lower percentage of OS on the right side.

3.2. Outcome Interdependencies

Progressive disease was significantly associated with therapy response (p < 0.0001) (Table 2). Among patients with progressive disease during therapy, 14 out of 17 (82.4%) had a poor therapy response, compared to 12 out of 46 (26.1%) among those without progressive disease. Relapse off therapy was not significantly associated with any prior outcomes (p > 0.05). OS-related mortality showed a highly significant association with progressive disease during therapy (p < 0.0001) and relapse off therapy (p = 0.009). Metastasis at diagnosis was significantly associated with the occurrence of relapse off therapy and OS-related mortality (p = 0.008 and 0.037). Chondroblastic subtype was also significantly more prevalent in the deceased group (p = 0.033).

3.3. Classification Results

Depending on the feature set and segmentation approach used, the training times for classifiers varied. On average, each outcome- and segmentation-specific analysis required 1.5 to 4.5 h to complete. The shortest runtimes were observed when using IBSI RFs combined with whole-tumor segmentation, while analyses including all RFs and tumor-sampling segmentation required the longest processing times.

3.3.1. Progressive Disease

Table 3 shows the detailed classification results for progressive disease. A list of selected features can be found in Supplementary Table S1. Classifiers derived from bone/soft tissue segmentation demonstrated higher validation and testing performance in terms of AUC compared to those developed using whole-tumor or tumor-sampling segmentation methods. A tumor being located in the humerus consistently ranked as the top clinical feature in the classifiers. Comparatively, the classifiers without any RFs generally had poor validation and testing classification results.
For bone/soft tissue segmentation, all top-performing classifiers showed excellent validation performance, with ROC AUC above 0.94. With IBSI RFs, adding baseline clinical features did not improve testing performance. The ROC AUC reached a maximum of 0.88 with an LDA classifier and a tissue RF (original_glrlm_RunLengthNonUniformity_tissue). While using all RFs optimized the validation performance, it reduced the testing performance and required an additional 13 features.
For whole-tumor segmentation, the best-performing classifier in terms of validation ROC AUC was a random forest classifier with 13 filtered RFs and OS location in the humerus. The classifier achieved ROC AUC 0.92 ± 0.09, with a testing ROC AUC of 0.51. Adding clinical features to filtered RFs improved validation ROC AUC, with the largest relative gain of 35.3%. However, the testing ROC AUC remained similar, with the highest value of 0.67 obtained without clinical features.
For tumor sampling, classifiers using all RFs slightly outperformed those using IBSI RFs in terms of validation ROC AUC. Compared to the classifier using IBSI RFs alone, adding baseline clinical features improved the validation ROC AUC to 0.84 (+12%) while decreasing the testing ROC AUC to 0.63 (−8.7%). Using only all RFs further increased the validation ROC AUC to 0.91 and the testing ROC AUC to 0.86 with an MLP classier and 15 RFs with filters (4 left, 4 front, 2 back, 5 bottom regions). Adding clinical features slightly reduced the validation ROC AUC to 0.90 but improved the testing ROC AUC to 0.89 (+3.5%).

3.3.2. Response to Therapy

Table 4 shows the detailed classification results for response to therapy. A list of selected features can be found in Supplementary Table S2. In general, classifiers derived from bone/soft tissue segmentation demonstrated higher validation AUC values compared to those using whole-tumor or tumor-sampling segmentation methods. A chondroblastic subtype often ranked as one of the most important clinical features. Comparatively, the classifiers without any RFs generally had lower validation performance in terms of ROC AUC. Comparatively, the classifiers without any RFs usually had poor validation and testing classification results.
For bone/soft tissue segmentation, all top-performing classifiers achieved perfect validation classification results. However, on the testing set, adding clinical features led to decreased ROC AUC (max −16.7%). The best classifier overall belonged to the KNN (k = 5) model with four bone RFs with filters, which yielded perfect validation and testing classification metrics.
For whole-tumor segmentation, classifiers using all RFs slightly outperformed those based on IBSI RFs, with validation ROC AUC improving by up to 0.10 (+12.0%). Among models using IBSI RFs, an MLP classifier incorporating six RFs, chondroblastic subtype, gender, and presence of skip lesion, performed the best, with validation and testing ROC AUCs of 0.86 ± 0.09 and 0.80, respectively. With all RFs, an SVM with 14 RFs with filters had validation and testing ROC AUCs of 0.93 ± 0.06 and 0.78, respectively.
For tumor sampling, nearly all top-performing classifiers achieved a validation ROC AUC above 0.90, with the highest (0.98 ± 0.02) obtained from a polynomial SVM and 11 RFs. However, that classifier only had a testing ROC AUC of 0.6. The second-best classifier belonged to a random forest classifier using three IBSI RFs (top, left, right region), subtype OS, and gender, yielding a validation ROC AUC of 0.94 ± 0.05 and the highest testing ROC AUC of 0.76.

3.3.3. Relapse off Therapy

Table 5 shows the detailed classification results for relapse off therapy. A list of selected features can be found in Supplementary Table S3. Classifiers derived from bone/soft tissue segmentation demonstrated higher validation AUC values compared to those based on whole-tumor or tumor-sampling segmentation methods. Among demographic and clinical features, the presence of skip lesions and metastasis at diagnosis consistently ranked among the most important predictors. In contrast, adding prior outcomes (i.e., progressive disease and percentage of necrosis) did not improve classification performance.
For bone/soft tissue segmentation, all top-performing classifiers achieved perfect validation classification results, except for the model using only IBSI RFs. Models using all RFs achieved comparable validation and testing results relative to the IBSI RF models. However, this required 3× as many features and had slightly lower testing accuracy and specificity. The best-performing classifier was an SVM model using three features (skip lesion, original_glszm_ZoneVariance_tissue, original_firstorder_Energy_bone), with a testing ROC AUC and PR AUC of 0.75.
In the case of whole-tumor segmentation, the highest validation ROC AUC (0.89 ± 0.11) was obtained by an SVM classifier incorporating eight filtered RFs, along with metastasis at diagnosis and skip lesion. However, the model’s testing ROC AUC was only 0.56. In comparison, an MLP using five features (metastasis at diagnosis, skip lesion, and three filtered RFs) achieved a validation ROC AUC of 0.78 ± 0.25, a testing ROC AUC of 0.73, and perfect sensitivity (1.0).
In tumor sampling, classifiers using all RFs showed marginal improvements in validation performance and similar test ROC AUCs, consistently yielding higher testing sensitivity at the cost of reduced accuracy. An MLP classifier with ten features (metastasis at diagnosis, skip lesion, and eight filtered RFs) attained the highest validation ROC AUC (0.93 ± 0.07), though its testing ROC AUC was 0.67 with perfect sensitivity. Using IBSI RFs, a polynomial SVM with 13 features (metastasis at diagnosis, skip lesion, chondroblastic subtype, humerus location, and nine RFs) achieved a validation ROC AUC of 0.86 ± 0.13, a testing ROC AUC of 0.73, and a specificity of 0.90.

3.3.4. OS-Related Mortality

Table 6 shows the detailed classification results for OS-related mortality. A list of selected features can be found in Supplementary Table S4. Most classifiers derived from bone/soft tissue segmentation achieved perfect classification performance on both the validation and testing sets, outperforming those based on the other two segmentation approaches. These high-performing models included a random forest classifier incorporating progressive disease, three filtered RFs (2 bone RFs, 1 tissue RF), and a sigmoid SVM with seven IBSI features (four tissue RFs, three bone RFs).
For whole-tumor segmentation, classifiers using filtered RFs generally outperformed those based on IBSI RFs (ROC AUC max + 44.9%). The best-performing model was a KNN classifier (k = 5) using six features: progressive disease, relapse status, percentage of necrosis, and three filtered RFs. This model achieved perfect validation performance, a testing ROC AUC of 0.97, and perfect classification accuracy on the external test set. Incorporating prior clinical outcomes significantly improved the performance of classifiers using IBSI RFs but had limited benefit when all RFs were included.
With tumor-sampling segmentation, the top-performing model in terms of validation ROC AUC was an SVM classifier using only three prior outcomes: progressive disease, percentage of necrosis, and relapse status. This model achieved a validation ROC AUC of 0.98 ± 0.03 and a testing ROC AUC of 0.92. The second-best model was a KNN (k = 8) classifier with eight features (progressive disease, relapse status, six filtered RFs), achieving a validation ROC AUC of 0.98 ± 0.04 and the highest testing ROC AUC of 0.94. This model also obtained an accuracy of 0.78 on the external set.
Figure 4 shows the middle slice of an MRI from patients who were either misclassified or correctly classified by the majority of the models. The patients’ demographics and clinical information are presented in Table 7.

4. Discussion

This study explores the use of filtered and IBSI-validated radiomics using MRI data from pediatric OS patients with extremity tumors. Our models demonstrated robust predictive capabilities for multiple clinical outcomes, including progressive disease, therapy response, relapse occurrence, and mortality. The hierarchical approach, which considered prior outcomes as predictors for subsequent events, captured the interdependencies of clinical events and provided a more holistic view of individual outcomes. By integrating clinical features and hierarchical modeling, we captured the interdependencies of these clinical events, providing a more comprehensive view of disease trajectories. Certain combinations of features and classifiers yielded good generalizability to external data, especially the ones for predicting OS-related mortality, demonstrating the potential of our models to be applied across different healthcare facilities and datasets. This generalizability underscores the practical utility of our findings in real-world clinical settings.
Previous studies investigating MRI-based radiomics in OS have primarily focused on a single clinical outcome such as chemotherapy response or survival rate. A multicenter T1 post-contrast MRI radiomics study reported a maximum AUC of 0.88 for predicting therapy response [32]. A study using T2-weighted MRI radiomics reported lower predictive performance (AUC = 0.708) [25]. For survival outcomes, published C-indices range from 0.741 (T2-weighted MRI radiomics) [25] to 0.813 (diffusion MRI radiomics) [28]. Although direct comparison is limited due to differences in evaluation metrics, our study’s C-index values are expected to exceed this range given our excellent ROC AUC. Our study has also demonstrated comparatively higher AUC values. Moreover, previous analyses were often not pediatric-specific and relied on whole-tumor segmentation alone, overlooking the spatial heterogeneity and tissue-specific behavior of OS. Standardized feature extraction protocols, such as those recommended by the IBSI, were not consistently applied or reported in earlier studies, potentially affecting the reproducibility and comparability of radiomic studies. In contrast, our study adopts an IBSI-compliant radiomics pipeline and applies image filters to enhance feature sensitivity. In addition, we introduced multi-region segmentation strategies (including tumor sampling and bone/soft tissue separation) to better account for structural heterogeneity in pediatric OS. By jointly modeling multiple clinical outcomes and using more biologically informed segmentation, our work offers a more comprehensive and reproducible approach, thereby advancing radiomics research in OS. Notably, models based on bone and soft tissue segmentation consistently outperformed those using whole-tumor or tumor-sampling segmentation, despite the smaller sample size. Certain RFs from bone and soft tissue regions, including filtered features that capture textural patterns imperceptible to the human eye, emerged as critical predictors across multiple outcomes. These results highlight the value of capturing distinct growth patterns in bone and surrounding soft tissues that are characteristic of OS and may reflect different biological processes. Clinical features, such as presence of metastasis at diagnosis, skip lesions, progressive disease, and OS type, further enhanced model performance. In contrast, models excluding RFs often required more features and demonstrated lower predictive performance (max 45% in validation ROC and 33% in testing ROC).
Despite these promising findings, our study has limitations. First, the manual effort required for bone/soft tissue segmentation constrained the training sample size and precluded external validation of these models. Automated segmentation tools could address this limitation in future research, reducing labor demands and facilitating broader validation efforts. Second, due to resource limitations and the availability of only one radiologist, intra- or inter-observer variability testing was not performed. Third, our exclusive reliance on post-contrast T1-weighted MRI may have overlooked valuable information from other sequences, such as T2-weighted or diffusion-weighted imaging, which might provide additional insights into tumor characteristics and heterogeneity. Future studies incorporating multimodal imaging data are desired.
In the future, we aim to validate these findings using larger, multicenter datasets and incorporating additional imaging modalities and longitudinal time factors. Moreover, exploring automated feature extraction and segmentation methods will be essential for enhancing reproducibility and scalability.

5. Conclusions

Our findings contribute to the growing evidence supporting the utility of radiomics in pediatric OS. We have demonstrated the potential of T1w-MRI-derived radiomics and certain pre-treatment variables for improving the prediction of critical clinical outcomes. Uniquely, this study addresses multiple outcomes in a pediatric population simultaneously using an IBSI-compliant radiomics framework, including multi-region segmentation strategies to capture tumor heterogeneity more effectively. Our work underscores the need for reproducible radiomics pipelines in future studies. The approach has the potential to support pediatric OS clinical risk stratification, inform treatment planning, and ultimately enable more tailored treatment strategies to improve patient outcomes.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/cancers17152586/s1, Table S1: Selected features from classification analysis—Progressive disease; Table S2: Selected features from classification analysis—Response to therapy; Table S3: Selected features from classification analysis—Relapse off therapy; Table S4: Selected features from classification analysis—OS related mortality.

Author Contributions

Conceptualization, E.N., D.M., A.J.T., A.A.H., K.B.G. and Z.A.S.; methodology, E.N., A.J.T. and Z.A.S.; software, E.N. and Z.A.S.; formal analysis, E.N. and Z.A.S.; investigation, E.N. and Z.A.S.; resources, E.N., D.M., A.J.T. and Z.A.S.; data curation, E.N., D.M. and A.J.T.; writing—original draft preparation, E.N.; writing—review and editing, E.N., D.M., A.J.T., A.V.A., K.B.G., A.A.H. and Z.A.S.; visualization, E.N. and Z.A.S.; supervision, A.V.A., K.B.G. and Z.A.S.; funding acquisition, A.V.A., K.B.G. and Z.A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Barineau, Torian, and Levy Families.

Institutional Review Board Statement

The study was approved by the Institutional Review Board of Baylor College of Medicine (protocol code H-50282, approved on 6 October 2021).

Informed Consent Statement

Patient consent was waived because of the retrospective nature of the study. The research required a large number of past patients’ records; re-contacting these individuals is logistically challenging and, in some cases, impossible due to patient relocation and loss of tracking.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author due to privacy and confidentiality restrictions associated with hospital records.

Conflicts of Interest

E.N., D.M., A.J.T. have nothing to disclose. A.V.A. owns stock in Abbott Laboratories Inc., Abbvie Inc., Alzeca Inc., and Sensulin LLC. A.V.A. holds patents in the field of nanoparticle-enhanced imaging. A.A.H. has patents related to GPC3-CARs. A.A.H. is a consultant for Waypoint Bio and served on the Scientific Advisory Board of CARGO Therapeutics. A.A.H. has equity in CARGO. A.A.H. received research support from Kuur/Athenex Therapeutics. A.A.H. has pending patent applications related to CAR T cells. K.B.G. holds patents in nanoparticle contrast-enhanced imaging. Z.A.S. owns stock in Alzeca Inc., Advertum Biotechnologies.

Abbreviations

The following abbreviations are used in this manuscript:
OSOsteosarcoma
RFRadiomic features
MRIMagnetic resonance imaging
CTComputed tomography
IBSIImage Biomarker Standardization Initiative
KNNK-nearest neighbor
SVMSupport vector machine
LDALinear discriminant analysis
MLPMultilayer perceptron classifier
CVCross-validation
ROC AUCArea under Receiver operating characteristic curve
PR AUCPrecision-Recall Area Under the Curve

References

  1. Callan, A. OrthoInfo. Osteosarcoma—OrthoInfo—AAOS. Available online: https://www.orthoinfo.org/en/diseases--conditions/osteosarcoma/ (accessed on 21 February 2025).
  2. American Cancer Society. Survival Rates for Osteosarcoma. Available online: https://www.cancer.org/cancer/types/osteosarcoma/detection-diagnosis-staging/survival-rates.html (accessed on 21 February 2025).
  3. Aisen, A.; Martel, W.; Braunstein, E.; McMillin, K.; Phillips, W.; Kling, T. MRI and CT evaluation of primary bone and soft-tissue tumors. Am. J. Roentgenol. 1986, 146, 749–756. [Google Scholar] [CrossRef]
  4. Meyers, P.A.; Gorlick, R. OSTEOSARCOMA. Pediatr. Clin. N. Am. 1997, 44, 973–989. [Google Scholar] [CrossRef]
  5. Pereira, H.M.; Leite Duarte, M.E.; Ribeiro Damasceno, I.; De Oliveira Moura Santos, L.A.; Nogueira-Barbosa, M.H. Machine learning-based CT radiomics features for the prediction of pulmonary metastasis in osteosarcoma. Br. J. Radiol. 2021, 94, 20201391. [Google Scholar] [CrossRef] [PubMed]
  6. Gillies, R.J.; Kinahan, P.E.; Hricak, H. Radiomics: Images Are More than Pictures, They Are Data. Radiology 2016, 278, 563–577. [Google Scholar] [CrossRef] [PubMed]
  7. Sun, R.; Limkin, E.J.; Vakalopoulou, M.; Dercle, L.; Champiat, S.; Han, S.R.; Verlingue, L.; Brandao, D.; Lancia, A.; Ammari, S.; et al. A radiomics approach to assess tumour-infiltrating CD8 cells and response to anti-PD-1 or anti-PD-L1 immunotherapy: An imaging biomarker, retrospective multicohort study. Lancet Oncol. 2018, 19, 1180–1191. [Google Scholar] [CrossRef]
  8. Coroller, T.P.; Agrawal, V.; Narayan, V.; Hou, Y.; Grossmann, P.; Lee, S.W.; Mak, R.H.; Aerts, H.J.W.L. Radiomic phenotype features predict pathological response in non-small cell lung cancer. Radiother. Oncol. 2016, 119, 480–486. [Google Scholar] [CrossRef]
  9. Wu, W.; Parmar, C.; Grossmann, P.; Quackenbush, J.; Lambin, P.; Bussink, J.; Mak, R.; Aerts, H.J.W.L. Exploratory Study to Identify Radiomics Classifiers for Lung Cancer Histology. Front. Oncol. 2016, 6, 71. [Google Scholar] [CrossRef]
  10. Bouhamama, A.; Leporq, B.; Khaled, W.; Nemeth, A.; Brahmi, M.; Dufau, J.; Marec-Bérard, P.; Drapé, J.-L.; Gouin, F.; Bertrand-Vasseur, A.; et al. Prediction of Histologic Neoadjuvant Chemotherapy Response in Osteosarcoma Using Pretherapeutic MRI Radiomics. Radiol. Imaging Cancer 2022, 4, e210107. [Google Scholar] [CrossRef]
  11. Conti, A.; Duggento, A.; Indovina, I.; Guerrisi, M.; Toschi, N. Radiomics in breast cancer classification and prediction. Semin. Cancer Biol. 2021, 72, 238–250. [Google Scholar] [CrossRef]
  12. Granzier, R.W.Y.; Verbakel, N.M.H.; Ibrahim, A.; Van Timmeren, J.E.; Van Nijnatten, T.J.A.; Leijenaar, R.T.H.; Lobbes, M.B.I.; Smidt, M.L.; Woodruff, H.C. MRI-based radiomics in breast cancer: Feature robustness with respect to inter-observer segmentation variability. Sci. Rep. 2020, 10, 14163. [Google Scholar] [CrossRef] [PubMed]
  13. Kickingereder, P.; Burth, S.; Wick, A.; Götz, M.; Eidel, O.; Schlemmer, H.-P.; Maier-Hein, K.H.; Wick, W.; Bendszus, M.; Radbruch, A.; et al. Radiomic Profiling of Glioblastoma: Identifying an Imaging Predictor of Patient Survival with Improved Performance over Established Clinical and Radiologic Risk Models. Radiology 2016, 280, 880–889. [Google Scholar] [CrossRef]
  14. Zhou, M.; Scott, J.; Chaudhury, B.; Hall, L.; Goldgof, D.; Yeom, K.W.; Iv, M.; Ou, Y.; Kalpathy-Cramer, J.; Napel, S.; et al. Radiomics in Brain Tumor: Image Assessment, Quantitative Feature Descriptors, and Machine-Learning Approaches. Am. J. Neuroradiol. 2018, 39, 208–216. [Google Scholar] [CrossRef]
  15. Aerts, H.J.W.L.; Velazquez, E.R.; Leijenaar, R.T.H.; Parmar, C.; Grossmann, P.; Carvalho, S.; Bussink, J.; Monshouwer, R.; Haibe-Kains, B.; Rietveld, D.; et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 2014, 5, 4006. [Google Scholar] [CrossRef] [PubMed]
  16. Kotrotsou, A.; Zinn, P.O.; Colen, R.R. Radiomics in Brain Tumors. Magn. Reson. Imaging Clin. N. Am. 2016, 24, 719–729. [Google Scholar] [CrossRef] [PubMed]
  17. Chen, C.; Chen, M.; Tao, Q.; Hu, S.; Hu, C. Non-contrast CT-based radiomics nomogram of pericoronary adipose tissue for predicting haemodynamically significant coronary stenosis in patients with type 2 diabetes. BMC Med. Imaging 2023, 23, 99. [Google Scholar] [CrossRef] [PubMed]
  18. Hunter, B.; Chen, M.; Ratnakumar, P.; Alemu, E.; Logan, A.; Linton-Reid, K.; Tong, D.; Senthivel, N.; Bhamani, A.; Bloch, S.; et al. A radiomics-based decision support tool improves lung cancer diagnosis in combination with the Herder score in large lung nodules. eBioMedicine 2022, 86, 104344. [Google Scholar] [CrossRef]
  19. Pan, F.; Feng, L.; Liu, B.; Hu, Y.; Wang, Q. Application of radiomics in diagnosis and treatment of lung cancer. Front. Pharmacol. 2023, 14, 1295511. [Google Scholar] [CrossRef]
  20. Arthur, A.; Orton, M.R.; Emsley, R.; Vit, S.; Kelly-Morland, C.; Strauss, D.; Lunn, J.; Doran, S.; Lmalem, H.; Nzokirantevye, A.; et al. A CT-based radiomics classification model for the prediction of histological type and tumour grade in retroperitoneal sarcoma (RADSARC-R): A retrospective multicohort analysis. Lancet Oncol. 2023, 24, 1277–1286. [Google Scholar] [CrossRef]
  21. Peeken, J.C.; Asadpour, R.; Specht, K.; Chen, E.Y.; Klymenko, O.; Akinkuoroye, V.; Hippe, D.S.; Spraker, M.B.; Schaub, S.K.; Dapper, H.; et al. MRI-based delta-radiomics predicts pathologic complete response in high-grade soft-tissue sarcoma patients treated with neoadjuvant therapy. Radiother. Oncol. 2021, 164, 73–82. [Google Scholar] [CrossRef]
  22. Juntu, J.; Sijbers, J.; De Backer, S.; Rajan, J.; Van Dyck, D. Machine learning study of several classifiers trained with texture analysis features to differentiate benign from malignant soft-tissue tumors in T1-MRI images. Magn. Reson. Imaging 2010, 31, 680–689. [Google Scholar] [CrossRef]
  23. Fields, B.K.K.; Demirjian, N.L.; Cen, S.Y.; Varghese, B.A.; Hwang, D.H.; Lei, X.; Desai, B.; Duddalwar, V.; Matcuk, G.R. Predicting Soft Tissue Sarcoma Response to Neoadjuvant Chemotherapy Using an MRI-Based Delta-Radiomics Approach. Mol. Imaging Biol. 2023, 25, 776–787. [Google Scholar] [CrossRef]
  24. Zwanenburg, A.; Vallières, M.; Abdalah, M.A.; Aerts, H.J.W.L.; Andrearczyk, V.; Apte, A.; Ashrafinia, S.; Bakas, S.; Beukinga, R.J.; Boellaard, R.; et al. The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology 2020, 295, 328–338. [Google Scholar] [CrossRef] [PubMed]
  25. White, L.M.; Atinga, A.; Naraghi, A.M.; Lajkosz, K.; Wunder, J.S.; Ferguson, P.; Tsoi, K.; Griffin, A.; Haider, M. T2-weighted MRI radiomics in high-grade intramedullary osteosarcoma: Predictive accuracy in assessing histologic response to chemotherapy, overall survival, and disease-free survival. Skelet. Radiol. 2023, 52, 553–564. [Google Scholar] [CrossRef]
  26. Zhang, L.; Gao, Q.; Dou, Y.; Cheng, T.; Xia, Y.; Li, H.; Gao, S. Evaluation of the neoadjuvant chemotherapy response in osteosarcoma using the MRI DWI-based machine learning radiomics nomogram. Front. Oncol. 2024, 14, 1345576. [Google Scholar] [CrossRef] [PubMed]
  27. Chen, H.; Liu, J.; Cheng, Z.; Lu, X.; Wang, X.; Lu, M.; Li, S.; Xiang, Z.; Zhou, Q.; Liu, Z.; et al. Development and external validation of an MRI-based radiomics nomogram for pretreatment prediction for early relapse in osteosarcoma: A retrospective multicenter study. Eur. J. Radiol. 2020, 129, 109066. [Google Scholar] [CrossRef] [PubMed]
  28. Zhao, S.; Su, Y.; Duan, J.; Qiu, Q.; Ge, X.; Wang, A.; Yin, Y. Radiomics signature extracted from diffusion-weighted magnetic resonance imaging predicts outcomes in osteosarcoma. J. Bone Oncol. 2019, 19, 100263. [Google Scholar] [CrossRef]
  29. Huang, B.; Wang, J.; Sun, M.; Chen, X.; Xu, D.; Li, Z.-P.; Ma, J.; Feng, S.-T.; Gao, Z. Feasibility of multi-parametric magnetic resonance imaging combined with machine learning in the assessment of necrosis of osteosarcoma after neoadjuvant chemotherapy: A preliminary study. BMC Cancer 2020, 20, 322. [Google Scholar] [CrossRef]
  30. Wu, Y.; Xu, L.; Yang, P.; Lin, N.; Huang, X.; Pan, W.; Li, H.; Lin, P.; Li, B.; Bunpetch, V.; et al. Survival Prediction in High-grade Osteosarcoma Using Radiomics of Diagnostic Computed Tomography. EBioMedicine 2018, 34, 27–34. [Google Scholar] [CrossRef]
  31. Ngan, E.; Mullikin, D.; Theruvath, A.; Annapragada, A.; Ghaghada, K.; Heczey, A.; Starosolski, Z. Classification of osteosarcoma clinical outcomes using contrast-enhanced MRI radiomics and clinical variables. In Proceedings of the SPR 2025 Annual Meeting, Honolulu, HI, USA, 7–11 April 2025. [Google Scholar]
  32. Chen, H.; Zhang, X.; Wang, X.; Quan, X.; Deng, Y.; Lu, M.; Wei, Q.; Ye, Q.; Zhou, Q.; Xiang, Z.; et al. MRI-based radiomics signature for pretreatment prediction of pathological response to neoadjuvant chemotherapy in osteosarcoma: A multicenter study. Eur. Radiol. 2021, 31, 7913–7924. [Google Scholar] [CrossRef]
Figure 1. Recruitment flowchart.
Figure 1. Recruitment flowchart.
Cancers 17 02586 g001
Figure 2. Data collection timeline.
Figure 2. Data collection timeline.
Cancers 17 02586 g002
Figure 3. Comparison between three segmentation methods, (a) OS tumor without mask, (b) mask for whole-tumor segmentation (blue = whole tumor), (c) mask for bone/soft tissue segmentation (yellow = soft tissue; green = bone), (d) mask for tumor sampling (7 non-overlapping regions from front (not displayed), back (not displayed), top (orange), bottom (green), left (yellow), right (blue), and middle region (dark green), respectively).
Figure 3. Comparison between three segmentation methods, (a) OS tumor without mask, (b) mask for whole-tumor segmentation (blue = whole tumor), (c) mask for bone/soft tissue segmentation (yellow = soft tissue; green = bone), (d) mask for tumor sampling (7 non-overlapping regions from front (not displayed), back (not displayed), top (orange), bottom (green), left (yellow), right (blue), and middle region (dark green), respectively).
Cancers 17 02586 g003
Figure 4. Example classification results for eight patients: four frequently misclassified and four consistently correctly classified by our models. Each panel displays the middle slice from a patient’s MRI. Patient 1 was misclassified by 26 out of 72 classifiers across different segmentation methods, feature types, and outcomes. Patient 2 was misclassified by 19 classifiers across different segmentation methods (whole-tumor segmentation and tumor sampling), feature types, and outcomes. Patient 3 was misclassified by 15 classifiers across different segmentation methods, feature types, and outcomes. Patient 4 was misclassified by 20 classifiers across different segmentation methods, feature types, and outcomes. Patient 5 was correctly classified by all classifiers except the classifier using whole-tumor segmentation with IBSI RFs only for predicting therapy response. Patient 6 was correctly classified by all classifiers except (1) the classifier using whole-tumor segmentation with IBSI RFs only for predicting progressive disease and (2) classifiers using whole-tumor segmentation with all RFs for predicting therapy response. Patient 7 was correctly classified by all classifiers except the classifier using tumor sampling, IBSI RFs, and baseline clinical features for predicting OS-related mortality. Patient 8 was correctly classified by all classifiers except (1) classifiers using whole-tumor segmentation for predicting OS-related mortality and (2) the classifier using tumor sampling with IBSI RFs only for predicting relapse off therapy.
Figure 4. Example classification results for eight patients: four frequently misclassified and four consistently correctly classified by our models. Each panel displays the middle slice from a patient’s MRI. Patient 1 was misclassified by 26 out of 72 classifiers across different segmentation methods, feature types, and outcomes. Patient 2 was misclassified by 19 classifiers across different segmentation methods (whole-tumor segmentation and tumor sampling), feature types, and outcomes. Patient 3 was misclassified by 15 classifiers across different segmentation methods, feature types, and outcomes. Patient 4 was misclassified by 20 classifiers across different segmentation methods, feature types, and outcomes. Patient 5 was correctly classified by all classifiers except the classifier using whole-tumor segmentation with IBSI RFs only for predicting therapy response. Patient 6 was correctly classified by all classifiers except (1) the classifier using whole-tumor segmentation with IBSI RFs only for predicting progressive disease and (2) classifiers using whole-tumor segmentation with all RFs for predicting therapy response. Patient 7 was correctly classified by all classifiers except the classifier using tumor sampling, IBSI RFs, and baseline clinical features for predicting OS-related mortality. Patient 8 was correctly classified by all classifiers except (1) classifiers using whole-tumor segmentation for predicting OS-related mortality and (2) the classifier using tumor sampling with IBSI RFs only for predicting relapse off therapy.
Cancers 17 02586 g004
Table 1. Summary statistics.
Table 1. Summary statistics.
63 Patients 26 Patients9 Patientsp-Value
Age (Median/Mean ± SD)12.29/11.82 ± 3.5312.21/11.83 ± 3.9811.49/12.73 ± 4.200.9263
Sex (Male)43 (68.25%)18 (69.23%)7 (77.78%)0.8451
Race 0.5845
   Caucasian 52 (82.54%)22 (84.62%)6 (66.66%)
   Black/African American9 (14.29%)3 (11.54%)3 (33.33%)
   Others2 (3.17%)1 (3.85%)0 (0%)
Hispanics31 (49.21%)14 (53.85%)4 (44.44%)0.8690
Laterality (right)36 (57.14%)16 (61.54%)1 (11.11%)0.0234
OS location 0.0532
   Femur41 (65.08%)23 (88.46%)5 (55.55%)
   Tibia 14 (22.22%)3 (11.54%)1 (11.11%)
   Humus 6 (9.52%)0 (0%)1 (11.11%)
   Fibula 2 (3.17%)0 (0%)2 (22.22%)
Histological subtype
   Osteoblastic 50 (79.37%)21 (80.77%)9 (100%)0.4017
   Chondroblastic 21 (33.33%)10 (38.46%)2 (22.22%)0.6798
   Telangiectatic 5 (7.94%)1 (3.85%)1 (11.11%) 0.5812
Metastasis (Yes)13 (20.63%)7 (26.92%)5 (55.55%)0.0784
Skip lesion (Yes)8 (12.70%)4 (15.38%)1 (11.11%) 0.8969
Progressive disease during therapy (Yes)17 (26.98%)6 (23.08%)4 (44.44%)0.4589
% necrosis (Median/Mean ± SD)93/77.32 ± 27.0895/81.21 ± 24.9895/81.11 ± 26.300.7054
Response on therapy (Adequate)37 (58.73%)16 (61.54%)6 (66.66%)0.9509
Relapse off therapy (Yes)16 (25.40%)7 (26.92%)4 (44.44%)0.4871
OS related mortality (Yes)19 (30.16%)6 (23.08%)5 (55.55%)0.1885
Legend: Note that histological subtypes were not mutually exclusive. p-values were calculated for each subtype.
Table 2. Associations between outcome and major clinical features.
Table 2. Associations between outcome and major clinical features.
Adequate Response to Therapy (n = 37)Poor Response to Therapy (n = 26)p-Value
Progressive disease during therapy (Yes)3 (8.11%)14 (53.85%)<0.0001
Metastasis at diagnosis (Yes)11 (29.73%)2 (7.69%)0.0557
Skip lesion (Yes)7 (18.92%)1 (3.85%)0.1254
Osteoblastic (Yes)30 (81.08%)20 (76.92%)0.6881
Chondroblastic (Yes)9 (24.32%)12 (46.15%)0.0704
Relapse off therapy (n = 16)No relapse off therapy (n = 47)p-value
Progressive disease during therapy (Yes)4 (25%)13 (27.66%)0.9999
Response to therapy (Adequate)12 (75%)25 (53.19%)0.1519
% necrosis (mean ± SD)81.13 ± 28.7876.03 ± 26.970.3378
Metastasis at diagnosis (Yes)7 (43.75%)6 (12.77%)0.0082
Skip lesion (Yes)4 (25%)4 (8.51%)0.1857
Osteoblastic (Yes)13 (81.25%)37 (78.72%)0.9999
Chondroblastic (Yes)7 (43.75%)14 (29.79%)0.3062
Deceased (n = 19)Alive (n = 44)p-value
Progressive disease during therapy (Yes)13 (68.42%)4 (9.09%)<0.0001
Response to therapy (Adequate)8 (42.11%)29 (65.91%)0.0782
Relapse off therapy (Yes)9 (47.37%)7 (15.91%)0.0085
% necrosis (mean ± SD)66.58 ± 28.4881.96 ± 25.720.0762
Metastasis at diagnosis (Yes)7 (36.84%)6 (13.64%)0.0367
Skip lesion (Yes)4 (21.05%)4 (9.09%)0.2286
Osteoblastic (Yes)16 (84.21%)34 (77.27%)0.7375
Chondroblastic (Yes)10 (52.63%)11 (25%)0.0327
Table 3. Classification performance for progressive disease.
Table 3. Classification performance for progressive disease.
SegmentationRF TypeValidation SetTesting SetBest ClassifierNo. FeaturesExternal Set
AccuracySensitivitySegmentationROC AUCPR AUCAccuracySensitivitySpecificityROC AUCPR AUCAccuracySensitivitySpecificity
Whole tumorIBSI RFs0.84 ± 0.140.70 ± 0.270.89 ± 0.130.68 ± 0.290.66 ± 0.290.690.500.780.670.69MLP40.330.250.40
IBSI RFs + baseline clinical0.90 ± 0.110.93 ± 0.130.89 ± 0.110.81 ± 0.170.87 ± 0.160.690.250.890.580.41SVM rbf80.440.250.60
All RFs0.92 ± 0.080.93 ± 0.130.91 ± 0.070.88 ± 0.130.91 ± 0.100.770.500.890.670.65KNN (k = 3)130.440.250.60
All RFs + baseline clinical0.96 ± 0.050.93 ± 0.130.97 ± 0.060.92 ± 0.090.94 ± 0.080.770.500.890.510.56Random forest140.560.250.80
Whole tumor/tumor samplingOnly baseline clinical0.78 ± 0.200.93 ± 0.130.71 ± 0.300.70 ± 0.260.71 ± 0.190.770.750.780.880.78Random forest120.220.500.00
Tumor samplingIBSI RFs0.82 ± 0.150.87 ± 0.160.81 ± 0.180.75 ± 0.230.80 ± 0.200.850.750.890.690.68MLP150.440.250.60
IBSI RFs + baseline clinical0.90 ± 0.091.0 ± 00.87 ± 0.130.84 ± 0.170.93 ± 0.090.690.500.780.630.56Gradient boosting60.670.500.80
All RFs0.92 ± 0.121.0 ± 00.89 ± 0.170.91 ± 0.120.94 ± 0.080.311.000.000.860.81MLP150.441.000.00
All RFs + baseline clinical0.92 ± 0.120.93 ± 0.130.93 ± 0.150.90 ± 0.130.92 ± 0.090.311.000.000.890.83MLP110.441.000.00
Bone/soft tissueIBSI RFs0.95 ± 0.071.0 ± 00.93 ± 0.090.94 ± 0.080.97 ± 0.050.830.501.000.880.83LDA1---
IBSI RFs + baseline clinical0.95 ± 0.071.0 ± 00.93 ± 0.090.94 ± 0.080.97 ± 0.050.830.501.000.880.83LDA1---
All RFs1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.670.500.750.750.75SVM rbf14---
All RFs + baseline clinical 1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.670.500.750.810.58Random forest15---
Only baseline clinical0.79 ± 0.211.0 ± 00.73 ± 0.250.69 ± 0.320.82 ± 0.230.670.001.000.750.75MLP9---
Table 4. Classification performance for therapy response.
Table 4. Classification performance for therapy response.
SegmentationRF TypeValidation SetTesting SetBest ClassifierNo. FeaturesExternal Set
AccuracySensitivitySpecificityROC AUCPR AUCAccuracySensitivitySpecificityROC AUCPR AUCAccuracySensitivitySpecificity
Whole tumorIBSI RFs0.72 ± 0.040.59 ± 0.120.91 ± 0.110.83 ± 0.040.73 ± 0.130.540.251.000.690.79KNN (k = 10)30.440.500.33
IBSI RFs + baseline clinical0.78 ± 0.130.67 ± 0.240.95 ± 0.100.86 ± 0.090.73 ± 0.170.690.630.800.800.87MLP90.560.500.67
All RFs0.88 ± 0.080.79 ± 0.141.0 ± 00.93 ± 0.060.87 ± 0.120.620.630.600.780.88SVM rbf140.560.670.33
All RFs + baseline clinical0.88 ± 0.080.79 ± 0.141.0 ± 00.93 ± 0.060.87 ± 0.120.620.630.600.780.88SVM rbf140.560.670.33
Whole tumor/tumor samplingOnly baseline clinical0.76 ± 0.080.76 ± 0.170.76 ± 0.160.84 ± 0.060.75 ± 0.100.770.880.600.900.95SVM sigmoid150.671.000.00
Tumor samplingIBSI RFs0.80 ± 0.060.69 ± 0.120.96 ± 0.080.89 ± 0.060.84 ± 0.040.620.750.400.680.77MLP50.670.830.33
IBSI RFs + baseline clinical0.88 ± 0.080.83 ± 0.110.95 ± 0.100.94 ± 0.050.89 ± 0.070.620.500.800.760.84Random forest50.440.500.33
All RFs0.94 ± 0.050.93 ± 0.090.95 ± 0.100.98 ± 0.020.97 ± 0.030.460.250.800.600.72SVM Polynomial110.330.001.00
All RFs + baseline clinical0.94 ± 0.050.93 ± 0.090.95 ± 0.100.98 ± 0.020.97 ± 0.030.460.250.800.600.72SVM Polynomial110.330.001.00
Bone/soft tissueIBSI RFs1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.500.251.000.750.92Random forest10---
IBSI RFs + baseline clinical1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.330.250.500.630.80Random forest12---
All RFs1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 01.001.001.001.001.00KNN (k = 5)4---
All RFs + baseline clinical1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.831.000.500.880.95LDA3---
Only baseline clinical0.90 ± 0.120.83 ± 0.211.0 ± 00.93 ± 0.100.83 ± 0.210.500.251.000.750.92Logistic regression11---
Table 5. Classification performance for relapse off therapy.
Table 5. Classification performance for relapse off therapy.
SegmentationRF TypeValidation SetTesting SetBest ClassifierNo. FeaturesExternal Set
AccuracySensitivitySpecificityROC AUCPR AUCAccuracySensitivitySpecificityROC AUCPR AUCAccuracySensitivitySpecificity
Whole tumorIBSI RFs0.84 ± 0.100.77 ± 0.200.86 ± 0.160.70 ± 0.170.73 ± 0.180.770.001.000.730.42SVM sigmoid40.670.251.00
IBSI RFs + baseline clinical0.80 ± 0.090.80 ± 0.270.81 ± 0.180.67 ± 0.110.80 ± 0.090.690.330.800.580.35Random forest50.440.500.40
IBSI RFs + baseline clinical + prior outcomes0.82 ± 0.080.70 ± 0.270.87 ± 0.160.65 ± 0.140.75 ± 0.120.770.001.000.630.39Logistic regression30.560.001.00
All RFs0.84 ± 0.100.93 ± 0.130.81 ± 0.170.75 ± 0.180.84 ± 0.110.540.330.600.330.22SVM rbf110.780.750.80
All RFs + baseline clinical0.88 ± 0.101.0 ± 00.83 ± 0.140.89 ± 0.110.92 ± 0.080.460.330.500.570.36SVM rbf100.671.000.40
All RFs + baseline clinical + prior outcomes0.90 ± 0.090.87 ± 0.270.92 ± 0.100.78 ± 0.250.84 ± 0.200.621.000.500.730.41MLP50.561.000.20
Whole tumor/tumor samplingOnly baseline clinical0.74 ± 0.190.93 ± 0.130.68 ± 0.240.68 ± 0.260.75 ± 0.200.620.670.600.580.53Random forest90.560.250.80
Only baseline clinical + prior outcomes0.78 ± 0.220.93 ± 0.130.74 ± 0.290.72 ± 0.290.76 ± 0.280.770.001.000.570.30MLP80.560.001.00
Tumor samplingIBSI RFs0.84 ± 0.160.87 ± 0.160.83 ± 0.230.80 ± 0.180.78 ± 0.180.770.330.900.670.43SVM sigmoid110.440.000.80
IBSI RFs + baseline clinical0.94 ± 0.050.93 ± 0.130.95 ± 0.070.86 ± 0.130.93 ± 0.060.770.330.900.730.61SVM polynomial130.560.250.80
IBSI RFs + baseline clinical + prior outcomes0.94 ± 0.050.93 ± 0.130.95 ± 0.070.86 ± 0.130.93 ± 0.060.770.330.900.730.61SVM polynomial130.560.250.80
All RFs0.92 ± 0.080.93 ± 0.130.92 ± 0.110.90 ± 0.080.93 ± 0.070.231.000.000.730.41MLP150.441.000.00
All RFs + baseline clinical0.94 ± 0.051.0 ± 00.92 ± 0.070.93 ± 0.070.97 ± 0.030.311.000.100.670.64MLP100.441.000.00
All RFs + baseline clinical + prior outcomes0.94 ± 0.051.0 ± 00.92 ± 0.070.93 ± 0.070.97 ± 0.030.311.000.100.670.64MLP100.441.000.00
Bone/soft tissueIBSI RFs0.95 ± 0.071.0 ± 00.93 ± 0.090.94 ± 0.080.97 ± 0.050.670.500.750.750.75Logistic regression3---
IBSI RFs + baseline clinical1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.670.500.750.750.75SVM rbf3---
IBSI RFs + baseline clinical + prior outcomes1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.670.500.750.750.75SVM rbf3---
All RFs1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.500.500.500.630.50MLP15---
All RFs + baseline clinical1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.500.500.500.750.75SVM rbf10---
All RFs + baseline clinical + prior outcomes1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.500.500.500.750.75SVM rbf10---
Only baseline clinical0.95 ± 0.071.0 ± 00.93 ± 0.090.94 ± 0.080.97 ± 0.050.670.500.750.750.58MLP11---
Only baseline clinical + prior outcomes0.91 ± 0.141.0 ± 00.87 ± 0.190.92 ± 0.120.93 ± 0.090.670.500.750.750.75SVM sigmoid2---
Table 6. Classification performance for OS-related mortality.
Table 6. Classification performance for OS-related mortality.
SegmentationRF TypeValidation SetTesting SetBest ClassifierNo. FeaturesExternal Set
AccuracySensitivitySpecificityROC AUCPR AUCAccuracySensitivitySpecificityROC AUCPR AUCAccuracySensitivitySpecificity
Whole tumorIBSI RFs0.80 ± 0.130.80 ± 0.160.80 ± 0.250.69 ± 0.160.73 ± 0.090.310.750.110.470.35SVM sigmoid100.671.000.25
IBSI RFs + baseline clinical0.88 ± 0.080.73 ± 0.250.94 ± 0.110.73 ± 0.180.77 ± 0.170.690.250.890.310.33KNN (k = 15)80.560.201.00
IBSI RFs + baseline clinical + prior outcomes0.98 ± 0.041.0 ± 00.97 ± 0.060.98 ± 0.030.99 ± 0.020.851.000.780.920.85SVM rbf30.560.400.75
All RFs0.92 ± 0.120.93 ± 0.130.91 ± 0.110.91 ± 0.150.92 ± 0.130.850.750.890.860.75Random forest150.891.000.75
All RFs + baseline clinical0.92 ± 0.040.80 ± 0.160.97 ± 0.060.87 ± 0.080.87 ± 0.110.770.750.780.810.57Random forest60.560.201.00
All RFs + baseline clinical + prior outcomes1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.921.000.890.970.95KNN (k = 5)61.001.001.00
whole tumor/tumor samplingOnly baseline clinical0.82 ± 0.120.73 ± 0.130.86 ± 0.160.69 ± 0.210.69 ± 0.190.620.500.670.670.46SVM polynomial150.440.200.75
Only baseline clinical + prior outcomes0.98 ± 0.041.0 ± 00.97 ± 0.060.98 ± 0.030.99 ± 0.020.851.000.780.920.85SVM rbf30.560.400.75
Tumor samplingIBSI RFs0.90 ± 0.060.73 ± 0.250.97 ± 0.060.80 ± 0.150.81 ± 0.180.620.250.780.610.53Random forest50.330.000.75
IBSI RFs + baseline clinical0.84 ± 0.100.87 ± 0.160.83 ± 0.170.77 ± 0.170.84 ± 0.140.690.500.780.720.47Random forest70.330.000.75
IBSI RFs + baseline clinical + prior outcomes0.98 ± 0.041.0 ± 00.97 ± 0.060.98 ± 0.030.99 ± 0.020.851.000.780.920.85SVM rbf30.560.400.75
All RFs0.94 ± 0.081.0 ± 00.91 ± 0.110.91 ± 0.110.95 ± 0.060.391.000.110.810.67Random forest120.671.000.25
All RFs + baseline clinical0.94 ± 0.080.93 ± 0.130.94 ± 0.110.91 ± 0.110.93 ± 0.080.621.000.440.850.73Random forest50.561.000.00
All RFs + baseline clinical + prior outcomes1.0 ± 01.0 ± 00.94 ± 0.110.97 ± 0.050.98 ± 0.040.850.750.890.940.92KNN (k = 8)80.780.601.00
Bone/soft tissueIBSI RFs1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 01.001.001.001.001.00SVM sigmoid7---
IBSI RFs + baseline clinical1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 01.001.001.001.001.00KNN (k = 6)8---
IBSI RFs + baseline clinical + prior outcomes1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 01.001.001.001.001.00Random forest8---
All RFs1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.831.000.750.880.83SVM rbf6---
All RFs + baseline clinical1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.831.000.750.880.83SVM rbf6---
All RFs + baseline clinical + prior outcomes1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 01.001.001.001.001.00Random forest4---
Only baseline clinical0.73 ± 0.291.0 ± 00.68 ± 0.350.57 ± 0.330.68 ± 0.350.670.500.750.630.50MLP15---
Only baseline clinical + prior outcomes1.0 ± 01.0 ± 01.0 ± 01.0 ± 01.0 ± 00.830.501.000.880.83SVM rbf4---
Table 7. Demographics and clinical information of the selected patients in Figure 4.
Table 7. Demographics and clinical information of the selected patients in Figure 4.
Patient Gender Age (y)Skip
Lesion
Meta-Stasis OS Type Progressive Disease% NecrosisRelapse off TherapyMortality
1Female14.0NoYesOsteobl.No100YesYes
2Male16.3NoYesOsteobl.No99NoNo
3Female14.9YesYesOsteobl.Yes>99YesYes
4Male10.9NoYesChondrobl.No>99NoNo
5Female9.8NoNoOsteobl.
and telangiectatic
No100NoNo
6Male12.3YesYesOsteobl.
and chondrob.
Yes95YesYes
7Male9.0NoNoOsteobl.No87NoNo
8Male14.2NoNoOsteobl.Yes40Noyes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ngan, E.; Mullikin, D.; Theruvath, A.J.; Annapragada, A.V.; Ghaghada, K.B.; Heczey, A.A.; Starosolski, Z.A. MRI-Based Radiomics for Outcome Stratification in Pediatric Osteosarcoma. Cancers 2025, 17, 2586. https://doi.org/10.3390/cancers17152586

AMA Style

Ngan E, Mullikin D, Theruvath AJ, Annapragada AV, Ghaghada KB, Heczey AA, Starosolski ZA. MRI-Based Radiomics for Outcome Stratification in Pediatric Osteosarcoma. Cancers. 2025; 17(15):2586. https://doi.org/10.3390/cancers17152586

Chicago/Turabian Style

Ngan, Esther, Dolores Mullikin, Ashok J. Theruvath, Ananth V. Annapragada, Ketan B. Ghaghada, Andras A. Heczey, and Zbigniew A. Starosolski. 2025. "MRI-Based Radiomics for Outcome Stratification in Pediatric Osteosarcoma" Cancers 17, no. 15: 2586. https://doi.org/10.3390/cancers17152586

APA Style

Ngan, E., Mullikin, D., Theruvath, A. J., Annapragada, A. V., Ghaghada, K. B., Heczey, A. A., & Starosolski, Z. A. (2025). MRI-Based Radiomics for Outcome Stratification in Pediatric Osteosarcoma. Cancers, 17(15), 2586. https://doi.org/10.3390/cancers17152586

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop