Next Article in Journal
Detection of Cadherin 12 in Plasma and Peritoneal Fluid Among Women with Endometriosis Using Novel Surface Plasmon Resonance Imaging (SPRi) Method
Previous Article in Journal
Sensitive Detection of Plasma Fibrinogen Chain A mRNA in Hepatocellular Carcinoma Using Semi-Nested RT-PCR
Previous Article in Special Issue
Integrating Machine Learning and Deep Learning for Predicting Non-Surgical Root Canal Treatment Outcomes Using Two-Dimensional Periapical Radiographs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Computed Tomography-Based Radiomics Diagnostic Model for Fat-Poor Small Renal Tumor Subtypes

1
Department of Urology, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul 06591, Republic of Korea
2
Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul 08826, Republic of Korea
3
Department of Urology, Seoul Metropolitan Government, Seoul National University Boramae Medical Center, Seoul 07061, Republic of Korea
4
Department of Phychology, Seoul National University, Seoul 08826, Republic of Korea
5
Department of Radiology, Eunpyung St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul 06591, Republic of Korea
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Diagnostics 2025, 15(11), 1365; https://doi.org/10.3390/diagnostics15111365
Submission received: 10 February 2025 / Revised: 12 May 2025 / Accepted: 23 May 2025 / Published: 28 May 2025
(This article belongs to the Special Issue Machine-Learning-Based Disease Diagnosis and Prediction)

Abstract

:
Background: Differentiating histologic subtypes of fat-poor small renal masses using conventional imaging remains difficult due to their overlapping radiologic characteristics. We aimed to develop a machine learning-based diagnostic model using CT-derived radiomic features to classify the five most common renal tumor subtypes: clear cell RCC (ccRCC), papillary RCC (pRCC), chromophobe RCC (chRCC), angiomyolipoma (AML), and oncocytoma. Methods: A total of 499 patients with pathologically confirmed renal tumors who underwent preoperative contrast-enhanced CT and nephrectomy were retrospectively analyzed. Results: We extracted and analyzed radiomic features from 1548 multi-phase CT scans from 499 patients, focusing on fat-poor tumors. Five machine learning classifiers including Linear SVM, Rbf SVM, Random Forest, and XGBoost were involved. Among the models, XGBoost showed the best classification performance, with an average AU-PRC: mean = 0.757, standard error = 0.033 and a renal angiomyolipoma-specific AU-ROC: mean = 0.824, standard error = 0.023. These results outperformed other single-phase CT radiomic feature-based machine learning models trained with 20% of principal components. Conclusions: This study demonstrates the effectiveness of radiomics-based machine learning in classifying renal tumor subtypes and highlights the potential of AI in medical imaging. The findings, particularly the utility of single-phase CT and feature optimization, offer valuable insights for future precision medicine approaches. Such methods may support more personalized diagnosis and treatment planning in renal oncology.

1. Introduction

Kidney cancer represents a significant global health burden, with the American Cancer Society reporting an estimated 81,610 new cases diagnosed in the United States alone in 2024. This highlights the urgent need for advanced diagnostic and treatment strategies to improve patient outcomes. Kidney cancer, predominantly renal cell carcinoma (RCC), is complex due to its multiple subtypes, each with unique prognostic implications [1,2].
However, considered anatomically, the kidney is surrounded by a membrane called Gerota’s fascia, and this membrane plays a role in preventing the spread of kidney cancer, so inserting a needle to perform a biopsy is taboo. For this reason, imaging diagnosis is very important in kidney cancer. Kidney cancer is usually discovered by ultrasound sonography or CT scan accidentally, and when a renal mass is diagnosed as cancer, the accuracy is currently known to be about 90% [3]. However, this high diagnosis rate includes all cases that can be easily diagnosed as cancer, such as very large kidney tumors. In cases such as small kidney tumors or fat-poor tumors, the diagnosis rate drops sharply [4]. When pathological results cannot be predicted, partial nephrectomy or radical nephrectomy, which are chosen for proactive surgical management, are highly invasive procedures that require advanced surgical expertise. If the lesion turns out to be benign, these surgeries can impose significant health and economic burdens on the patient. Therefore, accurate screening through imaging studies is of paramount importance.
Accurate diagnosis of kidney cancer subtypes remains challenging due to the variability in their appearance on imaging modalities like computed tomography (CT). This variability can lead to differential diagnostic outcomes, affecting treatment decisions [5,6]. Studies such as those by Bauman et al. [7] and Tanaka et al. [8] have documented the incidence and challenges of accurately classifying these tumors, emphasizing the risk of unnecessary interventions due to diagnostic inaccuracies.
The integration of machine learning with radiomic data represents a cutting-edge approach in the diagnosis of kidney cancer. Deep learning models, particularly those employing convolutional neural networks (CNNs), have demonstrated superior accuracy in identifying and classifying renal tumors compared to traditional methods [9,10]. Such models not only streamline the diagnostic process, but also reduce the variability associated with human interpretation [11,12].
Our research team has also made various attempts to utilize AI in interpreting CT scans to classify the presence of renal tumors and their subtypes. Our team first attempted prediction using Convolutional Neural Networks (CNN), one of the AI techniques known for achieving the highest predictive accuracy. Although we achieved an accuracy of nearly 62%, we were unable to obtain actionable feedback from this approach. However, this study confirmed that predictions are feasible, and that the predictive accuracy of the AI model exceeded that of current radiologists [13]. This model addresses critical gaps in current diagnostic approaches by providing a robust, automated system that offers high accuracy and reproducibility [14]. However, this study concluded that it could not provide feedback substantial enough to advance current medical practice. While the CNN demonstrated good predictive performance, it lacked the capability to offer actionable feedback or highlight specific components for radiologists to focus on.
Radiomics has emerged as a revolutionary technique in medical imaging, offering the potential to extract quantifiable data from images that are beyond human visual detection. This approach has significantly advanced the classification of renal masses, aiding in distinguishing between benign and malignant forms [15]. Sun et al. [16] and Leon et al. [17] provide evidence of the effectiveness of radiomic features in enhancing the diagnostic precision of RCC.
Radiomics, an AI-driven technology, offers significant potential as a supportive tool for radiologists. Its diagnostic decisions are based on features that can be validated by human experts, making it a reliable feedback mechanism. Leveraging this technology, we explored its application in diagnosing subtypes of renal cancer. We hypothesized that integrating radiomics into clinical practice could enhance the diagnostic accuracy of radiologists. As a precursor to this study, we first conducted research focused solely on determining the presence or absence of tumors [18]. Through this approach, we aimed to identify, using radiomics, features that should be prioritized for classification. This study is also part of the process of verifying those features.
By improving the accuracy of subtype classification, our model contributes to more tailored treatment strategies, potentially leading to better patient outcomes. The adaptability of our model across different imaging modalities and its application to other cancer types also suggests a broad potential impact, paving the way for future innovations in oncological diagnostics [19,20].

2. Materials and Methods

2.1. Patients

We utilized data from patients who underwent total or partial renal resection at a single institution between 2003 and 2021. Excluded from the study were cases of kidney tumor with a fat component exceeding 30%. Only patients with fat-poor tumors were included, resulting in a total selection of 499 patients. The average age of these patients was 56.02 ± 12.18 years, and the mean tumor size on CT was 3.515 ± 2.42 cm. The study encompassed patients with benign tumors like oncocytoma and AML, as well as those with malignant tumors such as clear cell, chromophobe, and papillary-type RCC. Each patient underwent CT imaging in one to four phases, resulting in a total of 1548 sets of CT images. Details of the patient demographics are provided in Table 1.
CT scans were obtained in various configurations: non-contrast, arterial phase (20–30 s after contrast injection), portal phase (60–70 s), and delayed phase (>180 s). For each CT scan, voxel-level segmentation labels were collected, with trained annotators manually outlining kidneys and tumors in the images. Annotations were further reviewed and refined by a radiologist with 11 years of experience. A second diagnosis was provided only if the radiologist had significant doubts about the initial diagnosis. Radiologist performance was evaluated based on the first diagnosis (top-one performance) and both the first and second diagnoses (top-two performance).

2.2. Radiomics Feature Analysis Workflow

Radiomics quantitatively extract textural information from medical images, which can then be utilized further by machine learning algorithms to support clinical decision-making. To develop multi-phase CT scan radiomic feature-based machine learning models for classifying subtypes of renal cell tumors, we initially obtained various types of radiomic features from each phase of the CT scans. Considering that some participants only underwent one or two phases of CT scans, such as the non-contrast and portal phases, we split participants with multi-phase and those with limited phase CT scans differently into training and hold-out test sets to align with our objective of developing machine learning models based on multi-phase CT scan radiomic features. We first randomly split participants with multi-phase CT scans into 75% of the training set and 25% of the hold-out test set in a stratified manner, and then we included those with single-phase or two-phase phase CT scans only in the hold-out test set. This resulted in both the training and the hold-out test set having a similar number of participants. The training and hold-out test sets consisted of similar proportions of renal cell tumor subtypes. Only the training set was used for feature selection and model training with 10-fold cross-validation. We compared the following four most popular machine learning algorithms: Linear Support Vector Machine (Linear SVM), Radial basis function Support Vector Machine (Rbf SVM), Random Forest, and XGBoost [21]. We conducted hyperparameter tuning through 50 trials of a random hyperparameter search for renal cell tumor subtype classification. We searched hyperparameters with Optuna (version 3.1.0) [22]. Codes used for our analyses are available for reproducibility testing (https://github.com/Transconnectome/Kidney_Radiomics; accessed on 9 February 2025). Due to the class imbalance among renal cell tumor subtypes, with a great number of participants having specific renal cell tumor subtypes such as clear cell renal cell carcinoma compared to other subtypes, we employed oversampling during the training phase of model-based feature selection and model development. We implemented oversampling with the Synthetic Minority Over-sampling Technique (SMOTE) using imbalanced-learn (version 0.10.1) [23]. We evaluated the classification performance of our machine learning models based on their ability to accurately differentiate each renal cell tumor subtype from the others. To assess this, we calculated the evaluation metrics, including accuracy, f1 score, Area Under Receiving Operating Characteristic Curves, and Area Under the Precision and Recall Curve, using a One-vs-Rest approach and the “micro” options in scikit-learn. The hold-out test set was exclusively utilized to evaluate the final performance of machine learning models. By using scikit-learn (version 1.2.1) [24] and Python (version 3.10.8), we conducted feature selection, machine learning model development, and model evaluation processes. We followed the Checklist for Artificial Intelligence in Medical Imaging (CLAIMS) to ensure the reproducibility of our study [25]. A schematic overview of the radiomics feature analysis workflow is illustrated in Figure 1.

2.3. Radiomics Feature Extraction

We resampled 1548 multi-phase CT scans with a resolution of 1 mm × 1 mm × 1 mm, and then we extracted 1288 radiomic features from the segmented region of interests (ROIs) of original CT scans, wavelet-filtered CT scans, and Laplacian of Gaussian-filtered scans. We extracted radiomic features utilizing the Python package PyRadiomics (version 3.1.0) [26] and Python (version 3.7). The extracted radiomic features included first-order features, three-dimensional shape features, Gray Level Cooccurence Matrix (GLCM), Gray Level Run Length Matrix (GLRM), Gray Level Size Zone Matrix (GLSZM), Neighboring Gray Tone Difference Matrix (NGTDM), and Gray Level Dependence Matrix (GLDM). Representative multi-phase CT scans with corresponding kidney ROI masks used for radiomic feature extraction are shown in Figure 2.

2.4. Feature Selection and Dimensionality Reduction

To ensure that the hold-out test set remained independent and unbiased for evaluating the final model’s performance, we utilized several feature selection methods and a dimensionality reduction method solely on the training set. In the feature selection process, ANOVA F-test with FDR correction using the Benjamin–Hochberg procedure (Pfdr = 0.05) was performed on 1288 z-score normalized radiomic features derived from each CT phase. This selected 846, 773, 660, and 972 features from the arterial, delayed, non-contrast, and portal phase scans, respectively. Principal Component Analysis (PCA) was then performed on those selected radiomic features to reduce dimensionality. In this study, we aimed to compare the predictive performance of machine learning models with respect to the number of principal components used, a hyperparameter typically determined by researchers using rule-of-thumb approaches [27,28]. We set the number of principal components to values corresponding to 10% and 20% of the count of features selected via the F-test. When the number of principal components was set to 10% of the selected feature count, the final feature counts were 84, 77, 66, and 97 for arterial, delayed, non-contrast, and portal phase scans, respectively. When set to 20%, the final feature counts were 169, 154, 132, and 194 for the corresponding phases. The PCA results from the training set indicated that setting the number of principal components to 10% and 20% of the count of selected features was appropriate for representing radiomic features selected via the F-test while effectively reducing the dimensionality of the feature space (with the accumulated variance of components were almost 100%) ( Figures S1 and S2). During development of the machine learning models, we assessed the evaluation performance of machine learning models in relation to the number of principal components used. We standardized these principal components (i.e., PCA whitening) for further analyses. Additionally, we employed the Sequential Feature Selector (SFS) on these standardized principal components to identify those that could optimize model evaluation performance on the validation set. The SFS method systematically chose features through a sequential removal process, based on their impact on model performance in the validation set.

2.5. Reproducibility and Ethical Considerations

All processes from feature extraction to model evaluation adhered to Checklist for Artificial Intelligence in Medical Imaging (CLAIM) standards to ensure reproducibility and ethical compliance [25]. Our codes are publicly available for validation and further research at GitHub Repository.

3. Results

3.1. Demographic Characteristics of Patients

In total, 499 patients were included in the development of our multi-phase CT radiomics machine learning model for renal cell tumor subtype classification. Participants with either single-phase or two-phase CT scans were included solely in the hold-out test set. Meanwhile, those with multi-phase CT scans were divided using a stratified approach, with 75% allocated to the training dataset and the remaining 25% reserved for the hold-out test dataset (Ntraining = 289, Ntest = 210). There were no significant statistical differences in demographic or pathological characteristics between the training and hold-out test datasets (p > 0.05) (Table 1).

3.2. Multi-Phase CT Radiomic Features for Renal Cell Tumor Subtype Classification

We compared the classification performance on a hold-out test set across various machine learning models with multiple evaluation metrics. The XGBoost model, trained on 20% of principal components from selected radiomic features (XGBoost20%), outperformed other algorithms in classifying renal cell tumor subtypes, achieving the highest average accuracy (ACC: mean = 0.475, standard error = 0.011), Area Under Receiving Operating Characteristic Curves (AU-ROC: mean = 0.744, standard error = 0.004), Area Under the Precision and Recall Curve (AU-PRC: mean = 0.474, standard error = 0.006), and F1 score (mean = 0.453, standard error = 0.007). Similarly, XGBoost trained on 10% of principal components from selected radiomic features (XGBoost10%) also performed well, achieving the highest performance evaluation metrics across ACC (mean = 0.453, standard error = 0.007), AU-ROC (mean = 0.744, standard error = 0.004), AU-PRC (mean = 0.474, standard error = 0.006), and F1 (mean = 0.453, standard error = 0.007). However, XGBoost20% showed higher overall metrics than XGBoost10%. These results are detailed in Table 2.
Of note, although the evaluation performance in classifying a specific subtype with other subtypes varied according to the machine learning model and the number of principal components used, XGBoost showed the best performance in classifying clear cell renal cell carcinoma against other subtypes when using either 10% (AU-PRC of XGBoost10%: mean = 0.673, standard error = 0.008; AU-ROC of XGBoost10%: mean = 0.738, standard error = 0.007) or 20% (AU-PRC of XGBoost20%: mean = 0.686, standard error = 0.012; AU-ROC of XGBoost20%: mean = 0.750, standard error = 0.011) of principal components for training. For angiomyolipoma (AML), Random Forest (10% principal components) trained with 10% of principal components achieved the highest AU-ROC in classifying AML against other subtypes (mean = 0.769, standard error = 0.009). However, it showed lower AU-PRC in classifying renal angiomyolipoma compared to clear cell renal cell carcinoma when distinguishing these subtypes from others (mean = 0.366, standard error = 0.01). These results indicated that machine learning models were biased towards predicting clear cell renal cell carcinoma, which constitutes the majority of the dataset. The results are detailed in Table 2 and Figure 3.

3.3. Single-Phase CT Radiomic Features for Renal Cell Tumor Subtype Classification

In contrast with the results from the multi-phase models, the machine learning models trained on 10% of principal components from selected radiomic features generally outperformed machine learning models trained on 20% of principal components for multiple evaluation metrics. XGBoost, trained on radiomic features derived from arterial and portal phase CT scans, achieved the highest F1 scores (arterial phase: mean = 0.594, standard error = 0.037; portal phase: mean = 0.456, standard error = 0.015) and ACC (arterial phase: mean = 0.594, standard error = 0.037; portal phase: mean = 0.456, standard error = 0.015) compared to other machine learning models trained on arterial and portal phase CT scans radiomic features. Random Forest trained on portal phase CT scan radiomic features showed the highest AU-PRC (portal phase: mean = 0.473, standard error = 0.015) and AU-ROC (portal phase: mean = 0.756, standard error = 0.008). Random Forest trained on radiomic features derived from delayed and non-contrastive phase CT scans also achieved the highest F1 (delayed phase: mean = 0.493, standard error = 0.015; non-contrastive phase: mean = 0.461, standard error = 0.011), AU-PRC (delayed phase: mean = 0.492, standard error = 0.031; non-contrastive phase: mean = 0.471, standard error = 0.013), ACC (delayed phase: mean = 0.493, standard error = 0.015; non-contrastive phase: mean = 0.461, standard error = 0.011), and AU-ROC (delayed phase: mean = 0.766, standard error = 0.016; non-contrastive phase: mean = 0.744, standard error = 0.008).
Similar to models trained on 10% of principal components from selected radiomic features, XGBoost and Random Forest, when trained on 20% of principal components, showed the highest evaluation performance among other machine learning models using the same proportion of principal components. However, their performance was slightly lower compared to models trained with only 10% of principal components. When trained with 20% of principal components derived from arterial phase CT scan radiomic features, XGBoost achieved the best performance across multiple metrics: F1 (arterial phase: mean = 0.531, standard error = 0.002), AU-PRC (arterial phase: mean = 0.565, standard error = 0.014), and ACC (arterial phase: mean = 0.531, standard error = 0.02). XGBoost, when trained with 20% of principal components derived from delayed phase CT scans, achieved the highest AU-ROC (delayed phase: mean = 0.766, standard error = 0.001) compared to other machine learning models trained with the same components. Similarly, XGBoost trained with 20% of principal components derived from portal phase CT scans showed the highest F1 (portal phase: mean = 0.447, standard error = 0.019) and ACC (portal phase: mean = 0.447, standard error = 0.019), while Random Forest achieved the highest AU-PRC (portal phase: mean = 0.449, standard error = 0.014) and AU-ROC (portal phase: mean = 0.735, standard error = 0.007) using the same components. Additionally, Random Forest achieved the highest performance in all metrics when trained with 20% of principal components derived from delayed and non-contrastive phase CT scans: F1 (delayed phase: mean = 0.497, standard error = 0.016; non-contrastive phase: mean = 0.45, standard error = 0.006), AU-PRC (delayed phase: mean = 0.529, standard error = 0.02; non-contrastive phase: mean = 0.433, standard error = 0.009), and ACC (delayed phase: mean = 0.497, standard error = 0.016; non-contrastive phase: mean = 0.45, standard error = 0.006). In the case of AU-ROC, Random Forest, when trained with 20% of principal components derived from either the arterial or the non-contrastive phase, showed the best performance (arterial phase: mean = 0.797, standard error = 0.018; non-contrastive phase: mean = 0.716, standard error = 0.007). The results from single-phase CT radiomic feature-based machine learning models are detailed in Table 3.
In subtype-specific performance, single-phase CT radiomic feature-based machine learning models also performed well on classifying clear cell renal cell carcinoma, similar to the results from multi-phase CT radiomics feature-based machine learning models. Notably, XGBoost, trained with 10% of principal components derived from selected radiomics features extracted from arterial phase CT scans, achieved the highest performance in classifying clear cell renal cell carcinoma (AU-PRC: mean = 0.757, standard error = 0.033) and renal angiomyolipoma (AU-ROC: mean = 0.824, standard error = 0.023). These results outperformed other single-phase CT radiomics feature-based machine learning models trained with 20% of principal components (Table 4A–D; Supplementary Figures S3–S6).

3.4. Comparison of Multi-Phase and Single-Phase Models

In summary, performance of machine learning models varied depending on which phase of the CT scan was used for training and which evaluation metric was applied. However, single-phase CT radiomics feature-based models generally outperformed multi-phase models. XGBoost and Random Forest consistently showed better performance than linear SVM and rbf SVM. For single-phase CT scans, machine learning models using 10% principal components derived from selected radiomic features yielded better results. In contrast, for multi-phase scans, models using 20% principal components derived from selected radiomic features performed better. Of note, regardless of the proportion of principal components used, models trained with components derived from single-phase CT scans consistently outperformed those trained with components from multi-phase CT scans. The superior performance of single-phase models, particularly those based on arterial phase scans, suggests that these may be sufficient for accurate renal cell tumor subtype classification, potentially reducing the need for multi-phase CT imaging in clinical practice (Table 4A–D; Supplementary Figures S3–S6).

4. Discussion

This study represents a significant step forward in the application of radiomics and machine learning for the classification of renal cell tumor subtypes. By leveraging multi-phase CT scans and advanced machine learning techniques, we have demonstrated a novel approach that has the potential to transform the diagnostic landscape in renal oncology [29,30,31]. Our findings address several critical challenges in the current management of renal tumors. The XGBoost model, trained on 20% of principal components from multi-phase CT radiomic features (XGBoost20%), demonstrated the best overall performance in classifying renal cell tumor subtypes.
Accurate preoperative classification of renal tumor subtypes is a significant clinical challenge, impacting treatment planning and patient outcomes [32,33]. Our radiomics-based approach offers a non-invasive and cross-sectional method to enhance diagnostic accuracy, potentially reducing the need for invasive biopsies and their associated risks [34,35].
The superior performance in identifying clear cell renal cell carcinoma (ccRCC) is particularly noteworthy, given that this subtype is associated with the worst prognosis among RCC subtypes [32]. Early and accurate identification could lead to more aggressive treatment strategies and improved patient outcomes. Our multi-phase CT scans radiomic feature-based machine learning models achieved an AU-PRC of 0.69 and AU-ROC of 0.75 in the independent testing cohort, which represents significant prediction performance.
A key finding of our study is that single-phase CT scans, particularly those from the arterial phase, can achieve comparable or even superior performance to multi-phase models for certain subtypes. The XGBoost model trained on arterial phase CT scans achieved the highest performance in classifying clear cell renal cell carcinoma (AU-PRC = 0.757) and renal angiomyolipoma (AU-ROC = 0.824). This suggests the potential for reduced radiation exposure and shorter imaging protocols, aligning with the broader goals of minimizing patient risk and optimizing resource utilization in healthcare [31,34].
Interestingly, models trained on 10% of principal components generally outperformed those trained on 20% for single-phase CT scans. This suggests that a more parsimonious feature set may suffice for accurate classification, potentially reducing computational complexity and improving model interpretability [33]. This finding opens new avenues for optimizing feature selection in radiomics research.
Our study also highlights the varying levels of model performance across different renal tumor subtypes. While clear cell renal cell carcinoma was consistently well-classified, other subtypes proved more challenging. This pattern aligns with the known heterogeneity of renal tumors and underscores the need for subtype-specific approaches in radiomics research [32,34]. Furthermore, our study adds to the growing body of evidence supporting the utility of radiomics in oncology [30]. By demonstrating that quantitative imaging features can capture clinically relevant information not visible to the naked eye, we reinforce the potential of radiomics to serve as a powerful tool in precision medicine [33].
Despite the promising results, our study has several limitations that warrant discussion. The predominance of ccRCC cases (64.3% in the training cohort) reflects a common challenge in medical imaging studies but may have introduced bias in our models. Our use of single-institution data may limit the generalizability of our findings to different patient populations or imaging protocols. The retrospective nature of our study could have introduced selection bias, and prospective validation studies are necessary to confirm the clinical utility of our approach. We encountered challenges in differentiating among specific malignant subtypes, particularly in identifying oncocytomas (AUC = 0.57–0.69), highlighting the need for further refinement of our models and potentially the incorporation of additional data types. Lastly, while we used an independent hold-out test set, our study lacks external validation on datasets from different institutions or populations, which is crucial for assessing the robustness and generalizability of our findings [33,34].
The challenges we encountered in differentiating among specific malignant subtypes, particularly in identifying oncocytomas (AUC = 0.57–0.69), mirror the difficulties faced in clinical practice. This alignment between computational and human challenges in tumor classification suggests that future improvements in AI models could have direct and significant impacts on clinical decision-making.
Looking ahead, our findings open up several exciting avenues for future research. The exploration of deep learning models, which can learn directly from image data without the need for hand-crafted features, represents a promising next step. Additionally, the integration of radiomics with other data types, such as genomics or clinical information, could lead to more comprehensive and accurate predictive models.
Our study has several significant limitations. First, the predictive accuracy of our research did not reach the commonly accepted threshold of 80%. However, we believe this limitation can be overcome by expanding the dataset, and we are actively collecting additional data to address this issue. Additionally, our study was conducted retrospectively, which means its efficacy has not yet been fully validated. We anticipate that conducting a prospective study based on these data will demonstrate its practical value.
Furthermore, the goal of this study was to identify recommended features using radiomics techniques and utilize feedback to enhance radiologists’ predictive accuracy. However, due to the scope of this research, we were only able to confirm the model’s ability to predict subtypes. We plan to include these broader aspects in future studies to fully realize the potential of this approach.
Ultimately, if we are able to validate our hypothesis step by step, and this predictive model achieves successful outcomes, it will allow us to determine the pathology of renal cancer through imaging studies alone, without the need for biopsy. This advancement would enable us to predict the optimal timing for surgery, prepare appropriate chemotherapy protocols, avoid unnecessary surgeries and their associated risks, and achieve significant economic savings.

5. Conclusions

In conclusion, our study not only demonstrates the potential of radiomics-based machine learning in renal tumor classification, but also contributes to the broader dialogue about the role of AI in medical imaging. The patterns we observed, particularly the efficacy of single-phase CT and the potential for feature set optimization, provide valuable insights for future research directions. As we continue to refine these techniques and address current limitations, the promise of more personalized, accurate, and efficient diagnosis and treatment planning in renal oncology comes increasingly within reach. This work represents a crucial step towards realizing the potential of precision medicine in the management of renal tumors, with far-reaching implications for patient care and outcomes.

Code Availability

The prediction of malignancy in renal tumors by using CT radiomics is found to be feasible. Based on this technology, it is expected that there will be future advances in the diagnosis of renal tumors. The relevant codes are freely available for reproducibility.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/diagnostics15111365/s1, Figure S1: Scree plots of principal components derived from PCA using 10% of selected radiomics features via F-test for each CT phase, Figure S2: Scree plots of principal components derived from PCA using 10% of selected radiomics features via F-test for each CT phase, Figure S3: Renal cell tumor subtype classification performance of XGBoost trained with 10% of principal components derived from arterial phase CT scan radiomic features, Figure S4: Renal cell tumor subtype classification performance of XGBoost trained with 10% of principal components derived from delayed phase CT scan radiomic features, Figure S5: Renal cell tumor subtype classification performance of XGBoost trained with 10% of principal components derived from non-contrast phase CT scan radiomic features, Figure S6: Renal cell tumor subtype classification performance of XGBoost trained with 10% of principal components derived from portal phase CT scan radiomic features

Author Contributions

Conceptualization, S.B.; Methodology, H.W.; Software, H.W., S.-H.H. and M.H.C.; Formal analysis, S.B., H.B. and J.C.; Investigation, S.-H.H.; Resources, H.B. and J.C.; Data curation, H.B. and M.H.C.; Writing—original draft, S.B. and H.W.; Writing—review & editing, M.H.C.; Project administration, S.-H.H.; Funding acquisition, J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Korea Medical Device Development Fund grant funded by the Korean government (Ministry of Science and ICT, Ministry of Trade, Industry and Energy, Ministry of Health & Welfare, Republic of Korea, Ministry of Food and Drug Safety) (Project Number: KMDF_PR_20200901_0096) and the Technology development Program (RS-2024-00460263) funded by the Ministry of SMEs and Startups(MSS, Korea). This work was also supported by Creative-Pioneering Researchers Program through Seoul National University (No. 200-20240057, 200-20240135) and by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) [NO. RS-2021-II211343, Artificial Intelligence Graduate School Program (Seoul National University)].

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of the Catholic University of Korea, Seoul St Mary’s Hospital (Protocol code KC22RISI0753) on 29 October 2024.

Informed Consent Statement

Patient consent was waived due to retrospective study.

Data Availability Statement

The relevant code is freely available for reproducibility (https://github.com/Transconnectome/Kidney_Radiomics, accessed on 16 October 2023).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Moch, H.; Cubilla, A.L.; Humphrey, P.A.; Reuter, V.E.; Ulbright, T.M. The 2016 WHO classification of tumours of the urinary system and male genital organs—Part A: Renal, penile, and testicular tumours. Eur. Urol. 2016, 70, 93–105. [Google Scholar] [CrossRef] [PubMed]
  2. American Cancer Society. Key Statistics About Kidney Cancer. Available online: https://www.cancer.org/cancer/types/kidney-cancer/about/key-statistics.html (accessed on 9 February 2025).
  3. Capitanio, U.; Bensalah, K.; Bex, A.; Boorjian, S.A.; Bray, F.; Coleman, J.; Gore, J.L.; Sun, M.; Wood, C.; Russo, P. Epidemiology of Renal Cell Carcinoma. Eur. Urol. 2019, 75, 74–84. (In English) [Google Scholar] [CrossRef] [PubMed]
  4. Kang, S.K.; Huang, W.C.; Pandharipande, P.V.; Chandarana, H. Solid renal masses: What the numbers tell us. AJR Am. J. Roentgenol. 2014, 202, 1196–1206. (In English) [Google Scholar] [CrossRef]
  5. Prasad, S.R.; Dalrymple, N.C.; Surabhi, V.R. Cross-sectional imaging evaluation of renal masses. Radiol. Clin. N. Am. 2008, 46, 95–111. [Google Scholar] [CrossRef]
  6. Young, J.R.; Margolis, D.; Sauk, S.; Pantuck, A.J.; Sayre, J.; Raman, S.S. Clear cell renal cell carcinoma: Discrimination from other renal cell carcinoma subtypes and oncocytoma at multiphasic multidetector CT. Radiology 2013, 267, 444–453. [Google Scholar] [CrossRef]
  7. Bauman, T.M.; Potretzke, A.M.; Wright, A.J.; Knight, B.A.; Vetter, J.M.; Figenshau, R.S. Partial nephrectomy for presumed renal-cell carcinoma: Incidence, predictors, and perioperative outcomes of benign lesions. J. Endourol. 2017, 31, 412–417. [Google Scholar] [CrossRef]
  8. Tanaka, T.; Huang, Y.; Marukawa, Y.; Tsuboi, Y.; Masaoka, Y.; Kojima, K.; Iguchi, T.; Hiraki, T.; Gobara, H.; Yanai, H. Differentiation of small (≤4 cm) renal masses on multiphase contrast-enhanced CT by deep learning. Am. J. Roentgenol. 2020, 214, 605–612. [Google Scholar] [CrossRef]
  9. De Fauw, J.; Ledsam, J.R.; Romera-Paredes, B.; Nikolov, S.; Tomasev, N.; Blackwell, S.; Askham, H.; Glorot, X.; O’Donoghue, B.; Visentin, D. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat. Med. 2018, 24, 1342–1350. [Google Scholar] [CrossRef]
  10. Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef]
  11. Mei, X.; Lee, H.-C.; Diao, K.-y.; Huang, M.; Lin, B.; Liu, C.; Xie, Z.; Ma, Y.; Robson, P.M.; Chung, M. Artificial intelligence–enabled rapid diagnosis of patients with COVID-19. Nat. Med. 2020, 26, 1224–1228. [Google Scholar] [CrossRef]
  12. Ardila, D.; Kiraly, A.P.; Bharadwaj, S.; Choi, B.; Reicher, J.J.; Peng, L.; Tse, D.; Etemadi, M.; Ye, W.; Corrado, G. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat. Med. 2019, 25, 954–961. [Google Scholar] [CrossRef] [PubMed]
  13. Uhm, K.H.; Jung, S.W.; Choi, M.H.; Shin, H.K.; Yoo, J.I.; Oh, S.W.; Kim, J.Y.; Kim, H.G.; Lee, Y.J.; Youn, S.Y.; et al. Deep learning for end-to-end kidney cancer diagnosis on multi-phase abdominal computed tomography. NPJ Precis. Oncol. 2021, 5, 54. (In English) [Google Scholar] [CrossRef] [PubMed]
  14. Zhao, W.; Jiang, D.; Queralta, J.P.; Westerlund, T. MSS U-Net: 3D segmentation of kidneys and tumors from CT images with a multi-scale supervised U-Net. Inform. Med. Unlocked 2020, 19, 100357. [Google Scholar] [CrossRef]
  15. Kim, S.H.; Kim, C.S.; Kim, M.J.; Cho, J.Y.; Cho, S.H. Differentiation of clear cell renal cell carcinoma from other subtypes and fat-poor angiomyolipoma by use of quantitative enhancement measurement during three-phase MDCT. Am. J. Roentgenol. 2016, 206, W21–W28. [Google Scholar] [CrossRef]
  16. Sun, X.-Y.; Feng, Q.-X.; Xu, X.; Zhang, J.; Zhu, F.-P.; Yang, Y.-H.; Zhang, Y.-D. Radiologic-radiomic machine learning models for differentiation of benign and malignant solid renal masses: Comparison with expert-level radiologists. Am. J. Roentgenol. 2020, 214, W44–W54. [Google Scholar] [CrossRef]
  17. de Leon, A.D.; Pedrosa, I. Imaging and screening of kidney cancer. Radiol. Clin. 2017, 55, 1235–1250. [Google Scholar] [CrossRef]
  18. Bang, S.; Wang, H.-H.; Kim, H.; Choi, M.H.; Cha, J.; Choi, Y.; Hong, S.-H. Development and Validation of a Prediction Model for Differentiation of Benign and Malignant Fat-Poor Renal Tumors Using CT Radiomics. Appl. Sci. 2023, 13, 11345. [Google Scholar] [CrossRef]
  19. Oberai, A.; Varghese, B.; Cen, S.; Angelini, T.; Hwang, D.; Gill, I.; Aron, M.; Lau, C.; Duddalwar, V. Deep learning based classification of solid lipid-poor contrast enhancing renal masses using contrast enhanced CT. Br. J. Radiol. 2020, 93, 20200002. [Google Scholar] [CrossRef]
  20. Kaur, R.; Juneja, M.; Mandal, A.K. Computer-aided diagnosis of renal lesions in CT images: A comprehensive survey and future prospects. Comput. Electr. Eng. 2019, 77, 423–434. [Google Scholar] [CrossRef]
  21. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  22. Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2623–2631. [Google Scholar]
  23. LemaÃŽtre, G.; Nogueira, F.; Aridas, C.K. Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 2017, 18, 1–5. [Google Scholar]
  24. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  25. Klontzas, M.E.; Gatti, A.A.; Tejani, A.S.; Kahn, C.E., Jr. AI reporting guidelines: How to select the best one for your research. Radiol. Artif. Intell. 2023, 5, e230055. [Google Scholar] [CrossRef] [PubMed]
  26. Van Griethuysen, J.J.; Fedorov, A.; Parmar, C.; Hosny, A.; Aucoin, N.; Narayan, V.; Beets-Tan, R.G.; Fillion-Robin, J.-C.; Pieper, S.; Aerts, H.J. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 2017, 77, e104–e107. [Google Scholar] [CrossRef]
  27. van Timmeren, J.E.; Cester, D.; Tanadini-Lang, S.; Alkadhi, H.; Baessler, B. Radiomics in medical imaging-“how-to” guide and critical reflection. Insights Imaging 2020, 11, 91. (In English) [Google Scholar] [CrossRef]
  28. Li, S.; Wang, K.; Hou, Z.; Yang, J.; Ren, W.; Gao, S.; Meng, F.; Wu, P.; Liu, B.; Liu, J.; et al. Use of Radiomics Combined With Machine Learning Method in the Recurrence Patterns After Intensity-Modulated Radiotherapy for Nasopharyngeal Carcinoma: A Preliminary Study. Front. Oncol. 2018, 8, 648. (In English) [Google Scholar] [CrossRef]
  29. Shi, Y.; Zou, Y.; Liu, J.; Wang, Y.; Chen, Y.; Sun, F.; Yang, Z.; Cui, G.; Zhu, X.; Cui, X. Ultrasound-based radiomics XGBoost model to assess the risk of central cervical lymph node metastasis in patients with papillary thyroid carcinoma: Individual application of SHAP. Front. Oncol. 2022, 12, 897596. [Google Scholar] [CrossRef]
  30. Mühlbauer, J.; Egen, L.; Kowalewski, K.-F.; Grilli, M.; Walach, M.T.; Westhoff, N.; Nuhn, P.; Laqua, F.C.; Baessler, B.; Kriegmair, M.C. Radiomics in renal cell carcinoma—A systematic review and meta-analysis. Cancers 2021, 13, 1348. [Google Scholar] [CrossRef]
  31. Bertsimas, D.; Wiberg, H. Machine learning in oncology: Methods, applications, and challenges. JCO Clin. Cancer Inform. 2020, 4, CCI-20. [Google Scholar] [CrossRef]
  32. Alhussaini, A.J.; Steele, J.D.; Jawli, A.; Nabi, G. Radiomics Machine Learning Analysis of Clear Cell Renal Cell Carcinoma for Tumour Grade Prediction Based on Intra-Tumoural Sub-Region Heterogeneity. Cancers 2024, 16, 1454. [Google Scholar] [CrossRef]
  33. Lambin, P.; Leijenaar, R.T.; Deist, T.M.; Peerlings, J.; De Jong, E.E.; Van Timmeren, J.; Sanduleanu, S.; Larue, R.T.; Even, A.J.; Jochems, A. Radiomics: The bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol. 2017, 14, 749–762. [Google Scholar] [CrossRef]
  34. Uhlig, A.; Uhlig, J.; Leha, A.; Biggemann, L.; Bachanek, S.; Stöckle, M.; Reichert, M.; Lotz, J.; Zeuschner, P.; Maßmann, A. Radiomics and machine learning for renal tumor subtype assessment using multiphase computed tomography in a multicenter setting. Eur. Radiol. 2024, 34, 6254–6263. [Google Scholar] [CrossRef]
  35. Jeon, H.G.; Seo, S.I.; Jeong, B.C.; Jeon, S.S.; Lee, H.M.; Choi, H.-Y.; Song, C.; Hong, J.H.; Kim, C.-S.; Ahn, H. Percutaneous kidney biopsy for a small renal mass: A critical appraisal of results. J. Urol. 2016, 195, 568–573. [Google Scholar] [CrossRef]
Figure 1. Flow chart of radiomic feature analysis. Radiomics features extracted from multi-phase CT scans were used to develop machine learning models for classifying renal cell tumor subtypes. Participants with multi-phase scans were split into training (75%) and hold-out test (25%) sets, while those with single- or two-phase scans were included only in the test set. To address class imbalance, synthetic oversampling was applied during model training and feature selection.
Figure 1. Flow chart of radiomic feature analysis. Radiomics features extracted from multi-phase CT scans were used to develop machine learning models for classifying renal cell tumor subtypes. Participants with multi-phase scans were split into training (75%) and hold-out test (25%) sets, while those with single- or two-phase scans were included only in the test set. To address class imbalance, synthetic oversampling was applied during model training and feature selection.
Diagnostics 15 01365 g001
Figure 2. Region of interest (ROI) masks for radiomic feature extraction. (A) Non-contrast phase CT scans overlaid with kidney ROI mask. (B) Arterial phase CT scans overlaid with kidney ROI mask. (C) Delayed phase CT scans overlaid with kidney ROI mask. (D) Portal phase CT scans overlaid with kidney ROI mask.
Figure 2. Region of interest (ROI) masks for radiomic feature extraction. (A) Non-contrast phase CT scans overlaid with kidney ROI mask. (B) Arterial phase CT scans overlaid with kidney ROI mask. (C) Delayed phase CT scans overlaid with kidney ROI mask. (D) Portal phase CT scans overlaid with kidney ROI mask.
Diagnostics 15 01365 g002aDiagnostics 15 01365 g002b
Figure 3. Renal cell tumor subtype classification performance of XGBoost. (A) Evaluation performance of machine learning models trained with 10% of the principal components derived from selected radiomic features in stratified 10-fold cross-validation. (B) Evaluation performance of machine learning models trained with 20% of the principal components derived from selected radiomic features in stratified 10-fold cross-validation. Abbreviations: AML = renal angiomyolipoma, chRCC = chromophobe renal cell carcinoma, pRCC = papillary renal cell carcinoma, ccRCC = clear cell renal cell carcinoma.
Figure 3. Renal cell tumor subtype classification performance of XGBoost. (A) Evaluation performance of machine learning models trained with 10% of the principal components derived from selected radiomic features in stratified 10-fold cross-validation. (B) Evaluation performance of machine learning models trained with 20% of the principal components derived from selected radiomic features in stratified 10-fold cross-validation. Abbreviations: AML = renal angiomyolipoma, chRCC = chromophobe renal cell carcinoma, pRCC = papillary renal cell carcinoma, ccRCC = clear cell renal cell carcinoma.
Diagnostics 15 01365 g003aDiagnostics 15 01365 g003b
Table 1. Demographic and pathological characteristics of the patients.
Table 1. Demographic and pathological characteristics of the patients.
Training DatasetTest Datasetp Value
Sex2892100.639
Male (%)162 (49.2%)123 (58.6%)
Female (%)127 (50.8%)87 (41.4%)
Age (years) (range)57.0 (22, 83)57.0 (27, 81)0.816
Cancer size (cm) (range)3.629 (0.8, 17.5)3.357 (0.7, 12.0)0.216
Renal cell tumor subtype 0.881
oncocytoma (%)25 (8.6%)23 (11%)
angiomyolipoma (%)47 (16.3%)31 (14.7%)
chromophobe renal cell carcinoma (%)48 (16.6%)36 (17.1%)
papillary renal cell carcinoma (%)50 (17.3%)39 (18.6%)
clear cell renal cell carcinoma (%)119 (41.2%)81 (38.6%)
We used paired the sample t-test or chi-squared test to hold-out test the differences in demographic and pathological characteristics for continuous variables (i.e., age, cancer size) and categorical variables (i.e., sex, kidney cancer subtype).
Table 2. Model performance of multi-phase CT radiomic features-based machine learning models in renal cell tumor subtype classification. We compared the model performance of four different types of machine learning algorithms in classifying the subtype of malignant kidney cancer. The average and standard error over test performance of each model in 10-fold cross-validation on a hold-out test set were calculated. (A,C) show overall evaluation performance of machine learning algorithms trained with 10% and 20% of principal components, respectively. (B,D) show the performance of machine learning algorithms in classifying renal cell tumor subtype via a one vs. rest approach, trained with 10% and 20% of principal components, respectively. Abbreviations: AML = renal angiomyolipoma, chRCC = chromophobe renal cell carcinoma, pRCC = papillary renal cell carcinoma, ccRCC = clear cell renal cell carcinoma. Values following signs indicate standard error.
Table 2. Model performance of multi-phase CT radiomic features-based machine learning models in renal cell tumor subtype classification. We compared the model performance of four different types of machine learning algorithms in classifying the subtype of malignant kidney cancer. The average and standard error over test performance of each model in 10-fold cross-validation on a hold-out test set were calculated. (A,C) show overall evaluation performance of machine learning algorithms trained with 10% and 20% of principal components, respectively. (B,D) show the performance of machine learning algorithms in classifying renal cell tumor subtype via a one vs. rest approach, trained with 10% and 20% of principal components, respectively. Abbreviations: AML = renal angiomyolipoma, chRCC = chromophobe renal cell carcinoma, pRCC = papillary renal cell carcinoma, ccRCC = clear cell renal cell carcinoma. Values following signs indicate standard error.
(A)
F1AU-PRCACCAU-ROC
Linear SVM0.423 ± 0.0080.393 ± 0.0160.423 ± 0.0080.703 ± 0.009
Rbf SVM0.433 ± 0.0170.408 ± 0.0110.433 ± 0.0170.722 ± 0.006
XGBoost0.453 ± 0.0070.474 ± 0.0060.453 ± 0.0070.744 ± 0.004
Random Forest0.444 ± 0.0090.443 ± 0.0060.444 ± 0.0090.742 ± 0.003
(B)
AU-PRCAU-ROC
OncocytomaAMLchRCCpRCCccRCCOncocytomaAMLchRCCpRCCccRCC
vs. Restvs. Restvs. Restvs. Restvs. Restvs. Restvs. Restvs. Restvs. Restvs. Rest
Linear SVM0.096 ± 0.010.285 ± 0.0390.257 ± 0.0440.246 ± 0.0220.613 ± 0.0190.511 ± 0.0420.688 ± 0.030.605 ± 0.0370.607 ± 0.0230.697 ± 0.011
Rbf SVM0.101 ± 0.0140.292 ± 0.0170.242 ± 0.0130.276 ± 0.020.570 ± 0.0230.488 ± 0.0420.707 ± 0.0170.618 ± 0.010.648 ± 0.0220.663 ± 0.015
XGBoost0.151 ± 0.0180.31 ± 0.0090.3 ± 0.0150.318 ± 0.0150.673 ± 0.080.606 ± 0.0210.698 ± 0.0090.666 ± 0.0070.681 ± 0.0140.738 ± 0.007
Random Forest0.142 ± 00210.366 ± 0.010.259 ± 0.0220.298 ± 0.0170.617 ± 0.0110.616 ± 0.0170.769 ± 0.0090.607 ± 0.0140.686 ± 0.0160.695 ± 0.01
(C)
F1AU-PRCACCAU-ROC
Linear SVM0.423 ± 0.0110.396 ± 0.0190.423 ± 0.0110.716 ± 0.001
Rbf SVM0.424 ± 0.00.322 ± 0.0010.424 ± 0.00.66 ± 0.004
XGBoost0.475 ± 0.0110.495 ± 0.0090.475 ± 0.0110.751 ± 0.006
Random Forest0.433 ± 0.0050.47 ± 0.0110.433 ± 0.0050.75 ± 0.003
(D)
AU-PRCAU-ROC
OncocytomaAMLchRCCpRCCccRCCOncocytomaAMLchRCCpRCCccRCC
vs. Restvs. Restvs. Restvs. Restvs. Restvs. Restvs. Restvs. Restvs. Restvs. Rest
Linear SVM0.141 ± 0.0150.278 ± 0.0220.230 ± 0.0320.288 ± 0.0200.598 ± 0.040.631 ± 0.0270.734 ± 0.0130.574 ± 0.0420.658 ± 0.0170.673 ± 0.026
Rbf SVM0.096 ± 0.00.148 ± 0.00.164 ± 0.00.168 ± 0.00.424 ± 0.00.5 ± 0.00.5 ± 0.00.5 ± 0.00.5 ± 0.00.5 ± 0.0
XGBoost0.124 ± 0.0120.410 ± 0.0290.289 ± 0.0170.294 ± 0.0170.686 ± 0.0120.578 ± 0.0290.748 ± 0.0060.646 ± 0.0130.674 ± 0.020.750 ± 0.011
Random Forest0.183 ± 0.0330.33 ± 0.0150.258 ± 0.0120.316 ± 0.0170.677 ± 0.0200.645 ± 0.0260.743 ± 0.0140.672 ± 0.0160.672 ± 0.0180.742 ± 0.013
Table 3. The model performance of single-phase CT radiomic feature-based machine learning models in renal cell tumor subtype classification. (A) shows the number of CT scans used for training and evaluating machine learning algorithms. (B,C) show the evaluation performance of machine learning algorithms trained with 10% and 20% of principal components, respectively. We compared the model performance of four difference types of machine learning algorithms in classifying renal cell tumor subtypes with regards to the number of principal components used for training models. We evaluated the model performance of each model in 10-fold cross-validation on a hold-out test set. Sample size indicates the number of labeled images used for training and evaluating machine learning algorithms.
Table 3. The model performance of single-phase CT radiomic feature-based machine learning models in renal cell tumor subtype classification. (A) shows the number of CT scans used for training and evaluating machine learning algorithms. (B,C) show the evaluation performance of machine learning algorithms trained with 10% and 20% of principal components, respectively. We compared the model performance of four difference types of machine learning algorithms in classifying renal cell tumor subtypes with regards to the number of principal components used for training models. We evaluated the model performance of each model in 10-fold cross-validation on a hold-out test set. Sample size indicates the number of labeled images used for training and evaluating machine learning algorithms.
(A)
Arterial PhaseDelayed PhaseNon-Contrast PhasePortal Phase
Training DatasetTest DatasetTraining DatasetTest DatasetTraining DatasetTest DatasetTraining DatasetTest Dataset
Renal cell tumor subtype2437525783280139253188
oncocytoma (%)22 (9%)6 (8%)23 (9%)8 (10%)25 (9%)13 (10%)20 (8%)21 (11%)
angiomyolipoma (%)40 (16%)10 (13%)42 (16%)10 (12%)44 (16%)20 (14%)42 (17%)27 (14%)
chromophobe renal cell carcinoma (%)43 (18%)13 (17%)42 (16%)15 (18%)48 (17%)25 (18%)40 (16%)28 (15%)
papillary renal cell carcinoma (%)36 (15%)14 (19%)43 (17%)13 (16%)48 (17%)24 (17%)46 (18%)37 (20%)
clear cell renal cell carcinoma (%)102 (42%)32 (43%)107 (42%)37 (44%)115 (41%)57 (41%)105 (41%)75 (40%)
(B)
F1 scoreAU-PRCACCAU-ROC
CT PhaseArterialDelayedNon-contrastPortalArterialDelayedNon-contrastPortalArterialDelayedNon-contrastPortalArterialDelayedNon-contrastPortal
Linear SVM0.421 ± 0.0360.378 ± 0.0320.397 ± 0.020.371 ± 0.0110.42 ± 0.0260.368 ± 0.0320.345 ± 0.0180.341 ± 0.0190.421 ± 0.0360.378 ± 0.0320.397 ± 0.020.371 ± 0.0110.719 ± 0.0220.676 ± 0.0240.671 ± 0.0190.659 ± 0.012
Rbf SVM0.461 ± 0.0280.477 ± 0.0090.425 ± 0.0060.386 ± 0.0040.48 ± 0.0510.416 ± 0.0310.303 ± 0.0240.525 ± 0.050.461 ± 0.0280.477 ± 0.090.425 ± 0.0060.386 ± 0.040.756 ± 0.0270.725 ± 0.0210.67 ± 0.0230.629 ± 0.018
XGBoost0.594 ± 0.0370.479 ± 0.0090.439 ± 0.0150.456 ± 0.0150.608 ± 0.0220.442 ± 0.180.419 ± 0.0140.463 ± 0.0120.594 ± 0.0370.479 ± 0.0090.439 ± 0.0150.456 ± 0.0150.83 ± 0.0120.728 ± 0.0130.712 ± 0.0060.739 ± 0.007
Random Forest0.478 ± 0.0130.493 ± 0.0150.461 ± 0.0110.441 ± 0.0140.511 ± 0.0160.492 ± 0.310.471 ± 0.0130.473 ± 0.0150.478 ± 0.0130.493 ± 0.0150.461 ± 0.0110.441 ± 0.0140.778 ± 0.0130.766 ± 0.0160.744 ± 0.0080.756 ± 0.008
(C)
F1 scoreAU-PRCACCAU-ROC
CT PhaseArterialDelayedNon-contrastPortalArterialDelayedNon-contrastPortalArterialDelayedNon-contrastPortalArterialDelayedNon-contrastPortal
Linear SVM0.396 ± 0.0280.385 ± 0.0270.414 ± 0.0220.297 ± 0.0450.353 ± 0.0480.321 ± 0.0360.354 ± 0.0310.267 ± 0.0310.396 ± 0.0280.385 ± 0.0270.414 ± 0.0220.297 ± 0.0450.669 ± 0.0360.627 ± 0.0320.688 ± 0.0220.581 ± 0.026
Rbf SVM0.437 ± 0.00.462 ± 0.00.413 ± 0.00.382 ± 0.00.337 ± 0.010.337 ± 0.0160.317 ± 0.0020.296 ± 0.0070.437 ± 0.00.462 ± 0.00.413 ± 0.00.382 ± 0.00.677 ± 0.0060.67 ± 0.0080.665 ± 0.0050.633 ± 0.007
XGBoost0.531 ± 0.020.466 ± 0.0290.354 ± 0.0240.447 ± 0.0190.565 ± 0.0140.507 ± 0.0210.382 ± 0.0220.429 ± 0.0130.531 ± 0.0020.466 ± 0.0290.354 ± 0.0240.447 ± 0.0190793 ± 0.0050.766 ± 0.0010.686 ± 0.0230.732 ± 0.008
Random Forest0.504 ± 0.0220.47 ± 0.0160.45 ± 0.0060.427 ± 0.0130.52 ± 0.0260.529 ± 0.020.433 ± 0.0090.449 ± 0.0140.504 ± 0.0220.497 ± 0.0160.45 ± 0.0060.427 ± 0.0130.797 ± 0.0180.73 ± 0.0140.16 ± 0.0070.735 ± 0.007
Table 4. The model performance of single-phase CT radiomics feature-based machine learning models in renal cell tumor subtype classification (A,B) show the AU-PRC and AU-ROC, respectively, of singe-phase CT radiomics feature-based machine learning models trained with 10% of principal components in classifying a specific subtype against other subtypes. (C,D) show the AU-PRC and AU-ROC, respectively, of models trained with 20% of principal components in the same task.
Table 4. The model performance of single-phase CT radiomics feature-based machine learning models in renal cell tumor subtype classification (A,B) show the AU-PRC and AU-ROC, respectively, of singe-phase CT radiomics feature-based machine learning models trained with 10% of principal components in classifying a specific subtype against other subtypes. (C,D) show the AU-PRC and AU-ROC, respectively, of models trained with 20% of principal components in the same task.
(A)
ArterialDelayedNon-ContrastPortal
CT PhaseOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCC
Linear SVM0.147 ± 0.0680.290 ± 0.0770.280 ± 0.0280.298 ± 0.0780.656 ± 0.0650.151 ± 0.0560.205 ± 0.0410.310 ± 0.0790.260 ± 0.0590.564 ± 0.0370.109 ± 0.0160.214 ± 0.0250.202 ± 0.0510.206 ± 0.0360.503 ± 0.0300.141 ± 0.0160.194 ± 0.0270.217 ± 0.0190.275 ± 0.0280.550 ± 0.054
Rbf SVM0.133 ± 0.0380.382 ± 0.1350.285 ± 0.0410.598 ± 0.1190.638 ± 0.0850.128 ± 0.0590.241 ± 0.0440.268 ± 0.10.421 ± 0.0950.547 ± 0.070.084 ± 0.0080.340 ± 0.1050.170 ± 0.0130.168 ± 0.0380.463 ± 0.0340.117 ± 0.0140.190 ± 0.0600.148 ± 0.0080.200 ± 0.0250.384 ± 0.048
XGBoost0.089 ± 0.0170.605 ± 0.0760.590 ± 0.0630.393 ± 0.10.757 ± 0.0330.105 ± 0.010.420 ± 0.0280.351 ± 0.0380.346 ± 0.0490.586 ± 0.0230.111 ± 0.0150.401 ± 0.0220.320 ± 0.0330.196 ± 0.0160.556 ± 0.0240.189 ± 0.0330.411 ± 0.0160.273 ± 0.0350.333 ± 0.310.682 ± 0.02
Random Forest0.163 ± 0.0260.384 ± 0.0350.265 ± 0.0490.413 ± 0.0520.711 ± 0.0260.158 ± 0.0230.356 ± 0.0210.277 ± 0.0380.344 ± 0.0930.712 ± 0.0460.149 ± 0.0290.499 ± 0.0230.239 ± 0.0270.238 ± 0.0190.637 ± 0.0390.214 ± 0.0320.479 ± 0.0520.271 ± 0.0270.295 ± 0.0160.643 ± 0.02
(B)
ArterialDelayedNon-ContrastPortal
CT PhaseOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCC
Linear SVM0.625 ± 0.1490.681 ± 0.0730.572 ± 0.0530.687 ± 0.0560.663 ± 0.0490.584 ± 0.0940.631 ± 0.0480.561 ± 0.0840.645 ± 0.0670.595 ± 0.0420.529 ± 0.0560.685 ± 0.0490.465 ± 0.0750.492 ± 0.0490.595 ± 0.0220.561 ± 0.0400.639 ± 0.0470.537 ± 0.0530.584 ± 0.0460.655 ± 0.044
Rbf SVM0.133 ± 0.0380.382 ± 0.1350.285 ± 0.0410.598 ± 0.1190.638 ± 0.0840.444 ± 0.1000.570 ± 0.0790.547 ± 0.1170.699 ± 0.1130.635 ± 0.0820.458 ± 0.0640.580 ± 0.1390.415 ± 0.0650.452 ± 0.0900.524 ± 0.0470.504 ± 0.0550.424 ± 0.1330.459 ± 0.0180.442 ± 0.0180.500 ± 0.074
XGBoost0.507 ± 0.050.843 ± 0.0280.775 ± 0.0320.781 ± 0.040.824 ± 0.0230.500 ± 0.0420.715 ± 0.0190.707 ± 0.0290.616 ± 0.0410.675 ± 0.0110.513 ± 0.0490.693 ± 0.020.668 ± 0.0260.522 ± 0.0140.645 ± 0.0180.580 ± 0.0330.785 ± 0.0160.659 ± 0.0240.667 ± 0.0150.761 ± 0.013
Random Forest0.664 ± 0.0520.755 ± 0.0450.636 ± 0.0480.769 ± 0.0330.775 ± 0.0260.644 ± 0.0590.749 ± 0.0250.659 ± 0.0370.637 ± 0.1130.785 ± 0.0370.603 ± 0.0560.780 ± 0.0260.585 ± 0.0190.577 ± 0.0380.706 ± 0.0250.669 ± 0.0330.806 ± 0.0170.651 ± 0.0220.656 ± 0.0230.731 ± 0.017
(C)
ArterialDelayedNon-ContrastPortal
CT PhaseOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCC
Linear SVM0.106 ± 0.060.217 ± 0.0850.188 ± 0.0750.207 ± 0.0630.583 ± 0.0590.111 ± 0.0250.189 ± 0.0820.196 ± 0.0490.218 ± 0.0540.469 ± 0.0610.129 ± 0.0300.331 ± 0.0550.207 ± 0.040.199 ± 0.0350.512 ± 0.0420.111 ± 0.0170.163 ± 0.0290.17 ± 0.0180.217 ± 0.0250.408 ± 0.04
Rbf SVM0.185 ± 0.0720.215 ± 0.0460.165 ± 0.0070.168 ± 0.0150.420 ± 0.0190.18 ± 0.0490.212 ± 0.0970.162 ± 0.0240.181 ± 0.0430.412 ± 0.0310.091 ± 0.00.147 ± 0.00.175 ± 0.00.175 ± 0.00.413 ± 0.00.111 ± 0.0090.248 ± 0.0340.154 ± 0.0090.205 ± 0.0240.361 ± 0.02
XGBoost0.177 ± 0.0560.584 ± 0.0370.261 ± 0.0230.393 ± 0.0670.726 ± 0.0260.184 ± 0.0770.399 ± 0.0480.239 ± 0.0260.34 ± 0.0360.714 ± 0.0320.119 ± 0.0250.443 ± 0.0480.22 ± 0.0230.222 ± 0.0250.549 ± 0.0290.147 ± 0.0160.452 ± 0.0380.318 ± 0.0450.394 ± 0.0270.552 ± 0.025
Random Forest0.249 ± 0.0720.451 ± 0.0280.326 ± 0.0790.368 ± 0.0760.671 ± 0.0410.136 ± 0.020.533 ± 0.0720.252 ± 0.0520.388 ± 0.0510.741 ± 0.0320.141 ± 0.0250.395 ± 0.0210.212 ± 0.040.221 ± 0.0320.627 ± 0.0220.123 ± 0.0210.428 ± 0.0440.261 ± 0.0270.318 ± 0.0310.622 ± 0.034
(D)
ArterialDelayedNon-ContrastPortal
CT PhaseOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCCOnco-
cytoma
AMLchRCCpRCCccRCC
Linear SVM0.509 ± 0.1350.566 ± 0.1350.478 ± 0.1320.537 ± 0.1390.580 ± 0.0610.466 ± 0.0940.514 ± 0.1610.481 ± 0.0790.475 ± 0.0940.5 ± 0.0660.598 ± 0.0680.744 ± 0.0360.53 ± 0.0440.488 ± 0.0620.605 ± 0.0340.46 ± 0.0580.495 ± 0.0790.484 ± 0.0430.493 ± 0.0410.491 ± 0.055
Rbf SVM0.497 ± 0.0850.56 ± 0.1050.457 ± 0.0450.441 ± 0.0390.43 ± 0.0490.492 ± 0.0610.529 ± 0.1150.429 ± 0.0660.433 ± 0.0820.387 ± 0.0720.5 ± 0.00.5 ± 0.0.5 ± 0.0.5 ± 0.0.5 ± 0.0.483 ± 0.0410.587 ± 0.0420.412 ± 0.050.518 ± 0.0310.409 ± 0.045
XGBoost0.489 ± 0.0390.805 ± 0.0280.671 ± 0.0270.707 ± 0.0290.751 ± 0.0150.594 ± 0.0540.762 ± 0.0190.630 ± 0.0160.766 ± 0.0290.731 ± 0.0230.503 ± 0.0450.747 ± 0.0210.545 ± 0.0220.607 ± 0.0410.632 ± 0.0250.560 ± 0.0280.796 ± 0.0190.678 ± 0.0310.726 ± 0.0210.648 ± 0.031
Random Forest0.648 ± 0.0740.78 ± 0.0280.717 ± 0.0770.717 ± 0.0770.716 ± 0.0240.541 ± 0.0620.799 ± 0.0330.609 ± 0.0450.712 ± 0.0610.777 ± 0.030.581 ± 0.0370.757 ± 0.0160.513 ± 00310.554 ± 0.0480.723 ± 0.0280.567 ± 0.0490.804 ± 0.0150.658 ± 0.0360.669 ± 0.0250.706 ± 0.019
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bang, S.; Wang, H.; Bae, H.; Hong, S.-H.; Cha, J.; Choi, M.H. Computed Tomography-Based Radiomics Diagnostic Model for Fat-Poor Small Renal Tumor Subtypes. Diagnostics 2025, 15, 1365. https://doi.org/10.3390/diagnostics15111365

AMA Style

Bang S, Wang H, Bae H, Hong S-H, Cha J, Choi MH. Computed Tomography-Based Radiomics Diagnostic Model for Fat-Poor Small Renal Tumor Subtypes. Diagnostics. 2025; 15(11):1365. https://doi.org/10.3390/diagnostics15111365

Chicago/Turabian Style

Bang, Seokhwan, Heehwan Wang, Hoyoung Bae, Sung-Hoo Hong, Jiook Cha, and Moon Hyung Choi. 2025. "Computed Tomography-Based Radiomics Diagnostic Model for Fat-Poor Small Renal Tumor Subtypes" Diagnostics 15, no. 11: 1365. https://doi.org/10.3390/diagnostics15111365

APA Style

Bang, S., Wang, H., Bae, H., Hong, S.-H., Cha, J., & Choi, M. H. (2025). Computed Tomography-Based Radiomics Diagnostic Model for Fat-Poor Small Renal Tumor Subtypes. Diagnostics, 15(11), 1365. https://doi.org/10.3390/diagnostics15111365

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop