Differential Diagnosis of Prostate Cancer Grade to Augment Clinical Diagnosis Based on Classifier Models with Tuned Hyperparameters

Simple Summary Multiparametric MRI with radiomics features derived from T2WI and ADC maps distinguished non-tumor regions from significant cancer and predicted the Gleason score using support vector machine (SVM) and random forest (RF) classification methods with tuned hyperparameters, as well as recursive feature elimination (RFE) and the least absolute shrinkage and selection operator (LASSO) feature selection methods. Successful application of a novel approach to machine learning incorporating recursive feature elimination combined with random forest and support vector classifiers allowed stratification of Gleeson scores in clinical cohorts at a sensitivity greater than 0.91. Abstract We developed a novel machine-learning algorithm to augment the clinical diagnosis of prostate cancer utilizing first and second-order texture analysis metrics in a novel application of machine-learning radiomics analysis. We successfully discriminated between significant prostate cancers versus non-tumor regions and provided accurate prediction between Gleason score cohorts with statistical sensitivity of 0.82, 0.81 and 0.91 in three separate pathology classifications. Tumor heterogeneity and prediction of the Gleason score were quantified using two feature selection approaches and two separate classifiers with tuned hyperparameters. There was a total of 71 patients analyzed in this study. Multiparametric MRI, incorporating T2WI and ADC maps, were used to derive radiomics features. Recursive feature elimination (RFE), the least absolute shrinkage and selection operator (LASSO), and two classification approaches, incorporating a support vector machine (SVM) (with randomized search) and random forest (RF) (with grid search), were utilized to differentiate between non-tumor regions and significant cancer while also predicting the Gleason score. In T2WI images, the RFE feature selection approach combined with RF and SVM classifiers outperformed LASSO with SVM and RF classifiers. The best performance was achieved by combining LASSO and SVM into a model that used both T2WI and ADC images. This model had an area under the curve (AUC) of 0.91. Radiomic features computed from ADC and T2WI images were used to predict three groups of Gleason score using two kinds of feature selection methods (RFE and LASSO), RF and SVM classifier models with tuned hyperparameters. Using combined sequences (T2WI and ADC map images) and combined radiomics (1st and GLCM features), LASSO, with a feature selection method with RF, was able to predict G3 with the highest sensitivity at a level AUC of 0.92. To predict G3 for single sequence (T2WI images) using GLCM features, LASSO with SVM achieved the highest sensitivity with an AUC of 0.92.


Introduction
Prostate cancer is the second predominant male tumor globally, with 1,276,106 new cases and 358,989 deaths in 2018 [1,2].That is 7.1% of new cases and 3.8% of all male cancer mortality in 2018 [3].Globally, the median age for detection of prostate cancer is 66 years old, and both the recurrence and fatality rates rise with age [4,5].Early detection of tumors increases the chances of being cured because treatment works even if the cancer is localized.
Multiparametric magnetic resonance imaging (mp-MRI) has been used extensively in prostate cancer (PCa) scanning, identification, and grading throughout the last few decades [6,7].It is possible to obtain high-resolution anatomical and functional images using the mp-MRI imaging technique [8].T 1 weighted images (T 1 WI) and T 2 weighted images (T 2 WI) are anatomic sequences used in multiparametric prostate MRI.For example, the zonal structure and tumor foci cannot be identified using T 1 WI.It is possible to employ T 1 WI to discover biopsy-associated haemorrhage, which can interfere with the capacity of other PCa MRI techniques to provide accurate diagnoses.T 2 WI provides the best soft-tissue imaging for malignancies, zonal morphology, seminal vesicle (SV), anterior fibromuscular stroma (AFS), neurovascular bundles, and the capsule [9].Diffusion-weighted imaging (DWI), Magnetic resonance spectroscopic imaging (MRSI), and Dynamic contrast-enhanced (DCE) are functional MRI sequences [10].The DWI technique was developed and implemented to detect an acute cerebrovascular stroke.DWI compares water diffusion in soft tissues and free solution to produce image contrast.When a PCa grows, there is a growth in cellularity and degradation of ductal architecture, which limits fluid flow through the prostate [11].The b-value and Apparent Diffusion Coefficient (ADC) are the two types of images used for analysis in DWI.Tumor diagnostic outcome is improved by utilizing b-values between 1400 and 2000 [12][13][14].Clinical interpretation from DWI is subjective; nevertheless, water molecules' limitations may be measured quantitatively.Interpretation is performed with ADC maps and ADC measurements (mm 2 /s).and ADC levels and Gleason scores are proportionally related [15,16].By using a machine learning approach with clinically relevant radiomics metrics as inputs we aim to improve the interpretation and augment clinical diagnosis.
Radiomics can generate (200+) statistical variables from medical images automatically.Patient anatomy can significantly vary in shape and texture depending on the imaging technique used [17].Using automated or semi-automated radiomic metrics we could improve diagnostic accuracy.Textural analysis has been used to extract tissue information from medical images since the 1980s [18,19].It recognizes that intratumor heterogeneity has significant implications for cancer research, which could be represented by tumors' texture [20,21].Radiomics relies heavily on texture analysis (TA), a necessary part of the process [21,22].Radiomics is the technique used to collect essential and extensive data from clinical images and give variables that can be used to assist in detection, prognostic, and treatment response [22][23][24][25][26][27].When developing a radiomics model, selecting the best Machine Learning (ML) model is key and different ML approaches may perform differently when applied to different tissues [27][28][29][30][31][32][33].
Training is used to derive many of the algorithm parameters used by machine learning (ML), and most contemporary ML algorithms must tune parameters to improve feature identification referred to as Hyperparameters [34][35][36].The hyperparameters are fine-tuned to optimize an algorithm for a specific learning task [37].Hyperparameter optimization usually employs Grid and Random Search techniques [38].Grid Search is a method using all possible permutations of hyperparameters.The training data and the number of layers can be adjusted in a grid search as hyperparameters [39].In contrast to a grid search, randomized search does not perform a comprehensive investigation of the hyperparameter space.Nonetheless, it permits us to investigate a wider variety of hyperparameter value settings more effectively and affordably [39].Weerts et al. [37] stated that an increased tuning risk and relative tuning risk were observed from the random forest's max features and SVM's gamma and C, suggesting that it is essential to tune these hyperparameters.In the domain of prostate cancer classification and grading, many prior studies have applied machine learning techniques with default hyperparameters, often without extensive hyperparameter optimization.In contrast, our research distinguishes itself by prioritizing hyperparameter tuning.This deliberate optimization process enhances the precision and reliability of our ML models, contributing to great precision in clinically relevant results.Our work aims to advance the field by systematically refining the parameters that underpin the diagnosis of prostate cancer.

Patient Group
This study utilized a dataset from The Cancer Imaging Archive (TCIA) funded through the SPIE, NCI/NIH, AAPM, and Radboud University [40].The population set used in this work consists of 99 patients, including T 2 WI and Apparent Diffusion Coefficient map (ADC) series from the open-source, freely released SPIE-AAPM-NCI PROSTATEx-2 [34]. the total number of patients (n = 99); received a 3T mp-MRI using a body coil.n = 29 The patients had a non-significant tumor at (3 + 3) Gleason grade excluded; (e) n = 71 patients had a significant tumor (≥3 + 4) at Gleason grade (GS).
Images were obtained using a Siemens 3T MRI techniques (MAGNETOM Skyra, Siemens Healthcare, Erlangen, Germany) utilising a pelvic phased-array coil.The axial T 2 WI and ADC maps were employed for imaging assessment.The current clinical practice uses a T 2 WI and a minimum of one if not two functional approaches (e.g., DCE, and spectroscopic) are used to identify prostate cancer [8,41,42].For precise localization, all biopsies were done under MR monitoring.A pathologist then rated biopsy specimens, which served as the ground truth.T 2 WI was obtained using a turbo spin echo sequence with 0.5 mm resolution and 3.6 mm slice thickness.The Diffusion weighted images were obtained using a single-shot echo-planar imaging procedure utilizing 2 mm in-plane resolution, 3.6 mm slice thickness, and three-dimensional diffusion encoding gradients.The scanner program generated the ADC map from three b-values (50, 400, and 800 s/mm 2 ).Table 1 contains a description of the mp-MRI acquisition settings.The images were collected with no endorectal coil in line with PI-RADS recommendations for prostate MRI images [41].

Segmentation
Regions of interest (ROIs) for significant cancer were segmented manually from T 2 WI, and ADC images predefined ROIs from PROSTATEx-2 Challenge that is available on TCIA [40,43].The LIFEx package was used for the segmentation process [20].Non-tumor regions cancers segmented depending on the same region for significant cancer (in different regions) assessed for every subject's lesion.Figure 1 illustrates a typical malignancy cancer segmentation on mp-MRI.
Number of acquisitions 1 1

Segmentation
Regions of interest (ROIs) for significant cancer were segmented manually from T2WI, and ADC images predefined ROIs from PROSTATEx-2 Challenge that is available on TCIA [40,43].The LIFEx package was used for the segmentation process [20].Nontumor regions cancers segmented depending on the same region for significant cancer (in different regions) assessed for every subject's lesion.Figure 1 illustrates a typical malignancy cancer segmentation on mp-MRI.

Feature Extraction
Pre-processing, including intensity normalization and spatial resampling, was conducted for all mp-MRI images using LIFEx to derive radiomics features.The dimensions were rescaled to 0.5 × 0.5 × 3 mm, preserving the dataset's in-plane and inter-plane resolutions.The radiomics features uniformity achieved using grey-level discretization defined between 1 and 128 bits/pixel.Absolute resampling between the minimum and maximum fixed bounds for all ROIs used for intensity resizing parameters [20].Figure 2 demonstrates the analysis procedures.For each ROI, (a) 5 features were computed from the histogram, and (b) Six features were computed from grey-level features co-occurrence, leading to 11 features per ROI for each patient.imum fixed bounds for all ROIs used for intensity resizing parameters [20].Figure 2 demonstrates the analysis procedures.For each ROI, (a) 5 features were computed from the histogram, and (b) Six features were computed from grey-level features co-occurrence, leading to 11 features per ROI for each patient.

Feature Selection
Feature selection refers to the process of selecting essential features in predictive models.Irrelevant features can degrade the prediction model by contributing little to it [6].Model overfitting challenges arise when there are too numerous features in the algorithm.A significant feature containing fewer numbers, but high precision can be minimised by determining the size of the feature set through the feature selection approach [44].It is popular to use recursive feature elimination (RFE) [31,[45][46][47][48] and to select the best features from the dataset.The least absolute shrinkage and selection operator (LASSO) and RFE were employed in this study for feature selection due to their high performance and widespread use.The Python environment with scikit-learn (version 1.0.2) was used to implement these feature selection algorithms.

Classification and Prediction
Both support vector machine (SVM) with hyperparameter tuning via grid search [45,[48][49][50] and random forest (RF) with hyperparameter tuning via a randomized search [30,51] were used to achieve optimal and fit classification performance for significant cancer versus non-tumor regions and tuning hyper using the scikit-learn library from Python (1.0.2).These classification techniques were selected and assessed because they have been extensively used to identify different organs, as mentioned in previous studies [28,45,47,52].To identify regions of significant cancer we employed radiomics parameters based on statistical features from both the 1st and 2nd order, derived from the Gray-Level Co-occurrence Matrix (GLCM).Our approach involved utilizing two ML classifiers: the Support Vector Machine algorithm and RF algorithm.For the RF model, we conducted a randomized search to fine-tune its hyperparameters, which encompassed factors such as the number of estimators, criterion, max depth, and max features.In contrast, for the SVM

Feature Selection
Feature selection refers to the process of selecting essential features in predictive models.Irrelevant features can degrade the prediction model by contributing little to it [6].Model overfitting challenges arise when there are too numerous features in the algorithm.A significant feature containing fewer numbers, but high precision can be minimised by determining the size of the feature set through the feature selection approach [44].It is popular to use recursive feature elimination (RFE) [31,[45][46][47][48] and to select the best features from the dataset.The least absolute shrinkage and selection operator (LASSO) and RFE were employed in this study for feature selection due to their high performance and widespread use.The Python environment with scikit-learn (version 1.0.2) was used to implement these feature selection algorithms.

Classification and Prediction
Both support vector machine (SVM) with hyperparameter tuning via grid search [45,[48][49][50] and random forest (RF) with hyperparameter tuning via a randomized search [30,51] were used to achieve optimal and fit classification performance for significant cancer versus nontumor regions and tuning hyper using the scikit-learn library from Python (1.0.2).These classification techniques were selected and assessed because they have been extensively used to identify different organs, as mentioned in previous studies [28,45,47,52].To identify regions of significant cancer we employed radiomics parameters based on statistical features from both the 1st and 2nd order, derived from the Gray-Level Co-occurrence Matrix (GLCM).Our approach involved utilizing two ML classifiers: the Support Vector Machine algorithm and RF algorithm.For the RF model, we conducted a randomized search to finetune its hyperparameters, which encompassed factors such as the number of estimators, criterion, max depth, and max features.In contrast, for the SVM model, we engaged in a grid search method to optimize hyperparameters such as C, gamma, and the choice of kernel function.
To assess the effectiveness and dependability of these models, we carried out a K-fold cross-validation (CV) procedure with K adjusted to 5. The meticulous validation process ensured that the models we developed were able to accurately differentiate between areas with significant cancer and non-tumor regions based on radiomics statistics.
In order to predict outcomes within GS cohorts, radiomics parameters, specifically those relating to the first and second orders of the GLCM, were utilised as feed for a RF classifier with randomized search as well as an SVM classifier with hyperparameter tuning through grid search.The intention was to demonstrate the statistical significance of these parameters.We trained the Random Forest model using a randomized search method with various hyperparameter settings, including number estimators, criterion, max depth, and max features.
The SVM model with grid search trained with different hyperparameter settings (including c, gamma, and kernel).Then the models were computed using starfield K-fold cross-validation (k = 5).The RF and SVM-based tuning hyperparameter classifiers were interpreted using a binary classification method, with G2 vs. rest, G3 versus rest, and G4 versus rest employed to illustrate the AUC-ROC.Because of class imbalance, a classifier's performance may suffer if all of the datasets are assigned to the majority class, leading to high accuracy in classification but low specificity or sensitivity [53].Several ways to deal with this issue are through oversampling [54] and sample weighting [55].To clarify the operation of one vs rest worked, we classified G2 from G3 and G4 using ROC-AUC as a binary classification.The G3 and G4 areas under curves were calculated utilizing the same approach (one vs. rest).This study used Python's scikit-learn (v.1.0.2) library to verify model validity using a five-fold cross-validation approach.

Statistical Analysis
Each radiomics parameter was tested for significance using the Kruskal-Wallis technique.Radiomic features and PCa patients' significant cancer versus non-tumor regions were correlated using Spearman correlation.Statistical significance was determined using the Holm-Bonferroni method at a p-value of <0.05 [56].
Using the Kruskal-Wallis test, each radiomics feature was looked at again to see if it was significant in the GS cohorts.The value of the correlation between radiomics characteristics and the GS groups for prostate cancer subjects was determined using the Spearman correlation, which was employed to measure the correlation value.Statistical significance was determined using the Holm-Bonferroni method at a p-value of <0.05 [56].

Relation between Radiomic Attributes and Significant versus Non-Tumor Regions
Each prostate cancer patient's T 2 WI and ADC map images were used to extract radiomics features.The Kruskal-Wallis approach was used to ascertain if any feature from radiomics had statistical significance to make a comparison between significant tumor versus non-tumor regions.The radiomics features correlated with significant cancer versus nontumor regions using Spearman correlation.

Classifiers and Feature Selection Performance
The radiomics features were fed into a model that used RF and SVM classifiers with tuned hyperparameters to distinguish between significant cancer versus non-tumor regions in 71 PCa patients.For T 2 WI images, the RFE combined RF Classification algorithm obtained the maximum AUC of 0.95 ± 0.01 (with 5-fold CV).Furthermore, the RFE combined the SVM classification algorithm obtained the second maximum AUC of 0.94 ± 0.01 (with 5-fold cross-validation).Nevertheless, the feature selection technique LASSO using support vector machine classifier obtained the maximum AUC of 0.93 ± 0.01 (with 5-fold cross-validation).Furthermore, the selection technique LASSO combined random forest obtained the second maximum AUC of 0.88 ± 0.02 (with 5-fold CV).The LASSO combined SVM Classification algorithm obtained the maximum AUC of 0.89 ± 0.00 (with 5-fold CV) for ADC images.
The feature selection approach LASSO combined RF classification algorithm also obtained the second maximum AUC of 0.89 ± 0.02 (with 5-fold CV).Nevertheless, the selection approach recursive feature elimination combined random forest classification algorithm obtained the maximum AUC of 0.85 ± 0.02 (with 5-fold cross-validation).Furthermore, the RFE combined SVM selection technique obtained the second maximum AUC of 0.84 ± 0.01 (with 5-fold CV).We selected the most appropriate feature selection technique and classification algorithm, Figures 2 and 3

Classifiers and Feature Selection Performance
The radiomics features were fed into a model that used RF and SVM classifiers with tuned hyperparameters to distinguish between significant cancer versus non-tumor regions in 71 PCa patients.For T2WI images, the RFE combined RF Classification algorithm obtained the maximum AUC of 0.95 ± 0.01 (with 5-fold CV).Furthermore, the RFE combined the SVM classification algorithm obtained the second maximum AUC of 0.94 ± 0.01 (with 5-fold cross-validation).Nevertheless, the feature selection technique LASSO using support vector machine classifier obtained the maximum AUC of 0.93 ± 0.01 (with 5-fold cross-validation).Furthermore, the selection technique LASSO combined random forest obtained the second maximum AUC of 0.88 ± 0.02 (with 5-fold CV).The LASSO combined SVM Classification algorithm obtained the maximum AUC of 0.89 ± 0.00 (with 5-fold CV) for ADC images.
The feature selection approach LASSO combined RF classification algorithm also obtained the second maximum AUC of 0.89 ± 0.02 (with 5-fold CV).Nevertheless, the selection approach recursive feature elimination combined random forest classification algorithm obtained the maximum AUC of 0.85 ± 0.02 (with 5-fold cross-validation).Furthermore, the RFE combined SVM selection technique obtained the second maximum AUC of 0.84 ± 0.01 (with 5-fold CV).We selected the most appropriate feature selection technique and classification algorithm, Figures 2 and 3 depict the significant receiver operating characteristic area under curves (ROC-AUC) for T2WI and ADC map images, respectively.For combined sequences (T 2 WI and ADC map images), the LASSO combined SVM Classification algorithm obtained the maximum AUC of 0.91.RFE combined with the RF classification algorithm obtained the second maximum AUC of 0.88.For combined sequences (T 2 WI and ADC map images), the RFE combined SVM Classification algorithm obtained an AUC of 0.81and LASSO combined RF classification algorithm obtained an AUC of 0.84.

Relationship between GS and Radiomics Attributes
The Kruskal-Wallis approach was used to ascertain if any feature from the radiomics aspect had statistical significance to make comparisons between the GS groups after retrieving radiomics features from T 2 WI and ADC map images of every prostate cancer subject.The radiomics features and GS cohorts were correlated using Spearman's correlation.
The Kruskal-Wallis test showed that the three GS cohorts (G2, G3, and G4) were statistically different in uniformity (Table 4).After applying the Holm-Bonferroni correction, no other characteristics were significantly different between GS groups.The correlation coefficients for entropylog2, entropylog10, uniformity and the angular second moment are 0.23, 0.23, −0.24, and −0.26.These numbers have a low correlation (Table 5).

Prediction of Gleason Score
The RF and SVM classifiers with tuning hyperparameters model predicted the GS groups of 71 prostate cancer subjects using all radiomics features.For ADC map images, using 1st order features, the LASSO combined RF Classification algorithm was an AUC of 0.82 for G2 subjects, 0.53 for G4 subjects, and 0.50 for G3 subjects.The RFE combined RF Classification algorithm was an AUC of 0.77 for G3 subjects, 0.71 for G3 subjects, and 0.43 for G4 subjects.The RFE combined SVM Classification algorithm was an AUC of 0.81 for G3 subjects, 0.48 for G2 subjects, and 0.25 for G4 subjects.The LASSO combined SVM Classification algorithm was an AUC of 0.77 for G4 subjects, 0.40 for G2 subjects, and 0.22 for G4 subjects.For ADC map images, using 1st order features, the LASSO with RF classification algorithm obtained the highest AUC of 0.82 to predict G2 (Figure 4).For combined sequences (T2WI and ADC map images) and featu GLCM), the LASSO combined RF Classification algorithm had an AUC jects.For combined sequences (T2WI and ADC map images) and featu GLCM), the RFE combined RF Classification algorithm was AUC of 0 and 0.61 for G4 subjects, respectively and 0.54 for G2 subjects.For co (T2WI and ADC map images) and features (1st order and GLCM), the SVM classification algorithm was an AUC of 0.78 for G4 subjects, 0.65 f G2 subjects.For combined sequences (T2WI and ADC map images) and and GLCM), the RFE combined SVM.For combined sequences (T2WI ages), the LASSO with RF classification algorithm obtained the maxim predict G3 (Figure 5).For combined sequences (T 2 WI and ADC map images) and features (1st order and GLCM), the LASSO combined RF Classification algorithm had an AUC of 0.92 for G3 subjects.For combined sequences (T 2 WI and ADC map images) and features (1st order and GLCM), the RFE combined RF Classification algorithm was AUC of 0.73 for G3 subjects and 0.61 for G4 subjects, respectively and 0.54 for G2 subjects.For combined sequences (T 2 WI and ADC map images) and features (1st order and GLCM), the LASSO combined SVM classification algorithm was an AUC of 0.78 for G4 subjects, 0.65 for G3, and 0.62 for G2 subjects.For combined sequences (T 2 WI and ADC map images) and features (1st order and GLCM), the RFE combined SVM.For combined sequences (T 2 WI and ADC map images), the LASSO with RF classification algorithm obtained the maximum AUC of 0.92 to predict G3 (Figure 5).For T2WI images, using 1st-order features, the LASSO combine algorithm was an AUC of 0.81 for G4 subjects, 0.67 for G3 subjects, and For T2WI images, using 1st order features, the LASSO with RF clas obtained the highest AUC of 0.81 to predict G4 (Figure 6).For T 2 WI images, using 1st-order features, the LASSO combined RF Classification algorithm was an AUC of 0.81 for G4 subjects, 0.67 for G3 subjects, and 0.63 for G2 subjects.For T 2 WI images, using 1st order features, the LASSO with RF classification algorithm obtained the highest AUC of 0.81 to predict G4 (Figure 6).
algorithm was an AUC of 0.81 for G4 subjects, 0.67 for G3 subjects, and 0. For T2WI images, using 1st order features, the LASSO with RF classif obtained the highest AUC of 0.81 to predict G4 (Figure 6).

Discussion
In PCa assessment, mp-MRI has been demonstrated to be a super lowing for greater accuracy when detecting cancerous growths.That is approach with enough spatial resolution and soft tissue contrast to iden cer effectively [8] without using ionising radiation.Prostate tumor aggr evaluated using artificial intelligence, such as radiomics [57].Conseq could be an innovative and effective method for extracting further clinic [17].Radiomics can diagnose prostate cancer early, grade it according t mine therapy response, and anticipate biochemical recurrence [57].
Different clinical settings may require different ML techniques for d tween sacral chordoma and sacral giant cell malignancies; LASSO using ear model (GLM) significantly outperformed [29,48].However, when i colon microarray gene expression and identifying meningioma, rando treme Gradient Boosting (XGBoost) classification methods achieved the [30][31][32]48,58].Wang et al. revealed that the ML approach of recursive eli using a support vector machine is better than other feature selection methods [48].As a result, it is essential and recommended to discover chine learning approaches in various clinical implementations in futu context of prostate cancer classification and grading, our research stan Figure 6.ROC-AUC of predicting GS of prostate cancer from RF classifier (using LASSO feature selections) using 1st order features obtained from T 2 WI image.

Discussion
In PCa assessment, mp-MRI has been demonstrated to be a superior technique, allowing for greater accuracy when detecting cancerous growths.That is the only imaging approach with enough spatial resolution and soft tissue contrast to identify prostate cancer effectively [8] without using ionising radiation.Prostate tumor aggressiveness can be evaluated using artificial intelligence, such as radiomics [57].Consequently, radiomics could be an innovative and effective method for extracting further clinically relevant data [17].Radiomics can diagnose prostate cancer early, grade it according to Gleason, determine therapy response, and anticipate biochemical recurrence [57].
Different clinical settings may require different ML techniques for discriminating between sacral chordoma and sacral giant cell malignancies; LASSO using a generalised linear model (GLM) significantly outperformed [29,48].However, when it came to scoring colon microarray gene expression and identifying meningioma, random forest and eXtreme Gradient Boosting (XGBoost) classification methods achieved the best performance [30][31][32]48,58].Wang et al. revealed that the ML approach of recursive elimination features using a support vector machine is better than other feature selection and classification methods [48].As a result, it is essential and recommended to discover appropriate machine learning approaches in various clinical implementations in future studies.In the context of prostate cancer classification and grading, our research stands out due to its focus on hyperparameter tuning.While many prior studies have applied machine learning techniques with default hyperparameters, we have systematically optimized these parameters to enhance the precision and robustness of our models.This approach has demonstrated its potential to contribute to more accurate and clinically relevant diagnoses, highlighting the critical role of hyperparameter optimization in medical applications of machine learning.

Significant Cancer versus Non-Tumor Regions
The Kruskal-Wallis test was utilised to examine radiomics characteristics' relevance in differentiating significant cancer versus non-tumor regions.Then, Spearman correlation was performed to determine the association between radiomics attributes and significant cancer versus non-tumor regions.Two feature selection methods (REF and LASSO) and two classifiers (RF and SVM) with tuned hyperparameters (randomised search and grid search) were used to create an effective ML algorithm.The analysis between radiomics features and the significant versus non-tumor regions revealed eleven radiomics features that are statistically significant (i.e., skewness, kurtosis, entropylog1o, entropylog2, uniformity, join-tEntropyLog2, jointEntropyLog10, correlation, contrast, dissimilarity, and angular second moment) with the capacity to discriminate between the significant and non-tumor regions.
Skewness and kurtosis reflect the distribution and shape of pixel intensities, indicating tissue composition or structural variations.Both entropylog1o and entropylog2 measure randomness in pixel distribution, revealing spatial tumor cell distribution.Uniformity indicates pixel intensity homogeneity, suggesting uniform tissue composition or density.Joint entropy reflects spatial relationship randomness, correlating with tumor heterogeneity.Correlation measures linear intensity relationships, indicating tissue structure homogeneity.Contrast reveals local intensity variations, suggesting distinct tumor features.Dissimilarity measures intensity differences between neighboring voxels, reflecting tissue heterogeneity.Angular second moment quantifies intensity uniformity, indicating tissue texture homogeneity.Overall, these features provide insights into tumor heterogeneity by quantifying pixel intensity distribution, texture, and spatial relationships within the tumor region.
Prostate cancer discrimination employing multiparametric MRI radiomics was designed and tested in this study, and the technique consistently performed well in the present study.As this study reveals, classification accuracy varies between ML techniques.For T 2 WI, RF and SVM classifiers were observed to be very useful when used with REF (AUC = 0.95 ± 0.01, and 0.94 ± 0.01, respectively).The second-best result was observed using LASSO selection with SVM and RF classifiers (AUC = 0.93 ± 0.01 for T 2 WI, and 0.89 ± 0.00 for ADC map, respectively).That is following previous findings have shown that this system is adequate to other feature selection techniques and classifiers in various organs [32,45,55,57,59,60].With support vector machines and random forests classifiers, the AUC for the T 2 WI sequence was maximum with the selection approach using the REF.
Radiomic features can be used to identify the T1-2 and T3-4 stages using an unsupervised clustering algorithm and the supervised LASSO technique, according to Sun et al. [61].This finding might link to the fact that morphological T 2 WI depends on the tumor signal for its assessment.The second-highest AUC was achieved using the selection approach of LASSO with SVM and RF.Wang et al. achieved the best result when combining a support vector machine with recursive feature reduction [62].
Nevertheless, the T 2 WI model performed better than the ADC model (AUCs of 0.95 vs. 0.89, respectively).We observed that the AUC of the classification algorithm generated from T 2 WI images using RF classifiers using the feature selection technique (RFE) was the maximum AUC of 0.95 ± 0.01.In addition, the RFE combined with the SVM classification algorithm obtained the second maximum AUC of 0.94 ± 0.01 (with 5-fold cross-validation).Additionally, T 2 WI could perform a non-invasive analysis of PCa biological growth, which might assist in classifying patients for adequate treatment.It also provides morphologic data for cancer diagnosis, localisation, and staging [62].SVM and RF classifiers combined LASSO (For LASSO, AUC of 0.89 ± 0.00, 0.89 ± 0.02 for SVM, and RF classifiers, respectively) and RFE (for RFE, AUC of 0.84 ± 0.01, 0.85 ± 0.02 for SVM, and RF classifiers, respectively) for classification between significant cancer versus non-tumor from ADC map images were lower when compared to T 2 WI images.For combined sequences (T 2 WI and ADC map images), the LASSO combined SVM classification algorithms had an AUC of 0.91.The second-highest AUC was 0.88 for the RFE with the RF classification algorithm.Features from several sequences achieved lower performance compared to single sequence features.

GS Prediction
The Kruskal-Wallis test assessed radiomics features' ability to predict GS in prostate cancer patients.Radiomics attributes and GS cohorts were then correlated using Spearman correlation.The ML algorithm was developed using feature selection methods (REF and LASSO) and classifiers (RF and SVM) with tuned hyperparameters (randomised search and grid search).ADC map images revealed one radiomics feature from the uniformity that could distinguish GS cohorts.Uniformity refers to the homogeneity of pixel intensities, indicating a consistent tissue composition or density.This feature offers insights into tumor heterogeneity by assessing the distribution of pixel intensities, texture variations, and spatial relationships within the tumor region, providing a comprehensive view of its internal characteristics.
The results we obtained agree with those of several other studies using texture analysis [63,64].Texture features, such as those of the first and second order derived from ADC and T 2 WI, and sample augmentation, were demonstrated to effectively achieve reasonably accurate classification of Gleason patterns [55].Our findings align with employing the Gleason score as the primary criterion for differentiating benign from significant prostate tumors.
There are limitations identified in this research.There is a relatively low N number of patients.A significant subject cohort (raw dataset) is required to fully validate and optimize the performance for application in a clinical setting.We agree that there are limitations to this work and that in clinical settings there are compromises made on mismatched resolutions.Ideally all our data and all clinical data would be at the same resolution field strength etc. providing uniformity in data acquisition and this step could be avoided.Due to the nature of clinical MRI time and the time requirement of different sequences employed this mismatch of resolutions will persist for the near future.

Conclusions
Within the scope of this study, the classification of prostate cancer and prediction of GS groups using multiparametric MRI-based radiomics has been achieved.By prioritizing hyperparameter tuning, we have significantly improved the precision and dependability of our ML approaches.This work underscores the importance of meticulous parameter optimization in enhancing the accuracy of medical diagnoses.Radiomics analysis based on multiparametric MRI showed excellent results in discriminating non-tumor regions from significant prostate cancer results obtained.The results of the radiomics analysis, which depended on the multiparametric MRI, demonstrated superior outcomes in predicting between GS groups.Our approach suggests that using multiple features and classifiers with tuning hyperparameters provided a more clinically dependable method of identifying clinically relevant features.

Figure 1 .
Figure 1.The prostate cancer classification approach entails three primary steps: (i) utilizing Regions of Interest (ROIs) that correspond to cancer locations on histology slides and MRI, specifically T2 weighted images and Apparent Diffusion Coefficient map images from 71 subjects; (ii) extracting both 1st and 2nd orders features; and (iii) conducting ROC-AUC analysis, which includes the generation of ROC curves.

Figure 1 .
Figure 1.The prostate cancer classification approach entails three primary steps: (i) utilizing Regions of Interest (ROIs) that correspond to cancer locations on histology slides and MRI, specifically T2 weighted images and Apparent Diffusion Coefficient map images from 71 subjects; (ii) extracting both 1st and 2nd orders features; and (iii) conducting ROC-AUC analysis, which includes the generation of ROC curves.

Figure 2 .
Figure 2. The classification of prostate cancer as significant versus non-tumor regions depends on RFE using mp-MRI within a 5-fold cross-validation.

Figure 2 .
Figure 2. The classification of prostate cancer as significant versus non-tumor regions depends on RFE using mp-MRI within a 5-fold cross-validation.

Figure 3 .Figure 3 .
Figure 3.The classification of prostate cancer as significant versus non-tumor regions depends on LASSO using mp-MRI within a 5-fold cross-validation.For combined sequences (T2WI and ADC map images), the LASSO combined SVM Classification algorithm obtained the maximum AUC of 0.91.RFE combined with the RF

Figure 4 .
Figure 4. ROC-AUC of predicting GS of prostate cancer from RF classifier (u selections) using 1st order features obtained from ADC map images.

Figure 4 .
Figure 4. ROC-AUC of predicting GS of prostate cancer from RF classifier (using LASSO feature selections) using 1st order features obtained from ADC map images.

Figure 5 .
Figure 5. ROC-AUC of predicting GS of prostate cancer from SVM classifier selections) using GLCM features obtained from T2WI images.

Figure 5 .
Figure 5. ROC-AUC of predicting GS of prostate cancer from SVM classifier (using LASSO feature selections) using GLCM features obtained from T 2 WI images.

Figure 6 .
Figure 6.ROC-AUC of predicting GS of prostate cancer from RF classifier (us selections) using 1st order features obtained from T2WI image.

Table 2 .
Collating of radiomics parameters of significant cancer versus non-tumor regions.

Table 3 .
Features related with the significant malignancy and the non-tumor regions are considered correlated.
depict the significant receiver operating characteristic area under curves (ROC-AUC) for T 2 WI and ADC map images, respectively.

Table 4 .
Collating of radiomics parameters of PCa that are related with the GS.

Table 5 .
Features that relate to the GS are considered correlates.