Advancements in MRI-Based Radiomics and Artificial Intelligence for Prostate Cancer: A Comprehensive Review and Future Prospects

Simple Summary The integration of artificial intelligence (AI) into radiomic models has become increasingly popular due to advances in computer-aided diagnosis tools. These tools utilize statistical and machine learning methods to evaluate various medical image analysis modalities. In the case of prostate cancer, there are multiple areas in the radiomics pipeline that can be improved. This article explores the latest developments in mpMRI for PCa and examines the radiomic flowchart, as well as the fusion of traditional medical imaging with AI to overcome challenges and limitations in clinical applications. Furthermore, it addresses challenges related to radiomics, radiogenomics, and multi-omics in prostate cancer and suggests the necessary critical steps for clinical validation. Abstract The use of multiparametric magnetic resonance imaging (mpMRI) has become a common technique used in guiding biopsy and developing treatment plans for prostate lesions. While this technique is effective, non-invasive methods such as radiomics have gained popularity for extracting imaging features to develop predictive models for clinical tasks. The aim is to minimize invasive processes for improved management of prostate cancer (PCa). This study reviews recent research progress in MRI-based radiomics for PCa, including the radiomics pipeline and potential factors affecting personalized diagnosis. The integration of artificial intelligence (AI) with medical imaging is also discussed, in line with the development trend of radiogenomics and multi-omics. The survey highlights the need for more data from multiple institutions to avoid bias and generalize the predictive model. The AI-based radiomics model is considered a promising clinical tool with good prospects for application.


Introduction
Prostate cancer (PCa) is a malignant tumor of the male genitourinary system, characterized by epithelial cells. It ranks as the most prevalent malignant tumor in men, the second most common cancer globally, and the fifth leading cause of cancer-related deaths in men. The disease is the primary cancer in 112 nations and is responsible for the majority of cancer deaths in 48 countries [1]. According to the latest statistics, in 2020, there were approximately 1.4 million newly diagnosed PCa cases and 375,000 deaths worldwide [2]. Since PCa develops slowly in its early stages, older men, who are at high risk, may not realize that they are affected. Therefore, timely detection is important to reduce mortality rates. Furthermore, early detection and prompt treatment can significantly reduce PCa-related deaths. In summary, the contributions of this survey can be listed as follows: • We provide a brief overview of radiomics models used for PCa. A detailed analysis of the key motivations for radiomics applications using current feature extraction, feature selection, and machine learning techniques is also included. • We commonly analyze the clinical value of mpMRI used in PCa, such as guidance for treatment, showing the pathological areas of tumors, and stating the current challenges with mpMRI. • We present the development of radiogenomics and multi-omics with PCa applications. • We discuss the recent challenges related to the current PCa radiomics, radiogenomics, and multi-omics with future directions in these topics.
The remainder of this paper is structured as follows. Section 2 briefly describes the impact of mpMRI for PCa. Section 3 presents the standard radiomic model. Section 4 discusses the stability of radiomics. Section 5 introduces the predictive models for classifying PCa with MRI scans. Sections 6 and 7 highlight the research value of radiogenomics and multi-omics for analyzing PCa, respectively. Section 8 discusses the future perspective and limitations. Section 9 summarizes the work and contribution of this paper.

Multi-Parametric MRI Imaging of the Prostate
Multi-parametric MRI represents both anatomical sequences (i.e., T1-weighted (T1W), T2-weighted (T2W)) and functional sequences, including diffusion-weighted imaging (DWI) and dynamic contrast enhancement (DCE). As T1W is limited in evaluating prostate morphology or identifying intraglandular tumors, mpMRI also uses T2W, DWI, and DCE, which have high sensitivity and specificity for detecting significant abnormal tissues. The quality of these sequences depends on the hardware and software used and the scanning parameters chosen, as well as on several other factors, including bowel motility, rectal dilation, the presence of total hip replacement, and post-biopsy bleeding [9].

Impact of Multi-Parametric MRI
The most important clinical value of mpMRI is to guide the biopsy of the abnormal area of PCa to complete the direct evaluation of the location, size, and cancer stage of different cancers in the prostate. The accuracy of mpMRI-guided biopsy depends on the ability to observe PCa on the mpMRI. A new study compared standard biopsy with MRIguided biopsy and proved that MRI-guided biopsy has higher sensitivity in detecting PCa with clinical significance, reducing the probability of over-examination and treatment [14]. In addition, men with positive MRI results should also undergo standard biopsy with target biopsy [14,15]. A prostate imaging reporting and data system (PI-RADS) was introduced to collect, interpret, and report standard MRI images. The first version of this system was proposed in 2012, which includes the essential scoring criteria [16]. A second version further refined the system proposed in 2015 [17], and updated PI-RADS v2.1 in 2019, making the system more advanced [18]. Its development promotes the standardization of MRI and contributes to more clinical applications. The scoring system provides a framework for evaluating individual T2W, DWI, and DCE sequences and integrates these individual scores into overall risk assessment categories from 1 to 5. These risk categories contribute to the determination of biopsy [19]. For example, PI-RADS v2.0 scores range from 1 to 5. A biopsy is required for a lesion with a PI-RADS score of 4 or 5. However, it is not required for lesions with a PI-RADS score of 1 or 2. A score of 3 indicates that the lesion may require biopsy, depending on clinical factors [17]. This leads to considering the mpMRI in detecting PCa and the treatment plan. A recent survey has indicated that MRI-based radiomics research on PCa has the potential to enhance the PIRADS report in the future. Specifically, this research may improve the diagnosis and risk stratification of PCa [20]. Table 1 reports recently published literature using mpMRI to detect Pca. According to recent papers presented in Table 1, it can be observed that mpMRI is presently the most frequently used technique for identifying PCa. Specifically, these works showed that using mpMRI can improve the detection rate of clinically significant PCa (csPCa) [14,[21][22][23][24]. In [25], they found no significant differences in the detection of PCa and csPCa using MRI in-core and MRI-TRUS fusion target biopsy (TBx). In addition, in terms of MRI imaging, the study of [26] combined MRI with prostate-specific membrane antigen (PSMA) to improve negative predictive value (NPV) and sensitivity of csPCa. In [27], they showed that the use of miniature ultrasound biopsies to detect csPCa was not significantly different from mpMRI. In [28], mpMRI outperforms Foggia Prostate Cancer Risk Calculator (FPC-RC) and is similar to the European Randomized Study of Screening for Prostate Cancer RC (ERSPC-RC) and Prostate Biopsy Collaborative Group RC (PBCG-RC) in predicting csPCa. This leads to improving the diagnostic accuracy of the risk calculator. In addition, the mpMRI risk calculator was studied and proved to avoid unnecessary biopsies [29]. Combined mpMRI with MR spectroscopy improved the performance of PCa diagnosis [30]. A new study showed that the combination of fusion-guided biopsy and systematic biopsy could improve the detection of PCa by 10% and identify csPCa [31]. While in [32], considering MRI-lesion targeted (MRI-TB) in MRI-positive patients improved the detection rate of csPCa. Table 1. Summary of multi-parametric MRI in detecting PCa.

Biopsy
Method Conclusion [14] No MRI-TBx When detecting csPCa, MRI provides a higher DR than the standard biopsy.
[29] Yes mpMRI-RCs When predicting csPCa, RC-R has a higher AUC than RC-A. [30] No mpMRI, mpMRI-DW-DCE-MRSI When diagnosing PCa, mpMRI-DW-DCE-MRSI has higher sensitivity and specificity than MRI. [31] No MRI/ultrasound fusion biopsy, SB When detecting PCa, combined biopsy provides a higher DR than SB and fusion-guided biopsy alone. When detecting csPCa, fusion-guided biopsy alone provides a higher detection rate than SB alone.
[32] Yes MRI-TBx, SB When detecting PCa and csPCa, MRI-TBx provides a higher DR than SB alone. The role of mpMRI in the guide to treatment includes radical prostatectomy, definitive radiotherapy, and active monitoring [19]. For example, mpMRI images can show the location of the lesion and correctly help in segmenting the tumor volume, which simplifies treatment management [34]. Patients with PCa undergoing surgery or radiotherapy have the possibility of recurrence, including biochemical recurrence (BCR), local recurrence, and distant metastases. Before treatment, we can combine mpMRI scans with clinical variables (clinical stage, PSA, and biopsy Gleason score) to determine the risk of recurrence. Using clinical variables and medical images, the AI model can improve the performance in multitask prediction related to PCa [35,36]. For PCa segmentation, mpMRI images can also improve performance metrics. For example, the combination of T2W and ADC images can enhance the evaluation of PCa in both visual quality and objective assessment [37]. In [38], a deep learning model, "ProGNet" was developed to segment MRI images of prostate tissues automatically. This model, "ProGNet" outperforms U-Net, and radiologic technologists reduce the time to clinically segment the prostate to facilitate targeted biopsy studies while potentially improving biopsy accuracy.
Despite the critical role of mpMRI for PCa management, there are still some challenges. For example, mpMRI is reliable in excluding clinically significant PCa, but whether a biopsy is needed in the case of negative mpMRI is still controversial, especially for young patients [39]. There are significant differences between radiologists when conducting interinstitutional research, which may make the same patient receive different examination results in other institutions [19]. Before mpMRI becomes the standard management for PCa treatment, large datasets derived from multi-centers are required. For patients with mpMRI limitations (i.e., pacemakers, metal implants, and claustrophobia), it is not easy to obtain mpMRI images, as well as some inherent problems of mpMRI, such as variability and challenges in image acquisition and interpretation [40]. All these limitations require more investigation to solve and improve PCa management challenges.

Radiomics Analysis for PCa
By avoiding the need for invasive procedures, such as obtaining pathological specimens through surgery, radiomics offers a more patient-friendly option. For example, radiomics can provide information without causing undue discomfort to patients. To facilitate the use of radiomics, we present the standard steps of the method as illustrated in Figure 2. A standard radiomic pipeline typically involves four key steps. First, an MRI scanner captures multi-parametric MRI images. Second, the images are segmented to label abnormal regions or areas, including the region of interest (ROI). Third, texture, shape, and intensity features (and/or deep features are extracted from convolution neural network (CNN) layers)are extracted from the images. Finally, the imaging features are aggregated with relevant clinical variables using a classifier model to predict clinical tasks such as Gleason score of prostate cancer.

Image Acquisition
Radiomics leverages various medical images, such as ultrasound, X-ray, CT, MRI, and PET scans. Numerous public datasets, such as The Cancer Imaging Archive (TCIA), provide detailed manual annotations of medical images, including Gleason scores and recommended treatments, labeled by clinicians (and/or radiologists, oncologists, and pathologists). Among these modalities, MRI is a preferred one due to its superior soft tissue imaging and sensitivity to metastases, making it a popular choice for prostate examinations [41]. However, image acquisition is a critical factor in the radiomic pipeline. Limitations in technology or equipment may result in some biological defects not being displayed, leading to unreliable results. To address this issue, initiatives such as the Quantitative Imaging Biomarkers Alliance (QIBA) [42], the International Biomarker Standardization Initiative (IBSI) [43] and the European Imaging Biomarkers Alliance (EIBALL) [44] have been proposed to promote quantitative imaging and ensure reliability. These initiatives specify measurement accuracy requirements for quantitative imaging biomarkers and outline procedures to achieve optimal accuracy while minimizing possible biases.
The extraction of radiomic features from medical images requires a series of preprocessing steps to enhance the quality of the data. These steps are necessary because the accuracy and reliability of the extracted features heavily depend on the quality of the input data. Denoising is one of the preprocessing techniques that is commonly used to reduce the noise in the data. The presence of noise can negatively impact the accuracy of the radiomic features, which is why denoising is an important step. Another important technique used in data preprocessing is standardization, which involves scaling the data to a common range. This technique is particularly useful when dealing with data from different sources or modalities because it makes the data comparable. Resampling is also a widely used technique that involves adjusting the resolution of the data to a common scale. This technique can improve the accuracy of the features by ensuring that the data are uniform. Several methods have been proposed to achieve data normalization, such as linear variation, Gaussian, and z-values, among others. These techniques have been shown to significantly impact the results of predictive models [45]. After completing the image acquisition and preprocessing steps, the next step in the radiomic process is to label the region of interest, such as lesion regions. This step is accomplished using segmentation techniques, which identify and separate the region of interest from the surrounding tissues. Specifically, the combination of these preprocessing techniques and segmentation provides a solid foundation for the accurate and reliable extraction of radiomic features.

Segmentation
Segmentation of PCa in MRI images is the process of identifying and isolating cancerous tissue within the prostate gland using MRI scans. Specifically, MRI scans of the prostate provide high-resolution images that can be used to distinguish between cancerous and non-cancerous tissue. However, interpreting these images can be challenging due to the complex anatomy of the prostate gland and the variability of cancerous lesions. Segmentation techniques aim to automate this process and improve the accuracy and efficiency of diagnosis and treatment. Several approaches have been developed for PCa segmentation in MRI images, including manual segmentation by radiologists, semi-automated methods using thresholding and region growing, and fully automated methods using machine learning algorithms [36,46]. Machine learning algorithms such as convolutional neural networks (CNNs) have shown promising results in segmenting PCa in MRI images with high accuracy. The process of segmentation involves the identification and separation of regions of interest (ROI) in both two-dimensional (2D) and three-dimensional (3D) space, also referred to as the volume of interest (VOI). Accurate ROI labeling is an important step in studies where pathological regions require precise boundaries, which can be challenging during the segmentation. However, automatic segmentation algorithms for ROIs require improvement. Manual segmentation is time-consuming and depends on the size of the data set. Both manual and semi-automatic segmentation can be affected by observers, leading to deviations. Therefore, the reproducibility of radiomics features derived from manual or semi-automatic image segmentation and correction should be evaluated for intraobserver and interobserver variability, and non-reproducible elements should be excluded from further analysis [47]. Fully automatic segmentation is expected to become the dominant method soon [48]. For example, CNNs have been widely employed for automatic segmentation [49,50]. In [51], they developed a multiregional automatic segmentation model based on CNNs using the intercontinental queue of PCa MRI. In [52], end-to-end CNNs were proposed to automatically segment csPCa lesions, and the accuracy of the segmentation results was higher than other methods (Dice and sensitivity were 0.7014 and 0.8652, respectively).
Additionally, CNN (V-Net T2) and Active Shape Model (ASM) increase the Dice Similarity Coefficient (DSC) value from 0.840 to 0.851 and reduce the Hausdorff distance from 10.74 to 7.55 mm, improving the segmentation performance [53]. Despite many advanced contributions in automatic segmentation methods, the semi-automatic segmentation that gives options to clinicians is the most recommended. For example, 3D-Slicer (www.Slicer.org (accessed on 7 April 2023)) and ITKSNAP (http://www.itksnap.org (accessed on 7 April 2023)) tools are used to label tumors.
So far, one challenge in PCa segmentation is the presence of false positives and false negatives, which can lead to over or underestimation of the extent of the PCa. To address this, more work is needed to integrate multiple MRI sequences with advanced image processing techniques to improve segmentation accuracy.

Feature Extraction, Selection, and Construction
After the ROI segmentation is the feature extraction step, which is the core part of radiomics. These extracted features describe biological information and important characteristics of abnormal tissue and are used as input into predictive models. Currently, radiomic features include morphological/shape, texture (such as gray-level co-occurrence matrix (GLCM), gray-level size zone matrix (GLSZM), gray-level run length matrix (GLRLM), gray-level dependence matrix (GLDM), neighboring gray-tone difference matrix (NGTDM), etc.), and high-order statistical features [54]. Joint Intensity Matrix (JIM) [55] and deep texture [56] are proposed radiomic features to predict the Gleason Score (GS) of prostate cancer using mpMRI scans. Shape features in conventional radiomics are typically derived from ROI that have been manually labeled. However, it is important to consider inter-observer variability during segmentation, as this can impact the reliability of selected features. To address this, segmented images can be analyzed by multiple observers and features can be compared using metrics such as intraclass correlation coefficient (ICC) and consistency correlation coefficient (CCC). Only variables that meet specific thresholds for robustness should be selected [57]. With CNN models, features are extracted and selected based on CNN layers (such as feature maps and pooling layers) [3]. The extraction of radiomics features typically results in a high-dimensional feature space. This can lead to overfitting when using the features as inputs to predictive models. The high-dimensional feature space includes redundant and noisy information, which can introduce errors in practical applications and affect accuracy. When the dimension of the feature exceeds a specific limit, the classifier's performance may decline, and training time will increase. Feature dimensionality reduction is therefore required to reduce errors, improve the efficiency of radiomics feature data, enhance the model's prediction ability, and shorten training time. Table 2 reports the recent literature on feature selection techniques. The feature selection methods include Random Forest (RF), the least absolute shrinkage and selection operator (LASSO), principal component analysis (PCA), maximal relevance and minimal redundancy (mRMR), etc. LASSO, PCA, and RF are frequently used methods for feature selection. According to a comprehensive study by Zebari et al. [58], PCA is the most commonly employed algorithm for dimensionality reduction. The feature selection methods can also be divided into the following categories: (1) Filtering method: evaluate the features according to the divergence or correlation of the features, set the threshold, and then select the features, such as correlation analysis, analysis of variance and rank sum test; (2) Wrapper method: select or exclude several features according to the objective function, such as the recursive feature elimination method; and (3) Embedding method: firstly, the algorithm and model of machine learning are used for training to obtain the weights of each feature, and then the features are selected according to the weights, such as logistic regression [59]. Despite the feature selection methods' advantages, more investigation is still needed to solve the overfitting problem.

Building Predictive Models
Modeling methods in radiomics can be divided into unsupervised and supervised categories. Unsupervised methods, such as k-means clustering and hierarchical clustering, are used for datasets without labels, while supervised methods, such as random forest, support vector machine, artificial neural network, and logistic regression, are used for labeled datasets. While no single classification method has been identified as universally superior in radiomics, supervised methods are generally used more frequently. For example, the logistic regression model is often preferred for its simplicity and has become the most commonly used method for building models.
The standard practice is to split datasets into training (70%) and test (30%) groups. The model is constructed using the training dataset and is fine-tuned through internal validation methods like k-fold cross-validation. The test datasets are used to evaluate the performance of the predictive model [73]. A variety of metrics can be used to assess the model's performance, including the area under the receiver operating characteristic (AUC-ROC), sensitivity (SE), specificity (SP), accuracy (ACC), and decision curve.
Many modeling techniques commonly used in radiomics are reported in Table 3. In [74], a radiomic model based on quantitative imaging features is used to predict clinically significant PCa. In [75], the radiomics model with mpMRI has been proven to help improve the diagnostic performance of PI-RADS v2.1 in PCa. In [61,76], radiomics is used to actively monitor the progression of PCA. In [77,78], the radiomic model was used to study the risk of lymph node invasion in PCA patients to avoid expanding pelvic lymph node dissection. In addition, radiomic models based on combination PET+ADC scans have complementary values [79]. In [80], the 3T-DWI b2000 sequence was used for the prognosis and targeted biopsy, proving ADC's feasibility for PCa detection. In [60], the nomogram shows the radiomic model with MRI and PI-RADS as a noninvasive method capable of predicting PCa. As reported in Table 3, the radiomics model was used and demonstrated a noticeable improvement in the detection rate of PCa.

Radiomics Stability
A big challenge facing radiomics is related to model stability [81]. Many factors that may consider in studying stability are (1) feature importance, (2) generalizability, (3) stability testing, and (4) failure examination. For example, features in a radiomic model could be estimated and evaluated regarding the relative importance in the trained model [82]. In addition, testing the model using different patient groups can assess generalizability. Furthermore, the stability and reproducibility of features may be applied through the use of standardized protocol guidelines and software [83]. More details about the radiomic stability are explained as follows.

Feature Importance
To improve the development of a stable model, it is important to identify significant radiomic features that are relevant to clinical tasks. This may involve identifying and avoiding redundant features that can lead to scalability issues. By excluding highly correlated or redundant features, a more stable model can be built [84,85]. To minimize the risk of using unstable and unrepeatable features in the radiomics analysis, it is suggested to retest the analysis of the treatment site, scanner, and imaging protocol control to evaluate and analyze the impact of each factor on the model [86]. In addition, the radiomic features also require being stable when various data sources are used.

Generalizability
Multicenter, large sample data, and additional clinical features are required to achieve the model's generalizability. For example, it may be beneficial to train the model with diverse groups of patients derived from multiple sites. This approach can help to increase the model's ability to perform well on unseen data and in different clinical settings. Training the model on a single group of patients from a particular site may lead to overfitting, which can limit the model's performance on new data. However, training the model with patients from various sites can help capture the heterogeneity in the radiomic features across different patient populations and imaging protocols. Since data may come from different imaging acquisition protocols or devices in clinical applications. Radiomic models with other patients should consider these factors when training models [87]. In [88], they evaluated the generalizability of the model using two external datasets and found a significant decrease in performance compared to internal cross-validation (average AUC 0.54 vs. 0.75). In [89], federated learning can improve the generalization performance of PCa models across institutions and protect data privacy. So far, domain adaptation and federated learning can be valuable techniques for enhancing the generalizability of machine learning models, particularly in the context of medical imaging, where data can be diverse and challenging to obtain [90].

Stability Testing
As previously mentioned, stability can be assessed through reproducibility. For mean or median comparison, common indicators were used, including CCC, coefficient of variation (CoV or CV), Pearson or Spearman correlation, and parametric or nonparametric statistical tests (t-test, analysis of variance test, Wilcoxon test, Friedmann test, etc.) [91]. Many studies performed radiology-assisted experiments to improve repeatability and reproducibility and identify stable features. However, unstable features may also contain relevant information needed for research, leading to an overestimation of model performance. In [92], a data analysis method was proposed to evaluate the stability of the radiomics features obtained from MRI. This method shows that a large part of the radiologic features based on ADC (25-29%) show retest stability in various tissues, MR systems, and suppliers. In addition, different observers and even the same observer may have varying evaluations of the same image, which can result in variations in the results. For instance, in a study where a pathology team of four experts evaluated 425 internal biopsy tissues [93], two European pathologists exhibited an observer consistency of 0.89 quadratic-weighted kappa (K quad ), while the consensus among general surgical pathologists was 0.69 K quad . The consistency between uropathologists and general surgical pathologists ranged from 0.50 to 0.59 K quad . To reduce errors, it is possible to train observers, standardize techniques and judgment criteria, estimate the degree of non-compliance between observations, and randomly assign patients to observers.

Failure Examination
Failure examination can show us the potential defects in established radiomics models. A summary of the relevant radiomics stability studies is presented in Table 4. It requires establishing a quality management procedure to check whether the model is still valid after updating [87]. In [94], 2D-based radiomic features of MRI models showed good stability in identifying GS. In addition, image normalization may be considered a stability factor in data preprocessing. For example, normalization is also applied in multisource data for prognosis modeling [95]. Multi-modal radiomics models have been developed and customized according to specific radiology protocols, which can solve particular problems related to single and multi-center research [96]. In addition, the radiomic features of phantoms and volunteers with low COV and high ICC can be considered good candidates for MRI radiomics studies [97]. With the progress of stability radiomic techniques, more investigation to manage these techniques with minimization of model bias is recommended.

Stability
Purpose Conclusion [94] MS and FS To explore the potential of radiomic features in identifying GS <7, =7 and >7 The 2D model performed better than the 3D model. [95] FS To evaluate the effect of different image normalization methods on the robustness of MRI features in a multicenter.
The percentage of stable features varies from 3.4% to 8% depending on the normalization method.
[98] MS To assess the potential of clinic-based models, radiomics based on multiparameter ultrasound, and combined models to predict PCa.
The combined model achieved better predictive performance than the radiomics and the clinical model.
[99] FS To explore the stability of the radiomics features extracted from T2 weighted MR Linac images for the five common influencing factors 25 of 1409 radiomics features remained robust. [

Radiomics Related to Prostate Cancer
Many machine learning algorithms have been used for classifying prostate lesions using MRI images. For example, the PCa classifications may be related to malignant versus (vs.) benign, csPCa vs. clinically insignificant prostate cancer (ciPCa), multi-class of invasiveness (aggressive, indolent, and indeterminate), GS groups, etc. In [102], they combined texture features derived from T2W images and ADC maps using a support vector machine to classify between low and high aggressive cases of PCa, which showed a higher AUC value with 0.96 compared to the use-only ADC map with 0.55. In [102], a fully automatic computer-aided diagnosis system has been developed, which can correctly identify patients with invasive PCa, and it can eliminate the need for manual segmentation and analyze data sets from multiple centers. In [103], they presented an algorithm model that combines radiomics and pathology to differentiate between indolent and aggressive cancers on MRI-CorrSigNIA, which achieved an accuracy of 80%. Another study aimed to predict GS and established a radiomics model using T2WI, ADC, and diffusion kurtosis imaging (DKI) sequences. The radiomic model using imaging features with lesion size and PI-RADS score predicted PCa with GS ≥ 8 [70]. Using MRI images, the radiomics model could distinguish between csPCa and ciPCa [71]. In addition, the radiomics model using DCE-MRI sequences with logistic regression in predicting the aggressiveness of PCa showed a feasible diagnostic performance [72]. Compared with T2WI and DWI sequences, prostate DCE-MRI could better display the tumor boundary, which is beneficial to the segmentation of the ROI. However, the study only focused on the radiomics features of the DCE-MRI sequence, and future studies need to be combined with other sequences to improve the diagnostic performance of the radiomics model [72]. Chaddad et al. proposed a new radiomic signature based on the joint intensity matrix (JIM) to predict the Gleason score (GS) of prostate cancer (PCa) patients. The predictive model achieves an AUC value of 78.40% for GS ≤ 6, 82.35% for GS = 3 + 4, and 64.76% for GS ≥ 4 + 3 [55]. In another study, texture features used with a random forest model achieved an average of AUC of 83.40%, 72.71%, and 77.35% to predict GS = 6; 6 < GS < 3 + 4 and GS ≥ 4 + 3, respectively [54]. The performance metric in predicting the GS is significantly improved when the imaging features are extracted from CNN layers, known as deep radiomic features [56]. With the related approaches to PCa, more investigation is still needed to consider all MRI sequences with AI models in monitoring patients with PCa. To reduce the gap between the academic research of AI in PCa and the improvement of the interpretability of AI models in clinical diagnosis support. It is suggested to solve the limited labeled data, complete the further development and validation of multi-reader research and prospective evaluation, and formulate and improve the standard evaluation criteria [104].

Radiogenomics in Prostate Cancer
The improvement of gene expression levels has strongly promoted the rapid development of genomics. By combining imaging and genomics data, radiogenomics provides a more accurate method for diagnosing and avoiding overtreatment of low-risk tumors [56]. Specifically, radiogenomics may use imaging features to predict (or combine) the status of genes and guide the diagnosis, treatment, and prognostic process of PCa [105]. For example, the combination of mpMRI and gene expression data can detect the radioactive signature of PCa. Because of the susceptibility of gene mutations in PCa, many genes are included in gene testing guidelines to assess the risk of PCa and provide guidance for targeted personalized therapy. Common genes used as biomarkers include the breast cancer (BRCA) gene, E-twenty six(ETS)-related gene (ERG), hypoxia gene, ATM gene, etc. Identification of BRCA mutations can be used for PCa screening strategies, in which BRCA 1 and BRCA 2 are key genes associated with PCa susceptibility and are related to hereditary breast cancer and ovarian cancer syndrome [106]. ERG is the result of a fusion of the androgen receptor-regulated transmembrane protease serine 2 (TMPRSS2) with proto-oncogenes. Hypoxia is an essential feature of the tumor microenvironment, which affects the treatment and prognosis of PCa. Hypoxic gene signatures are usually based on gene expression responses in cell lines exposed to hypoxia. In [107], the risk marker constructed by two hypoxia and immune-related genes, ISG15 and ZFP36, showed significant PCa prediction ability and was helpful to the prognosis of PCa. Table 5 presents recent radiogenomic studies of PCa and their findings. In [108], PTEN and ERG were found to be correlated with PCa visibility on MRI. In clinical trials, prophylactic PCa resection is the primary prevention choice for BRCA 2 carriers [109]. Detecting BRCA gene mutations in PCa patients helps guide treatment and further genetic detection [110]. HP 13 C-MRI can distinguish inactive from aggressive PCa based on unique metabolic features [111]. The visibility of mpMRI increased when the tumor evolution resulted in numerous protein groups different from normal PCa [112]. Ragnum-signature has been further developed as a biopsy-derived hypoxia biomarker for PCa [113]. The combination of sSelectMDx and PI-RADS is more sensitive in detecting PCa and may avoid unnecessary biopsy [114]. Early gene mutation detection, including BRCA 1/2, can improve the survival rate of patients [115]. Furthermore, the RNA sequencing of benign biopsies revealed the upregulation of NKX3-1 and HOXB13 in the absence of T cells, which may help identify a higher risk of PCa [116]. Hyperlipidemia is associated with invasive features of PCa without TMPRSS2-ERG fusion or PTEN deletion/mutation [117]. Eleven miRNAs were identified as sensitive biomarkers for early detection of clinically significant PCa [118]. Additionally, recent research has found that ANGPTL4, VEGFA, and P4HA1 (hypoxia-related genes) are related to PCa texture features [119]. In [120], Fischer et al. identified four biomarkers belonging to genes and miRNAs that play important roles in PCa, which have the ability to differentiate between T2c and T3b stages. Benafif et al. [121] demonstrated the feasibility of using germline SNPs in targeted PCa population screening in the UK community through the BARCODE1 study. So far, these studies demonstrate the importance of radiogenomic research in understanding PCa and identifying potential biomarkers for early detection, risk assessment, and treatment guidance. Furthermore, genomic measurements are typically assessed on a small tumor. They reflect only one aspect of tumor heterogeneity. With the ability to determine tumor heterogeneity, radiogenomics offers a personalized approach to risk stratification in patients with PCa [122]. It can also guide clinical treatment strategies based on individual clinical risk factors. For example, one of the personalized methods of PCa risk calculation is to include clinical data of patients, consisting of PSA levels, and PCa Antigen 3 (PCa3) and TMPRSS2-ERG (T2:ERG) expression [105]. Due to the limited medical datasets, the short-term solution is to use transfer learning or data augmentation, and the long-term solution is to use multiinstitutional data by facilitating the development of online databases [123]. Personalized treatment requires sequencing a patient's genome, transcriptome, or proteome [124]. Using genome sequencing to classify cancers and identify tumor patients with actionable goals may help clinicians make more accurate treatment decisions. Targeted sequencing is currently used to detect genetic changes. The development of next-generation sequencing (NGS) technology is a major advance in a different aspect. It will help in recording unique genetic alterations, enabling the generation of large datasets of genomic, transcriptomics, and/or epigenetic features of tumor cells. As known, DNA or RNA sequencing can help detect changes in gene expression features and gene mutations in cancer. RNA sequencing can help identify and produce new long non-coding RNA and gene fusion in PCa. DNA sequencing becomes more sensitive and scalable with the help of NGS. Genome-wide association studies (GWAS) generate large amounts of genomic data and link these data to related cancers like PCa. Thus, integrating data from genomes and radiomics helps to understand their correlation. In [125], a web-based platform ImaGene analyzes the correlation between oncology and imaging data sets by inputting them and building an AI model. Although radiogenomics improves model performance by combining genomic and imaging data, data heterogeneity mainly coming from data source inconsistencies between radioactivity and genomes may be considered a challenge.

Multi-Omics for PCa
Omics is the comprehensive and quantitative analysis of molecular classes in biological samples. It includes genomics, epigenomics, transcriptomics, proteomics, and metabolomics analyses. Omics is the holistic study of a medical problem from a biological point of view to better achieve a predetermined clinical effect through a single model or a specific feature. It can be used to understand and define changes in biomolecules as complex diseases develop and change. Scientists can search for associations between organisms by analyzing these complex biological macromolecules and constructing accurate disease biomarkers. Multi-omics is to combine these different types of omics data to determine the universal disease-pheno-envirotype relationship or association. Gene expression signatures are the gold standard to guide clinical decision-making, but some questions remain about their clinical utility and interpretability. In 2003, the human genome project was completed, and the information contained in the DNA sequence was deciphered [126]. Thus, omics data associated with the genome, transcriptome, proteome, epigenome, and metabolome rapidly increased. Furthermore, as the technology matures and costs decrease, the likelihood of using omics data to guide clinical practice increases.
We note that epigenomics studies genome modifications, which affect gene expression without altering the DNA sequence. Epigenetic regulatory mechanisms controlling gene expression in PCa mainly include DNA methylation and histone post-translational modifications. DNA methylation is predominantly seen at GPG dinucleotides and leads to gene silencing [127]. Histone post-translational modifications can enhance or attenuate gene expression [128]. These studies facilitate the discovery of new biomarkers or new targeted drugs. In contrast, transcriptomics aims to study the situation of gene expression at the RNA level. Gene signatures of the PCa cell lines LNCaP and VCaP with pre-existing or treatment-induced resistance have been established using single-cell sequencing [129]. For example, a single-cell transcriptomic study identified a population of luminal cells with progenitor functions as a possible contributor to prostate carcinogenesis [130].
In addition, proteomics essentially refers to a protein at a large-scale level, including the expression level of the protein, post-translational modifications, and protein-protein interactions. It provides knowledge about disease occurrence that is gained at the protein level. Proteomics also can discover new molecular biomarkers, which have high clinical potential, especially for routine monitoring because their expression can reflect disease activity in real-time [131].
Metabolomics is a way to quantify metabolites in an organism and find a relationship with physiopathological changes. Analytical techniques are mainly based on nuclear magnetic resonance spectroscopy and mass spectrometry. For example, metabolomics has led to a renewed focus on urine as a valuable biomarker because PCa cells or their substances can be found in prostate fluid. This leads to detecting PCa in urine samples [132]. Moreover, metabolomics studies can lead to a better understanding of disease pathogenesis and therefore better interventions [133]. For example, 26 metabolites were significantly altered in PCa tissues, indicating dysregulation of 13 metabolic pathways associated with PCa development. The most affected metabolic pathways were amino acid metabolism, nicotinate, nicotinamide metabolism, purine metabolism, and glycerophospholipid metabolism [134].
In contrast, the multi-omics study can better describe cancer progress [135], help us to have a more comprehensive view of factors leading to pathological changes [136], develop new biomarkers, and improve clinical management of patients [137,138]. Despite advances in multi-omics analysis, radiomic with multi-omic topics is still limited. More investigation in this direction will detect more biomarkers of PCa. Table 6 lists the recent literature on multi-omics in PCa, including the specific type of omics, research objectives, and experimental results. As reported, we observe that the results of multi-omics studies are superior to the single omics, multi-omics are very extensive, and the specific methods are also quite different. For example, multimodal molecular analysis based on cell network biology provides robust prognostic biomarkers to detect and identify high-level diseases [139]. While in other studies, multi-omics analysis, integrating genomics, methylomics, and transcriptomics are used to assess the risk correlation between DNA methylation and PCa [140]. Therefore, we suggest explaining the incidence and prognosis of PCA from multi-omics dimensions. Table 6. Summary of recent multi-omics studies.
The DL-based model was proven robust by external validation.
Signature 1 is significantly prognostic in the high-Gleason risk group, and signature 2 is significantly prognostic in the low-Gleason group. [142] mRNA, microRNA, long noncoding RNA, DNA methylation, and somatic mutation Accurately identified specific molecular signatures and judged potential clinical outcomes from a multi-omics perspective Identified three clusters independently of ten multi-omics integrative clustering algorithms. [143] Untargeted RNA sequencing, proteomics, and metabolomics Test the feasibility of applying a multi-omics approach on an in vivo panel of paired HN and CRPC tumor models.
Metabolomics identifies increased N-acetyl aspartate (NAA) and N-Acetyl aspartyl glutamate (NAAG) in all three models of CRPC. [144] somatic mutations, somatic copy number alterations (SCNAs), DNA methylation, and mRNA expression Provide a comprehensive evaluation of GPCRs expression in primary PCa GPCRs exhibit low expression levels and mutation frequencies, which should contribute to the focus on GPCRs in oncology. [145] mRNA, miRNA, methylation, CNA, and SNV Perform a multi-omics analysis to identify immune genes associated with PCa The data point toward a role for LILRB molecules and especially LILRB1 and suggest that these receptors could play a role in the resistance of PCa to antitumor immune response.
[140] DNA methylation Building genetic models to predict methylation and perform association analysis with PCa risk.
759 CpG sites were identified whose predicted DNA methylation levels correlated after Bonferroni correction.
[146] m6A methylation To know gene expression, DNA methylation status, and CNVs for each putative m6A regulator In 27 genes, 18 showed significant differential expression between normal and PCa samples. [147] mRNA, miRNA, lncRNA, DNA methylation, gene mutation Identify and judge potential clinical outcomes based on multi-omics data When the number of clusters is 3, the scores of the two methods are closer.

Future Perspective and Limitations
Radiomics with PCa is gaining increasing attention as a research direction. While MRI is the primary modality used in current radiomics studies of PCa due to its broad clinical application, there remain numerous challenges in future research and application.
Firstly, the majority of current radiomics studies on PCa are single-center, retrospective studies with small sample sizes, which can limit the accuracy of the research results. Therefore, there is a need for multi-center, prospective studies with larger sample sizes to further validate the research findings.
Secondly, DCE-MRI sequences are commonly included in clinical prostate MRI scans, but most current radiomics studies of PCa do not incorporate these sequences. It is suggested to include DCE-MRI sequences to improve the efficiency of image data utilization.
Thirdly, while manual segmentation is currently the primary method used for delineating the region of interest, automatic segmentation algorithms for the prostate could be improved. This is significant in the clinical practice of oncology, as automatic segmentation can enhance the accuracy of biopsy positioning and allow for more precise and repeatable evaluation of metastatic lesions [148].
Finally, since most prostate lesions have low malignancy, prostatectomy is not typically performed, and the diagnosis of suspicious lesions relies heavily on pathological examination. However, inaccurate pathological results from a needle biopsy can negatively impact the diagnostic performance of radiomics models, which rely on pathological findings.
Early detection of most cases of PCa is highly challenging. Currently, the primary means of diagnosing suspected PCa is through pathological examination. However, this process relies heavily on needle biopsy, which carries a risk of missed or incorrect diagnoses, leading to inaccurate results. These errors in pathology directly impact the diagnostic accuracy of radiomics. Therefore, enhancing the precision of pathological examination can help improve the performance of radiomics models.
As is widely acknowledged, the approach to treating tumors depends on a range of factors, including the tumor's pathological type, disease stage, patient condition, cytogenetic changes, and other considerations. The efficacy of treatment can vary from patient to patient, and before administering genomic targeted therapy, a gene test is typically required. Additional assessments, such as radiomics, may also be necessary during treatment to monitor the development of drug resistance. It is noted that gene testing remains a costly, invasive, and time-consuming procedure [149], whereas radiomics is a relatively inexpensive and non-invasive alternative. As a result, genomic targeting may not be a viable treatment option for all patients with PCa.
Radiogenomics associates imaging data with genome maps, but the availability of these data is affected by the databases (e.g., TCGA, TCIA) and the heterogeneity of tumors. It also requires standardization of imaging and biochemical techniques for analysis to identify stable and repeatable biomarkers. Obtaining reliable results requires many queues and biological sample collection to ensure stability.
Advances in omics, such as genomics, transcriptomics, proteomics, and metabolomics, have begun to enable personalized medicine at the highly detailed molecular level. However, omics alone cannot capture the entire biological complexity of most human diseases. Integrating multiple omics features (radiomics + radiogenomics + omics) provides a more comprehensive view of biology and disease [135]. In addition, few studies performed biomarker validation, but few used independent sample cohorts to exclude false positives caused by sample collection and processing. Moreover, the discovery cohorts were small due to the need for standardized methods for sample collection and processing, data acquisition, and bioinformatics analysis. Finally, few studies shared the same biomarker candidates [131]. These challenges require a massive investigation and a collaborative way to share the findings between federated hospital systems.
AI techniques rely increasingly on large datasets, especially when the data are suitable. It is important to note that data sets have varying feature distributions, and differences arise across various techniques when multiple data sources are used. Therefore, the process of identifying and preprocessing appropriate data can result in more valuable research outcomes. Table 7 lists the most common and public data sets that can be used for PCa studies, such as MRI images containing benign and malignant labels acquired by different types of scanners, consisting of manual labels, distinguishing csPCa from ciPCa, clinical variables, examination, diagnosis, and treatment including PSA and other biochemical data, microscopic scans of prostate biopsy samples with imperfect labels and large images. In the past, open-source datasets were typically constructed to meet specific research needs, which may not align with current research requirements. To better serve a wider range of communities, it is preferable to provide clean data in multiple formats. However, current datasets face several challenges such as low data reading rates, the presence of multiple data types, and complex data processing requirements. In the near future, researchers are likely to adopt a responsible approach to data collection and annotation, as well as data set maintenance and problem formulation, in order to mitigate these challenges [162].
The translation of radiomics models constructed from medical images into clinical applications faces challenges in terms of interpretability. Specifically, there is a lack of transparent explanation regarding the relationship between selected features and clinical outcomes. In order to ensure interpretability and assist clinicians in making clinical decisions, it is important to have a thorough understanding of the decision-making process behind the radiomics process, especially before incorporating AI fields like DL methods. Without adequate interpretability, the decision-making process and possible biases are not well accounted for, leading to the limited clinical utility of radiomics features and models. The General Data Protection Regulation (GDPR) law in the European Union requires an explanation of an algorithm's decision-making process, and data subjects are entitled to meaningful information about the logic involved [163]. Explainable Artificial Intelligence (XAI) can help to interpret the information behind the "black box" model, showing how the decision was made transparently, thus enhancing the credibility of the model. Different XAI techniques, such as class activation map (CAM), local interpretable modelagnostic explanation (LIME), Shapley additive explanations (SHAP), Gradient-weighted class activation mapping (Grad-CAM), Attention, and Saliency, can be used to improve algorithm performance [90,164]. The explanation forms generated by XAI can be categorized as feature-based, text-based, and example-based, and improve the credibility of AI from different levels. Interpretable methods have been associated with various tasks in radiomics, including image segmentation, lesion and organ detection, image registration, computer-aided diagnosis and staging, prognosis, radiotherapy planning, disease progression monitoring, classification, and image reconstruction [165]. In one study, multi-modal volumetric concept activation was used to provide an explanation, which showed that the detection was mainly based on the location of metastatic PCa in CT anatomy, and the reliability of PET detection was high [166]. In another study, a model fused with multiple DL methods was used to examine PCa with MRI images, and then XAI explained how the model differentiated benign or malignant PCa [167]. We note that many other radiomics, AI, and radiogenomics works could be also discussed. However, this study collects the most common models that are used for PCa analysis.
Radiomics analysis based on mpMRI can not only improve the detection rate of PCa but also predict prognosis, its texture features can reflect the heterogeneity of lesions. After radiation therapy, radiomics during the follow-up process can be used to evaluate the efficacy of treatment and tumor recurrence. When comparing the radiomic results before and after treatment, tumor shrinkage, tissue recovery, and the presence of residual or new lesions can be evaluated, helping determine whether further treatment is needed, thus improving the survival rate. The combination of mpMRI and Prostate Health Index (PHI) in radiomics may help to better estimate the risk categories of prostate cancer at the initial diagnosis, thus achieving personalized treatment methods [168]. We note that the prediction of cancer prognosis is based on statistical data and models, which provide a probability estimate rather than an absolute prediction. Everyone's cancer situation is unique, including pathological features, health status, and personal factors, all of which may have an impact on prognosis. Therefore, predicting cancer prognosis should serve as a reference for auxiliary decision-making, rather than the only basis. The final treatment decision should comprehensively consider multiple factors and make personalized choices based on individual circumstances.
As PCa is increasingly being diagnosed at an early stage, with excellent survival rates, the rationale for patients' primary treatment selection has switched to health-related quality of life (HRQOL). Use mpMRI to detect suspicious PCa before biopsy, thus reducing the number of unnecessary biopsies and avoiding the risk of overdiagnosis and overtreatment. At the same time, through the combined method of systematic and fusion targeted biopsy, the detection rate of PCa can be further improved and the risk of missing csPCa can be reduced [169]. Studies usually conduct follow-ups or send questionnaires (e.g., the Expanded Prostate Cancer Index Composite (EPIC) questionnaire and the Short-Form 12 Item Health (SF-12)) at baseline 3, 6, 12, and 24 months after treatment to collect patient-reported QOL outcomes. The EPIC complements existing instruments by measuring a broad range of urinary, bowel, sexual and hormonal symptoms, allowing for a more comprehensive assessment of important HRQOL issues in contemporary PCa management [170]. In [171], they examined a prospective serial cohort of low-dose-rate (LDR) brachytherapy for PCa using MRI and explored factors associated with toxicity and QOL, as assessed by EPIC and the International Prostate Symptom Score (IPSS). In [172], a prospective phase II clinical study was developed to evaluate outcomes in patients treated with MRI-guided wholegland prostate high-dose-rate brachytherapy (HDR-BT) augmentation with an assessment of toxicity and HRQOL outcomes. In [173], the HRQOL of early PCa patients who did not receive hormone therapy within 3 years after radiotherapy was examined using the 15D instrument and the FACT-P questionnaire, and the HRQOL was the same in the radiotherapy group and the age-standardized general male population. The treatment of PCa is mainly for curative purposes, but the treatment options are usually accompanied by high morbidity of urinary problems and/or erectile dysfunction, significant loss of quality of life, and high treatment costs. Curing PCa, solving possible complications during treatment, and improving quality of life are the common pursuits of doctors and PCa patients. In [174], palliative transurethral resection of the prostate (pTURP) combined with intermittent androgen deprivation therapy (ADT) can be used in the treatment of elderly patients with localized PCa to resolve dysuria and improve QOL. The personalized treatment of PCa remains one of the challenging areas that require further investigation.

Conclusions
This paper presents advances in MRI-based PCa radiomics and discusses the steps and details of the radiomic flow chart. It describes the integration of AI with traditional medical imaging for radiomics to address the limitations and challenges of clinical applications, in line with the development trend of the significant data era. Currently, the application of radiomics in PCa extends to almost every patient, from diagnosis to grading of PCa, from adjuvant treatment of PCa to prediction of prognosis of prostate patients. Radiomics, combined with ML methods, could relatively objectively diagnose PCa and predict the treatment effect of patients, which is in line with the concept of precision medicine and personalized treatment. Currently, related studies combine PCa radiomics, and genomics to form radiogenomics. Imaging genomics is expected to become a valuable method for detecting PCa genotypes and will become a tool to assist in the diagnosis and treatment of PCa. With the further development of AI and the improvement of radiomics technology, radiomics will play a better and better role in more fields of PCa, with good application prospects. Future research must improve the versatility and quality of radiomics models with more significant multi-institutional data to complete the promotion and transformation of clinical applications.