Recent Radiomics Advancements in Breast Cancer: Lessons and Pitfalls for the Next Future

Radiomics is an emerging translational field of medicine based on the extraction of high-dimensional data from radiological images, with the purpose to reach reliable models to be applied into clinical practice for the purposes of diagnosis, prognosis and evaluation of disease response to treatment. We aim to provide the basic information on radiomics to radiologists and clinicians who are focused on breast cancer care, encouraging cooperation with scientists to mine data for a better application in clinical practice. We investigate the workflow and clinical application of radiomics in breast cancer care, as well as the outlook and challenges based on recent studies. Currently, radiomics has the potential ability to distinguish between benign and malignant breast lesions, to predict breast cancer’s molecular subtypes, the response to neoadjuvant chemotherapy and the lymph node metastases. Even though radiomics has been used in tumor diagnosis and prognosis, it is still in the research phase and some challenges need to be faced to obtain a clinical translation. In this review, we discuss the current limitations and promises of radiomics for improvement in further research.


Introduction
In the last few years, the inclusion of standard digital imaging among the possible sources of big data for precision medicine has represented one of the new frontiers of research. Particularly, radiomics, the "omic" field related to diagnostic imaging, has been viewed as a great opportunity for several medical fields, yielding the most interesting results in oncology. Radiomic tumor analysis, including intra and inter-tumor heterogeneity, tumoral micro-environment and infiltrating cells, aims to extract quantitative features from medical imaging that are potentially beyond the perception of the human eye, in order to uncover novel features that are associated with treatment outcomes, disease molecular expressions or patient survival. tive data expressing different tumor properties, has gained recognition as a new tool in the field of cancer care for non-invasively profiling of BC [1,3,7,8].
Particularly, the ultimate purpose of radiomics applied in BC care should be early diagnosis of BC and prediction of its clinical course and biological aggressiveness in order to optimize treatment [23].
The imaging evaluation of BC through mammography, ultrasound (US) or magnetic resonance imaging (MRI) is currently essentially qualitative. This includes subjective evaluations such as tumor morphology/structure, type of enhancement, anatomic relationship to the surrounding tissues. However, to reach a truly personalized medicine, a quantitative evaluation is demanded too [3]. Data derived from radiomics investigation, such as the intensity, shape, textural related features and wavelength related transforms, may provide valuable information to differentiate benign from malignant lesions, to predict treatment response, to assess cancer molecular profile and to derive robust models that combine multidisciplinary information [24][25][26][27][28][29][30][31].

The Workflow of a Radiomic Study
Most of radiomics studies concerns its application in the oncological field and the first step is generally to acquire the appropriate images. The Quantitative Imaging Biomarker Alliance and Quantitative Imaging Network have defined standardized imaging protocols and recommendations in the field of quantitative imaging [32] to improve the reproducibility of radiomics studies, which remains one of the biggest drawbacks currently limiting their clinical application.
Radiomics features are generally extracted from routine medical images that decode information about a region of interest (ROI) which are specified to limit the spatial extents of the analysis and can be delineated manually, semi-automatically or automatically, with increased reproducibility for textural features extracted with automatic segmentation algorithms compared to free-hand region delineation [33]. Feature extraction from the ROIs is performed using specific algorithms and are thus objective imaging features, with standard mathematical definition of the most common features [17].
An example of MRI-based radiomics workflow for features extraction is shown in Figure 1.
The features can be broadly classified into four categories: morphological, histogrambased, textural and related to the gray level co-occurrence matrix and to transform-based features [32]. Morphological features describe different aspects of the lesion shape, such as volume, surface area, convexity or the borders heterogeneity. Histogram-based features characterize the histogram of voxel intensities, including the average value, standard deviation and parameters related to the histogram shape such as skewness and kurtosis. Textural features focus on the spatial arrangement of voxel intensities, trying to capture different properties of their distribution in terms of heterogeneity, randomness, presence of clusters or privileged signal directions. All these features can be calculated from the images as they are, or after applying mathematical transforms, such as wavelet of Laplacian of Gaussian (LoG), resulting in the so-called transform-based features. While hundreds or thousands of features may be computed, only a selection of fewer (and more specific) features is required to compute a clinically useful radiomic signature. Features whose value is not stable when images are repeatedly acquired under the same experimental condition (referred to as unstable or not repeatable features) should be identified a priori, by means of phantom studies or, if feasible, test-retest acquisitions in the clinical setting and eliminated [34]. Usually, a big gap between the number of features extracted (p) within a study and the number of patients actually recruited (n) remains, leading commonly to p>>n, with the risk to build radiomic models with high predictive accuracy in the experimental dataset but with extremely poor generalizability of the results, due to precise modelling of dataset "noise" instead of the true biological behavior. To overcome this problem feature selection and dimension reduction is of utmost importance, and different approaches can be performed, including rigorous algorithms such as principal component analysis, LASSO or Boruta [32]. The desired response variable differs based on the study, and models are built using the selected features to suit specific aims. For classification problems (e.g., benign vs malignant lesions), various classifiers are used including support vector machine (SVM), random forest (RF) and XGBoost classifiers. To predict continuous variables, such as the expression of biological markers, various regression methods including linear regression, regularized linear regression and RF are commonly used. For prediction of survival, Cox regression models with or without LASSO approach are finally performed. Figure 1. Example of MRI-based radiomics workflow. The first phase is the image acquisition (i.e., by breast MRI with contrast-enhancement sequences), then (orange arrow) the ROI segmentation could be performed manually or by automatic or semi-automatic software, finally (orange arrow) the radiomic features are extracted and selected by algorithms. An example of a semi-automatic segmentation by a threshold value method is shown in the three figures below (blue arrows). ROI: region of interest, DCE-MRI: Dynamic contrast enhancement-Magnetic resonance imaging.
Most radiomics studies involve a mixture of biomedical imaging specific techniques related to signal processing and proper AI applications, a broad field of computational techniques which includes machine learning (ML) and deep learning (DL) algorithms, the latter being often "black-box" and self-learning neural networks, with less dependence on human input in the model building step [33]. Given the high number of features obtained within radiomics studies and the often-non-linear relationships involved, these techniques offer a better approach in clinical predictive modeling compared to traditional inferential statistic and if properly applied, can limit model overfitting. Since a number of radiomics studies focused on BC are limited to single-center data lacking external validation, cross-validation with a leave-one-out, k-fold approach or with bootstrapping can be adopted using splits of the data into training and validation sets [33].
However, the optimal method of validation remains external dataset independent validation, which is typically accomplished in multi-center studies. However, acquiring multi-center data is challenging, so the solution may be to leverage an open database such as the cancer genome atlas program (TGCA), to acquire the external validation data.
As previously elucidated, reproducibility and standardization of radiomics analysis is currently the biggest issue. This partly because of the intrinsic high number of different steps involved and partly because every one of each can be performed in several different ways. The retrospective nature of studies, the heterogeneity of software and the variability of the radiomics features that can be extracted in the different studies raise legitimate concerns regarding the potential lack of reproducibility in radiomics. It is good practice to acquire imaging data using standardized settings that should be well documented in published papers, in order to be accurately evaluated during peer-review and be available to different research teams working on the same field. Data obtained under such settings should be shared on public repositories in order to receive appropriate external validation.

Radiomics Application in Breast Cancer
Even though there are studies about radiomics based on mammography, digital breast tomosynthesis (DBT), US and even PET/CT, in BC imaging scenario the radiomics approaches have been investigated mainly with MRI and, in the very last few years, with the contrast enhancement spectral mammography (CESM). However, results of most studies have been derived from relatively pure study designs, with homogeneous patient populations where the MRI was sourced from specific scanner systems and a single field strength. This limits their wider applicability and generalizability at present.
Tables 1 and 2 summarize in our opinion the most relevant original studies and reviews, respectively, on radiomics in breast imaging published in peer reviewed journals from 01/2018 to 01/2021. Results from other interesting studies are briefly discussed only in the text.
In the following sections, studies on the current main applications of radiomics in BC care are discussed. Table 1. Original studies on radiomics in breast imaging published in peer reviewed journals from 01/2018 to 01/2021, classified on modality/technique and ordered by newest first. Relevant papers were obtained with a scoping review approach, using the following set of keywords and the relative controlled vocabulary terms (Mesh/Emtree): (radiomic* OR textur*) AND (breast) AND (cancer* OR malign* OR neoplas* OR metast* OR tumor* OR tumour*). The same approach was conduct for Table 2, which includes only review papers. ALN: axillary lymph node, AUC: area under the curve, BC: breast cancer, CESM: contrast enhancement spectral mammography, DCE: dynamic contrast-enhanced, ML: machine learning, NACT: Neoadjuvant Chemotherapy, pCR: pathological complete response, SD: standard deviation, TNBC: triple-negative breast cancer, T1WI: T1 weighted imaging, T2WI: T2 weighted imaging, VOI: volume of interest. The radiomics nomogram, comprising PR status, molecular subtype and radiomics signature, showed excellent calibration and better performance for the metastatic ALN detection (AUC 0.883 and 0.863 in the primary and validation cohorts), better than each independent clinical feature and radiomics signature.

Modality/Technique
The mammography-based radiomics nomogram could be used as a non-invasive and reliable tool in predicting ALN metastasis.

Discrimination between Benign and Malignant Breast Lesions
The early identification and characterization of BC is essential to improve outcomes in patients because small non-metastatic disease can be effectively treated with curative intent [11,52,53].
To detect a malignant breast lesion, dynamic contrast-enhancement MRI (DCE-MRI) is currently the imaging technique with the best accuracy performance [54]. Accordingly, most radiomics studies are based on such technique. A recent DCE-MRI-based radiomics study by Zhou et al. [41] (Table 1) used 99 texture and histogram parameters from 133 patients to differentiate between benign and malignant lesions. Their model resulted in an accuracy of 91% when using the smallest bounding box of peritumoral tissues in segmentation, showing that including proximal peritumor tissue provided higher accuracy than including in segmentation tumor alone or larger boxes. Other MRI-based studies considered both DCE and diffusion weighted imaging (DWI), such as the one by Xie et al. [43] ( Table 1). They recently analyzed features extracted from images of 134 invasive ductal cancers, founding highest accuracy of 91% for comparing triple negative to non-TN cancers. Previously, Jiang et al. [55] already showed how combining DCE-MRI and DWI with ADC values increased the overall accuracy for discriminating of malignant and benign breast nodules to 0.90. In 2018, a retrospective study [56] examined unenhanced DWI-based radiomics to predict the malignant nature of suspicious breast lesions detected on screening mammography, showing that lesions classified as BI-RADS 4 or 5 by mammography resulted in 70% of false-positive findings while retaining 98% of sensitivity.
Using mammography, Li et al. [57] analyzed the radiomic features of the breast with a suspicious lesion alongside with contralateral health breast in 182 patients (106 malignant and 76 benign), showing that the combined lesion and parenchyma classifier in the differentiation of malignant and benign mammographic lesions was better than using the lesion features alone. In 2019, a sub-study of a multi-center and prospective study leaded by Tagliafico et al. [58] applied a radiomics approach to Digital Breast Tomosynthesis for the first time to differentiate normal from malignant breast tissue in patients with dense breasts in a small number of 40 patients, showing encouraging results.
Recently, Massafra et al. [59] used CESM to discriminate benign and malignant breast lesions based on radiomic analysis of 53 patients with BC resulting with the aid of the random forest classifier in the best prediction of benign/malignant with median values for sensitivity and specificity of 88.37% and 100%, respectively.
Finally, there are examples of US-based radiomics for discrimination of malignant breast lesions, such as the recent study by Luo et al. [60] based on 315 BCs patients which showed that nomograms combining the radiomics score and BI-RADS category improved the discrimination of benign and malignant lesions than either the single radiomics score or the BI-RADS category.

Prediction of Breast Cancer's Molecular Subtypes
Once a breast lesion is diagnosed as malignant, its molecular subtype has to be assessed. In 2017, Fan et al.  (Table 1) showed the usability of a simplified and rapid approach to tumor for MRI-based tumor decoding and phenotyping of BC on a population of 98 patients. Notably, they evaluated the molecular subtype, hormonal receptor status, Ki67-and HER2-expression, ALN metastasis as well as grading considering 13.118 radiomic features extracted with a VOI-based approach. Involvement of the ALN could be predicted with an AUC of 0.80, while ALN metastasis yielded an AUC of 0.71. Receptor status predictions yielded AUCs of 0.67-0.69, Ki67 0.81 and HER2 Expressions 0.62, which are promising results but not enough to be applied in clinical practice as a substitute of tissue samples. In 2018, Liang et al. [44] (Table 1) proposed a noninvasive Ki67 predictor status based on breast radiomics features extracted from 318 breast MRI. Their customized radiomic score based on T2WI was significantly associated with the Ki67 status, suggesting a new radiomics marker might pre-operatively predict Ki67 expression in patients with BC.

Prediction of Response to Neoadjuvant Chemotherapy
In the last decade, neoadjuvant chemotherapy (NACT) has been increasingly used in the treatment of operable BC and it is associated with a positive response, especially in women with ER-negative BC and it decrease the rate of recurrence and of BC mortality [64].
The achievement of pathological complete response (pCR) is a powerful prognostic factor for long-term outcome, and it is considered as the only currently validated biomarker of survival, but it can only be assessed at surgery [10,65]. Therefore, radiomics may allow a non-invasive and earlier detection of resistance to treatment to avoid in some patients (namely, the non-responders to NACT) the unnecessary toxicity and delays access to other potentially effective therapies. At the same time, the neoadjuvant setting provides a unique opportunity for in vivo assessment of tumor response, evaluation of biological markers of responsiveness or resistance and to study intermediate endpoints indeed [66].
In 2020, Choudhery et al. [5] (Table 1) used morphological and 3D textural features to predict the molecular subtype and the pCR in 259 BC women underwent with NACT. Significant differences in minimum signal intensity and entropy were found among the tumor subtypes. Sphericity in HER2+ BCs and entropy in luminal BCs were significantly associated with pCR. Multiple features demonstrated significant association with pCR and residual tumour burden in TNBC with SD of intensity achieving the highest AUC for pCR in TN BCs.
In 2017, Braman et al. [71] evaluated radiomic features based of both peri-and intratumoral regions on pre-treatment DCE-MRI to predict the pCR to NACT in 117 BC patients. Their results showed that peri-tumoral radiomics contributed to the prediction of the pCR of HER2+ BC patients, yielding a maximum AUC of 0.74 within the testing set.
Previously, other authors [22,61,70,72,73] showed that quantitative analyses of radiomic features (morphologic, texture and dynamic features) from pretreatment breast DCE-MRI data in BC patients could be used as valuable image markers that are associated with pCR to NACT. In the above-mentioned studies, DCE had been used more frequently than DWI to extract radiomics features as it can provide the kinetic characteristics of the contrast agent by producing pharmacokinetic maps indeed.
In a multicenter study, Liu et al. [67] utilized multiple MRI sequences, including DWI, to predict pCR to NACT in patients with BC. In 586 patients radiomic score was calculated using 13,950 features from MRI quantitatively, providing a promising tool for predicting response of patients with advanced BC and showing a potential and practical value in clinical practice.
Parikh et al. [74] used unenhanced MRI data evaluating whether changes in MRI textural features could predict the pCR in a small number of patients with BC who underwent NACT. Using histogram-based features, they showed how an increase in T2WI uniformity and a decrease in T2WI entropy after NACT could predict pCR as compared to BC size change.
In addition to MRI, Wang et al. [75] recently developed and validated a CESM-based radiomics nomogram to predict NACT-insensitive BC prior to treatment. In 117 patients, their radiomics nomogram that incorporates 11 radiomics features and 3 independent clinical risk factors, including Ki-67 index, background parenchymal enhancement (BPE) and HER-2 status, showed an encouraging discrimination power with AUCs of 0.877 (95% CI 0.816 to 0.924) and 0.81 (95% CI 0.575 to 0.948) in the training and validation sets, respectively.

Prediction of Lymph Node Metastases
Involvement of ALN is an independent predictor for disease outcomes in patients with BC [76][77][78]. At present, definitive diagnosis is reliant on pathological examination by invasive lymph node tissue sampling from surgery or biopsy. This is because imaging assessment based on nodal size measurement and/or morphological criteria has limited accuracy, and apical nodes in the axilla are poorly visualized by US at the time of diagnosis [79,80].
In some studies [6,[81][82][83], the prediction using a combination of radiomics features and clinical risk factors led to further improvement in the identification of nodal status. Particularly, Dong et al. [82] showed how radiomics features extracted from DWI sequences were highly correlated with ALN metastases than those extracted from ADC. Moreover, some studies have investigated radiomics nomograms based on mammography [86], CESM [36], US [87] and even CT [88] to predict axillary lymph node metastases preoperatively. Once again, authors built the radiomic score from a huge number of radiomic features and then incorporated additional radiological and clinicopathological findings.

What Next?
As more research is conducted, the body of published literature is rapidly growing. In the next future radiomics may become the standard in MRI-based tumor assessment, with AI algorithms totally skilled of carrying out complex data analysis under the precise guidance of radiologists.
Currently, radiomics is an appealing technique in research but it has not been fully applied yet in the clinical setting but multidisciplinary and translational studies are still required, gather the amount of data needed to implement radiomics on a wide scale. As it is difficult to acquire consistent imaging and acquire uniform results that can be applied in clinical practice [89] (due to imaging acquisition from different machines, varied technical parameters and slice thickness and diverse reconstruction algorithms), some techniques to deal with multicentric data have been proposed such as ComBat [90]. However, radiomics models based on transparency of methodologies and developed using standardized acquisition techniques and high-quality data can overcome confounders arising from differences among centers' workflows. Finally, the sample size of radiomics analyses is another crucial issue in predictive models: larger samples can increase prognostic accuracy and, at the same time, can make it possible to use AI techniques and DL algorithms with more robust results. However, the samples of the above-mentioned studies are not big enough, and these models should be validated in further research.

Role of Artificial Intelligence and Big Data in Radiomics
The possibilities of using texture analysis and other advanced approaches such as ML and DL in radiomics are wide open as radiomics studies have numerous degrees of freedom and need large datasets [91][92][93][94].
At present, the best method to analyze big data is based on AI technology, which consists of flexible mathematical models that use algorithms to identify complex nonlinear relationships within such data: ML is a subfield of AI which allows machines to learn without being specifically programmed, and it has been applied in radiomics [95,96]. Among the techniques that fall under the ML umbrella, DL has emerged as one of the most promising [97]. This essentially because DL allows a continuous improvement towards a better performance with lower error rate while ML reaches an error rate that cannot be further lowered adding other data to the process [98]. Moreover, while in ML handcrafted features are pre-defined using domain expertise, in DL the algorithm is able to learn specific features from the data themselves [33]. Accordingly, there is no need to specify pre-defined features as the same algorithm can solve many different tasks. On the other hand, DL till needs time before playing a significant practical role in radiomics, especially applied cancer research, due to the limitations of the available big-data which usually still lack complete characterization of the patients and poor integration of individual datasets [99]. Finally, AI suffers from the interpretability issue that may represent a main challenge for researchers who want to understand how certainly studies comes to conclusions, which features have been selected, and so how to recognize (and interpret) possible failures. This also has a practical drawback when communicating the results to the clinician, which may not be able to understand all the processes behind the DL proposed clinical response.
However, AI already showed its utility in radiomics studies. Nie et al. [100] generated higher-quality images adopting DL-based image synthesis to match different imaging settings. Havaei et al. [101] used a DL segmentation algorithm to automatically perform difficult segmentation tasks and their good results suggest the possibility to eliminate the need for manual segmentation. Finally, DL could help in the feature extraction step providing insights for new features as it may be capable of learning relevant features from the data themselves [33].

Standardization and Curation of Radiomics Data
Since for developing a radiomic workflow enforced by AI is required a big training dataset [102], radiology is in well-poised to benefit from it thanks to its abundance of data. Nevertheless, irregular completeness and different quality of data entry, as well as the interoperability between different providers, are still a non-negligible obstacle for the further development of radiomics in clinical practice. To appropriately train AI algorithms, the abundance of data that are acquired with radiological exams might be not enough indeed because most of the health-related data are unstructured and not standardized yet [98,103]. Imaging data is far from standardized as different hospitals use different systems to acquire and to store them, and it can be incomparable due to the fast development of new equipment/device/system and to the differences in the technical implementation used by the disparate vendors. In case of multicenter collaboration [104,105], the situation could be even more difficult as radiomic data are more heterogeneous and variable [106]. In this regard, standardization is the process of transforming data into a common format which can be understood and shared across different tools and methodologies. This is a crucial point in radiomic studies. The first step is the standardization of techniques [107,108]: in radiology, for instance, the size of a mass in organs can be compared consistently if the comparison are performed with exactly the same imaging protocol [104]. In addition, segmentation of an index lesion in the training process of an AI system requires the most uniform images possible. Unfortunately, o widely accepted standard o store and communicate segmentation results between different tools from different vendors/sources is still commonly used [1,109,110]. Two main example of high quality standardized annotation methods, namely the Annotation and Image Markup standard and the DICOM Presentation State [111,112], are still little used by software developers to report on annotations indeed [113]. Without early efforts to optimize interoperability, the practical effectiveness of AI in radiomics will be severely narrowed. Therefore, a set of standards would be necessary to allow integration between these different algorithms and to allow AI techniques to be used in different centers, from different users, on different equipment.
The collection and sharing of heterogeneous (from the imaging point of view) dataset, properly cured and homogeneous from a clinical standpoint, can be extremely useful to train and test feature selection and harmonization methods, aiming at identifying robust procedures and/or features, able to guarantee generalizability. This is especially relevant when ML techniques are used to analyze handcrafted radiomic data extracted from images, an approach which should be preferable when the number of available data is not high enough to allow the use of DL methodology. Conversely, in presence of sufficiently wide datasets, DL methodology might be favored and image dataset heterogeneity would represent an advantage more than an issue, DL architecture having the potential to incorporate in the deep layers of the network image processing and features selection tasks.
Broadly speaking, the lack of appropriate big datasets is a key obstacle to a large introduction of AI systems in healthcare [94,96,114], and the needed level of data curation is strictly related to the number of available cases and to the AI methodology implemented, and the other way around. Scientists and researchers have the fundamental role of choosing the best methodology in relation to the dataset characteristics, and to collect adequate datasets for the aim of the study, also in relation to the methodology they plan to implement.
The collection of external, independent dataset for validation is fundamental for testing the performance of predictive models in a validation setting. The standards for prediction model reporting which are currently used need to be updated. Radiomics datasets for AI algorithm training, testing and validation should be updated and developed including statistical metrics for validation, parameters for clinical integration and pathways for assessing algorithm performance in research and even in clinical practice [106,115].
As standardized evaluation of the performance, reproducibility and clinical utility of radiomics findings is needed. The radiomics quality score is a specific system of metrics with the purpose to determine the validity and completeness of radiomics studies [116,117], similarly to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) initiative [13]. These indications are applied to a radiomics-specific design that considers high-dimensional data and modeling, and highlights the clinical adoption of modeling research as in the TRIPOD guidelines [118].

Radiomics Data Sharing
Since radiomic data are needed for validation and multi-centers cooperation, they may need to be shared across multiple institutions and across nations for a widespread implementation. Accordingly, there is the need to compliance with regulatory frameworks when using personal data such as health information [1]. Data would need to be anonymized as the rules of patient privacy and the cybersecurity measures will be increasingly important in radiomic research [1,119,120].
In EU, regulators recently updated the legislation concerning data protection and cybersecurity with GDPR [99,121]. Furthermore, with the Cybersecurity Directive [122], they set out a number of requirements for EU Member States to prevent cyberattacks and, eventually, keep consequences under control [122]. In the US, the Health Insurance Portability and Accountability Act is a compliance focus for what concerns health information [123] defining standards and safeguards that protect confidential data and personal health information that apply to all healthcare providers, insurers and other healthcare entities.
On the other hand, the current healthcare environment still holds little incentive for data sharing [124]. Some policymakers have proposed creating anonymized benchmarking datasets including a local calibration, which is crucial because radiomics features may have local or cultural-specific parameters that may not be generalizable to different populations.
Concerning the radiomics studies in the field of BC, the Cancer Imaging Archive [125] is a good example of a functional service which hosts a large archive of anonymized medical images of tumors with related data (e.g., patient outcomes, treatment details, genomics, pathology, expert analyses). Such archive is accessible for public download.
Other example of data-sharing efforts include biobanks and international consortia for medical imaging databases, such as the Cardiac Atlas Project [126], the Visual Concept Extraction Challenge in Radiology Project [127], the UK Biobank [128] and the Kaggle Data Science Bowl [129].
Finally, the other milestone in radiomics will be transparency. The accuracy of radiomics performance relies massively on the quality of the inputted data: accordingly, poorly labelled data will yield poor results [130] and transparency of labelling allows that others can critically evaluate the radiomic workflow process.

Conclusions
The recent studies discussed in this review show that radiomics applied in BC is an expanding and promising research topic. However, the application of radiomics in clinical practice is still hampered by some pitfalls. Lessons from recent experience, further advances in technology (including development of AI) and efforts in curation and standardization of data and methodologies among researchers would make radiomics a more robust and trustable field in both research and clinic in BC care.