Addressing the Clinical Feasibility of Adopting Circulating miRNA for Breast Cancer Detection, Monitoring and Management with Artificial Intelligence and Machine Learning Platforms

Detecting breast cancer (BC) at the initial stages of progression has always been regarded as a lifesaving intervention. With modern technology, extensive studies have unraveled the complexity of BC, but the current standard practice of early breast cancer screening and clinical management of cancer progression is still heavily dependent on tissue biopsies, which are invasive and limited in capturing definitive cancer signatures for more comprehensive applications to improve outcomes in BC care and treatments. In recent years, reviews and studies have shown that liquid biopsies in the form of blood, containing free circulating and exosomal microRNAs (miRNAs), have become increasingly evident as a potential minimally invasive alternative to tissue biopsy or as a complement to biomarkers in assessing and classifying BC. As such, in this review, the potential of miRNAs as the key BC signatures in liquid biopsy are addressed, including the role of artificial intelligence (AI) and machine learning platforms (ML), in capitalizing on the big data of miRNA for a more comprehensive assessment of the cancer, leading to practical clinical utility in BC management.


Introduction
Globally, breast cancer (BC) remains the most common cancer and the leading cause of cancer death for women [1]. Despite the advancement in BC screening, diagnostics and therapy, the overall rates of BC incidence and mortality around the world have been on an increasing trend. Although BC mortality rates have declined over time in most high-income countries (HICs) [2], they remain high and are increasing in many low-middleincome and low-income countries [3,4], partly due to poor awareness and perception of early BC detection, leading to delays in diagnosis and treatment [1]. Recent studies show that the mortality rate is also compounded by disparities in BC screening between rural and urban rural/urban areas [5] as well as among those from different socio-economic and ethnic backgrounds of HICs [2]. Nevertheless, there has been a more established understanding of the heterogenous nature of BC and its application in the development of personalized medicine and targeted therapy [6]. However, the complex interaction within the tumor microenvironment (TME) and the influence of cancer stem cells (CSCs) Int. J. Mol. Sci. 2022, 23, 15382 2 of 18 from cancer recurrence to drug resistance [7] as well as population-based variation in terms of immunological response [8] continue to pose challenges in deciphering feasible approaches in managing BC diagnosis, monitoring and administrating effective treatment options. In the past decade, the evolution of BC diagnosis and classification has resulted in greater supporting tools, ranging from classical mammography and histopathology [9] to molecular-based markers and multigene prognosticators [10], which ultimately guided the overall management of BC (Table 1) [11][12][13][14]. Still, many if not all of these existing tools are heavily reliant on invasive tissue biopsies as the starting point for screening and monitoring BC progression, evaluating cancer prognosis and deciding on the best therapeutic options. Moreover, studies have shown that the BC signatures are often not clearly manifested and well represented in the current screening methods, namely mammography [15] and tissue biopsies [16]-more so in the diagnosis of metastatic cancers [17]. Liquid biopsy (LB), on the other hand, has emerged as a potential feasible approach in overcoming these shortfalls, from obtaining samples in a noninvasive manner to early detection of cancer and more comprehensive monitoring [18,19], thereby offering patients less stressful experiences and a better sense of worthiness in managing cancer treatments. Among the commonly known cancer biomarkers in LB [19], circulating microRNAs (miRNAs) stand out as a feasible and practical option [20]. Hence, this review will address the role of miRNA as a feasible candidate for liquid biopsy, not only in personalized BC management and targeted therapy but also in consolidating the big data of miRNA with artificial intelligence (AI) and machine learning platforms (ML) for a more comprehensive and inclusive approach in promoting effective BC patient care and outcome.  (G4)  TNM  Stage  NR  I-III  I-III  I-III  I-III  I-III

miRNA as Liquid Biopsy in Personalized Breast Cancer Management and Targeted Therapy
The liquid biopsy (LB) approach enables the securing of essential information on cancer progression and tumor through simple body-fluid-based samples, mainly through routine blood sampling. As gene-regulatory molecules in the body, circulating microRNAs (miRNAs) may be readily detected in plasma or serum of blood samples, enabling measurable changes in their levels, which are associated with the various conditions of the body, including cancers. Moreover, with the rapid advancement in bioinformatics in molecular data analysis, the inference of miRNAs with oncogene targets, cancer signaling pathways, survival analysis, prognostic values and drug targets is commonly obtainable [21] with cross-validation from clinical cancer-associated databases [22,23]. Although the application of miRNA as LBs for BC in the clinical setting is fairly new, with only two clinical studies recorded in www.clinicaltrials.gov as of October 2022 (Table 2), its utility may complement well with the existing standard clinical approach (Table 1), enhance diagnosis and monitoring of BC progression, as well as the response to treatments.

Current Trends and Research Outcomes of Circulating miRNA as Liquid Biopsy
Circulating miRNAs are extracellular miRNAs that are present in body fluids, such as blood, serum, plasma, milk, saliva and urine, either in the form of free-circulating miRNAs or encapsulated within extracellular vesicles (EVs), such as exosomes [24][25][26]. The advancement in molecular biology techniques has allowed scientists to employ different methods, such as real-time polymerase chain reaction (qPCR) [27], miRNA-sequencing (miRNA-seq) [28] and microarray [29], to detect the levels of circulating miRNAs among BC patients. These techniques have been reported to play essential roles in the diagnosis, classification and prognosis of BC [30,31]. We hope to provide an updated overview of the research findings that reported the association of circulating miRNAs to the diagnosis and prognosis of BC. Notably, EV-derived and exosomal miRNAs have attracted remarkable interest due to their superior stability to those of free-circulating miRNAs. As such, a separate cluster of studies that researched specifically miRNAs derived from EVs is also highlighted in this review.

Prognostic Significance of Circulating miRNAs in Human Breast Cancer
Apart from being employed as minimally invasive biomarkers in diagnosing and classifying different stages or types of BC (Tables 3 and 4) [57], circulating miRNAs are also important in predicting the prognosis and treatment responses of BC patients [38,58]. For instance, the expression of miR-155 and miR-1246 was elevated in the plasma exosomes isolated from BC patients, and the upregulation of both miRNA levels was linked to poor survival, recurrence and trastuzumab resistance among BC patients [38]. In another study [39], downregulation of plasma miR-140-5p was correlated to increased chemoresistance, reduced event-free survival (EFS) and increased recurrence among BC patients. On the other hand, upregulation of exosomal miR-21 [58], exosomal miR-34a, miR-182 and miR-183 levels [59] was shown to contribute to poor chemotherapy response among BC patients, while the elevation of blood exosomal miR-2392, miR-4448 and miR-4800-3p was demonstrated to correlate to good response after neoadjuvant chemotherapy [60]. In terms of response towards targeted therapy, such as trastuzumab, decreased serum levels of miR-16-5p, miR-17-3p, miR-451a and miR-940 were observed in trastuzumab-resistant BC and the increased expression of these miRNAs was shown to promote treatment response to trastuzumab and improve survival among BC patients [50]. In another Arabic study [30], dysregulations of seven circulating miRNAs that include miR-19a, miR-19b-3p, miR-22-3p, miR-25-3p, miR-93-5p, miR-199a-3p and miR-210-3p were shown to be related to resistance to both chemotherapy and targeted therapy. On the contrary, the upregulation of circulating miR-21 correlated tightly to radio resistance among BC patients [36]. Notably, all the circulating miRNAs that were reported to modulate treatment responses in BC patients appeared to share a common function in promoting uncontrolled proliferation and apoptosis evasion among the BC cells [61,62]. For example, miR-373 was reported to upregulate the expression of vascular endothelial growth factor (VEGF) in BC cells, which would lead to enhanced proliferation and angiogenesis [62].
By studying the relationships between circulating miRNA levels, clinical conditions and treatment responses, clinicians could predict the survival and likelihood of disease recurrence among BC patients [63]. Elevation of several circulating exosomal miRNAs was shown to be associated with good survival and it was suggested that the upregulation of these miRNAs may improve patient survival by enhancing patient response to chemotherapy [60]. On the other hand, the upregulation of exosomal miR-200c [64] and miR-24-3p [47] was shown to correlate to poor overall survival (OS) among BC patients, as the increased expressions of these miRNAs were hypothesized to directly correlate to advanced disease staging [47,64]. Similarly, the downregulation of several circulating miR-NAs, such as miR-34a [37] and miR-335 [54], was shown to reduce BC patient survival and this would contribute to an increased likelihood of relapse and recurrence. Metastasis is one of the important factors causing BC recurrence and the dysregulations. Eight miRNAs, including miR-296-3p, miR-575, miR-3610-5p, miR-4483, miR-4710, miR-4755-3p, miR-5698 and miR-8089, were demonstrated to be able to predict the likelihood of recurrence secondary to BC metastasis [31]. Other circulating miRNAs that are reported to play vital roles in influencing patient survival include miR-17, miR-18b, miR-103, miR-107, miR-652, miR-26b-5p, miR-106b-5p, miR-142-3p, miR-142-5p, miR-185-5p and miR-362-5p [45,65,66]. Patients with dysregulated circulating levels of these miRNAs were found to have more advanced disease staging and were more prone to face disease relapse and recurrence with reduced survival rate [45,65,66]. In two other studies [29,31], at least 20 circulating or exosomal miRNAs were reported to be sensitive and useful in distinguishing between recurrent and non-recurrent BC cases and this is helpful to predict patient prognosis and survival.

Multifunctional Roles of Circulating miRNAs as Potential Biomarker for Human Breast Cancer
Evidently, circulating miRNAs have great potential to be employed as minimally invasive biomarkers in diagnosing BC at an early stage and complementary to the distinguishment of BC based on its clinical and histopathological grading [62,67]. In addition, circulating miRNAs are also helpful to predict the likelihood of relapse, recurrence and treatment responses among BC patients and this is particularly useful in guiding clinicians in planning a personalized treatment approach for different BC patients [30,68]. Given the challenge in identifying a miRNA panel useful for these functions within the growing number of related studies, we hope to provide an overview of the current status of the reported multifunctional roles of circulating miRNAs. Based on our findings from Tables 3 and 4 as well as the Supplementary Tables S1 and S2, the miRNAs were further classified based on their reported roles in the diagnosis, staging classification and the prediction for relapse, treatment outcome and survival prognosis for BC patients (Figure 1a,b). We found that most of the reported free-circulating miRNAs were suitable in achieving the purpose of diagnosis only or concurrently in diagnosis and staging classification (Figure 1a). On the other hand, more reported exosomal and EV miRNAs were classified with either diagnosis only or staging only (Figure 1b), suggesting that exosomal and EV miRNAs may have more specific targets than free-circulating miRNA to be translated into clinical validation for diagnostic and staging purposes. Interestingly, there are two freecirculating (miR-21, miR-140-5p) and exosomal and EV (miR-155, miR-1246) miRNAs that were reported with the multifunctional roles, i.e., diagnosis, staging, treatment, survival and relapse.
number of related studies, we hope to provide an overview of the current status of the reported multifunctional roles of circulating miRNAs. Based on our findings from Tables  3 and 4 as well as the Supplementary Table S1 and S2, the miRNAs were further classified based on their reported roles in the diagnosis, staging classification and the prediction for relapse, treatment outcome and survival prognosis for BC patients (Figure 1a,b). We found that most of the reported free-circulating miRNAs were suitable in achieving the purpose of diagnosis only or concurrently in diagnosis and staging classification ( Figure  1a). On the other hand, more reported exosomal and EV miRNAs were classified with either diagnosis only or staging only (Figure 1b), suggesting that exosomal and EV miR-NAs may have more specific targets than free-circulating miRNA to be translated into clinical validation for diagnostic and staging purposes. Interestingly, there are two freecirculating (miR-21, miR-140-5p) and exosomal and EV (miR-155, miR-1246) miRNAs that were reported with the multifunctional roles, i.e., diagnosis, staging, treatment, survival and relapse.

Sensitivity and Specificity Levels of miRNA Detection in BC Patients
Several potential miRNA biomarkers have been identified in BC patients' serum or plasma. With a receiver operating characteristic (ROC) curve analysis, miR-21-5p was shown to have greater potential in discriminating between BC patients and the control group than that of miR-221-3p [69]. Additionally, a recent meta-analysis on miR-21-5p and BC that comprises six publications, consisting of Asian and Caucasian study cohorts,

Sensitivity and Specificity Levels of miRNA Detection in BC Patients
Several potential miRNA biomarkers have been identified in BC patients' serum or plasma. With a receiver operating characteristic (ROC) curve analysis, miR-21-5p was shown to have greater potential in discriminating between BC patients and the control group than that of miR-221-3p [69]. Additionally, a recent meta-analysis on miR-21-5p and BC that comprises six publications, consisting of Asian and Caucasian study cohorts, further confirmed the potential early diagnostic role of miR-21-5p in BC patients due to its high pooled AUC and diagnostic odds ratio [70]. Further, another recent meta-analysis conducted with the aim of determining the overall diagnostic performance of 56 eligible studies involving circulating miRNAs via qPCR revealed a pooled sensitivity and specificity of 0.85 and 0.83, respectively [71]. Moreover, multiple miRNA panels with sensitivity and specificity scores of 0.90 and 0.86, respectively, were significantly higher compared to that of the single miRNA panels with corresponding sensitivity and specificity scores of 0.82 and 0.83, respectively. With regard to specimen type, pool sensitivity and specificity of plasma were 0.83 and 0.85, respectively, and the pool sensitivity and specificity of serum were 0.87 and 0.83, respectively, indicating little difference in the diagnostic performance between serum and plasma samples. These studies revealed that cell-free circuiting miRNA could function as a promising early diagnostic biomarker for the detection of BC [71].

Current Challenges and Issues in Circulating miRNAs as a Common Candidate for Liquid Biopsy in BC Management
While establishing significant dysregulated miRNA expression of significance blood samples, studies have shown contrary findings between the use of serum or plasma as start-ing materials for obtaining miRNAs and in multicentered heterogenous patient samples, as well as in the use of various statistical models. Nevertheless, the need for standardization of pre-analytical variables, namely sample processing, storage and handling, as well as the data normalization strategy for the quantification of miRNA, are often highlighted as possible causes for the discordant outcomes of the identified miRNA as diagnostic, predictive or prognostic markers [72].

Biological Parameters
The expression profiles between miRNAs obtained from human BC serum versus tumors using RNA-sequencing revealed a total of 109 significant differentially expressed miRNAs between the patient's serum and healthy individuals' serum. Furthermore, 174 significant differentially expressed miRNAs between normal tissues and tumors were observed, of which only 10 common miRNAs were differentially expressed in serum and tumor biopsy [73]. Furthermore, an in-depth analysis of data obtained from the HMDD v3.0 database and individual papers showed circulating miRNAs as BC diagnostic biomarkers lack specificity due to different expressions in tissues and blood of cancer patients and even miR-21-5p being cited as the most commonly dysregulated miRNA in BC studies was shown to be highly expressed in other cancer types and diseases [74]. On the other hand, a study on circulating miRNA among BC tumors, serum and normal tissues using microarray and qPCR was able to show that miRNA profiles between tumors matched to that of their corresponding serums, indicating the possible selective release of miRNAs from the tumor site to the blood [75]. In another study, in which the influence of the heterogeneous population setting on miRNA profile was obtained from two different geographical populations, one from Belgium (n = 110; primary BC = 55, healthy individuals = 55) and another from Rwanda (n = 110; primary BC = 55, healthy individuals = 55), using qPCR from plasma samples, revealed two distinct pools of circulating miRNA corresponding to each of the studied populations [76]. However, a multicenter study comprising Caucasian and Asian ethnicities from five different geographical locations obtained a common pool of miRNA, with AUC values ranging from 0.88 to 0.97 in the detection of early BC [15].

Statistical Models
The detection of dysregulated miRNA expression level methods requires normalization to remove variations across samples and different normalization methods were shown to have likely contributed to differences in the miRNA profiles obtained. In one study, qPCR was validated using the >2-fold change method from plasma circulating miRNAs and obtained 24 significant upregulated miRNAs and 16 significant downregulated miRNAs in BC patients compared to controls; however, only three miRNAs (miR-22-5p, miR-27b-3p, miR-423-5p) were able to distinguish cancer patients from healthy individuals [77]. Another study also used qPCR for the detection of possible circulating miRNA biomarkers in BC by creating a panel from an unbiased exploration among all expressed miRNAs via the two-fold cross-validation consolidating logistic regression and feature selection algorithm in the discovery cohort. Results from the study revealed the identification of six miRNA potential biomarker panels with an AUC of 0.78 and 0.77 in the discovery and validation cohorts, respectively, using the global geometric mean normalization method [15].
It is obvious that numerous studies (Tables 3 and 4) have identified dysregulated miRNA profiles, which significantly represent early cancer detection, molecular subtype classification status and monitoring signatures of recurrence and metastatic BC progression. However, these studies often reveal a diverse pool of miRNA, with roles typically associated with the various hallmarks of cancer [78]. Therefore, by consolidating the big data of miRNA with artificial intelligence (AI) and machine learning platforms (ML), a more comprehensive and inclusive approach may be established for complementing clinical-based decisions in promoting effective BC patient care and recovery outcome.

Machine Learning and Deep Learning Approaches in BC Research
The field of healthcare has been transformed by technological advancements, such as the generation of large digital datasets. Over the past few decades, many researchers have put their efforts in exploring the application of machine learning (ML) in various healthcare applications, including cancer detection, diagnoses, prognoses, treatment and recurrence prediction [79][80][81][82]. Basically, there are two main common types of ML techniques, i.e., (i) supervised learning and (ii) unsupervised learning. The main difference between these two types of learning is the need for labelled training data. Supervised learning relies on labelled input and output training data. Based on the input and output data, the model will first identify their relationship before it can be used to classify new and unseen datasets and predict outcomes. On the contrary, unsupervised learning processes unlabeled or raw data. It is often used to identify the trends or patterns in raw datasets and perform initial data analysis.

Machine Learning and Deep Learning for Detection and Diagnosis
Detection and diagnosis of BC at the early stage helps in reducing the fatality rate to a greater extent. Rana et al. [83] conducted a comparative experiment with four different supervised algorithms, including the support vector machine (SVM), logistic regression (LR), k-nearest neighbor (KNN) and Naïve Bayes (NB) on the Wisconsin Breast Cancer Diagnostic dataset (WBCD) in predicting and diagnosing BC. Based on their analysis, the KNN technique provided the best results. NB and LR have also performed well in BC diagnosis. Nonetheless, they highlighted that SVM is a strong predictive and sophisticated machine learning algorithm, especially when it comes to predictive analysis; thus, this technique is also the most suited technique for recurrence or non-recurrence prediction of BC. Similar findings were also found by many researchers [84,85], whereby SVM has demonstrated its efficiency in BC prediction and diagnosis and achieved the best performance in terms of accuracy and precision.
In addition to BC detection and diagnosis, machine learning techniques were also used for subtypes classification. Three different ML techniques, including fuzzy SVM, Bayesian classifier and random forest (RF), were compared in categorizing the types of cancer from a sequence of mammography images in the MIAS database. They found that fuzzy SVM has the best performance compared to other ML techniques, with over 90% accuracy, sensitivity, specificity, precision and recall [86]. Similar findings were also obtained by Wu and Hicks [87], whereby the SVM technique was effective in discriminating the existence of triple-negative breast cancer (TNBC) based on the RNA sequencing datasets. The clinical significance of this investigation is that ML algorithms could be used not only to improve diagnostic accuracy, but also for identifying women who are at high risk of developing TNBC, which could be prioritized for treatment.
ML techniques have been widely utilized for cancer prognosis and survival prediction purposes too. For instance, in a study [88] involving eight different ML techniques to develop models for identifying and visualizing relevant prognostic indications of BC survival rates, based on 5 years' BC patient database of the National Cancer Institute's SEER Program from 2006 to 2010, the RF technique was found to be the best technique, with an accuracy level of 94.64%. A study focusing on the analyses of the impact of chemotherapy and establishment of prediction model of prognosis in early elderly TNBC was conducted by using machine learning, with 4696 patients in the SEER Database who were 70 years or older, diagnosed with primary early TNBC, from 2010 to 2016 [89]. The propensity-score-matched method was utilized to reduce covariable imbalance. Univariable and multivariable analyses were used to compare BC-specific survival (BCSS) and overall survival (OS). Nine models were developed by ML to predict the 5-year OS and BCSS for patients who received chemotherapy. The multivariate analyses showed a better survival in the chemotherapy group and the Light Gradient Boosting Machine (LightGBM) is a practical model for predicting survival and providing precious systemic treatment for patients who received chemotherapy [90].
Researchers have also used deep learning (DL), a subset of ML, in cancer applications. Based on the literature review, it was noted that some of the DL applications were found to have better performance compared to the conventional ML techniques. For instance, an ensemble deep learning approach for the definite classification of non-carcinoma and carcinoma BC histopathology images was able show a sensitivity of 97.73% for carcinoma classification, with an overall accuracy of 95.29% [91]. On the other hand, particle-swarmoptimized wavelet neural network (PSOWNN) was found relatively superior compared to other conventional ML techniques, such as CNN, KNN and SVM [92]. Meanwhile, the deep-learning-assisted efficient AdaBoost algorithm (DLA-EABA), a combined ML approach with AdaBoost algorithm as the base, for early BC detection showed a high accuracy level of 97.2%, sensitivity at 98.3% and specificity at 96.5% [93]. Apart from that, this method was reported to increase the patient survival rate.

Studies with miRNAs as Breast Cancer Biomarkers with ML/DL Approaches
MicroRNAs (miRNAs) have been suggested as the biomarkers or therapeutic targets in BCs [94][95][96]. However, there are not many BC studies utilizing the ML or DL approaches on miRNA biomarkers. Table 5 summarizes some of the related work for BC that adopted ML or DL approaches on miRNA biomarkers. As the amount of miRNA expression data in the Genomic Data Commons (GDC) Data Portal increased dramatically, several researchers proposed feature-selection methods to reduce the size of datasets, before they proceed with their analysis. An ensemble feature-selection methodology for miRNA signatures was proposed to identify the most robust and reliable miRNAs to be used in clinically relevant prediction tasks [97]. In their research, including over 8000 samples from TCGA, 100 miRNA signatures were identified and distinguished between tumor and normal tissues. The proposed approach provided better accuracy after 10-fold cross-validation with different ML classifiers, showing over 90% classification accuracy. In another study, the ensemble methodology was performed to identify the important biomarkers for BC and then classified by different ML techniques, such as NB, LR, KNN, SVM and multilayer perceptron [98]. In their preliminary analysis, default parameters were changed only when experimentation showed that classifier performance generally improved significantly across all datasets. Rehman et al. [99] performed four different feature-selection methods, including the Information Gain (IG), Chi-Squared (CHI2) and Least Absolute Shrinkage and Selection Operation (LASSO), to identify the most specific and effective miRNAs in discriminating normal and cancerous tissues. After feature selection, they applied the RF and SVM algorithms to identify the cancerous cell. The study demonstrated that the miRNAs ranked higher by their analysis had higher classifier performance. Performance becomes lower as the rank of the miRNA decreases, which shows that these miRNAs had different degrees of importance as biomarkers.
The tree-based ML models were normally applied on specific miRNAs for classifying the upregulated and downregulated BC cells [100]. In addition, several supervised methods, such as DT, NB, neural network and DL, were adopted, to classify cancer cells based on the expression of the microRNA gene to obtain the best method that can be used for gene analysis [101]. It was found that the DL method, which was developed based on a multilayer feed-forward ANN trained with stochastic gradient descent using backpropagation outperformed other conventional ML methods [101].

Conclusions
Key cancer signatures are represented by the dysregulated expression of miRNAs detected in liquid biopsy and shown to be associated with BC diagnosis, subtype classification and recurrence, as well as metastatic spread of the cancerous cells. Therefore, using miRNA-based liquid biopsy may be much more feasible in BC management, which also may come with greater patient acceptance as it could be conducted as simple routine blood collection, but the amount of information contained in the miRNA profile is immense. Using bioinformatics and with current and emerging AI and ML platforms, this huge amount of miRNA data may be able to be analyzed in ways that provide cancer progression indicators that complement standard clinical practices. However, translation of the miRNA targets selected by ML requires clinical validation to achieve the number of biomarkers that can accurately perform the expected roles cost effectively. This will eventually result in better treatment outcomes and supportive BC management from early detection to personalized therapy, ultimately improving the quality of life among BC patients.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable for this study.