Next Article in Journal
Comparison between PSMA PET/CT and MRI for Characterizing Hepatocellular carcinoma: A Real-World Study
Next Article in Special Issue
Radiation Exposure to Low-Dose Computed Tomography for Lung Cancer Screening: Should We Be Concerned?
Previous Article in Journal
The Relationship between the Contouring Time of the Metal Artifacts Area and Metal Artifacts in Head and Neck Radiotherapy
Previous Article in Special Issue
Reduction in Radiation Exposure in Minimally Invasive Pedicle Screw Placement Using a Tubular Retractor: A Pilot Study
 
 
Order Article Reprints
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Radiation-Free Microwave Technology for Breast Lesion Detection Using Supervised Machine Learning Model

1
School of Engineering, London South Bank University, London SE1 0AA, UK
2
Breast Screening and Diagnostic Breast Cancer Unit, AUSL Umbria 2, 06034 Foligno, Italy
3
Department of Diagnostic Imaging, Perugia Hospital, 06156 Perugia, Italy
4
Umbria Bioengineering Technologies (UBT) Srl, 06081 Perugia, Italy
*
Author to whom correspondence should be addressed.
Tomography 2023, 9(1), 105-129; https://doi.org/10.3390/tomography9010010
Received: 12 December 2022 / Revised: 8 January 2023 / Accepted: 9 January 2023 / Published: 12 January 2023
(This article belongs to the Special Issue Radiation Protection Opportunities in Medical Imaging)

Abstract

:
Mammography is the gold standard technology for breast screening, which has been demonstrated through different randomized controlled trials to reduce breast cancer mortality. However, mammography has limitations and potential harms, such as the use of ionizing radiation. To overcome the ionizing radiation exposure issues, a novel device (i.e. MammoWave) based on low-power radio-frequency signals has been developed for breast lesion detection. The MammoWave is a microwave device and is under clinical validation phase in several hospitals across Europe. The device transmits non-invasive microwave signals through the breast and accumulates the backscattered (returned) signatures, commonly denoted as the S 21 signals in engineering terminology. Backscattered (complex) S 21 signals exploit the contrast in dielectric properties of breasts with and without lesions. The proposed research is aimed to automatically segregate these two types of signal responses by applying appropriate supervised machine learning (ML) algorithm for the data emerging from this research. The support vector machine with radial basis function has been employed here. The proposed algorithm has been trained and tested using microwave breast response data collected at one of the clinical validation centres. Statistical evaluation indicates that the proposed ML model can recognise the MammoWave breasts signal with no radiological finding (NF) and with radiological findings (WF), i.e., may be the presence of benign or malignant lesions. A sensitivity of 84.40% and a specificity of 95.50% have been achieved in NF/WF recognition using the proposed ML model.

1. Introduction

Breast cancer is the most common cancer in women worldwide, affecting 1 in every 8 women [1]. Mammography is the gold standard technology for breast screening but it has ionizing radiation which leads to potential harms for patients; specifically, the cumulative effect of routine mammography screening may increase women’s risk of developing radiation-induced breast cancer [2]. Thus, age and screening frequency have been defined taking into account mammography risk-benefit ratio. Also, women feel some pain and discomfort [3] when undergoing mammography. Specifically, many women avoid mammography screening by fear of pain, embarrassment, discomfort, and radiation [1]. Additionally, the level of anxiety about the screening outcome is a tangible factor reducing mammography screening adherence, although the attitude differs depending on the age, profession, marital status, ethnicity, racial and educational differences [4,5].
To overcome the fear of pain, discomfort and ionizing radiation exposure issues, a novel device (i.e. MammoWave) based on low-power radio-frequency signals has been developed for breast lesion detection by UBT S.R.L. (Italy) team [6].
Moreover, in some cases, lesions prove difficult to detect from mammography, especially when the breast is highly dense [7,8,9], or if the breast comprises small, elongated salt-like microcalcification particles [10]. However, the evolution of machine learning (ML) algorithms and the greater availability of medical datasets from different modalities is enabling improved assisted detection and better performance hopes [11,12,13].
A recent case study on early breast cancer detection using AI methods from the mamographic images has been demonstrated in [14]. This study focus on the data collected from UK and US clinical trials. They shown a cancer case (small, irregular mass with associated microcalcifications and ) that was missed by six readers in the examination, but correctly identified by the AI system. In this case three level deep learning was used to train the model for the breast cancer detection. In [15], a background on the key ethical, technical, legal and regulatory challenges of AI in breast imaging and performance in breast screening have been provided.
Recently, microwave-based techniques have emerged and received attention as an alternative breast-screening tool [6,16,17,18,19]. Ultrawide band microwave breast imaging (UWB-MWBI) demonstrates strong evidence of the dielectric property contrast between healthy and malignant tissues. Usually, a multistatic or monostatic radar mechanism is employed to measure the dielectric property contrast of breast tissues in the spectrum between 0.5 GHz to 9 GHz [20,21]. Hitherto, seven clinically tested MWBI operational systems have been reported in the literature such as; (a) UBT Srl, Italy who designed a MWBI system, named MammoWave, differentiate healthy tissues and tissues with lesions. MammoWave has been tested on 150 patients [22]; (b) Dartmouth College, USA, constructed a MWBI device for breast cancer identification where patterns were inspected by a the Surgical Pathology team at Dartmouth-Hitchcock Medical Center (DHMC). The trial was performed with 150 patients with and without lesions [23]; (c) Multistatic Array Processing for Radiowave Image Acquisition (MARIA) system was developed at the University of Bristol, UK and a clinical trial was carried with over 300 patients including healthy and lesion breast patterns [24,25]; (d) Tissue Sensing Adaptive Radar (TSAR) system was developed at the University of Calgary, CA and a clinical trial was reported with a small group of patients [26,27]; (e) Southern University of Science and Technology, China was constructed a MWBI imaging system and completed their first trial with a small group of patients [28]; ( f ) Hiroshima University, Japan trialled a MWBI for cancer identification with a small group of patients [29]. Additionally, McGill University, Canada [30] and Shizuoka University, Japan [31] have also developed their own MWBI system and recently started trials.
Microwave breast screening is non-ionising, non-invasive, and painless as it does not include any form of breast compression. A handful of microwave-based studies have been performed by researchers to detect breast cancer using real data, with the majority investigated using either numerical simulations or with phantoms. UBT Srl’s MammoWave is one of the few Ultra-Wide Band (UWB) based microwave breast screening prototypes built, tested, and validated. MammoWave, uniquely functions in air with 2 antennas rotating in the azimuth plane, operating within the frequency band of 1–9 GHz. MammoWave examinations record the complex S 21 (backscattered (returned) complex signals), in the frequency domain. Artefact removal is performed through appropriate mathematical procedures, namely rotation subtraction [32,33].

Contribution towards This Research

This work demonstrates that Machine learning (ML) can help understand phenomena from the frequency spectrum collected through MammoWave in response to the stimulus, segregating breasts with and without lesions. Specifically, ML can recognise the MammoWave signal response of breasts with no radiological finding (NF) and breasts with radiological findings (WF), i.e., with lesions (finding) which may be benign or malignant. Contributions of this study are: (a) the experiment has been performed across 61 breasts, enabling the exploration of lesions with different dimensions. (b) The new data appear differently in the hyperplane, motivating the authors to explore Gaussian kernel of SVM alongside of quadratic kernel ( S V M Q ) and Gaussian kernel ( S V M G ), which are found to be more efficient in this case. (c) Making better use of the frequency response signals has been explored and experimentally it is shown that 50 components obtained using principal component analysis (PCA) provide best classification in this case. (d) The prediction results have been analysed by the team of researchers and radiologists through statistical measurements to understand the false positive and negative classifications, revealing that lesion size and breast density have effect on microwave response as well as ML predictions.

2. Materials and Methods

A diagrammatic interpretation of the proposed work is presented in Figure 1. Each breast has its own correspondent output of the radiologist study assessment, which has been used as gold standard for categorization of the breasts in two categories: breasts with no radiological finding (NF), and breasts with radiological findings (WF), i.e., with lesions which may be benign or malignant.
Regarding the radiological study review, the radiologist assessment (NF/WF) used (accordingly to his/her opinion) one or more of the following conventional techniques: mammography, performed using Selenia LORAD Mammography System (Hologic, Marlborough, MA, USA); echography, performed using the MyLab 70 xvg Ultrasound Scanner (Esaote, Genova, Italy); magnetic resonance imaging, performed through a 3.0 T MAGNETOM scanner (Siemens Healthcare, Erlangen, Germany). Gold standard labels of the breasts (NF or WF) have been employed to train and test the ML algorithms to automatically identify breast response signals backscattered from lesions via the MammoWave. The approach follows six primary steps: patient’s undergo breast examinations through conventional approaches and breast type annotation (WF or NF) by the resident radiologist; subsequently, patient’s undergo breast examination through MammoWave and the microwave signal data is collected. Once this is done, classification of resulting microwave signals is performed. The reduction of microwave feature space applying PCA to improve the classification result then follows. Finally, classification of signals employing only the real fragments of the complex signals, reduction of real part’s feature space for improving the classification result. Each phase is explained in the following subsections.
The proposed research aims to identify the optimal classification settings for the research question using the components of complex S 21 numbers. There are four possible ways to employ complex numbers (not alter the form of the original signal)and perform the classification task: (i) applying actual complex S 21 responses, (ii) using features extracted from the complex S 21 responses, (iii) using resistance values (real part of the complex S 21 responses) from the transmitted voltage and current (reactance or imaginary part of S 21 varies inversely with increasing frequency thus unsuitable for this classification task), (iv) using features extracted from the real part of S 21 . The possible classification directions involved in the proposed study to improve breast lesion identification performances detailed in the later section.
  • MammoWave breast signal classification considering features extracted from the complex S 21 responses. Features have been extracted using PCA a powerful mathematical tool for multivariate data transformation. S V M Q has been chosen for the ML task applying the team’s previous research. Subsequently, S V M G has been experimented alongside observing spherical data shapes for improved classification S V M G performed better than S V M Q here, thus S V M G has been further adopted for the following classification tasks.
  • MammoWave breast signal classification considering real parts of complex S 21 responses and employing S V M G .
  • MammoWave breast signal classification considering features extracted (by PCA) from real parts of complex S 21 and employing S V M G .
Characteristics of the data used in each of these stages and their performance summaries are explained in the following sections.

Apparatus Description and Data Collection

The MammoWave device (shown in Figure 2a) has been designed by Umbria Bioengineering Technologies (UBT), Italy. The device comprises a cylindrical aluminium container equipped with a chamber and cup to comfortably place the breast when women (or, participants) lay down to be screened. There is one transmitting ( t x ) and one receiving antenna ( r x ) which operate at 1 to 9 GHz frequency to obtain the breast’s response to the applied microwave signals. The 2 antennas rotate around the breast, illuminating it from a number of angles. Figure 2b displays how patients undergo the MammoWave’s breast examination, placing the breast inside the chamber with no breast compression required. To support all patients, three different cup sizes are available, and the best fit for the patient breast is selected. The largest cup has a diameter of 135 mm; thus, it cannot accommodate very large breasts. The cups have the following features. They are made using polylactic acid (PLA) to ensure biocompatibility [34] and with a width of 1 mm. This 1 mm thickness is based on modelling and experimental investigations by the team which demonstrated that this thickness has no effect on the microwave imaging outcomes [35].
The functioning principle of the MammoWave system is based on the dielectric properties (i.e., relative dielectric constant and conductivity) contrast between breast normal tissue and tissue with lesions at microwave frequencies. The antennas inside the container (covered to absorb microwaves) are fitted at the constant height, in free space and can rotate across the azimuth for collecting the microwave signals from diverse angular locations. The transmitting and receiving antennas are attached to a 2-port VNA (Cobalt C1209, Copper Mountain, Indianapolis, IN) that operates up to 9 GHz. Measurements have been accomplished by recording the complex S 21 in a multi-bistatic fashion. For every transmitting and receiving spot, the complex S 21 is gathered from 1 to 9 GHz, along with 5 MHz sampling. Let, r x rotate across the azimuth collecting the microwave signals from diverse angular at a radius a 0 . The received signals can be expressed as, n = 1,2, …,80, denotes the receiving points; m = 1,2, …,5 indicates the transmitting sections, p = 1,2 indicates the position inside each transmitting section; and f is the frequency.
MammoWave uniquely functions in air with 2 antennas rotating in the azimuth plane, operating within the frequency band of 1–9 GHz. The 2 antennas, one transmitting and one receiving, both rotate around the breast, illuminating it from a number of angles. MammoWave examinations are performed by recording the S 21 (backscattered (returned) complex signals), in the frequency domain. Artefact removal is performed through appropriate mathematical procedures, namely rotation subtraction [32,33]. The MammoWave system has been employed to collect patient data: according to the conventional radiologist review, 25 breasts without lesions and 36 breasts with lesions, have been used in the work. Different S 21 signatures are found when the microwave signals interact through breast tissues. This signature is due to the contrast in dielectric properties i.e., permittivity and conductivity, within the spectrum of microwave frequencies as they pass through the breasts. A high contrast (up to 5) has been reported [36] between healthy breast tissue and malignant tissue, while recent studies confirm a high contrast only between fatty and malignant breast tissues, while it decreases between healthy fibro glandular and malignant tissues [37].
MammoWave removes the need for applying any matching liquid. For each breast, measurements have been performed recording the complex S 21 in a multibistatic fashion. Specifically, we employed here 15 transmitting positions, displaced in 5 triplets centred at 0 , 72 , 144 , 216 , and 288 ; in each triplet, the transmitting positions are displaced by 4 . 5 . For each transmitting position t x m , the receiving antenna is moved to measure the received signal at 80 receiving points r x n , equally spaced along the azimuth of 4 . 5 . For each transmitting and receiving position, we recorded the complex S 21 in the frequency band from 1 to 9 GHz, using a frequency sampling of 5 MHz. Thus, that for each breast, the raw-data can be represented by a matrix of complex S 21 with dimension ( 15 × 80 ) × ( 1601 × 2 ) . Figure 2c shows a pictorial view of the measurements setup. A breast scan is completed in 10 min, whilst the person is in a horizontal face down position on a thin mattress, with no breast compression. Numbers and displacements of transmitting and/or receiving positions can be changed [35]; specifically, an increase of the number of transmitting and/or receiving positions will lead to an increase of the measurement time. We verified in phantoms that the proposed configuration allows detection in a reasonable measurement time [35], i.e., 10 min, a duration which is similar to a traditional X-ray breast screening examination.
The MammoWave feasibility clinical validation has been performed in Perugia and Foligno Hospitals, Italy (Ethical Committee of Umbria, Italy, approval N. 6845/15/AV/DM of 14/10/2015, N. 10352/17/NCAV of 16/03/2017, N 13203/18/NCAV of 17/04/2018). The correspondent clinical protocol aims to quantify the device’s accuracy in breast lesions detection. As an inclusion criterion, the subjects should have a radiologist study output obtained through conventional examinations (mammography and/or ultrasound and/or magnetic resonance imaging) within the last month. The protocol and procedures were in accordance with institutional and ethical standards in research, with Declaration of Helsinki (1964) and its later amendments.
For this study, we used data of 61 breasts, each one with its own correspondent radiologist study review output, which has been used as gold standard for classifying the breasts in two categories: breasts with no radiological finding (NF), and breasts with radiological findings (WF), i.e., with lesions which may be benign or malignant. Some details of the detected or suspected lesions have been collected for WF breasts; moreover, pathology and/or clinical follow-up has been performed for lesions’ final assessment (benign/malignant). The subject’s information is charted in Table 1. In Table 2 some details of the radiologist study review are given for WF breasts.

3. Proposed Methodology

The proposed ML for segregating breasts with and without lesions from the MammoWave signal consist of mainly three stages. At first the PCA has been applied to the collected data from 61 breasts for the optimum use of the frequency response signals. Then SVM was applied on the extracted feature by PCA for the classification and results were compared with two different kernel functions. Finally, the performance of the classification method were analysed and results were validated statistically. A flow chart of the proposed method shown in Figure 3.

3.1. Principal Component Analysis

Principal components (PCs) have been extracted from the MammoWave’s complex S 21 responses, represented as λ = n = 1 N F = 1601 λ r e a l ( n ) + j λ i m g ( n ) , where n is the number of frequencies, λ r e a l and λ i m g are the real and imaginary components respectively. The The λ r e a l indicates the resistance of the dielectric materials (tissues) to the transmitted signals with n number of frequencies. The λ i m g indicates the capacitance of the dielectric materials (tissues) of the transmitted signals with n number of frequencies. PCA has been implemented on both λ r e a l and λ i m g components to extract principal components (PCs) for classifying signals obtained from breast considering covariance matrix and Eigen vector.

3.2. Basic Theory of the Proposed Algorithms

Several ML algorithms present for classification in the literature, although the selection of appropriate methods is quite intuitive and needs to be determined heuristically. In support vector machine (SVM), each breast signal data consist of S dimensional feature vector and a class is refer to each corresponding pattern [38]. SVM assigns a class label to each signal data pattern on the basis of its position with respect to a decision hyperplane, which defines by Equation (1), where, b is bias and x is variable. The distance between those boundaries of a margin area around the decision hyperplane, is called the margin width defined by w T x + b = ± 1 . In SVM, kernel function must be chosen in order to achieve better performance. It defines the structure of the high-dimensional feature space to determine maximise margin hyperplane Equation (2).
w T x + b = 0
w ^ , b ^ = arg min w , b 1 2 w 2
The experimented data distribution appears non-linearly separable and to avoid model overfitting, hence the quadratic kernel ( S V M Q ) and Gaussian kernel ( S V M G ) function have been chosen for this NF and WF signal classification.
S V M Q have been employed to differentiate data points by minimising the gap between the two groups. The considered quadratic function is obtained by the optimisation problem define by Equation (3), where, x i , x j are real valued vectors, d is degree of polynomials, here d = 2 (quadratic), since larger value tend to overfit.
k ( x i , x j ) = ( x i · x j + 1 ) d
S V M G is popular kernel function for its excellent learning performance which can approximate bounded and continuous functions arbitrarily well, defined by Equation (4). The non-linear Euclidean distance controlled by kernel width parameter σ , which has great influence for the classification accuracy. It determine the feature space that the samples will be mapped onto. As σ 0 , where all samples is classified correctly, but learning generalisation performance is poor and SVM can not classify new samples; whereas σ , the whole sample is classified as one class. This function creates the best hyperplane to classify the subjects here.
k ( x i , x j ) = e x p x i x j 2 2 σ 2 ,

3.3. Performance Analysis

In this work, 61 breasts were used, of which: 26 NF and 35 WF breasts (see Table 2). A summary statistic of two raw microwave scan population is shown in Table 3. Quantitative range of features of each population are not significantly high and compact in nature. Hence, application of normalisation could make the features insignificant for machine learning application and has been avoided in this work.
MammoWave’s breast screening data is non-linearly separable and logistic regression assumes the linearity between dependent and independent variables. Hence, logistic regression algorithm has not been attempted for classifying the data into WF and NF classes. In case of decision tree, a small change in the test/unseen data can cause a large change in the structure of the decision tree causing instability. Also, decision tree is very much sensitive to the new data which may put the whole system into risk. Therefore, decision tree has not been employed for the classification task. The leading supervised and non-linear classifiers such as, k-nearest neighbour, and multi-layer perceptron neural network, support vector machines have been attempted before and reported in the previously published article [39]. Support vector machine’s quadratic ( S V M Q ) kernel has been selected following the previous research findings. As S V M Q was applied on a limited number of patients after performing dedicated S 21 pre-processing for artefact removal. Here, with the aim of exploring the applicability of ML algorithms directly to S 21 , i.e., raw-data and observing spherical data shapes. Hence, the two algorithms S V M G and S V M Q are investigated and compared for obtaining improved classification performance. Also, the proposed work explores the possible way for improving previous results using principal component analysis (PCA) [40] on raw S 21 signals (described in Section 4.1), real parts of S 21 signals (described in Section 4.2, and PCA over real parts of S 21 signals (described in Section 4.3 minimising false positive-negative signals and identify the appropriate numerical sequence for the classification of NF and WF signals. It is possible that even the data are labelled into two groups by the radiologists but the finite differences are not suitable for classification when the elements of S 21 and principal components are prepared. Therefore, two sample t-test has been implemented three times before applying machine learning algorithm. Null hypothesis ( H 0 ) of the proposed work assumes the S 21 values or extracted features applying PCA of two population (NF and WF group) comes from independent random samples from normal distributions with equal means and equal but unknown variances. However, the alternative hypothesis ( H a ) assumes the S 21 values or extracted features applying PCA of two population (NF and WF group) comes from unequal means. Hence, H a needs to be accepted in order to perform NF-WF signal classification. The ideal importance level α = 0.05 has been expected for tolerating and dismissing the null hypothesis, where p-value has been thought about for choosing the factual importance. Additionally, the confidence interval for the distinction in populace method for NF and WF signals have been considered, where C L and C U show the lower and upper limits of the certainty span. The ML experiment has been divided into two major parts; (a) realisation of optimal feature combination through training and validation phase (described in Section 4.1Section 4.3), (b) testing the trained model with optimal settings (described in Section 4.4). The data have partitioned using Monte Carlo cross validation (MCCV) [41] in both cases. This is done because it randomly partitions to select the training and validation dataset helping to understand the impact of risk and uncertainty in NF-WF breast signal prediction. The performance outcomes (statistical metrics) have been aggregated and averaged over all the rounds. A number of statistical metrics, accuracy, sensitivity, and specificity have been used to investigate the classification performance of the classifiers [42]. Subsequently, Matthews Correlation Coefficient ( M C C ) [43] has been implemented to investigate the classification outcomes and estimate quality of classification and probability of the informed decision respectively. Receiver operating characteristic (ROC) curve has been generated for the optimally performing ML model to explain the diagnostic competence and steadiness of the classification system with different discrimination threshold. The hyperparameter optimisation has been conducted in training-validation phase and best operating point decided analysing ROC curve. While the optimised parameters and ROC threshold have been used to produce final testing result (described in Section 4.4).

4. Results Analysis

According to the radiologist’s review, a total of 34 patients have been included in this study, with a total of 61 breasts examined (see Table 1). Among the total examined breasts, 25 NF and 36 WF breasts underwent the MammoWave exam, collecting S 21 raw data. WF breasts are breasts having lesions, which may be benign or malignant; lesions were found to have dimensions ranging up to 32 mm. The raw-data of each breast are represented by a matrix of complex S 21 having dimensions 1200 × 3202 (described in Section 2), where each complex signal contains 1601 real and 1601 imaginary components. Hence the classification experiment has been conducted on total 73,200 signals (31,200 NF signals and 42,000 WF signals). Complex S 21 raw data signals and its components have been individually employed for experimental purpose, where prediction efficiency is influenced by the discriminating ability of individual features i.e., real, and imaginary parts of complex S 21 . Each simulation has been run twenty-five times to observe the result stability before reporting average performance metrices.

4.1. Classification Applying PCs of Complex S 21

Figure 4 shows the percentage of total variance obtained from each PC for two different breast S 21 ’s, and the first 80 PCs are found to be quantitatively significant. This is because numerically, the variance values are a factor of 10 5 after the 80 PCs, hence extremely small and quantitatively insignificant, thus the team will investigate the first 80 PCs only. Figure 4a,b show an example of percentage variance for the first 80 PCs, where the x axis and y axis represent the number of components and percentage of variance respectively. Figure 4a displays the percentage of variance of a NF breast and Figure 4b describes the percentage of variance of a WF breast. Hence, the experiment for reduced feature space has been started from 80 PCs and features are continually eliminated until the optimal performance achieved with this feature setting. Maximum variances of 23.774% and 32.289% were achieved for NF and WF breast contained the within 80 PCs mentioned above.
Table 4 shows the outcomes of the t-test, where p < α rejects the null hypothesis H 0 ( H 0 = 1 ), accepts alternative hypothesis H a , and the true mean of the population belong between 3 × 10 5 to 1.500 × 10 5 . Hence, the acceptance of alternative hypothesis indicates that the λ r e a l data comes from populations with unequal means and can be employed for NF-WF signal classification task.
The results obtained here are from the selection of 80 PCs based on the investigation of Figure 4. With the PC length varied from 80 to 40 at 10-unit intervals, the variation of the classification performance of S V M Q and S V M G was obtained and tabulated in Table 5. Accuracy ( A c ), sensitivity ( S e ), specificity ( S p ), and the Matthews Correlation Coefficient ( M C C ) performance measure have been computed to investigate the predictions provided by S V M Q and S V M G . Table 5a,b show the classification performance of S V M Q and S V M G respectively. Each set of performance metrics varying principal components (PCs) have been included here. The optimal performance of both S V M Q and S V M G in this case was obtained in the first 50 PCs. Outcomes have been plotted in Figure 5 for comparing the performance in graph. Figure 5a,c,e,g show the A c , S e , S p , and M C C respectively obtained from classification applying S V M Q . The best sensitivity S e of 0.448 obtained by S V M Q employing 50 PCs which is not significant and satisfactory for breast lesion identification. Though, achieved the specificity S p is 0.820 but the misclassification of breast WF signals (signals reflected from lesions) i.e., false positives lowered the overall A c and M C C . The best set of performance of S V M Q achieved employing 80% of training data (and 20% validation data) with 50 PCs, where A c = 66.80 % , S e = 44.80 % , S p = 82 % , and M C C = 29 % .
Figure 5b,d,f,h exhibit A c , S e , S p , and M C C respectively by implementing S V M G . In case of S V M G , all the achieved performance metrics are satisfactory. The best performance of S V M G obtained using 80% of training data (i.e., 20% of validation data) with 50 number of PCs, obtained A c = 90.90 % , S e = 84.30 % , S p = 95.30 % , and M C C = 81 % . Further reduction of PCs length (40 PCs) drops the performance of classification using S V M G (shown in Figure 5 and Table 5b).

4.2. Classification Applying Real parts of Complex S 21

During the previously executed experiments SVM performed well with the Gaussian kernel, thus the S V M G has been considered here to obtain optimal outcomes for NF and WF breast signal classification. Only real components Σ n = 1 N F λ r e a l ( n ) from the complex S 21 signals λ n = Σ n = 1 N F = 1601 λ r e a l ( n ) + j λ i m g ( n ) , N F = 1601 , have been chosen for this classification employing S V M G and the work flow stated in Figure 3. The resistance components λ r e a l ( n ) have employed as features to inspect whether these are more revealing features for NF-WF signal prediction along with the S V M G and increasing true predications.
λ r e a l ( n ) data has been analysed and studied through two sample t-test for comparing the average values of λ r e a l ( n ) from two different classes i.e., NF and WF signal data. Table 6 shows the outcomes of the t-test, where p < α rejects the null hypothesis H 0 ( H 0 = 1 ), accepts alternative hypothesis H a , and the true mean of the population belong between 6.600 × 10 5 to 4.600 × 10 5 . Hence, the acceptance of alternative hypothesis indicates that the R e a l S 21 data comes from populations with unequal means and can be employed for NF-WF signal classification task.
Table 7 shows the NF-WF signal classification performance using λ r e a l ( n ) applying S V M G . The best classification performance of S V M G was obtained with 80% training data (i.e., 20% of validation data). Achieved metrics A c , S e , S p , and M C C are 0.798, 0.704, 0.863, and 0.577 respectively. The performance metrics did not improve on the previous test (using PCs obtained from the actual complex S 21 sequences). Sensitivity S e = 0.704 clearly indicates the increment of false positive predictions or misidentification of WF (or, lesion) signals. These misidentifications affect the M C C measure ( = 0.577 ) in the other way. All these results have been pictured in Figure 6. This parameter setting becomes unreliable as a significant number of misidentifications were found, also reflected in the performance metrics. The 20% validation data indicates 14,640 number of signals out of which approximately 3597 signals misidentified (false positive signals 2962 approx. and false negative signals 635 approx.) in each run.

4.3. Classification Applying PCs of Real Parts of Complex S 21

The third phase of the proposed work has been performed by extracting principal components from λ r e a l ( n ) and applying S V M G for NF-WF classification (as stated in Figure 3). PCs extracted from λ r e a l ( n ) are denoted as σ 1 , σ 2 , …, σ n . Two vectors of variances (applying PC) have been selected from NF-WF breasts to study the magnitude of variances which has been employed to choose the number of PCs for classification task, as shown in Figure 7. Significant variance has been found upto 80 PCs ( σ 1 , σ 2 , …, σ 80 ) and selected for NF-WF signal classification. Number of PCs have been varied anticipating an improved performance. In addition, the spherical data distribution and more compactness than before may help in better classification.
The variance of PCs are close to each other in Figure 7. Two sample t-test has been repeated on PCs to understand the capability to represent two signal groups and the data compactness. The probability has been found to be less than the significance level, P < α . Hence, the t-test accepts alternative hypothesis H a and clearly demonstrates the presence of two different means for two different populations. Subsequently, the difference between lower and upper boundary ( 1.770 × 10 4 and 1.570 × 10 4 ) is reduced which implies improved data compactness than before detailed in Table 8.
NF-WF signal classification has been performed with the PCs extracted from λ r e a l ( n ) (up to 80 PCs) and executing S V M G for breast lesion identification. Classification results have been tabulated in Table 9, where A c , S e , S p , and M C C are presented with varying number of PCs (interval of 10 units). The optimal classification performance was achieved with 50 PCs, but begins to decrease upon further reduction (i.e., 40 PCs). The resultant metrics are plotted in Figure 8a–d for comparison between each metric with the PCs variation. Metrics, A c , S e , S p , and M C C increased from 90.90% to 91%, 84.30% to 84.40%, 95.30% to 95.50%, and 81% to 81.20% respectively using the PCs ( σ 1 , σ 2 ,…, σ 50 ) extracted from λ r e a l ( n ) instead of the PCs extracted from λ n = Σ n = 1 N F = 1601 λ r e a l ( n ) + j λ i m g ( n ) . Classification performance improved from the performance achieved in Section 4.1, signifying that PCs derived from resistance components are more enlightening than PCs derived from both S 21 resistance and reactance components to represent NF-WF breast signals (as well as the NF-WF breasts). Hence, the performance obtained from 50 PCs with S V M G is considered as the optimal validation classification performance. Further two parameters; regularisation and scaling parameter have been tuned to search the space of possible hyperparameter values that results optimum validation results of the proposed model. The regularisation parameter 1 found to be optimal while the model is noticed to be reactive to the variation of kernel scale. Therefore, the ROC curve has been calculated varying kernel scaling parameter to select optimal value, displayed in Figure 9 and measured the area under the ROC curve (AUC = 0.99). The x and y-axes of Figure 9 represents false positive rate (FPR) or (1-specificity) and true positive rate (TPR) or sensitivity respectively. The threshold of ROC curve has been found to be kernel scaling parameter 1.8 for locating the balanced true and false positive rates. Hence, the optimal results A c = 95.07%, S e = 92.40%, S p = 98.32%, and M C C = 89.90% for validation performance have been decided considering kernel scaling parameter 1.8.

4.4. Classification Applying Optimal Settings: Training, Validation, Testing Experiment

Once the training and validation process have been completed that apprise the 50 PCs extracted from real parts of complex S 21 are most efficient for classifying NF and WF signals among all the feature extraction combination employed before. The experiment has now reorganised with the data broken into training, validation, and testing set for addressing data contamination and realise the final unbiased model performance on truly unseen test data set. Table 10 demonstrates the final performance obtained in the proposed work. As limited number of breasts (61 breasts’ data) are available at this stage of research, 20% (12 randomly selected breasts), 25% (15 randomly selected breasts) data have been stratified and held-out using MCCV procedure as test set initially. Rest of the 80%, 75% data have been used for training-validation process using optimal number of 50 PCs of real parts and applying S V M G on 80% allocated training and 20% validation, as settled in Section 4.3. Same performance metrics accuracy, sensitivity, specificity, and MCC have been calculated to analyse the performance. It is found that, the trained and validated S V M G model with 50 PCs of real parts of complex S 21 outperforms in 20% testing dataset (and 80% training-validation set). The attained performance includes the accuracy A c = 95.50%, sensitivity S e = 97.20%, specificity S p = 94.50%, and M C C = 90.90%, whereas the metrics were A c = 95.00%, S e = 92.40%, S p = 98.30%, and M C C = 89.90% before.

5. Discussion & Conclusions

The proposed methodology have three primary focuses: (1) the absence of radiation exposure through MammoWave have many potential benefits compared to the mammography; specifically, women will benefit from safe and accessible radiation-free breast cancer screening, more inclusive (no age-limitation), and more comfortable (no breast compression), (2) the study is part of a retrospective clinical trials, (3) the embrace of robust machine learning models using microwave breast imaging, makes it a cutting edge solution for safe breast lesions detection.
The structure of proposed support vector machine is simple to interpret and performed flexibly with the data. Proposed research outperformed with statistically and biologically dependent data as signals are generated and transmitted using same frequency band for each breast scanning and each patient’s body responds differently to microwave signal transmission. The signal data are fused while preparing for ML application. The procedure is computationally less complex and fast. Results are cross validated using MCCV method. Therefore, the proposed research is protected from several limitations. The experiment in [39] showed that breast frequency response is discriminative and independent where quadratic kernel of SVM can differentiate the signal response reflected from NF and WF breasts with acceptable sensitivity and specificity. However, in [39], S V M Q was applied on a limited number of patients after performing a dedicated S 21 pre-processing for artefact removal. The proposed work aims to provide a quantitative portrayal of NF-WF breast classification through MammoWave applying the ML algorithm directly to the raw S 21 , i.e., raw-data. In this scenario, it has been found that S V M G outperformed S V M Q . Specifically, S V M G was executed on the NF-WF signals of 61 breasts from 35 patients who participated in the feasibility clinical trial, showing satisfactory and improved prediction performance. Accuracy > 91%, sensitivity > 84%, specificity > 95%, and MCC > 81% were achieved through this study. Hence, this proposed study attained remarkable performance in the task of identifying or separating NF and WF (or lesion) signals automatically from raw signal data. The achieved sensitivity (84.40%) and specificity (95.50%) are similar to digital mammography sensitivity [44,45]. However, the MammoWave breast screening is non-invasive and painless and can be used across all ages, during pregnancy and multiple times.
The sensitivity value aligns with the MARIA system which has been used in symptomatic patients [24,25]. The MARIA system (Micrima Ltd, UK) uses an array of 60 antennas and a matching liquid to carry out the radar approach), which is far more complex that the one presented here. The proposed classification algorithm predicted false negatives (actually WF but predicted as NF) in some cases, effecting the sensitivity (84.40%) measure. This was due to the presence of very small sized lesion (few mm or smaller). There are low differences found between the signals value of NF and WF breasts with very small sized lesions. Therefore, numerically NF breasts and WF breasts with very small sized lesion behave similarly, misguiding the classification process, resulting in false negative predictions. This issue will be addressed in our future work by modifying the conventional S V M G kernel structure and performing advance research on feature representation. Moreover, we will investigate the use of microwave image features [22] for dedicated machine learning models, following a procedure similar to [46], where the authors used radiomics derived from Contrast-Enhanced Spectral Mammography Images, obtaining a sensitivity and specificity of 88.37% and 100%.
Raw scan of each breast contains 1200 complex S 21 signals. Classification of NF-WF signals with high specificity and sensitivity will help to decide the threshold for a breast to be entitled as NF or WF which will help to determine further clinical procedure for the patients. The research for detecting the threshold is underway. Also, further research and more breast data are required to generalise that threshold for breast classification because, a main limitation of the study is represented by the fact that a limited number of breast has been used. Research is in progress using data collected in other MammoWave clinical trials, just ended (https://clinicaltrials.gov/ct2/show/NCT04253366). In this circumstance, ML study will be carried on with ongoing MammoWave clinical trial data [47]. Moreover, further clinical trials are planned to enlarge the resulting research database [48], paving the way for the use of microwave imaging into clinical practice as complementary tool for the screening of asymptomatic women of any age and without any safety restrictions. It is worthwhile pointing out that one of the goals of HORIZON-MISS-2021-CANCER-02-01 scheme is to validate new methods and technologies for cancer screening and early detection, preferably non-invasive and more-inclusive than current approaches. In this context, one of the selected projects has the aim of generating evidence on thousands of women regarding the use of MammoWave as breast cancer screening technique [48].

Author Contributions

G.T. and the team have designed and manufactured the UWB microwave non-ionising apparatus with associated signal processing techniques comprising MammoWave. Subjects have been recruited and screened through MammoWave. R.L. and M.D. (Michele Duranti) have conducted the conventional radiological investigation on the same subjects and confirmed the outcomes of UWB MammoWave experiment. S.P.R. analysed, interpreted the data for ML application, performed the ML algorithms, and made the draft. S.P.R. and M.D. (Maitreyee Dey) have analysed the prediction outcomes for automatic breast lesion detection through clinical UWB MammoWave. S.D. and G.T. have supervised the work and managed the experiments performed along with the co-authors at LSBU. S.D., G.T. and M.G. instigated the collaborative work on this paper between the teams at UBT, Perugia and LSBU. All authors have read and agreed to the published version of the manuscript.

Funding

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 830265. This project leading to this application has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 793449. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 872752.

Institutional Review Board Statement

The protocol and procedures were conducted in accordance with institutional and ethical standards in research, with Declaration of Helsinki (1964) and its later amendments. The MammoWave feasibility clinical validation has been performed in Perugia and Foligno Hospitals, Italy (Ethical Committee of Umbria, Italy, approval N. 6845/15/AV/DM of 14/10/2015, N. 10352/17/NCAV of 16/03/2017, N 13203/18/NCAV of 17/04/2018).

Informed Consent Statement

Prior to the trial, informed consent was obtained from all participants involved in the study.

Data Availability Statement

The datasets that support the findings of this study are not publicly available, but will be made available upon reasonable request, following ethics committee approval and a data transfer agreement to guarantee the General Data Protection Regulation. Please contact the authors, Soumya Prakash Rana (Email: [email protected], [email protected]) or Gianluigi Tiberi (Email: [email protected], [email protected]) to request access to the data.

Conflicts of Interest

Gianluigi Tiberi is shareholders of UBT—Umbria Bioengineering Technologies. This does not alter our adherence to MDPI journal policies on sharing data and materials.

References

  1. Yedjou, C.G.; Sims, J.N.; Miele, L.; Noubissi, F.; Lowe, L.; Fonseca, D.D.; Alo, R.A.; Payton, M.; Tchounwou, P.B. Health and racial disparity in breast cancer. In Breast Cancer Metastasis Drug Resistance; Springer: Basel, Switzerland, 2019; pp. 31–49. [Google Scholar]
  2. Miglioretti, D.L.; Lange, J.; Van Den Broek, J.J.; Lee, C.I.; Van Ravesteyn, N.T.; Ritley, D.; Kerlikowske, K.; Fenton, J.J.; Melnikow, J.; De Koning, H.J.; et al. Radiation-induced breast cancer incidence and mortality from digital mammography screening: A modeling study. Ann. Intern. Med. 2016, 164, 205–214. [Google Scholar] [CrossRef] [PubMed]
  3. Seiffert, K.; Thoene, K.; Zu Eulenburg, C.; Behrens, S.; Schmalfeldt, B.; Becher, H.; Chang-Claude, J.; Witzel, I. The effect of family history on screening procedures and prognosis in breast cancer patients-Results of a large population-based case-control study. Breast 2021, 55, 98–104. [Google Scholar] [CrossRef] [PubMed]
  4. Miller, A.M.; Champion, V.L. Attitudes about breast cancer and mammography: Racial, income, and educational differences. Women Health 1997, 26, 41–63. [Google Scholar] [CrossRef]
  5. Loving, V.A.; Aminololama-Shakeri, S.; Leung, J.W. Anxiety and its association with screening mammography. J. Breast Imaging 2021, 3, 266–272. [Google Scholar] [CrossRef]
  6. Tiberi, G.; Raspa, G. Apparatus for Testing the Integrity of Mammary Tissues. US Patent 10,349,863, 2019. [Google Scholar]
  7. Sensitivity, Specificity, and False Negative Rate for 1,682,504 Screening Mammography Examinations from 2007–2013. In Technical Report; Breast Cancer Surveillance Consortium (BCSC): Seattle, WA, USA, 2017.
  8. Stout, N.K.; Lee, S.J.; Schechter, C.B.; Kerlikowske, K.; Alagoz, O.; Berry, D.; Buist, D.S.; Cevik, M.; Chisholm, G.; De Koning, H.J.; et al. Benefits, harms, and costs for breast cancer screening after US implementation of digital mammography. JNCI: J. Natl. Cancer Inst. 2014, 106, dju092. [Google Scholar] [CrossRef] [PubMed]
  9. Nelson, H.D.; Pappas, M.; Cantor, A.; Griffin, J.; Daeges, M.; Humphrey, L. Harms of breast cancer screening: Systematic review to update the 2009 US Preventive Services Task Force recommendation. Ann. Intern. Med. 2016, 164, 256–267. [Google Scholar] [CrossRef]
  10. Fanizzi, A.; Basile, T.; Losurdo, L.; Amoroso, N.; Bellotti, R.; Bottigli, U.; Dentamaro, R.; Didonna, V.; Fausto, A.; Massafra, R.; et al. Hough transform for clustered microcalcifications detection in full-field digital mammograms. In Applications of Digital Image Processing XL; International Society for Optics and Photonics: San Diego, CA, USA, 2017; Volume 10396, p. 1039616. [Google Scholar]
  11. Bibault, J.E.; Giraud, P.; Burgun, A. Big data and machine learning in radiation oncology: State of the art and future prospects. Cancer Lett. 2016, 382, 110–117. [Google Scholar] [CrossRef]
  12. Zhang, B.; He, X.; Ouyang, F.; Gu, D.; Dong, Y.; Zhang, L.; Mo, X.; Huang, W.; Tian, J.; Zhang, S. Radiomic machine-learning classifiers for prognostic biomarkers of advanced nasopharyngeal carcinoma. Cancer Lett. 2017, 403, 21–27. [Google Scholar] [CrossRef]
  13. Huang, S.; Yang, J.; Fong, S.; Zhao, Q. Artificial intelligence in cancer diagnosis and prognosis: Opportunities and challenges. Cancer Lett. 2020, 471, 61–71. [Google Scholar] [CrossRef]
  14. McKinney, S.M.; Sieniek, M.; Godbole, V.; Godwin, J.; Antropova, N.; Ashrafian, H.; Back, T.; Chesus, M.; Corrado, G.S.; Darzi, A.; et al. International evaluation of an AI system for breast cancer screening. Nature 2020, 577, 89–94. [Google Scholar] [CrossRef]
  15. Hickman, S.E.; Baxter, G.C.; Gilbert, F.J. Adoption of artificial intelligence in breast imaging: Evaluation, ethical constraints and limitations. Br. J. Cancer 2021, 125, 15–22. [Google Scholar] [CrossRef] [PubMed]
  16. Meaney, P.M.; Fanning, M.W.; Raynolds, T.; Fox, C.J.; Fang, Q.; Kogel, C.A.; Poplack, S.P.; Paulsen, K.D. Initial clinical experience with microwave breast imaging in women with normal mammography. Acad. Radiol. 2007, 14, 207–218. [Google Scholar] [CrossRef] [PubMed]
  17. Bahramiabarghouei, H.; Porter, E.; Santorelli, A.; Gosselin, B.; Popović, M.; Rusch, L.A. Flexible 16 antenna array for microwave breast cancer detection. IEEE Trans. Biomed. Eng. 2015, 62, 2516–2525. [Google Scholar] [CrossRef]
  18. Bond, E.J.; Li, X.; Hagness, S.C.; Van Veen, B.D. Microwave imaging via space-time beamforming for early detection of breast cancer. IEEE Trans. Antennas Propag. 2003, 51, 1690–1705. [Google Scholar] [CrossRef]
  19. Fear, E.C.; Li, X.; Hagness, S.C.; Stuchly, M.A. Confocal microwave imaging for breast cancer detection: Localization of tumors in three dimensions. IEEE Trans. Biomed. Eng. 2002, 49, 812–822. [Google Scholar] [CrossRef]
  20. Meaney, P.M.; Fanning, M.W.; Li, D.; Poplack, S.P.; Paulsen, K.D. A clinical prototype for active microwave imaging of the breast. IEEE Trans. Microw. Theory Tech. 2000, 48, 1841–1853. [Google Scholar]
  21. O’Loughlin, D.; O’Halloran, M.; Moloney, B.M.; Glavin, M.; Jones, E.; Elahi, M.A. Microwave breast imaging: Clinical advances and remaining challenges. IEEE Trans. Biomed. Eng. 2018, 65, 2580–2590. [Google Scholar] [CrossRef]
  22. Sani, L.; Vispa, A.; Loretoni, R.; Duranti, M.; Ghavami, N.; Alvarez Sánchez-Bayuela, D.; Caschera, S.; Paoli, M.; Bigotti, A.; Badia, M.; et al. Breast lesion detection through MammoWave device: Empirical detection capability assessment of microwave images’ parameters. PLoS ONE 2021, 16, e0250005. [Google Scholar] [CrossRef]
  23. Meaney, P.M.; Kaufman, P.A.; Muffly, L.S.; Click, M.; Poplack, S.P.; Wells, W.A.; Schwartz, G.N.; di Florio-Alexander, R.M.; Tosteson, T.D.; Li, Z.; et al. Microwave imaging for neoadjuvant chemotherapy monitoring: Initial clinical experience. Breast Cancer Res. 2013, 15, 1–16. [Google Scholar] [CrossRef]
  24. Preece, A.W.; Craddock, I.; Shere, M.; Jones, L.; Winton, H.L. MARIA M4: Clinical evaluation of a prototype ultrawideband radar scanner for breast cancer detection. J. Med Imaging 2016, 3, 033502. [Google Scholar] [CrossRef]
  25. Massey, H.; Ridley, N.; Lyburn, I.; Taylor, S.; Schoenleber-Lewis, M.; Bannister, P.; Shere, M.H. Radio-wave detection of breast cancer in the symptomatic clinic—a multi-centre study. In Proceedings of the International Cambridge Conference on Breast Imaging, Cambridge, UK, 3–4 July 2017; pp. 3–4. [Google Scholar]
  26. Curtis, C.; Lavoie, B.R.; Fear, E. An analysis of the assumptions inherent to near-field beamforming for biomedical applications. IEEE Trans. Comput. Imaging 2017, 3, 953–965. [Google Scholar] [CrossRef]
  27. Yang, F.; Sun, L.; Hu, Z.; Wang, H.; Pan, D.; Wu, R.; Zhang, X.; Chen, Y.; Zhang, Q. A large-scale clinical trial of radar-based microwave breast imaging for Asian women: Phase I. In Proceedings of the 2017 IEEE International Symposium on Antennas and Propagation & USNC/URSI National Radio Science Meeting, San Diego, CA, USA, 9–14 July 2017; pp. 781–783. [Google Scholar]
  28. Kurrant, D.; Bourqui, J.; Fear, E. Surface estimation for microwave imaging. Sensors 2017, 17, 1658. [Google Scholar] [CrossRef] [PubMed]
  29. Song, H.; Sasada, S.; Kadoya, T.; Okada, M.; Arihiro, K.; Xiao, X.; Kikkawa, T. Detectability of breast tumor by a hand-held impulse-radar detector: Performance evaluation and pilot clinical study. Sci. Rep. 2017, 7, 1–11. [Google Scholar] [CrossRef]
  30. Porter, E.; Duff, K.; Popovic, M.; Coates, M. Investigation of time-domain microwave radar with breast clinic patients. In Proceedings of the 2016 10th European Conference on Antennas and Propagation (EuCAP), Davos, Switzerland, 10–15 April 2016; pp. 1–3. [Google Scholar]
  31. Kuwahara, Y.; Malik, A. Microwave imaging for early breast cancer detection. New Perspectives Breast Imaging; IntechOpen: London, UK, 2017; pp. 45–71. [Google Scholar]
  32. Ghavami, N.; Tiberi, G.; Edwards, D.J.; Monorchio, A. UWB microwave imaging of objects with canonical shape. IEEE Trans. Antennas Propag. 2011, 60, 231–239. [Google Scholar] [CrossRef]
  33. Tiberi, G.; Sani, L.; Ghavami, N.; Paoli, M.; Vispa, A.; Raspa, G.; Vannini, E.; Saracini, A.; Duranti, M. Sensitivity assessment of a microwave apparatus for breast cancer detection. In Proceedings of the European Congress of Radiology-ECR 2018, Vienna, Austria, 28 February–4 March 2018. [Google Scholar]
  34. Da Silva, D.; Kaduri, M.; Poley, M.; Adir, O.; Krinsky, N.; Shainsky-Roitman, J.; Schroeder, A. Biocompatibility, biodegradation and excretion of polylactic acid (PLA) in medical implants and theranostic systems. Chem. Eng. J. 2018, 340, 9–14. [Google Scholar] [CrossRef]
  35. Sani, L.; Ghavami, N.; Vispa, A.; Paoli, M.; Raspa, G.; Ghavami, M.; Sacchetti, F.; Vannini, E.; Ercolani, S.; Saracini, A.; et al. Novel microwave apparatus for breast lesions detection: Preliminary clinical results. Biomed. Signal Process. Control 2019, 52, 257–263. [Google Scholar] [CrossRef]
  36. Lazebnik, M.; Popovic, D.; McCartney, L.; Watkins, C.B.; Lindstrom, M.J.; Harter, J.; Sewall, S.; Ogilvie, T.; Magliocco, A.; Breslin, T.M.; et al. A large-scale study of the ultrawideband microwave dielectric properties of normal, benign and malignant breast tissues obtained from cancer surgeries. Phys. Med. Biol. 2007, 52, 6093. [Google Scholar] [CrossRef]
  37. Conceição, R.C.; Mohr, J.J.; O’Halloran, M. An Introduction to Microwave Imaging for Breast Cancer Detection; Springer: Basel, Switzerland, 2016. [Google Scholar]
  38. Noble, W.S. What is a support vector machine. Nat. Biotechnol. 2006, 24, 1565–1567. [Google Scholar] [CrossRef]
  39. Rana, S.P.; Dey, M.; Tiberi, G.; Sani, L.; Vispa, A.; Raspa, G.; Duranti, M.; Ghavami, M.; Dudley, S. Machine learning approaches for automated lesion detection in microwave breast imaging clinical data. Sci. Rep. 2019, 9, 1–12. [Google Scholar] [CrossRef]
  40. He, Q.; Kong, F.; Yan, R. Subspace-based gearbox condition monitoring by kernel principal component analysis. Mech. Syst. Signal Process. 2007, 21, 1755–1772. [Google Scholar] [CrossRef]
  41. Xu, Q.S.; Liang, Y.Z. Monte Carlo cross validation. Chemom. Intell. Lab. Syst. 2001, 56, 1–11. [Google Scholar] [CrossRef]
  42. Powers, D.M. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv 2020, arXiv:2010.16061. [Google Scholar]
  43. Matthews, B.W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Et Biophys. Acta (BBA)-Protein Struct. 1975, 405, 442–451. [Google Scholar] [CrossRef]
  44. Pisano, E.D.; Hendrick, R.E.; Yaffe, M.J.; Baum, J.K.; Acharyya, S.; Cormack, J.B.; Hanna, L.A.; Conant, E.F.; Fajardo, L.L.; Bassett, L.W.; et al. Diagnostic accuracy of digital versus film mammography: Exploratory analysis of selected population subgroups in DMIST. Radiology 2008, 246, 376–383. [Google Scholar] [CrossRef]
  45. Zeeshan, M.; Salam, B.; Khalid, Q.S.B.; Alam, S.; Sayani, R. Diagnostic accuracy of digital mammography in the detection of breast cancer. Cureus 2018, 10. [Google Scholar] [CrossRef]
  46. Massafra, R.; Bove, S.; Lorusso, V.; Biafora, A.; Comes, M.C.; Didonna, V.; Diotaiuti, S.; Fanizzi, A.; Nardone, A.; Nolasco, A.; et al. Radiomic Feature Reduction Approach to Predict Breast Cancer by Contrast-Enhanced Spectral Mammography Images. Diagnostics 2021, 11, 684. [Google Scholar] [CrossRef]
  47. ClinicalTrials.gov. 2020. Available online: https://clinicaltrials.gov/ct2/show/NCT04253366 (accessed on 24 August 2021).
  48. Tiberi, G. Presenting MammoScreen project: Innovative and safe microwave-based imaging technology to make breast cancer screening more accurate, inclusive, and female-friendly, ECR 2023, RPS 702, Advanced applications in breast imaging. 2023; Vienna, Austria on 1–5 March 2023. (Just accepted). [Google Scholar]
Figure 1. Proposed flow chart of MammoWave signal classification for the breast lesion detection.
Figure 1. Proposed flow chart of MammoWave signal classification for the breast lesion detection.
Tomography 09 00010 g001
Figure 2. (a) MammoWave device designed by Umbria Bioengineering Technologies (UBT), Italy. (b) Patient’s breast examination procedure with MammoWave. (c) Pictorial view of the transmitting-receiving antenna positions of MammoWave measurement.
Figure 2. (a) MammoWave device designed by Umbria Bioengineering Technologies (UBT), Italy. (b) Patient’s breast examination procedure with MammoWave. (c) Pictorial view of the transmitting-receiving antenna positions of MammoWave measurement.
Tomography 09 00010 g002
Figure 3. A Flowchart of the classification stages involved in the proposed ML experiment. S V M G and S V M Q have been trialled and compared, employing PCs of the complex S 21 data (raw data). The investigation continued with S V M G , comparing its performance with S V M Q . Subsequently, real parts and PCs obtained from the real parts of the complex S 21 have been employed in the second and third stages respectively to classify the NF-WF signals by S V M G . The optimal performance attained in the third stage applying PCs of real parts of complex S 21 with S V M G .
Figure 3. A Flowchart of the classification stages involved in the proposed ML experiment. S V M G and S V M Q have been trialled and compared, employing PCs of the complex S 21 data (raw data). The investigation continued with S V M G , comparing its performance with S V M Q . Subsequently, real parts and PCs obtained from the real parts of the complex S 21 have been employed in the second and third stages respectively to classify the NF-WF signals by S V M G . The optimal performance attained in the third stage applying PCs of real parts of complex S 21 with S V M G .
Tomography 09 00010 g003
Figure 4. Example of significant variance achieved applying PCA on complex S 21 signals. (a) Significant variance achieved applying PCA on complex S 21 signals of a NF breast. (b) Significant variance achieved applying PCA on complex S 21 signals of a WF breast.
Figure 4. Example of significant variance achieved applying PCA on complex S 21 signals. (a) Significant variance achieved applying PCA on complex S 21 signals of a NF breast. (b) Significant variance achieved applying PCA on complex S 21 signals of a WF breast.
Tomography 09 00010 g004
Figure 5. NF and WF signal classification results obtained from S V M Q and S V M G applying PCA over MammoWave’s complex S 21 data. (a) Accuracy obtained from S V M G varying PCs and validation data. (b) Accuracy obtained from S V M Q varying PCs and validation data. (c) Sensitivity obtained from S V M G varying PCs and validation data. (d) Sensitivity obtained from S V M Q varying PCs and validation data. (e) Specificity obtained from S V M G varying PCs and validation data. (f) Specificity obtained from S V M Q varying PCs and validation data. (g) M C C obtained from S V M G varying PCs and validation data. (h) M C C obtained from S V M Q varying PCs and validation data.
Figure 5. NF and WF signal classification results obtained from S V M Q and S V M G applying PCA over MammoWave’s complex S 21 data. (a) Accuracy obtained from S V M G varying PCs and validation data. (b) Accuracy obtained from S V M Q varying PCs and validation data. (c) Sensitivity obtained from S V M G varying PCs and validation data. (d) Sensitivity obtained from S V M Q varying PCs and validation data. (e) Specificity obtained from S V M G varying PCs and validation data. (f) Specificity obtained from S V M Q varying PCs and validation data. (g) M C C obtained from S V M G varying PCs and validation data. (h) M C C obtained from S V M Q varying PCs and validation data.
Tomography 09 00010 g005
Figure 6. NF and WF signal prediction results (accuracy, sensitivity, specificity, and M C C ) obtained using real parts of MammoWave’s complex S 21 signals applying S V M G over different amount of validation data.
Figure 6. NF and WF signal prediction results (accuracy, sensitivity, specificity, and M C C ) obtained using real parts of MammoWave’s complex S 21 signals applying S V M G over different amount of validation data.
Tomography 09 00010 g006
Figure 7. Example of significant variance achieved applying PCA on real part of complex S 21 signals. (a) Significant variance achieved applying PCA on real part of complex S 21 signals of a NF breast. (b) Significant variance achieved applying PCA on real part of complex S 21 signals of a WF breast.
Figure 7. Example of significant variance achieved applying PCA on real part of complex S 21 signals. (a) Significant variance achieved applying PCA on real part of complex S 21 signals of a NF breast. (b) Significant variance achieved applying PCA on real part of complex S 21 signals of a WF breast.
Tomography 09 00010 g007
Figure 8. NF and WF signal classification results obtained from S V M G applying PCA over real-parts of MammoWave’s complex S 21 data. (a) Accuracy obtained from S V M G varying PCs and validation data. (b) Sensitivity obtained from S V M G varying PCs and validation data. (c) Specificity obtained from S V M G varying PCs and validation data. (d) M C C obtained from S V M G varying PCs and validation data.
Figure 8. NF and WF signal classification results obtained from S V M G applying PCA over real-parts of MammoWave’s complex S 21 data. (a) Accuracy obtained from S V M G varying PCs and validation data. (b) Sensitivity obtained from S V M G varying PCs and validation data. (c) Specificity obtained from S V M G varying PCs and validation data. (d) M C C obtained from S V M G varying PCs and validation data.
Tomography 09 00010 g008
Figure 9. ROC curve obtained from S V M G employing principal components of real parts of S 21 signals for NF-WF signal classification.
Figure 9. ROC curve obtained from S V M G employing principal components of real parts of S 21 signals for NF-WF signal classification.
Tomography 09 00010 g009
Table 1. Summary of the patient population used in this study.
Table 1. Summary of the patient population used in this study.
Name of ParametersValues
Total patients34
Total subjects (breasts)61
Number of patients age between 20–49 year23
Number of patients age between 50–80 year38
Mean of patient’s age (in year)52
Standard deviation of patient’s age (in year)12
Table 2. Details and related radiologist’s review for the breasts with radiological findings (WF).
Table 2. Details and related radiologist’s review for the breasts with radiological findings (WF).
AgeBreast
(L/R)
ACR
Breast Density
Mammography
BI-RADS
Echography
BI-RADS
Radiologist’s Output Details:
Sizes (mm) & Notes (if Available)
Pathology or 1-Year
Clinical Follow-up Output
48LD3-MicrocalcificationsBenign
65LC4-Cluster of microcalcificationsBenign
40LB22Three masses:
15 mm, 21 mm, and 23 mm
Benign
RB22MicrocalcificationsNot available
52LC5-MicrocalcificationsMalignant
47LD22MicrocalcificationsBenign
55RC221.6 mm microcalcificationsBenign
LC223.8 mm microcalcificationsBenign
51LC22Presence of metallic markerBenign
54RA22MicrocalcificationsBenign
77RD-517 mm massMalignant
61RC4-Multifocal lobular type suspected
carcinoma (MRI BI-RADS 4)
Malignant
LC2-Macrocalcification and Focal
contrast enh. (MRI BI-RADS 3)
Not available
50LB2210 mm massBenign
67LC4-MicrocalcificationsMalignant
49LA3-MicrocalcificationsBenign
70LD34MassMalignant
42LC237 mm mass, hypoechoicBenign
67LB3-Architectural distortionBenign
56RB4431 mm mass, hypoechoic,
irregular borders
Malignant
43RD1312 mm massBenign
51LC3-MicrocalcificationsBenign
59LB-411 mm areolar, suspicious
of malignancy
Malignant
40LD2230 mm massBenign
35RC237 mm, hypoechoicBenign
37LA2325 mm massBenign
43RB32MicrocalcificationsMalignant
54RB2218 mm massBenign
49LA2316 mm massBenign
56LD4427 mm massMalignant
63LA346 mm massMalignant
55RC4423 mm massMalignant
LC22Multiple cystsBenign
64RB3-1.6 mm microcalcificationsBenign
37R--315.4 mm massBenign
L--2Multiple cystsNot available
Table 3. Approximated quantitative summary of raw S 21 signals, as these values may vary with the addition of new patients.
Table 3. Approximated quantitative summary of raw S 21 signals, as these values may vary with the addition of new patients.
Breast TypeMeanMaximumMinimumMedianStandard DeviationVariance
No-Finding (NF)2.179 × 10 5 0.114−0.105−8.286 × 10 5 0.0114.578 × 10 7
With-Finding (WF)1.985 × 10 5 0.118−0.108−1.640 × 10 4 0.0114.651 × 10 7
Table 4. Two-sample t-test on PCA features extracted from MammoWave’s complex S 21 data.
Table 4. Two-sample t-test on PCA features extracted from MammoWave’s complex S 21 data.
Null Hypothesis ( H null )Probabilty (p)Confidence Interval ( C L )Confidence Interval ( C U )
1 7.516 × 10 10 3.000 × 10 5 1.500 × 10 5
Table 5. The classification results over reduced feature space (i.e., extracted features from MammoWave’s original frequency response) after applying S V M s . (a) The classification results over reduced feature space (i.e., extracted features from MammoWave’s original frequency response) after applying S V M Q . (b) The classification results over reduced feature space (i.e., extracted features from MammoWave’s original frequency response) after applying S V M G .
Table 5. The classification results over reduced feature space (i.e., extracted features from MammoWave’s original frequency response) after applying S V M s . (a) The classification results over reduced feature space (i.e., extracted features from MammoWave’s original frequency response) after applying S V M Q . (b) The classification results over reduced feature space (i.e., extracted features from MammoWave’s original frequency response) after applying S V M G .
Feature
Dimension →
PC-80PC-70PC-60PC-50PC-40
Validation
Data
A c S e S p MCC A c S e S p MCC A c S e S p MCC A c S e S p MCC A c S e S p MCC
95%0.6440.5090.7370.2510.6350.4340.7740.2210.6260.4890.7200.2130.6160.4160.7540.1800.6010.4100.7330.150
90%0.6700.5100.7810.3020.6530.4710.7810.2640.6470.4680.7720.2510.6300.4220.7750.2100.6110.3680.7790.161
85%0.6820.5180.7970.3280.6700.5010.7870.3010.6540.4550.7930.2640.6420.4270.7910.2350.6210.3700.7960.183
80%0.6900.5220.8070.3450.6800.5350.7810.3260.6660.4580.8110.2890.6490.4230.8060.2480.6290.3910.7940.202
75%0.6930.5280.8080.3520.6840.5160.7990.3300.6660.4620.8080.2890.6530.4300.8080.2570.6320.3430.8320.201
70%0.7060.5650.8040.3800.6890.5420.7910.3450.6760.4980.7990.3130.6580.4340.8130.2680.6300.3450.8300.201
65%0.7040.5490.8120.3760.6950.5510.7950.3570.6730.4880.8010.3050.6540.4330.8070.2590.6300.3180.8470.195
60%0.7080.5700.8050.3870.6920.5270.8080.3500.6760.4870.8080.3130.6590.4430.8080.2700.6340.3490.8320.207
55%0.7150.5590.8240.4000.7000.5310.8180.3660.6810.4920.8130.3240.6620.4250.8260.2770.6290.3010.8550.189
50%0.7170.5830.8090.4050.7020.5360.8180.3710.6820.5010.8090.3270.6630.4230.8310.2800.6290.3190.8460.195
45%0.7180.5770.8150.4050.6970.5410.8050.3600.6830.5030.8070.3270.6610.4170.8290.2730.6340.3280.8480.207
40%0.7130.5680.8130.3940.7010.5320.8190.3680.6800.4940.8080.3200.6610.4240.8250.2730.6360.3450.8380.211
35%0.7180.5710.8200.4060.7080.5580.8110.3840.6850.5010.8120.3310.6660.4400.8220.2860.6370.3560.8300.213
30%0.7180.5700.8220.4070.7140.5640.8180.3960.6860.4990.8170.3350.6650.4280.8300.2840.6310.3070.8570.197
25%0.7220.5760.8220.4130.7060.5590.8060.3790.6900.5030.8180.3410.6680.4230.8390.2910.6370.3240.8570.216
20%0.7230.5830.8180.4150.7100.5650.8100.3880.6880.5130.8070.3360.6680.4480.8200.2900.6380.3270.8550.217
95%0.6330.1180.9900.2330.6330.1290.9830.2280.6410.1630.9720.2420.6490.2350.9360.2470.6490.3110.8830.241
90%0.6640.2020.9850.3200.6740.2370.9780.3380.6770.2640.9640.3340.6870.3500.9230.3420.6820.3810.8900.322
85%0.7030.3070.9780.4050.7110.3470.9650.4140.7190.3750.9570.4260.7260.4500.9180.4280.7160.4810.8780.399
80%0.7310.3830.9720.4610.7360.4140.9600.4670.7480.4560.9500.4850.7560.5320.9120.4910.7360.5380.8740.445
75%0.7530.4420.9700.5080.7660.4910.9570.5270.7710.5140.9510.5360.7750.5560.9290.5360.7590.5790.8840.495
70%0.7790.5010.9720.5590.7920.5480.9600.5790.7930.5660.9510.5790.7940.6120.9210.5730.7780.6190.8880.534
65%0.8000.5560.9700.6020.8070.5880.9580.6080.8100.6100.9490.6120.8120.6510.9250.6110.7970.6570.8940.576
60%0.8150.5950.9690.6310.8260.6300.9620.6480.8270.6510.9500.6470.8280.6900.9230.6420.8090.6820.8970.600
55%0.8320.6310.9720.6640.8420.6700.9600.6780.8420.6800.9550.6780.8420.7100.9350.6740.8200.6990.9040.625
50%0.8430.6600.9710.6850.8510.6920.9600.6950.8520.7040.9550.6980.8550.7430.9320.6990.8330.7280.9060.653
45%0.8570.6900.9730.7120.8660.7150.9710.7290.8660.7340.9570.7260.8700.7630.9430.7310.8440.7430.9140.675
40%0.8670.7160.9710.7300.8760.7430.9680.7480.8770.7590.9580.7470.8790.7830.9450.7490.8500.7610.9110.687
35%0.8810.7400.9780.7600.8870.7640.9740.7720.8850.7730.9640.7650.8830.7950.9440.7580.8610.7720.9240.713
30%0.8870.7560.9790.7720.8920.7750.9740.7810.8920.7870.9650.7790.8930.8120.9490.7780.8710.7860.9300.733
25%0.8910.7630.9810.7810.9060.8040.9750.8070.8970.7950.9690.7900.9000.8250.9510.7920.8730.7990.9240.736
20%0.9050.7960.9790.8060.9110.8200.9740.8180.9100.8260.9680.8150.9090.8430.9530.8100.8830.8060.9360.757
Table 6. Two-sample t-test on real-parts of MammoWave’s S 21 data.
Table 6. Two-sample t-test on real-parts of MammoWave’s S 21 data.
Null Hypothesis ( H 0 )Probabilty(p)Confidence Interval ( C L )Confidence Interval ( C U )
1 8.864 × 10 27 6.600 × 10 5 4.600 × 10 5
Table 7. NF and WF signal classification results obtained from S V M G applying real-parts of MammoWave’s complex S 21 data.
Table 7. NF and WF signal classification results obtained from S V M G applying real-parts of MammoWave’s complex S 21 data.
Validation Data A c S e S p MCC
95%0.6230.4000.7770.190
90%0.6520.4270.8080.255
85%0.6730.4550.8240.303
80%0.6940.5130.8200.353
75%0.7090.5420.8250.385
70%0.7240.5680.8320.418
65%0.7290.5780.8340.430
60%0.7460.6100.8400.465
55%0.7510.6150.8470.478
50%0.7600.6250.8550.497
45%0.7670.6400.8550.511
40%0.7750.6590.8550.528
35%0.7790.6600.8620.538
30%0.7860.6780.8610.552
25%0.7930.6850.8680.567
20%0.7980.7040.8630.577
Table 8. Two-sample t-test on PCA features extracted from real-parts of MammoWave’s S 21 data.
Table 8. Two-sample t-test on PCA features extracted from real-parts of MammoWave’s S 21 data.
Null Hypothesis
( H 0 )
Probabilty (p)Confidence Interval
( C L )
Confidence Interval
( C U )
1 8.219 × 10 23 1.770 × 10 4 1.570 × 10 4
Table 9. Classification results applying PCs extracted from real parts of MammoWave’s complex S 21 signals with S V M G .
Table 9. Classification results applying PCs extracted from real parts of MammoWave’s complex S 21 signals with S V M G .
Feature
Dimension →
PC-80PC-70PC-60PC-50PC-40
Validation
Data
A c S e S p MCC A c S e S p MCC A c S e S p MCC A c S e S p MCC A c S e S p MCC
95%0.6320.1180.9900.2320.6320.1270.9830.2240.6380.1500.9770.2370.6500.2430.9320.2480.6520.3130.8870.248
90%0.6640.2030.9850.3200.6820.2690.9670.3460.6860.2880.9620.3540.6910.3570.9230.3510.6860.4240.8670.330
85%0.7040.3080.9790.4070.7120.3510.9640.4160.7120.3620.9570.4130.7290.4640.9130.4330.7110.4820.8700.388
80%0.7320.3850.9720.4630.7460.4490.9520.4820.7500.4730.9420.4870.7540.5280.9110.4870.7320.5120.8860.437
75%0.7540.4420.9710.5090.7670.4940.9570.5290.7660.5030.9500.5240.7770.5800.9150.5380.7580.5790.8830.493
70%0.7800.5030.9720.5610.7900.5500.9570.5750.7880.5570.9500.5700.7980.6210.9220.5820.7800.6300.8830.538
65%0.8000.5550.9710.6020.8120.6050.9550.6170.8100.6010.9560.6150.8140.6580.9220.6140.7920.6490.8910.565
60%0.8160.5970.9690.6310.8250.6320.9590.6450.8320.6570.9530.6570.8310.6980.9230.6490.8060.6790.8930.594
55%0.8330.6330.9730.6670.8420.6720.9580.6770.8440.6810.9560.6810.8460.7180.9350.6820.8220.7020.9050.628
50%0.8430.6600.9720.6860.8540.6960.9640.7050.8510.6970.9570.6950.8520.7390.9300.6930.8290.7240.9020.644
45%0.8580.6900.9730.7130.8640.7140.9680.7240.8660.7270.9630.7270.8650.7650.9350.7210.8410.7310.9180.670
40%0.8670.7170.9710.7310.8730.7390.9660.7410.8730.7530.9560.7390.8750.7850.9370.7410.8540.7560.9230.697
35%0.8810.7410.9770.7600.8880.7750.9670.7720.8810.7660.9620.7580.8860.8000.9470.7650.8620.7760.9220.713
30%0.8880.7570.9790.7750.8960.7850.9720.7870.8900.7810.9660.7760.8940.8110.9510.7810.8670.7870.9220.723
25%0.8920.7650.9810.7820.9000.7980.9700.7950.9050.8090.9710.8050.9000.8210.9560.7950.8790.8060.9300.749
20%0.9060.8000.9800.8100.9100.8130.9770.8170.9050.8100.9720.8060.9100.8440.9550.8120.8850.8230.9280.760
Table 10. Training, validation, and testing dataset: classification results applying PCs extracted from real parts of MammoWave’s complex S 21 signals with S V M G .
Table 10. Training, validation, and testing dataset: classification results applying PCs extracted from real parts of MammoWave’s complex S 21 signals with S V M G .
Total BreastsTraining-Validation
Data
Training
Data
Validation
Data
Feature Dimension PC-50Testing DataFeature Dimension PC-50
A c S e S p MCC A c S e S p MCC
61 Breasts75% of Data
(46 Breasts)
80%20%84.20%88.20%82.20%67.40%25% of Data
(15 breasts)
94.40%96.20%93.40%88.50%
61 Breasts80% of Data
(49 Breasts)
80%20%85.40%88.80%83.60%69.70%20% of Data
(12 breasts)
95.50%97.20%94.50%90.90%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rana, S.P.; Dey, M.; Loretoni, R.; Duranti, M.; Ghavami, M.; Dudley, S.; Tiberi, G. Radiation-Free Microwave Technology for Breast Lesion Detection Using Supervised Machine Learning Model. Tomography 2023, 9, 105-129. https://doi.org/10.3390/tomography9010010

AMA Style

Rana SP, Dey M, Loretoni R, Duranti M, Ghavami M, Dudley S, Tiberi G. Radiation-Free Microwave Technology for Breast Lesion Detection Using Supervised Machine Learning Model. Tomography. 2023; 9(1):105-129. https://doi.org/10.3390/tomography9010010

Chicago/Turabian Style

Rana, Soumya Prakash, Maitreyee Dey, Riccardo Loretoni, Michele Duranti, Mohammad Ghavami, Sandra Dudley, and Gianluigi Tiberi. 2023. "Radiation-Free Microwave Technology for Breast Lesion Detection Using Supervised Machine Learning Model" Tomography 9, no. 1: 105-129. https://doi.org/10.3390/tomography9010010

Article Metrics

Back to TopTop