Next Article in Journal
Virtual Non-Contrast Reconstructions of Photon-Counting Detector CT Angiography Datasets as Substitutes for True Non-Contrast Acquisitions in Patients after EVAR—Performance of a Novel Calcium-Preserving Reconstruction Algorithm
Next Article in Special Issue
A Non-Invasive Interpretable Diagnosis of Melanoma Skin Cancer Using Deep Learning and Ensemble Stacking of Machine Learning Models
Previous Article in Journal
Awareness of Nuclear Medicine Physicians in Romania Regarding the Diagnostic of Cardiac Amyloidosis—A Survey-Based Study
Previous Article in Special Issue
Machine Learning in Prostate MRI for Prostate Cancer: Current Status and Future Opportunities
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Breast Cancer Mammograms Classification Using Deep Neural Network and Entropy-Controlled Whale Optimization Algorithm

1
Computer Science Department, University of Gujrat, Gujrat 50700, Pakistan
2
Information Sciences Department, University of Education Lahore, Jauhrabad Campus, Khushab 41200, Pakistan
*
Author to whom correspondence should be addressed.
Diagnostics 2022, 12(2), 557; https://doi.org/10.3390/diagnostics12020557
Submission received: 9 December 2021 / Revised: 22 January 2022 / Accepted: 30 January 2022 / Published: 21 February 2022
(This article belongs to the Special Issue AI as a Tool to Improve Hybrid Imaging in Cancer)

Abstract

:
Breast cancer has affected many women worldwide. To perform detection and classification of breast cancer many computer-aided diagnosis (CAD) systems have been established because the inspection of the mammogram images by the radiologist is a difficult and time taken task. To early diagnose the disease and provide better treatment lot of CAD systems were established. There is still a need to improve existing CAD systems by incorporating new methods and technologies in order to provide more precise results. This paper aims to investigate ways to prevent the disease as well as to provide new methods of classification in order to reduce the risk of breast cancer in women’s lives. The best feature optimization is performed to classify the results accurately. The CAD system’s accuracy improved by reducing the false-positive rates.The Modified Entropy Whale Optimization Algorithm (MEWOA) is proposed based on fusion for deep feature extraction and perform the classification. In the proposed method, the fine-tuned MobilenetV2 and Nasnet Mobile are applied for simulation. The features are extracted, and optimization is performed. The optimized features are fused and optimized by using MEWOA. Finally, by using the optimized deep features, the machine learning classifiers are applied to classify the breast cancer images. To extract the features and perform the classification, three publicly available datasets are used: INbreast, MIAS, and CBIS-DDSM. The maximum accuracy achieved in INbreast dataset is 99.7%, MIAS dataset has 99.8% and CBIS-DDSM has 93.8%. Finally, a comparison with other existing methods is performed, demonstrating that the proposed algorithm outperforms the other approaches.

1. Introduction

Cancer is a fatal disease, with an estimated ten million deaths and 19.3 million cancer cases reported in 2020 [1]. Breast cancer after lung cancer is the second utmost common cancer [2], and a fifth foremost reason of death in women [2,3]. In 2020, 684,996 deaths occurred with breast cancer and 2.3 million new cases diagnosed in women (https://gco.iarc.fr/today/data/factsheets/cancers/20-Breast-fact-sheet.pdf) (accessed on 20 October 2021) [1]. In less developed countries the breast cancer is the foremost cause of death [4,5]. The cells in the breast tissues change and split into multiple cells, causing a mass or lump. Cancer begins in ducts or lobules that are connected to the nipples (https://www.cancer.org/content/dam/cancer-org/research (accessed on 11 November 2021)) [3]. Most masses in the breast are benign that is, noncancerous, and cause fibroids, tenderness, area thickening, or lumps [3]. Mostly, breast tumors have no signs when small in size and can be easily treated (https://www.cancer.org/content/dam/cancer-org/research (accessed on 11 November 2021)). Painless mass is the sign of abnormal cells. Family history, reproductive factors, personal characteristics, excess body weight, diet, alcohol, tobacco, environmental factors, and other risk factors, such as night shift work, are all breast cancer issues. In primary phase, breast cancer spreads slowly but with passage of time it affect to other body parts.(https://www.cancer.org/content/dam/cancer-org/research (accessed on 11 November 2021)).
Many tests are recommended for the diagnosis of breast tumors, including mammography, magnetic resonance imaging (MRI) [6], ultrasound [7], and digital breast tomosynthesis (https://www.cancer.org/content/dam/cancer-org/research (accessed on 11 November 2021)) [1]. The most recommended test at an early stage is mammography. The mammography is an affordable, low radiation test that is suggested for early diagnosis of the breast tumor [1,8]. The MRI is an alternative test that is used to confirm the presence of a tumor. An allergic reaction to the contrast dye may occur during the MRI test. This is an unintended consequence of the MRI test. At an initial phase, the recommended test is mammography.In an initial phase, treatment of breast cancer is possible [8]. There are many treatment methods such as surgery to remove the defected area, medication, radiation therapy, chemotherapy, hormonal therapy, and immunotherapy (https://www.cancer.org/content/dam/cancer-org/research (accessed on 11 November 2021)) [2]. These treatments, when administered early on, have the potential to save lives. The survival rate is 90% in developed countries, 40% in South Africa and 66% in India if detected in an initial stage. The low-income countries have fewer resources, so early diagnosis methods and treatments can be helpful to save women’s lives (https://www.who.int/news-room/fact-sheets/detail/breast-cancer (accessed on 18 November 2021)) [8].
In medical imaging diagnosis of breast cancer has been extensively used. In the initial stage detection of breast tumors is critical. Mammography is the recommended procedure for early detection (https://www.cancer.org/content/dam/cancer-org/research (accessed on 11 November 2021)). In the diagnostic centers, more than one radiologist is present to diagnose the mammogram breast tumor. The radiologists accomplished single readings with or without CAD systems, as well as double readings in a known or non-known manner. For accurate results, a double reading is recommended for confirmation. In Dutch, double reading is the quality of practice. In double reading, the radiologists can classify the results into different types and stages. It can generate false-positive results [9]. To reduce burden of human observers and to minimize the false negative results many hospitals implemented double reading. Double reading is not an appropriate procedure due to cost and time constraints [10,11]. The results can be inaccurate due to two readings. The false-positive rates can be high, so there is a need to precisely diagnose abnormal regions and classify them into malignant or benign.In medical field, CAD systems always required to support the radiologists as a second opinion.To reduce the perceptual errors CAD systems have been investigated. The computerized methods are used in CAD systems to detect image anomalies and perform tests. Human perception and decision-making abilities are aided by CAD systems. The medical diagnostician makes the final decision. CAD systems help radiologists detect and differentiate between normal and abnormal tissues [8,11,12]. The masses are symptoms of breast tumors. The masses are benign and malignant. The benign are round or oval-shaped, while the masses are round or irregular in shape. The most whitened area in mammogram images is mass [5,8]. Mammogram images have a complex structure, making it difficult for radiologists to extract features and precisely classify the images. Many researchers have introduced numerous methods for feature extraction and classifying diseases, but these methods still need to be improved.
Different deep learning models are available that perform different tasks such as object detection [13], visual tracking [14], semantic segmentation [15], and classification [16]. The researchers proposed different models like AlexNet, GoogleNet, ResNet, MobileNet, and EfficientNet to perform the classifications [17].
In this research study, Features are extracted to achieve the best performance of the model by using Fine-tuned MobilenetV2 and Fine-tuned Nasnet Mobile. The Modified Entropy Whale Optimization Algorithm (MEWOA) is applied to improve the features. The performance speed of the system will be increase by minimizing the computation cost. The selection of features is performed. The feature selection methods will increase the performance of the model by decreasing the computational cost and the load of the classification. The feature selection technique is used to choose the model’s most relevant features that will improve the model’s accuracy. The feature fusion is performed to get the internal information of the multiple input data by joining it into a single feature vector. Feature fusion helps to join the multiple data into a single place.
In this paper, three datasets are taken. Data augmentation is performed to increase the images. The Modified Entropy controlled Whale Optimization Algorithm (MEWOA) is proposed for optimal features selection. The major contributions are as follows:
  • Data augmentation is performed using three mathematical formulas: horizontal shift, vertical shift, and rotation 90.
  • Two deep learning pre-trained models are fine-tuned such as Nasnet mobile andMobilenetV2 and deep features are extracted from the middle layer (average pool) instead of the FC layer.
  • We proposed a Modified Entropy-controlled Whale Optimization Algorithm for optimal feature selection and reducing the computational cost.
  • We fused the optimal deep learning features using a serial-based threshold approach.

Literature Review

To perform the feature extraction and classification lot of models have been proposed [18]. To extract the features and to perform mammogram images classification into malignant and benign CAD is developed by deploying the Deep Convolutional Neural Network (DCNN) and AlexNet model [19]. The SVM approach is used to connect to the last layer, which is fully connected to achieve good accuracy. Fine-tuning is also performed. The AUC is 0.94% and accuracy is 87.2% achieved [20]. The DCNN model is used for mammogram images detection. The model is fine-tuned [21]. To classify the malignant and non-malignant images, a CAD system is proposed. The K-Clustering technique and SVM classifier is used. Sensitivity and specificity of 96% are achieved by using the dataset DDSM [22]. The researcher took INbreast and CBIS-DDSM datasets in the form of png, resizing of the images is performed. The VGG and Resnet methods used for classification. The CBIS-DDSM achieved 0.90 AUC and 0.98 on INbreast dataset [23]. The Deep Learning models VGG, Resnet, Xception are applied to the CBIS-DDSM dataset. Transfer learning and fine tuning methods used to adjust the overfitting problem. The 0.84 AUC value is achieved on CBIS-DDSM [24]. To perofrm the classification researchers proposed Multi-View Feature Fusion(MVFF) method on mini-MIAS and CBIS-DDSM dataset. The AUC 0.932% is achieved [25]. The researchers used the MobilenetV2 model and performed transfer learning on the CBIS-DDSM dataset to perform the classification. The 74.5% accuracy is achieved. The data resizing and augmentation are performed [26]. The multi-level thresholding and radial region-growing methods are used on the DDSM dataset with an accuracy of 83.30% and an AUC of 0.92, which reduce the false positive rates [27].The CAD system is proposed by using the DDSM and mini-MIAS datasets. The histogram regions are used for segmentation and classification. K-means analysis is used to segment the images. The shape and texture features extracted and SVM classifier is used to perform the classification. The classification accuracy in mini-MIAS is 94.2% with an AUC 0.95% and in CBIS-DDSM 90.44% accuracy with an AUC value of 0.90% is achieved [28].
The Fuzzy Gaussian Mixture Model (FGMM) is used to classify the mammogram DDSM images. The FGMM achieves 93% accuracy, 90% sensitivity, and 96% specificity [29].
The CAD system is proposed to classify the INbreast dataset. To perform the classification the deep CNN model is used. The 95.64% accuracy, 94.78% AUC and 96.84% F1-score is achieved [30]. In the other study, the Modified VGG (MVGG) model is used to classify data from the CBIS-DDSM dataset. The hybrid transfer learning fusion approach is used in MVGG and ImageNet models. The modified MVGG achieves 89.8% accuracy, while MVGG and Imagenet combined by the fusion method achieve 94.3% accuracy [31]. In the other study, the researchers extract the features by using the Maximum Response (MR) filter bank that is convolved by the CNN to perform the classification. The fusion approach is applied to address the mass features. The accuracy on the CBIS-DDSM dataset after the fusion reduction approach is 94.3%, an AUC is 0.97%, and the specificity is 97.19% is achieved [32]. To extract features ensemble transfer learning approach is used.The neural networks are used to perform the classification. The 88% accuracy and an AUC 0.88% achieved on CBIS-DDSM [33]. To generate the ROI and classification of the INbreast dataset, a CAD system is proposed. Deep learning techniques such as a Gaussian mixture model and deep belief network are proposed. The cascade deep learning method is used to reduce the false-positive results. Bayesian optimization is performed to learn and segment the ROIs. In the last, the deep learning classifier is used to classify the INbreast images by achieving an accuracy of 91% and AUC is 0.76% [34].
The transfer learning [13] approach is used for improving the efficiency of the training models that are used to perform the classification. This approach makes learning faster and easier. Transfer learning is helpful when data is not available in a large amount. Transfer learning with fine-tuning is usually faster and training is easier when initialized the weights. It quickly learned transfer features using a small number of [33,34,35,36]. The transfer learning approach with CNNs has been used to classify the different types of images like histological cancer images, digital mammograms, and chest-X ray images [37].
To classify the INbreast and DDSM datasets, deep learning models CNN, ResNet-50, Inception-ResNetV2 were used. Mammogram images are classified as benign or malignant.The INbreast dataset achieved an accuracy of 88.74%, 92.55%, and 95.32% [38]. In another study, Faster-RCNN is used to detect and perform classification on the INbreast and CBIS-DDSM datasets. An AUC of 0.95 on the INbreast dataset is achieved [39]. The large number of data set require to train the deep learning models, so augmentation on the mini-MIAS dataset is performed by using the rotation and flipping method. The 450,000 images of MIAS after augmentation are taken and resized into 192 × 192. The images are classified into three categories using the multiscale convolutional neural network method (MCNN): normal, benign, and malignant. The AUC is 0.99 and the sensitivity is 96% [40]. The random forest (RF) on CNN with a pre-training approach is used to extract the hand-crafted features from the INbreast dataset. The 91.0% accuracy is achieved [41]. The author’s used physics informed neural network (PINN) by applying regression adaptive activation functions to predict the smooth and discontinuous functions to solve the linear and non-linear differential equation. To provide the smooth solution the nonlinear Klein Gordon equation has been solved, to use the high gradient solutions the non-linear Burgers equation and the Helmholtz equation, in particular, are used. To achieve the network best performance the activation function hyper parameter is optimized by changing the topology loss function that participates in the optimization process. To improve the convergence rate during initial training and solution accuracy the adaptive activation function outperforms in terms of learning capabilities. The efficiency can be increased by using this method [42]. To improve the performance of PINN the adaptive activation function use layer-wise and neuron-wise approaches. To complete the local adaption of activation function the scalable parameter is initialized in each layer of layer-wise and neuron to perform the optimization updation by utilizing the stochastic gradient descent algorithm. To increase the training speed the slope-based activation with loss function is applied [43]. The adaptive activation functions are utilizd to propose the Kronecker neural networks (KNNs). The number of parameters in the large network is reduced by KNNs by using the Kronecker product. The KNNs tempts faster loss decay as compare to feed forward network. For KNNs, the global convergence of gradient descent is established. The Rowdy activation function remove the saturation region with training parameters by using sinusoidal fluctuations [44].

2. Methods and Materials

This section illustrates the proposed methodology. The six steps are involved in the proposed methodology. In the first step, to increase the number of training samples the data augmentation is applied. In the second step, fine-tuning is performed on two selected deep models: MobilenetV2 and Nasnet mobile. Fine-tuned models are used to extract the features from the global average pool layer. In the third step, a Modified Entropy Whale Optimization Algorithm (MEWOA) is applied to the extracted deep features. In the fourth step, features are fused using a serial-based non-redundant approach. In fifth step, again, to reduce thecomputational time MEWOA is applied, and finally, classification is performed using machine learning classifiers. Figure 1 shows the detailed architecture of the proposed method. The detail of each step is given below.

2.1. Datasets

In this work, three publicly available mammography datasets are utilized for the experimental process: CBIS-DDSM [45], INbreast [46], and MIAS (http://peipa.essex.ac.uk/info/mias.html (accessed on 10 October 2019)). For evaluation of the proposed framework, a 50:50 approach has opted which means 50% of the images of each dataset are consumed for the training and remaining for testing. A few sample images of each dataset are illustrated in the figures. The each dataset description is given below.
CBIS-DDSM: The Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM) is an improved and standardized form of DDSM. A trained mammographer curated the dataset. The images are in the Dicom form. The ROI annotations are also provided of the images. The two views craniocaudal(CC) and mediolateral oblique (MLO) are available. The 1696 mass images with pathological information training and testing are available [45]. Figure 2 shows a few examples of images from this dataset.
INbreast: The Portugal breast research group generated INbreast dataset. The INbreast database includes 410 images from 115 patients. The 108 mass mammogram images with BIRADS information are available. The 107 images with mass annotations are available. The INbreast images are available in Dicom format. The images size is 3328 × 4084 or 2560 × 3328 pixels [46]. The 108 mass mammogram images are taken for the experiment. Figure 3 represents a few sample images as well.
mini_MIAS (Mammographic Image Analysis Society): This is publicly available dataset. There are 322 images in this dataset. MIAS images have been condensed to 200-micron pixel edges. Every image is 1024 × 1024 pixels. The benign, malignant, and normal images are given. The complete information of the dataset regarding normal, benign, and malignant images is available (http://peipa.essex.ac.uk/info/mias.html (accessed on 10 October 2019)). The images are available in a portable gray map (PGM). The 300 images without calcification cases are taken for the experiment. Figure 4, represent sample images of this dataset.
These three datasets CBIS-DDSM, INbreast, and mini-MIAS are converted into portable network graphic (PNG) format [47]. The resizing of the images is performed by using the neighbor interpolation method into 256 × 256 .

2.2. Data Augmentation

To enable the deep learning models the sample images increased by using data augmentation [47]. The deep learning models give promising results on a large amount of data. In this work, three mathematical operations are implemented such as flip left to right, flip up to down, and rotation at 90 degrees. Algorithm 1 of data augmentation is presented below.
Algorithm 1: Data Augmentation
While (i = 1 to target object)
Step 1: Input read
Step 2: Flip Left to right
Step 3: Flip-up to down
Step 4: Rotate image to 90 °
Step 5: Image write step 2
Step 6: Image write step 3
Step 7: Image write Step 4
End
In Figure 5, data augmentation is presented of the CBIS-DDSM [45] images. The augmentation of the data is performed by using flip left to right, up to down, and by rotating at 90°.
Table 1 shows the detailed information of the three datasets CBIS-DDSM, INbreast, and MIAS. The detail of original images and data augmentation is given below.

2.3. Convolutional Neural Network

There are several layers in the CNN model, including an input layer, convolutional layers, batch normalization layers, Pooling, ReLu, softmax layers, and one output layer. The input layer consist of dimensions a × b × c of the input image. The number of channels described by c. The convolutional layer that is main and first layer utilizes three inputs: a, b, and c. The mapping of features is performed in the convolutional layer. These features are utilized for visualization and used in the activation layer.

2.4. Fine-Tuned MobilenetV2

The MobilenetV2 model is a portable custom-based model in computer vision. This model sustains the same accuracy by decreasing the number of operations and consuming a small amount of memory. In this model, the inverted residual layer with a linear bottleneck is included. In this model, the compressed representation of low-dimensional input is used that is converted into high dimension by using the light weight depth-wise convolution filters [48].
MobilenetV2 performs efficiently in any framework. This model reduces the need for main memory in many embedded hardware designs while providing a small amount of cache memory that increases the speed and efficiency of the system. It reduces the main memory need. The MobilenetV2 performs best in object detection, semantic segmentation, and classification tasks. In Mobilenetv2, depth-wise convolution, linear bottleneck, inverted residuals, and information flow interpretation are used [49].
The depth-wise separable convolution blocks achieve good performance. In MobilenetV2, the convolution layers are replaced with the two other layers. The depth-wise convolution that is the first layer uses the single convolution filter per input unit to perform the lightweight filtering. The pointwise convolution that is the second layer generates new features by utilizing the input channels of the linear combinations. In the residual bottleneck, the information in the deep convolutional layer is encoded in some manifold that is residing in a low-dimensional subspace. This procedure can be captured by reducing the layer dimensionality and operating space dimensionality.
The manifold expands the space by allowing us to reduce the activation space dimensionality. The deep convolution neural network has ReLU, which is a nonlinear per coordinate transformation that breaks down intuition. If the volume of the manifold of interest after ReLU transformation remains non-zero, the linear transformation is formed. ReLU has complete information about the input manifold if it remains in the low-dimensional subspace of the input space. The inverted residuals are built up that is more memory efficient [49].
In fine-tuned MobilenetV2 the last three layers replaced by adding new layers according to the target datasets. The target dataset is based on mini-MIAS, CBIS-DDSM, and INbreast. To train the fine-tuned model transfer learning approach is used. In the training process, 100 epochs, 0.00001 learning rate, and 8 batch size is set. The Single Shot Multibox Detector (SSD) [50] and Adam optimizer are utilized for the learning method. To quantize bounding box space, the SSD uses default anchor boxes with different fractions and measures. SSD adds different feature layers in the network end [2]. Finally, deep features are extracted for further processing from the fine-tuned model of layer global average pool (GAP). The vector output size of this layer is N × 1280 .

2.5. Fine-Tuned Nasnet Mobile

The Nasnet Mobile is a search network of neural architecture. By using a small dataset, the architectural building block searched and transferred on the large dataset. The best cells or convolutional layers are searched and applied to the Imagenet by making more copies of the convolution layers. The new method ScheduleDropPath regularization is proposed that improves the generalization of the Nasnet models [51]. The samples child of the different networks is added in the recurrent neural network (RNN) to propose the NAS [52]. To achieve accuracy the network child is trained. To update the controller, the resulting accuracies are used that generate the best architecture. The controller weights are updated by using the gradient. The RNN controller only returns the structure of the normal and reduction cells [51]. The Nasnet search space is familiar with CNN architecture engineering because it identifies motifs such as convolutional filter bank combinations, nonlinearities, and connections prudent selection [53,54,55].
The above-mentioned studies suggests to predict the generic convolutional cells that are utilized to express motifs for the controller RNN. To control the filter depth and spatial dimensions of the input the cells are stacked in the form of series. While in Nasnet convolutional, nets manually pertain to the architecture. These are built up by convolutional cells and repeated several times by using different weights but the same architecture [51].
In Nasnet proposed by [51], the reinforcement learning search method is used to search the blocks. The initial convolutional filters and motif repetitions N is free from the parameters and used in scaling. The features map of the same dimensions is returned by the convolutional cells when in normal cell form, otherwise in reduction cells the features map with their height and width reduced by a factor of two.
The Nasnet model uses scalable, convolutional cells from data and can be transferred to the other image classification tasks. The parameters and computational cost of the architecture are quite flexible and this model can be used model for a lot of different problems. The search space is used to minimize the architecture complexity from the depth of the network. The searching space achieves good architecture on small datasets and shifts the learned architecture to the classification.
The last three layers of the Nasnet are replaced by adding new layers based on the target dadasset during the fine-tuning phase. The target dataset consists of mini-MIAS, CBIS-DDSM, and INbreast. The transfer learning approach is used to train the fine-tuned models. The number of epochs used to train the process is 100, the learning rate is 0.00001, and the batch size is 8. To learn the methods, the Adam optimizer and SSD are used [50]. To quantize bounding box space, the SSD uses default anchor boxes with different fractions and measures. SSD adds different feature layers in the network end [2]. Finally, deep features are extracted for further processing from the fine-tuned model of the layer Global Average Pool (GAP). This layer’s output vector size is N*1056.

2.6. Transfer Learning

Transfer learning makes use of an already trained and reused model as the foundation for a new task and model. The model used for one task can be repurposed for other tasks as an optimization to improve performance. By applying transfer learnin the model can be train with a small volume of data. It is helpful to save time and achieve good results [56,57].
In the transfer learning approach, we transfer knowledge from the source mammogram input images I s to the target domain mammogram mass images I T . The target classifier T c   ( M t ) is to be trained from the input mammogram image I s to the target image I T to get the classifier prediction about B M N T i , which stands for benign, malignant, and normal. To extract the features transfer layer is used. The top layer from the classifier retrained the new target classes while the other layers kept frozen.
B M N T i   = T c   ( M t )
To extract the features from MobilenetV2 and Nasnet the transfer learning approach is used. In Figure 6, multiple classes of knowledge have been utilized into two classes.

2.7. Whale Optimization Algorithm (WOA)

To explore the feasible solution to the problems in the search space, whale individuals are used in the community. There are three functions performed by WOA: encircling, shrinking, and hunting. In the exploitation phase, the encircling and shrinking operations are used, while in the exploration phase, the hunting function is used [58].
To provide the solution of the dimension optimization problems (DO) the procedures of the ith individual in the cth generation is used to find the best solution.
The WOA procedures are following.
Encircling Operation
E S H i j ( c + 1 ) = E S H j   ( c ) B · O i j ( c )
Shrinking Operation
E S H i j ( c + 1 ) = E S H j ( c ) + g e t   · cos ( 2 π t ) · O i j ( c )
Hunting Operation
E S H i j   ( c + 1 ) = E S H k j   ( c ) B · O i j *   ( c )
B = 2 ( 1 c c m a x ) · ( 2 r d 1 )
The arbitrary number in the range [0 1] is described by ( r d ) , The present number of the iterations is represented by c, iterations maximum no is described by c m a x , the best solution positive vector is represented by E S H ( c ) . To define logarithmical spiral shape the constant e is used, and the random number in [−1, 1] is represented by   t . The arbitraryposition vector E S H K ( c ) is selected from the present population. Three distances are following. The first is | O i j   ( c ) = | 2 r d . E S H j   ( c ) E S H i j ( c ) | , the 2nd distance is O i j   ( c ) = | E S H j   ( c ) E S H i j   ( c ) | , and the 3rd distance is O * i j ( c ) = | 2 r d . E S H k j ( c ) E S H i j ( c ) | . According to the probability p r o b , Equations (1)–(3) are executed by WOA. The whale individuals are updated by Equation (1), when p r o b < 0.5 and |B| < 1; otherwise individuals are corrected by Equation (3), when |B| ≥ 1. The Equation (2) is used to update the individuals, when p r o b ≥ 0.5.

2.7.1. Modified Entropy Whale Optimization Algorithm (MEWOA)

The WOA learns the best current solution from the exploitation phase, which easily succumbs to local optimization and reduces population diversity. The random individual learning operation has some sightlessness and does not perform any effective interchange of information between groups in the exploration phase, which disrupts the algorithm convergence rate. The WOA needed to be improved to reduce these issues. The new algorithm MEWOA is proposed. To balance the WOA’s an exploration and exploitation functions the control parameter B is used.The exploration probability in WOA is only 0.1535, during the iterative process of the algorithm.WOA has limited ability. The development and exploration process in the MEWOA is controlled by linearly increasing the probability. Individual quality in the animal large group improves when individuals learn from the elite and other members of the group. Individual neighborhoods are formed through adaptive social learning procedures that use the social position of the individual, social influence, and social network formation. The adaptive social networking approach is used to build whales’ adaptive community and to improve the interaction between groups, as well as to improve the MEWOA’s calculation accuracy. The new approach is proposed based on neighborhood, which will also increase the population diversity. The MEWOA’s convergence speed increases when the population jumps out from a local optimum by introducing the wavelet mutation strategy, and the algorithm exhibits premature convergence when the population falls into the local optimum [58].

2.7.2. Linear Increasing Probability

The control parameters | B | [ 0 ,   2 ] in the WOA, the global exploration is performed by the algorithm when | B |   1 . As presented in Equation (4), when c 1 2   c m a x ,   | B | < 1 is always true. The algorithm has weak exploration ability in the second half of the iteration. Let q = 2 ( 1 c c m a x ) , ᴧ = 2rd − 1, then B = q.ᴧ in the whole iteration, and the probability of | B 1 | is
P r o b ( | B | 1 ) = 0 + 1   2 1 / q 1 O Λ O q   0.307 .
The WOA performs exploitation operations when the p r o b   0.5 and the exploration probability is 0.5 × 0.307 = 0.1535 in the iterations. The search ability of the MEWOA is not maintained by |B| due to weak exploration ability, so the exploitation and exploration ability is handled by probability P i that will increase the number of iterations linearly to conduct global exploration.
P i = 0.5 + x . c c m a x
where 0.2 x < 0.5 .
The r n o is arbitrary no in [0 1]. The exploitation operation is performed when the r n o   < p i ; otherwise, an exploration operation is performed by the algorithm. The global exploration has a possibility of 0.1 when the coefficient of C c m a x < 0.5 even in the last iteration, which will rise algorithm capacity to jump out of local optimization.
Average exploration probability according to Equation (6) will be
P i ¯ = 1 1 c m a x   .   c = 1 c m a x ( 0.5 + 0.4 . c c m a x   ) = 0.3 0.2 c m a x
when c m a x 2 ,   P i ˇ 0.2 > 0.1535 . The exploitation and exploration is controlled by linear increasing probability P i that will increase the algorithm search ability.

2.7.3. Adaptive Social Learning Strategy

In social behavior, each whale can build a neighborhood membership relationship and can change its current best solution behavior of imitation. The algorithm (MEWOA) moves away from the local optimal solution by improving and enhancing information sharing between groups. For the current population, G ( c ) = { E S H 1 ( c ) , E S H 2 ( c ) , . E S H P N ( c ) } , the population size is denoted by PN. The fitness value is computed and ordered from minor to huge to achieve the stored population G 1   ( c ) = { E S H ( 1 ) ( c ) ,   E S H ( 2 ) ( c ) , . , E S H ( P N ) ( c ) }, and E S H ( i ) (c) is used to describe the social ranking.
S R ( i ) ( c ) = P N + 1 i ,   i = 1 , 2 , P N .
E S H i ( c ) social impact is
t i ( c ) = S R ( i ) ( c ) S i f   P N ,   i = 1 , 2 , P N
The social impact is represented by S i f , and S i f 0.4 . Equations (8) and (9) defined that when the social impact is greater the social ranking is also greater that denote the better individual and by using the specific limit of S i f the influence will be limited. For G 1 ( c ) population, the social network is constructed according to social influence. The relationship between E S H   i ( c ) and E S H j ( c ) is defined as
S R ( i j ) ( c ) = { 1 ,   i f               r d 1 m a x ( t i ( c ) , t j ( c ) ) 0 ,   o t h e r w i s e
where r d 1 is a random number in [0, 1]. When the social influence is greater, the individual has the strongest connection with other individuals as shown in Equation (10), and enhance the likelihood possibility ( t ( j ) ( c ) ) , and when there is less social influence, the likelihood ( t i ( c ) ) of the relationship enhances between the individuals and other individuals. More Individuals can adopt the best individual behavior. The greater an individual’s social influence, the more interaction between the individuals. The E S H i ( c ) the adaptive neighborhood of individuals built up the relationship between individuals:
P N ( i ) ( c ) = [ E S H ( j ) ( c ) | j [ 1 ,   P N ]         a n d         J i         a n d           S R ( i j ) ( c ) = 1 ] .
In the algorithm, the exploitation stage is in the center of the best search solution, and the exploration ability is finished due to interaction between the group members. The new search strategy of a whale is recognized using community adaptive strategy and linearly increasing probability. The new strategy is described here.
If p r o b 1   < p i , the jth dimension of ith individual, E S H ( i ) ( c ) in population G 1 ( c ) updates its position as follows.
E S H ( i ) j ( c + 1 ) = { E S H ( 1 ) j ( c ) B . O ( i ) j ( c )                 r d < 0.5 E S H ( 1 ) j ( c ) + g e t   . cos ( 2 π t ) .   O ( i ) j   ( c )               r d 0.5
where O ( i ) j ( c ) = | 2 r d . E S H ( 1 ) j ( c ) E S H ( i ) j ( c ) | ,   O ( i ) j ( c ) = | E S H ( 1 ) j   ( c ) E S H ( i ) j ( c ) | . if p r o b 1   p i , the adaptive neighborhood procedure is used by the algorithm to explore. This process is described in Equation (13). Let the following:
f ( i ) 1 = B . U = 1 , E S H m U   ( c ) P N i ( c ) U i W U . ( 2 r d . E S H ( i ) j ( c ) E S H m U J ( c ) )
f ( i ) 2 = g e t . cos ( 2 π t ) . U = 1 , ESHm U ( c ) PN ( i ) ( c ) U i   W U ( E S H ( i ) j ( c ) E S H m U j ( c ) )
Then,
E S H ( i ) j ( c + 1 ) = { E S H ( 1 ) j ( c ) f ( i ) 1         r d < 0.5 E S H ( 1 ) j ( c ) + f ( i ) 2   r d 0.5 P r o b 2 < 0.5
where P r o b 1 , p r o b 2 , r d   a n d   r d are the arbitrary number in [0, 1], P i is represented in Equation (6). U i is the cardinality of P N ( i ) (c), U i = | P N ( i ) ( c ) | . W U is the weight, W U = S R m U   ( c ) U = 1 U i S R m U   ( c ) . Using Equations (12) and (13), updating individuals fully utilizes the most recent best solution and individual adaptive neighborhood information while effectively increasing population diversity.

2.7.4. Morlet Wavelet Mutation

MEWOA holds the key to breaking out of the local optimum for optimization problems involving extreme points of intensive distribution. In biological growth, change is the main factor. To adjust the mutation space dynamically that increases the solution frequency. The amplitude function can be reduced by fixing the wavelet function, extending parameters, and fixing the mutation space of the number of iterations to a specific limit, the change operation can be grasped by using the fine-tuning effect. The WOA is incorporated by using the wavelet mutation to improve the algorithm’s convergence and correctness speed and by allowing it to release from local optimization by enhancing its ability. The purpose of the change in the algorithm’s exploration phase is to find the best solution from all other solutions.
Suppose the P r o b m is the mutation probability, and random no in [0, 1] is represented by r d . When P r o b 1 P i and   r d p r o b m , modified wavelet mutation represent the position of whale according to p r o b m .
f ( i ) 3 = σ j . ( y j E S H ( i ) j ( c ) )
f ( i ) 4 = σ j . ( E S H ( i ) j ( c ) t j )
E S H ( i ) j ( c + 1 ) = { E S H ( i ) j ( c ) + f ( i ) 3       r d < 0.5   E S H ( i ) j ( c ) + f ( i ) 4           r d 0.5         P r o b 2 0.5
The random number is represented by P r o b 1 , P r o b 2 , rd, and rd′ in the range of [0 1] as mentioned above. The upper and lower bounds of jth dimensions are described by y j and t j The   σ coefficient wavelet mutation is σ j = 1 v   $ ( J v   ) ,   J   { 1 , 2 , . , D O } . The Morlet Wavelet mutation is $ ( E S H ) and $ ( E S H ) = g e s h 2 2   . cos ( 5 ESH ) , the function energy 99% is consists of [−2.5, 2.5], so J in [−2.5v, 2.5v] is represented as a random number.
When iterations increase the scaling parameter v also increases makes possible for the algorithm to find the best solution when there are huge at the end of the iteration.
v = a ( 1 a ) ( 1 c c m a x )
The constant number is represented by a. The proposed Algorithm 2 MEWOA is presented below.
Algorithm 2: Modified Entropy Whale Optimization Algorithm.
Start
Parameters to initialize MEWOA such as a,   c m a x , e, t, PN, S i f , p r o b m
The initial population randomly generates G ( 0 ) = { E S H 1   ( 0 ) , E S H 2 ( 0 ) , , E S H P N ( 0 ) }
Assess each whale’s individual fitness values, E S H * ( 0 ) , search individual best c = 1
While ( c < c m a x )
Update p i according to Equation (6), compute the neighborhood P N ( i ) ( c ) of whale, for each search individual E S H ( i ) ( c ) according to Equations (8)–(11)
If ( P r o b 1 < p i )
  update the whale individual by using Equation (12);
Else
If P r o b 2 < 0.5
Use Equation (13);
Else
Use Equation (14);
End if
End if
End for
Fix boundaries of the whale individuals that go beyond.
Evaluate individual whale fitness values; Update the E S H * ( c ) global best solution.
c = c + 1 ;
End while
Output the best search individual E S H * ;
End

3. Results

The experimental results are offered in this section by using three datasets: CBIS-DDSM, Mini-MIAS, and INbreast. The detail of the datasets is given in Section 2.1. The results of each dataset are measured by applying the deep learning models from a different perspective. For the validation purpose, several classifiers of machine learning are applied by using 10-fold cross-validation. In the 10-fold cross-validation test, the provided learning set is divided into ten distinct subsets of comparable size.
The number of subsets created is referred to as the fold in this context. Then, these subsets are used for training and testing, and the loop is repeated until the model has trained and tested every subset, whereas the 10-fold cross-validation performed better than any other k fold selection.
As a result, the 10-fold cross-validation method is used to validate the models in order to avoid over- and under-fitting during the training process. Different measures like Sensitivity, Precision, F1-Score, AUC, FPR, Accuracy, and Time are computed to evaluate the performance of the proposed method. All the training is conducted on MATLAB2020a by using a Personal Computer of 16 GB Ram and a 4 GB graphics card.

3.1. Experimental Results

  • Several experiments are conducted to validate the proposed method.
  • Classification using Fine-tuned MobilenetV2 deep features.
  • Classification using Fine-tuned Nasnet Mobile deep features.
  • Classification using MEWOA on Fine-tuned MobilenetV2 deep features.
  • Classification using MEWOA on Fine-tuned Nasnet Mobile deep features.
  • Classification using serial-based non-redundant fusion approach.
  • Classification using MEWOA on fused features.

3.2. Classification Results

The classification results are conducted on three datasets. Several classifiers are applied to compute the classification results. In Table 2, the Fine-tuned MobilenetV2 model is applied to the CBIS-DDSM dataset. The deep features of the dataset are extracted from the GAP layer and fed to classifiers. The highest accuracy is 90.3%, which is achieved by the Cubic SVM classifier in 260.13 (s). The minimum time is taken by Gaussian Naïve Bayes is 20.432 (s), but the accuracy is 71.5%, which is less than that of Cubic SVM. The second highest accuracy is achieved by Weighted KNN, which is 88.2% in 72.92 (s). The sensitivity rate of each classifier is also calculated, and the best-noted value is 90.20% for Cubic SVM. The sensitivity can be confirmed by using a confusion matrix, as mentioned in Figure 7. The machine learning classifiers Cubic SVM, Fine Tree, Linear SVM(LSVM), Quadratic SVM(QSVM), Fine Gaussian SVM (FG-SVM), Gaussian Naïve Bayes (GN-Bayes), Fine KNN (FKNN), Medium KNN (MKNN), Weighted KNN (WKNN), and Coarse KNN(Co-KNN) are applied to classify the mammography images.
In Table 3, the Fine-tuned Nasnet model is applied to the CBIS-DDSM dataset. The deep features of the dataset are extracted from the GAP layer and fed to classifiers. The highest accuracy is 93.9%, which is achieved by the Cubic SVM classifier in 112.96 (s). In this table, the minimum time is taken by Fine Tree, which is 16.91 (s), but the accuracy is 89.8%, which is less than Cubic SVM.
The second highest accuracy is achieved by WKNN, which is 93.6% in 59.909 (s). Each classifier sensitivity rate is also computed, and the Cubic SVM achieved the best-noted value that is 94%. A confusion matrix, as shown in Figure 8, can be used to confirm it.
In Table 4, MEWOA on MobilenetV2 is applied to the CBIS-DDSM dataset. The deep features of the dataset are extracted from the GAP layer and fed to classifiers. The highest accuracy is 90.0%, which is achieved by Cubic SVM in 132.98 (s). In this table, the minimum time is taken by GN-Bayes, which is 8.70 (s), but the accuracy is 70.5%, which is less than Cubic SVM.
The second highest accuracy is achieved by WKNN, which is 87.4% in 37.385 (s). Every classifier sensitivity rate is also calculated, and Cubic SVM achieved the best-noted value that is 89.95%. A confusion matrix, as shown in Figure 9, can be used to confirm it.
In Table 5, MEWOA on Nasnet is applied to the CBIS-DDSM dataset. The deep features of the dataset are extracted from the GAP layer and fed to classifiers. The highest accuracy is 93.50%, which is achieved by Cubic SVM in 73.24 (s).
In this table, the minimum time is taken by GN-Bayes, which is 11.57 (s), but the accuracy is 83.7%, which is less than Cubic SVM. The second highest accuracy is achieved by QSVM, which is 92.30% in 75.26 (s). The sensitivity rate of each classifier is also computed, and the Cubic SVM achieves the best-noted value that is 93.50%. The sensitivity can be calculated by the confusion matrix, as illustrated in Figure 10.
In Table 6, by using CBIS-DDSM dataset, Serial Fusion on MobilenetV2 and Nasnet deep features is applied. The GAP layer extracts deep features from the dataset and feeds them to classifiers.
The highest accuracy is 94.1%, which is achieved by Cubic SVM in 314.97 (s). In this table, the minimum time is taken by GN-Bayes, which is 55.161 (s), but the accuracy is 85.5%, which is less than Cubic SVM. The second highest accuracy is achieved by QSVM, which is 93.0% in 265.45 (s). Each classifier sensitivity rate is also calculated and the Cubic SVM has the best-noted value that is 94.1% and can be confirmed by using the confusion matrix as described in Figure 11.
In Table 7, MEWOA on fusion is applied to the CBIS-DDSM dataset. The deep features of the dataset are extracted from the GAP layer and fed to classifiers.
The highest accuracy is 93.8%, which is achieved by Cubic SVM in 255.84 (s). In this table, the minimum time is taken by Fine Tree, which is 42.42 (s), but the accuracy is 88%, which is less than Cubic SVM.
The second highest accuracy is achieved by QSVM, which is 93.0% in 227.28 (s). Each classifier sensitivity rate is also calculated and Cubic SVM has a best-noted value that is 93.75% and can be verified by using the confusion matrix as defined in Figure 12.
In Figure 13, the deep learning model’s time comparison graph by using the machine learning classifiers is shown.
The Fine Tree utilized maximum time in the MEWOA in Fine-tuned Nasnet model. The FG-SVM classifier utilized maximum time in serial fusion. The GN-Bayes utilized minimum time.
In Table 8, the Fine-tuned MobilenetV2 model is applied to the MIAS dataset. The GAP layer is used to extract the deep features and fed to classifiers.
The highest accuracy is 99.4%, which is achieved by the Cubic SVM classifier in 85.29 (s). In this table, the minimum time is taken by Fine Tree, which is 22.82 (s), but the accuracy is 88.9%, which is less than Cubic SVM.
The second highest accuracy is achieved by QSVM, which is 99.3% in 79.88 (s). Each classifiers sensitivity is computed and Cubic SVM has 98.73% best-noted value that can be verified by using the confusion matrix that is presented in Figure 14.
In Table 9, the Fine-tuned Nasnet model is applied to the MIAS dataset. The GAP layer extracts the dataset’s deep features and feeds them to the classifiers.
The highest accuracy is 99.7%, which is achieved by the WKNN classifier in 81.459 (s). In this table, the minimum time is taken by Fine Tree, which is 35.187 (s), but the accuracy is 99.1%, which is less than WKNN.
The second highest accuracy is achieved by Cubic SVM, which is 99.6% in 267.43 (s). Every classifier sensitivity rate is computed and WKNN has a 99.2% best sensitivity rate that can be confirmed by using a confusion matrix, as described in Figure 15.
In Table 10, MEWOA on Fine-tuned MobilenetV2 model is applied to the MIAS dataset. The GAP layer is used to extract the deep features and fed to classifiers.
The highest accuracy is 99.4%, which is achieved by the Cubic SVM classifier in 75.49 (s). In this table, the minimum time is taken by Fine Tree that is 20.09 (s), but the accuracy is 89.1%, which is less than Cubic SVM.
The second highest accuracy is achieved by FKNN, which is 99.3% in 65.81 (s). Each classifier sensitivity rate is calculated and Cubic SVM has a best-noted value that is 98.87%. The sensitivity rate can be confirmed by using the confusion matrix as described below in Figure 16.
In Table 11, MEWOA on the Fine-tuned Nasnet model is applied to the MIAS dataset. The deep features are extracted from the GAP layer and fed to classifiers. The highest accuracy is 99.7%, which is achieved by the WKNN classifier in 24.70 (s). In this table, the minimum time is taken by Fine Tree, which is 9.40 (s), but the accuracy is 98.9%, which is less than that of WKNN.
The second highest accuracy is achieved by Cubic SVM, which is 99.6% in 18.35 (s). Each classifier sensitivity rate is calculated and WKNN has the best value of 99%, which can be confirmed using the confusion matrix shown in Figure 17.
In Table 12, Fusion on Fine-tuned MobilenetV2 and the Nasnet model are applied on the MIAS dataset. The GAP layer extracts deep features from the dataset and fed them to the classifiers.
The highest accuracy is 99.8%, which is achieved by the Cubic SVM classifier in 133.46 (s). In this table, the minimum time is taken by GN-Bayes, which is 60.069 (s), but the accuracy is 96.4%, which is less than that of Cubic SVM.
The second highest accuracy is achieved by Linear SVM, which is 99.6% in 115.46 (s). Each classifier sensitivity rate is calculated and Cubic SVM has the best value that is 99.66% and can be verified by confusion matrix as described in Figure 18.
In Table 13, MEWOA on Fusion is applied to the MIAS dataset. The deep features of the dataset are extracted from the GAP layer and fed to classifiers.
The highest accuracy is 99.8%, which is achieved by the Cubic SVM classifier in 63.287 (s). In this table, the minimum time is taken by GN-Bayes, which is 7.9 (s), but the accuracy is 95.7%, which is less than that of Cubic SVM.
The second highest accuracy is achieved by QSVM, which is 99.7% in 15.37 (s). The sensitivity rate of each classifier is also computed, and the best-noted value for Cubic SVM is 99 %, which can be verified using a confusion matrix, as shown in Figure 19.
In Figure 20, the time comparison graph of the models by using the machine learning classifiers is shown. FG-SVM utilized the maximum time in the fusion model. The second highest time was utilized by the FG-SVM classifier in MEWOA serial fusion deep features. GN-Bayes utilized minimum time.
In Table 14, Fine-tuned MobilenetV2 was applied on the INbreast dataset. The GAP layer is used to extract the deep features of the dataset and fed to classifiers. The highest accuracy is 98.3%, which is achieved by the LSVM classifier in 18.80 (s). In this table, the minimum time is taken by GN-Bayes, which is 13.53 (s), but the accuracy is 94.8%, which is less than that of Linear SVM. The second highest accuracy is achieved by QSVM, which is 98.2% in 16.11 (s).
The sensitivity rate of each classifier is also computed, and the best-noted value is 98.35% for LSVM. It can be confirmed using a confusion matrix, as illustrated in Figure 21.
In Table 15, Fine-tuned Nasnet is applied on the INbreast dataset. The deep features of the dataset are extracted from the GAP layer and fed to classifiers. The highest accuracy is 98.6%, which is achieved by the Cubic SVM classifier in 10.85 (s). In this table, the minimum time is taken by QSVM, which is 9.549 (s), and the accuracy is 98.6%.
The second highest accuracy is achieved by GN-Bayes, which is 98.4% in 13.07 (s). Each classifier sensitivity rate is calculated and QSVM has a best-noted value that is 98.5%. The sensitivity rate can be verified by the confusion matrix that is described in Figure 22.
In Table 16, MEWOA was applied on Fine-tuned MobilenetV2 by using INbreast dataset. The deep features of the dataset are extracted from the GAP layer and fed to classifiers. The highest accuracy is 98.3%, which is achieved by the Fine KNN classifier in 35.41 (s). In this table, the minimum time is taken by GN-Bayes, which is 8.4453 (s), and the accuracy is 94.0%.
The second highest accuracy is achieved by QSVM, which is 98.2% in 8.86 (s). The sensitivity rate of each classifier is also computed, and the best-noted value is 98% for Cubic SVM. A confusion matrix, as shown in Figure 23, can be used to verify it.
In Table 17, MEWOA was applied on Fine-tuned Nasnet by using INbreast dataset. The GAP layer is used to extract the deep features of the dataset and features are fed to classifiers.
The highest accuracy is 98.6%, which is achieved by the Cubic SVM classifier in 6.24 (s). In this table, the minimum time is taken by QSVM, which is 4.55 (s), and the accuracy is 98.6%.
The second highest accuracy is achieved by WKNN, which is 98.5% in 10.47 (s). The sensitivity rate of each classifier is also computed and Cubic SVM has the best noted value that is 98.5 and It can be verified by using a confusion matrix, as described in Figure 24.
In Table 18, Fusion on MEWOA MobilenetV2 & Nasnet model is applied on the INbreast dataset. The deep features of the dataset are extracted from the GAP layer and fed to classifiers.
The highest accuracy is 99.9%, which is achieved by the Cubic SVM classifier in 23.04 (s). In this table, the minimum time is taken by Fine Tree, which is 17.63 (s), and the accuracy is 98.8%.
The second highest accuracy is achieved by QSVM, which is 99.8% in 23.68 (s). The sensitivity rate of each classifier is also computed, and the best-noted value is 99.9% for Cubic SVM. The confusion matrix is illustrated in Figure 25 to verify the results.
In Table 19, MEWOA on Fusion is applied to the INbreast dataset. The deep features of the dataset are extracted from the GAP layer and fed to classifiers.
The highest accuracy is 99.7%, which is achieved by the WKNN classifier in 6.57 (s). In this table, the minimum time is taken by Cubic SVM is 1.6178 (s), and the accuracy is 99.1%.
The second highest accuracy is achieved by Quadratic SVM, which is 99.6% in 1.9933 (s). The sensitivity rate of each classifier is also computed, and the Cubic SVM has a best-noted value that is 99%. A confusion matrix, as shown in Figure 26, can be used to verify the sensitivity rate.
In Figure 27, the time comparison graph of the deep learning models by using the machine learning classifiers is presented. FG-SVM utilized maximum time in the fusion model. The second highest time was utilized by the Fine KNN classifier in Fine-tuned MobilenetV2. The QSVM utilized minimum time.
Table 20 illustrates comparisons of CBIS-DDSM classification images with respect to other classification studies. The number of images, methods, sensitivity, precision, F1-score, AUC, and accuracy is mentioned in the table. The proposed method shows better results as compared to the other studies.
Table 21 contrasts comparative analysis on MIAS dataset classification images. The number of images, methods, sensitivity, precision, F1-score, AUC, and accuracy compared with the other studies. The proposed method shows good results compared to other studies.
Table 22 shows classification comparisons of the INbreast images with respect to other classification studies. The number of images, methodology, sensitivity, F1-Score, precision, AUC, and accuracy results are mentioned in the table. The proposed method shows good results as compared to the other studies.

4. Discussion

Breast cancer for women is a fatal disease all over the world. The women’s life can be saved if cancer is detected at an initial phase. To classify the mammogram images on the basis of features is difficult because optimal features extraction from mammogram images is a challenging task. The three publically available mammogram images dataset of CBIS-DDSM, INbreast, and mini-MIAS are taken to extract the features and perform the classification. Data augmentation is performed to increase the volume of data. Deep learning models achieve best when train with large datasets.The datasets are simulated using the Fine-tuned MobilenetV2 and Nasnet Mobile approaches. To improve the model efficiency the deep features are extracted from the middle layer and fed into MEWOA. The ideal features are selected by using the MEWOA, which reduces the computational cost. The optimal features are selected by using the MEWOA, which reduces the computational cost. The serial fusion is performed by using the MEWOA on MobilenetV2 and MEWOA Nasnet mobile. In the latter, the MEWOA is performed on the fused features to select the best optimized features. The machine learning classifiers are applied. To estimate the performance of the system different measures are applied such as Sensitivity, Precision, F1-Score, AUC, FPR, Accuracy, and Time are computed. All computation is performed on MATLAB2020a by using the personal computer of 16 GB RAM, 4 GB graphics card is used. The comparison time graphs of the figures are made to represent the comparisons of the different classifiers.
The limitation of this proposed approach is that it entails a large volume of the datasets. To rise the size of the datasets, data augmentation is required. The results increased when the data size is large. The deep learning training on a large number of datasets also takes more time. The transfer learning approach is used to increase the efficiency of the system.

5. Conclusions

In medical imaging field, extract the features and on the basis of optimized features classification of images is the main domain by using the deep learning procedures. The machine learning classifiers are applied to generate more productive results. This proposed work employed Fine-tuned MobilenetV2 and Nasnet Mobile models to perform the training of the three imbalanced datasets. To extract the deep features the average pool layer is used. Transfer learning and the Adam optimization approaches are utilized to extract the deep features by using the MEWOA on fine-tuned models. The extracted deep features of these optimized models are fused by using the non-redundant serial fusion. The fused deep features are again optimized by using the MEWOA. Finally, classification results are established by applying the machine learning classifiers. The fusion practice increases the accuracy of the results but increases the time of the system. The MEWOM is applied, which optimized the features by reducing the time of computation. By using these techniques, the false-negative and true positive rates decreased. This methodology will be helpful for the radiologists as a second opinion to address the problems of optimal feature extraction and on the basis of optimal features perform the classifications.

Author Contributions

This research work is done as part of PhD research work of S.Z. under the supervision of U.S. and co-supervision of I.U.L. All authors have read and agreed to the published version of the manuscript.

Funding

Not applicable.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

CBIS-DDSM, MIAS and INbreast are publically available data sets and are easily accessible from the website. The INbreast data set was used after the permission from INbreast research group.

Conflicts of Interest

The authors declared that they have no conflict of interest in this work.

References

  1. Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
  2. Zhao, Z.-Q.; Zheng, P.; Xu, S.-T.; Wu, X. Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Sharma, G.N.; Dave, R.; Sanadya, J.; Sharma, P.; Sharma, K. Various types and management of breast cancer: An overview. J. Adv. Pharm. Technol. Res. 2010, 1, 109. [Google Scholar]
  4. Ekpo, E.U.; Alakhras, M.; Brennan, P. Errors in mammography cannot be solved through technology alone. Asian Pac. J. Cancer Prev. APJCP 2018, 19, 291. [Google Scholar]
  5. Lauby-Secretan, B.; Scoccianti, C.; Loomis, D.; Benbrahim-Tallaa, L.; Bouvard, V.; Bianchini, F.; Straif, K. Breast-cancer screening—Viewpoint of the IARC Working Group. N. Engl. J. Med. 2015, 372, 2353–2358. [Google Scholar] [CrossRef] [Green Version]
  6. Khan, M.A.; Ashraf, I.; Alhaisoni, M.; Damaševičius, R.; Scherer, R.; Rehman, A.; Bukhari, S.A.C. Multimodal brain tumor classification using deep learning and robust feature selection: A machine learning application for radiologists. Diagnostics 2020, 10, 565. [Google Scholar] [CrossRef] [PubMed]
  7. Khan, M.A.; Sharif, M.; Akram, T.; Damaševičius, R.; Maskeliūnas, R. Skin lesion segmentation and multiclass classification using deep learning features and improved moth flame optimization. Diagnostics 2021, 11, 811. [Google Scholar] [CrossRef]
  8. Zahoor, S.; Lali, I.U.; Khan, M.A.; Javed, K.; Mehmood, W. Breast cancer detection and classification using traditional computer vision techniques: A comprehensive review. Curr. Med. Imaging 2020, 16, 1187–1200. [Google Scholar] [CrossRef]
  9. Coolen, A.M.; Voogd, A.C.; Strobbe, L.J.; Louwman, M.W.; Tjan-Heijnen, V.C.; Duijm, L.E. Impact of the second reader on screening outcome at blinded double reading of digital screening mammograms. Br. J. Cancer 2018, 119, 503–507. [Google Scholar] [CrossRef]
  10. Jung, N.Y.; Kang, B.J.; Kim, H.S.; Cha, E.S.; Lee, J.H.; Park, C.S.; Whang, I.Y.; Kim, S.H.; An, Y.Y.; Choi, J.J. Who could benefit the most from using a computer-aided detection system in full-field digital mammography? World J. Surg. Oncol. 2014, 12, 1–9. [Google Scholar] [CrossRef] [Green Version]
  11. Freer, T.W.; Ulissey, M.J. Screening mammography with computer-aided detection: Prospective study of 12,860 patients in a community breast center. Radiology 2001, 220, 781–786. [Google Scholar] [CrossRef]
  12. Warren Burhenne, L.J.; Wood, S.A.; D’Orsi, C.J.; Feig, S.A.; Kopans, D.B.; O’Shaughnessy, K.F.; Sickles, E.A.; Tabar, L.; Vyborny, C.J.; Castellino, R.A. Potential contribution of computer-aided detection to the sensitivity of screening mammography. Radiology 2000, 215, 554–562. [Google Scholar] [CrossRef]
  13. Saleem, F.; Khan, M.A.; Alhaisoni, M.; Tariq, U.; Armghan, A.; Alenezi, F.; Choi, J.-I.; Kadry, S. Human gait recognition: A single stream optimal deep learning features fusion. Sensors 2021, 21, 7584. [Google Scholar] [CrossRef]
  14. Khan, S.; Khan, M.A.; Alhaisoni, M.; Tariq, U.; Yong, H.-S.; Armghan, A.; Allenzi, F. Human Action Recognition: Paradigm of Best Deep Learning Features Selection and Serial Based Extended Fusion. Sensors 2021, 21, 7941. [Google Scholar] [CrossRef]
  15. Khan, M.A.; Alhaisoni, M.; Tariq, U.; Hussain, N.; Majid, A.; Damaševičius, R.; Maskeliunas, R. COVID-19 Case Recognition from Chest CT Images by Deep Learning, Entropy-Controlled Firefly Optimization, and Parallel Feature Fusion. Sensors 2021, 21, 7286. [Google Scholar] [CrossRef]
  16. Azhar, I.; Sharif, M.; Raza, M.; Khan, M.A.; Yong, H.-S. A Decision Support System for Face Sketch Synthesis Using Deep Learning and Artificial Intelligence. Sensors 2021, 21, 8178. [Google Scholar] [CrossRef]
  17. Chai, J.; Zeng, H.; Li, A.; Ngai, E.W. Deep learning in computer vision: A critical review of emerging techniques and application scenarios. Mach. Learn. Appl. 2021, 6, 100134. [Google Scholar] [CrossRef]
  18. Khan, M.A.; Rajinikanth, V.; Satapathy, S.C.; Taniar, D.; Mohanty, J.R.; Tariq, U.; Damasevicius, R. VGG19 Network Assisted Joint Segmentation and Classification of Lung Nodules in CT Images. Diagnostics 2021, 11, 2208. [Google Scholar] [CrossRef]
  19. Nawaz, M.; Nazir, T.; Masood, M.; Mehmood, A.; Mahum, R.; Khan, M.A.; Kadry, S.; Thinnukool, O. Analysis of brain MRI images using improved cornernet approach. Diagnostics 2021, 11, 1856. [Google Scholar] [CrossRef]
  20. Ragab, D.A.; Sharkas, M.; Marshall, S.; Ren, J. Breast cancer detection using deep convolutional neural networks and support vector machines. PeerJ 2019, 7, e6201. [Google Scholar] [CrossRef]
  21. Suzuki, S.; Zhang, X.; Homma, N.; Ichiji, K.; Sugita, N.; Kawasumi, Y.; Ishibashi, T.; Yoshizawa, M. Mass detection using deep convolutional neural network for mammographic computer-aided diagnosis. In Proceedings of the 2016 55th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE), Tsukuba, Japan, 20–23 September 2016; p. 1382. [Google Scholar]
  22. Sharma, S.; Khanna, P. Computer-aided diagnosis of malignant mammograms using Zernike moments and SVM. J. Digit. Imaging 2015, 28, 77–90. [Google Scholar] [CrossRef]
  23. Shen, L.; Margolies, L.R.; Rothstein, J.H.; Fluder, E.; McBride, R.; Sieh, W. Deep learning to improve breast cancer detection on screening mammography. Sci. Rep. 2019, 9, 1–12. [Google Scholar] [CrossRef]
  24. Falconi, L.G.; Perez, M.; Aguilar, W.G.; Conci, A. Transfer learning and fine tuning in breast mammogram abnormalities classification on CBIS-DDSM database. Adv. Sci. Technol. Eng. Syst. 2020, 5, 154–165. [Google Scholar] [CrossRef] [Green Version]
  25. Khan, H.N.; Shahid, A.R.; Raza, B.; Dar, A.H.; Alquhayz, H. Multi-view feature fusion based four views model for mammogram classification using convolutional neural network. IEEE Access 2019, 7, 165724–165733. [Google Scholar] [CrossRef]
  26. Ansar, W.; Shahid, A.R.; Raza, B.; Dar, A.H. Breast cancer detection and localization using mobilenet based transfer learning for mammograms. In Proceedings of the International Symposium on Intelligent Computing Systems, Sharjah, United Arab Emirates, 18–19 March 2020; Springer: Cham, Switzerland, 2020; pp. 11–21. [Google Scholar]
  27. Chakraborty, J.; Midya, A.; Rabidas, R. Computer-aided detection and diagnosis of mammographic masses using multi-resolution analysis of oriented tissue patterns. Expert Syst. Appl. 2018, 99, 168–179. [Google Scholar] [CrossRef]
  28. Lbachir, I.A.; Daoudi, I.; Tallal, S. Automatic computer-aided diagnosis system for mass detection and classification in mammography. Multimed. Tools Appl. 2021, 80, 9493–9525. [Google Scholar] [CrossRef]
  29. Aminikhanghahi, S.; Shin, S.; Wang, W.; Jeon, S.I.; Son, S.H. A new fuzzy Gaussian mixture model (FGMM) based algorithm for mammography tumor image classification. Multimed. Tools Appl. 2017, 76, 10191–10205. [Google Scholar] [CrossRef]
  30. Al-Antari, M.A.; Al-Masni, M.A.; Choi, M.-T.; Han, S.-M.; Kim, T.-S. A fully integrated computer-aided diagnosis system for digital X-ray mammograms via deep learning detection, segmentation, and classification. Int. J. Med. Inform. 2018, 117, 44–54. [Google Scholar] [CrossRef]
  31. Khamparia, A.; Bharati, S.; Podder, P.; Gupta, D.; Khanna, A.; Phung, T.K.; Thanh, D.N. Diagnosis of breast cancer based on modern mammography using hybrid transfer learning. Multidimens. Syst. Signal Process. 2021, 32, 747–765. [Google Scholar] [CrossRef]
  32. Zhang, Q.; Li, Y.; Zhao, G.; Man, P.; Lin, Y.; Wang, M. A novel algorithm for breast mass classification in digital mammography based on feature fusion. J. Healthc. Eng. 2020, 2020, 8860011. [Google Scholar] [CrossRef]
  33. Ridhi, A.; Rai, P.K.; Balasubramanian, R. Deep feature-based automatic classification of mammograms. Med. Biol. Eng. Comput. 2020, 58, 1199–1211. [Google Scholar]
  34. Dhungel, N.; Carneiro, G.; Bradley, A.P. A deep learning approach for the analysis of masses in mammograms with minimal user intervention. Med. Image Anal. 2017, 37, 114–128. [Google Scholar] [CrossRef] [Green Version]
  35. Garcia-Garcia, A.; Orts-Escolano, S.; Oprea, S.; Villena-Martinez, V.; Garcia-Rodriguez, J. A review on deep learning techniques applied to semantic segmentation. arXiv 2017, arXiv:1704.06857. [Google Scholar]
  36. Gardezi, S.J.S.; Elazab, A.; Lei, B.; Wang, T. Breast cancer detection and diagnosis using mammographic data: Systematic review. J. Med. Internet Res. 2019, 21, e14464. [Google Scholar] [CrossRef] [Green Version]
  37. Abbas, A.; Abdelsamea, M.M.; Gaber, M.M. Detrac: Transfer learning of class decomposed medical images in convolutional neural networks. IEEE Access 2020, 8, 74901–74913. [Google Scholar] [CrossRef]
  38. Al-Antari, M.A.; Han, S.-M.; Kim, T.-S. Evaluation of deep learning detection and classification towards a computer-aided diagnosis of breast lesions in digital X-ray mammograms. Comput. Methods Programs Biomed. 2020, 196, 105584. [Google Scholar] [CrossRef]
  39. Ribli, D.; Horváth, A.; Unger, Z.; Pollner, P.; Csabai, I. Detecting and classifying lesions in mammograms with deep learning. Sci. Rep. 2018, 8, 1–7. [Google Scholar] [CrossRef] [Green Version]
  40. Agnes, S.A.; Anitha, J.; Pandian, S.I.A.; Peter, J.D. Classification of mammogram images using multiscale all convolutional neural network (MA-CNN). J. Med. Syst. 2020, 44, 1–9. [Google Scholar] [CrossRef] [PubMed]
  41. Dhungel, N.; Carneiro, G.; Bradley, A.P. The automated learning of deep features for breast mass classification from mammograms. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Athens, Greece, 17–21 October 2016; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
  42. Jagtap, A.D.; Kawaguchi, K.; Karniadakis, G.E. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. J. Comput. Phys. 2020, 404, 109136. [Google Scholar] [CrossRef] [Green Version]
  43. Jagtap, A.D.; Kawaguchi, K.; Karniadakis, G.E. Locally adaptive activation functions with slope recovery for deep and physics-informed neural networks. Proc. R. Soc. A 2020, 476, 20200334. [Google Scholar] [CrossRef]
  44. Jagtap, A.D.; Shin, Y.; Kawaguchi, K.; Karniadakis, G.E. Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions. arXiv 2021, arXiv:2105.09513. [Google Scholar] [CrossRef]
  45. Lee, R.S.; Gimenez, F.; Hoogi, A.; Rubin, D. Curated breast imaging subset of DDSM. Cancer Imaging Arch. 2016, 8, 2016. [Google Scholar]
  46. Moreira, I.C.; Amaral, I.; Domingues, I.; Cardoso, A.; Cardoso, M.J.; Cardoso, J.S. Inbreast: Toward a full-field digital mammographic database. Acad. Radiol. 2012, 19, 236–248. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  47. Hou, X.; Bai, Y.; Xie, Y.; Li, Y. Mass segmentation for whole mammograms via attentive multi-task learning framework. Phys. Med. Biol. 2021, 66, 105015. [Google Scholar] [CrossRef] [PubMed]
  48. Attique Khan, M.; Sharif, M.; Akram, T.; Kadry, S.; Hsu, C.H. A two-stream deep neural network-based intelligent system for complex skin cancer types classification. Int. J. Intell. Syst. 2021, 1–29. [Google Scholar] [CrossRef]
  49. Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
  50. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. Ssd: Single shot multibox detector. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
  51. Zoph, B.; Vasudevan, V.; Shlens, J.; Le, Q.V. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8697–8710. [Google Scholar]
  52. Zhang, X.; Zhou, X.; Lin, M.; Sun, J. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6848–6856. [Google Scholar]
  53. Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A.A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
  54. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
  55. Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
  56. Weiss, K.; Khoshgoftaar, T.M.; Wang, D. A survey of transfer learning. J. Big Data 2016, 3, 9. [Google Scholar] [CrossRef] [Green Version]
  57. Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
  58. Guo, W.; Liu, T.; Dai, F.; Xu, P. An improved whale optimization algorithm for forecasting water resources demand. Appl. Soft Comput. 2020, 86, 105925. [Google Scholar] [CrossRef]
  59. El Houby, E.M.; Yassin, N.I. Malignant and nonmalignant classification of breast lesions in mammograms using convolutional neural networks. Biomed. Signal Process. Control 2021, 70, 102954. [Google Scholar] [CrossRef]
  60. Zhang, H.; Wu, R.; Yuan, T.; Jiang, Z.; Huang, S.; Wu, J.; Hua, J.; Niu, Z.; Ji, D. DE-Ada*: A novel model for breast mass classification using cross-modal pathological semantic mining and organic integration of multi-feature fusions. Inf. Sci. 2020, 539, 461–486. [Google Scholar] [CrossRef]
  61. Tsochatzidis, L.; Costaridou, L.; Pratikakis, I. Deep learning for breast cancer diagnosis from mammograms—A comparative study. J. Imaging 2019, 5, 37. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  62. Shams, S.; Platania, R.; Zhang, J.; Kim, J.; Lee, K.; Park, S.J. Deep generative breast cancer screening and diagnosis. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Granada, Spain, 16–20 September 2018; Springer: Cham, Switzerland, 2018; pp. 859–867. [Google Scholar]
  63. Chakravarthy, S.S.; Rajaguru, H. Automatic Detection and Classification of Mammograms Using Improved Extreme Learning Machine with Deep Learning. IRBM 2021, 43, 49–61. [Google Scholar] [CrossRef]
  64. Shayma’a, A.H.; Sayed, M.S.; Abdalla, M.I.; Rashwan, M.A. Breast cancer masses classification using deep convolutional neural networks and transfer learning. Multimed. Tools Appl. 2020, 79, 30735–30768. [Google Scholar]
  65. Kaur, P.; Singh, G.; Kaur, P. Intellectual detection and validation of automated mammogram breast cancer images by multi-class SVM using deep learning classification. Inform. Med. Unlocked 2019, 16, 100151. [Google Scholar] [CrossRef]
  66. Ting, F.F.; Tan, Y.J.; Sim, K.S. Convolutional neural network improvement for breast cancer classification. Expert Syst. Appl. 2019, 120, 103–115. [Google Scholar] [CrossRef]
Figure 1. Proposed architecture of classification of breast cancer using deep learning.
Figure 1. Proposed architecture of classification of breast cancer using deep learning.
Diagnostics 12 00557 g001
Figure 2. Sample images CBIS-DDSM.
Figure 2. Sample images CBIS-DDSM.
Diagnostics 12 00557 g002
Figure 3. Samples Images of INbreast dataset.
Figure 3. Samples Images of INbreast dataset.
Diagnostics 12 00557 g003
Figure 4. Sample images of MIAS dataset.
Figure 4. Sample images of MIAS dataset.
Diagnostics 12 00557 g004
Figure 5. Data augmentation.
Figure 5. Data augmentation.
Diagnostics 12 00557 g005
Figure 6. Transfer learning architecture.
Figure 6. Transfer learning architecture.
Diagnostics 12 00557 g006
Figure 7. Fine-tuned MobilenetV2 TPR CBIS-DDSM.
Figure 7. Fine-tuned MobilenetV2 TPR CBIS-DDSM.
Diagnostics 12 00557 g007
Figure 8. Fine-tuned Nasnet CBIS-DDSM.
Figure 8. Fine-tuned Nasnet CBIS-DDSM.
Diagnostics 12 00557 g008
Figure 9. MEWOA TPR on MobilenetV2 CBIS-DDSM.
Figure 9. MEWOA TPR on MobilenetV2 CBIS-DDSM.
Diagnostics 12 00557 g009
Figure 10. MEWOA TPR on Nasnet Mobile CBIS-DDSM.
Figure 10. MEWOA TPR on Nasnet Mobile CBIS-DDSM.
Diagnostics 12 00557 g010
Figure 11. Fusion on MobilenetV2 and Nasnet TPR for CBIS-DDSM.
Figure 11. Fusion on MobilenetV2 and Nasnet TPR for CBIS-DDSM.
Diagnostics 12 00557 g011
Figure 12. MEWOA on fusion TPR for CBIS-DDSM.
Figure 12. MEWOA on fusion TPR for CBIS-DDSM.
Diagnostics 12 00557 g012
Figure 13. Time comparison with individual machine learning classifiers of deep learning models for CBIS-DDSM.
Figure 13. Time comparison with individual machine learning classifiers of deep learning models for CBIS-DDSM.
Diagnostics 12 00557 g013
Figure 14. Fine-tuned MobilenetV2 TPR for MIAS.
Figure 14. Fine-tuned MobilenetV2 TPR for MIAS.
Diagnostics 12 00557 g014
Figure 15. Fine-tuned Nasnet TPR for MIAS.
Figure 15. Fine-tuned Nasnet TPR for MIAS.
Diagnostics 12 00557 g015
Figure 16. MEWOA on Fine-tuned MobilenetV2 TPR for MIAS.
Figure 16. MEWOA on Fine-tuned MobilenetV2 TPR for MIAS.
Diagnostics 12 00557 g016
Figure 17. MEWOA on Fine-tuned Nasnet TPR for MIAS.
Figure 17. MEWOA on Fine-tuned Nasnet TPR for MIAS.
Diagnostics 12 00557 g017
Figure 18. Fusion TPR for MIAS.
Figure 18. Fusion TPR for MIAS.
Diagnostics 12 00557 g018
Figure 19. MEWOA on fusion TPR MIAS.
Figure 19. MEWOA on fusion TPR MIAS.
Diagnostics 12 00557 g019
Figure 20. Time compared with individual machine learning classifiers of deep learning models for MIAS.
Figure 20. Time compared with individual machine learning classifiers of deep learning models for MIAS.
Diagnostics 12 00557 g020
Figure 21. Fine-tuned MobilenetV2 TPR for INbreast.
Figure 21. Fine-tuned MobilenetV2 TPR for INbreast.
Diagnostics 12 00557 g021
Figure 22. Fine-tuned Nasnet TPR for INbreast.
Figure 22. Fine-tuned Nasnet TPR for INbreast.
Diagnostics 12 00557 g022
Figure 23. MEWOA on Fine-tuned MobilenetV2 TPR for INbreast.
Figure 23. MEWOA on Fine-tuned MobilenetV2 TPR for INbreast.
Diagnostics 12 00557 g023
Figure 24. MEWOA on Fine-tuned Nasnet TPR for INbreast.
Figure 24. MEWOA on Fine-tuned Nasnet TPR for INbreast.
Diagnostics 12 00557 g024
Figure 25. Fusion TPR on INbreast.
Figure 25. Fusion TPR on INbreast.
Diagnostics 12 00557 g025
Figure 26. MEWOA on Fusion TPR INbreast.
Figure 26. MEWOA on Fusion TPR INbreast.
Diagnostics 12 00557 g026
Figure 27. Time compared with individual machine learning classifiers of deep learning models for INbreast.
Figure 27. Time compared with individual machine learning classifiers of deep learning models for INbreast.
Diagnostics 12 00557 g027
Table 1. Dataset information.
Table 1. Dataset information.
DatasetTotal ImagesClassesAugmented Images
CBIS-DDSM1696214,328
INbreast10827200
MIAS300314,400
Table 2. Classification results using Fine-tuned MobilenetV2 deep features for CBIS-DDSM.
Table 2. Classification results using Fine-tuned MobilenetV2 deep features for CBIS-DDSM.
ModelSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM90.2090.2590.220.960.10090.3260.13
Fine Tree72.2572.5072.340.780.27572.630.13
LSVM77.4577.5577.490.850.22577.6296.91
QSVM85.8585.9585.890.930.14086.0287.09
FG-SVM84.2088.5086.290.940.15585.2361.36
GN-Bayes71.6571.5571.590.780.28571.520.43
FKNN87.0586.9086.970.870.13087.073.68
MKNN69.7570.0569.890.770.30569.273.00
WKNN88.2588.1088.170.960.12088.272.92
Co-KNN67.4567.5067.470.750.32567.172.90
Table 3. Classification results using Fine-tuned Nasnet deep features for CBIS-DDSM.
Table 3. Classification results using Fine-tuned Nasnet deep features for CBIS-DDSM.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM94.0094.0094.000.980.06093.9112.96
Fine Tree89.5090.0089.740.930.10589.816.91
LSVM89.5089.5089.500.960.10589.4117.51
QSVM92.5092.0092.240.980.07592.4113.75
FG-SVM84.0087.5085.710.950.16084.6275.70
GN-Bayes84.5084.5084.500.860.15584.018.27
FKNN93.0093.0093.000.930.01092.960.35
MKNN87.0086.5086.740.940.13086.560.00
WKNN94.0093.5093.740.980.06093.659.90
Co-KNN86.5086.5086.500.940.13586.460.19
Table 4. Classification results using MEWOA on MobilenetV2 deep features for CBIS-DDSM.
Table 4. Classification results using MEWOA on MobilenetV2 deep features for CBIS-DDSM.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM89.9590.0589.990.960.10590.0132.98
Fine Tree70.2070.2570.220.760.30070.414.83
LSVM75.4575.5575.490.830.24575.7146.36
QSVM85.1585.2585.190.920.15085.3150.17
FG-SVM83.4588.0585.680.940.16584.5178.98
GN-Bayes70.6070.5070.540.780.29570.58.70
FKNN86.7586.6086.670.870.13086.737.96
MKNN69.4069.5569.470.770.30568.937.01
WKNN87.4087.3587.370.960.12587.437.38
Co-KNN67.3067.2567.270.740.33067.337.49
Table 5. Classification results using MEWOA Nasnet Mobile deep features for CBIS-DDSM.
Table 5. Classification results using MEWOA Nasnet Mobile deep features for CBIS-DDSM.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM93.5093.4593.470.980.06593.573.24
Fine Tree88.9588.9588.950.920.11089.0996.56
LSVM89.1589.2589.190.960.11089.377.09
QSVM92.2592.3592.300.970.08092.375.26
FG-SVM84.2587.6085.890.950.16085.1182.78
GN-Bayes84.2084.4584.320.860.16083.711.57
FKNN92.6592.5092.570.930.07592.640.46
MKNN86.6086.4086.490.940.13586.439.42
WKNN93.5093.4593.470.980.06593.542.39
Co-KNN86.1086.0086.040.940.14086.139.88
Table 6. Classification results using Fusion on MobilenetV2 and Nasnet deep features for CBIS-DDSM.
Table 6. Classification results using Fusion on MobilenetV2 and Nasnet deep features for CBIS-DDSM.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM94.1094.1094.100.990.06094.1314.97
Fine Tree88.6088.6588.620.910.11588.7151.53
LSVM92.0592.0592.050.980.08092.1271.30
QSVM93.0093.0593.020.980.07093.0265.45
FG-SVM50.6576.8561.050.720.49554.0795.29
GN-Bayes85.8586.3086.070.870.14585.555.16
FKNN92.8092.6092.690.930.07592.6135.51
MKNN89.3589.2089.270.960.10589.2134.29
WKNN92.1092.0092.040.980.08092.1134.09
Co-KNN88.6088.4588.520.960.15088.5483.34
Table 7. Classification results using MEWOA on fusion deep features for CBIS-DDSM.
Table 7. Classification results using MEWOA on fusion deep features for CBIS-DDSM.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM93.7593.8093.770.980.12093.8255.84
Fine Tree87.9087.9587.920.900.12088.042.42
LSVM91.7591.8091.770.980.08091.8241.52
QSVM92.9092.9592.920.980.51093.0227.28
FG-SVM50.7076.8561.090.690.45054.0692.97
GN-Bayes85.9586.0085.970.880.14085.559.89
FKNN92.5092.4592.470.930.07592.5408.99
MKNN88.5088.4088.440.960.11588.283.916
WKNN92.2091.7591.970.980.08591.8407.35
Co-KNN88.6088.4588.520.960.11588.5407.14
Table 8. Classification results using Fine-tuned MobilenetV2 deep features for MIAS.
Table 8. Classification results using Fine-tuned MobilenetV2 deep features for MIAS.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM98.7399.1098.911.000.00399.485.29
Fine Tree79.7386.0382.760.910.93388.922.82
LSVM96.8698.2097.521.000.01398.497.40
QSVM98.5098.9698.721.000.00399.379.88
FG-SVM96.0398.6397.311.000.01698.2353.67
GN-Bayes89.0381.4385.060.970.06088.524.15
FKNN98.9098.9398.911.000.00099.473.83
MKNN88.2389.9089.050.990.05392.675.73
WKNN97.7098.2697.971.000.00698.875.39
Co-KNN52.0688.4065.520.960.23677.974.30
Table 9. Classification results using Fine-tuned Nasnet deep features for MIAS.
Table 9. Classification results using Fine-tuned Nasnet deep features for MIAS.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM99.0399.1399.081.000.00099.6267.43
Fine Tree98.3398.3098.311.000.00699.135.18
LSVM97.1398.6097.861.000.02098.756.34
QSVM98.0698.8698.461.000.00699.158.91
FG-SVM95.8398.5697.171.000.01698.2193.88
GN-Bayes96.0091.0093.430.980.02095.471.71
FKNN99.0399.3399.181.000.00099.685.83
MKNN97.6398.0397.831.000.01098.780.99
WKNN99.2099.3099.241.000.00099.781.45
Co-KNN96.2097.896.991.000.0198.182.53
Table 10. Classification results using MEWOA on Fine-tuned MobilenetV2 deep features for MIAS.
Table 10. Classification results using MEWOA on Fine-tuned MobilenetV2 deep features for MIAS.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM98.8798.1699.011.000.00099.475.49
Fine Tree80.0086.5683.150.910.09389.120.09
LSVM96.7798.1697.461.000.01698.386.78
QSVM98.7399.0698.891.000.00099.469.90
FG-SVM95.7798.6697.190.990.01698.1305.64
GN-Bayes89.4081.8085.430.960.05688.922.38
FKNN98.7798.8398.790.990.00699.365.81
MKNN88.4089.5088.940.980.05692.465.34
WKNN97.8098.4098.091.000.00698.964.05
Co-KNN51.3088.3064.890.960.24077.667.94
Table 11. Classification results using MEWOA on Fine-tuned Nasnet deep features for MIAS.
Table 11. Classification results using MEWOA on Fine-tuned Nasnet deep features for MIAS.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
CSVM99.0099.0099.001.000.00099.618.35
Fine Tree97.6698.0097.831.000.00398.99.40
LSVM97.0099.0097.981.000.01098.817.49
QSVM98.0099.0098.491.000.00699.118.92
FG-SVM96.0098.6697.311.000.01698.280.55
GN-Bayes95.0090.0092.430.970.02394.99.06
FKNN98.6698.6698.660.990.00099.625.15
MKNN97.3397.6697.491.000.01098.524.62
WKNN99.0099.0099.001.000.00099.724.70
Co-KNN95.6698.0096.811.000.01398.125.10
Table 12. Classification results using Fusion deep features for MIAS.
Table 12. Classification results using Fusion deep features for MIAS.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM99.6699.8699.731.000.00399.8133.46
Fine Tree98.0098.0098.000.980.00698.9175.03
LSVM99.2099.7399.461.000.00699.6115.46
QSVM99.6699.9099.781.000.00099.8113.38
FG-SVM47.3391.6062.050.830.26076.21276.20
GN-Bayes97.2392.7094.910.980.01696.460.06
FKNN99.1698.9699.060.990.00399.4490.17
MKNN98.3399.0098.661.000.02099.3140.46
WKNN98.7699.4699.111.000.00399.4484.52
Co-KNN96.6098.3097.441.000.01098.7141.04
Table 13. Classification results using MEWOA on Fusion deep features for MIAS.
Table 13. Classification results using MEWOA on Fusion deep features for MIAS.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM99.0099.3399.161.000.00099.863.28
Fine Tree97.2097.1097.140.990.01398.231.91
LSVM98.7699.1698.961.000.00099.316.55
QSVM99.4399.7099.561.000.00099.715.37
FG-SVM47.6091.6062.640.780.26076.3699.49
GN-Bayes96.0091.6693.780.980.02095.77.09
FKNN99.2099.2099.200.990.00399.562.45
MKNN98.3098.5398.411.000.00698.962.35
WKNN98.8399.4099.111.000.00399.462.49
Co-KNN96.4698.0397.241.000.01398.263.85
Table 14. Classification results using Fine-tuned MobilenetV2 deep features for INbreast.
Table 14. Classification results using Fine-tuned MobilenetV2 deep features for INbreast.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM98.2098.1598.170.990.02098.118.89
Fine Tree97.8597.7597.790.990.02097.834.00
LSVM98.3598.2598.291.000.01598.318.80
QSVM98.2598.2098.220.990.01598.216.11
FG-SVM97.6597.6097.620.980.02597.646.86
GN-Bayes94.8594.8094.820.970.05094.813.53
FKNN98.3098.2098.240.980.01598.271.07
MKNN95.3095.2595.271.000.06595.122.20
WKNN98.0598.0098.021.000.02098.070.38
Co-KNN92.7092.8092.740.990.07592.422.05
Table 15. Classification results using Fine-tuned Nasnet deep features for INbreast.
Table 15. Classification results using Fine-tuned Nasnet deep features for INbreast.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM98.5098.5098.501.000.015098.610.85
Fine Tree98.5098.5098.501.000.015098.314.92
LSVM98.5098.5098.501.000.015098.613.23
QSVM98.5098.5098.501.000.015098.69.54
FG-SVM98.5098.0098.240.990.015098.233.00
GN-Bayes98.0098.0098.000.990.015098.413.07
FKNN98.5098.5098.500.990.015098.659.50
MKNN98.0098.0098.001.000.015098.418.79
WKNN98.5098.0098.251.000.015098.418.53
Co-KNN98.0098.0098.001.000.02098.159.31
Table 16. Classification results using MEWOA on Fine-tuned MobilenetV2 deep features for INbreast.
Table 16. Classification results using MEWOA on Fine-tuned MobilenetV2 deep features for INbreast.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM9898980.990.0298.18.77
Fine Tree9898981.000.0297.824.05
LSVM9898981.000.0298.110.92
QSVM9898980.990.01598.28.86
FG-SVM97.597.597.50.980.02597.819.83
GN-Bayes9494940.970.0694.08.44
FKNN9898980.980.01598.335.41
MKNN94.594.594.51.000.05594.234.88
WKNN9898981.000.0298.234.86
Co-KNN93.593.593.50.980.06593.434.89
Table 17. Classification results using MEWOA Fine-tuned Nasnet deep features for INbreast.
Table 17. Classification results using MEWOA Fine-tuned Nasnet deep features for INbreast.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM98.5098.5098.501.000.015098.66.24
Fine Tree98.5098.5098.501.000.015098.45.44
LSVM98.5098.5098.501.000.015098.67.47
QSVM98.5098.5098.501.000.015098.64.55
FG-SVM98.5098.0098.240.990.015098.214.49
GN-Bayes98.0098.0098.000.990.015098.47.47
FKNN98.5098.5098.500.990.015098.629.35
MKNN98.5098.4098.441.000.015098.49.47
WKNN98.5098.5098.501.000.015098.510.47
Co-KNN98.0097.5097.741.000.020097.89.75
Table 18. Classification results using Fusion deep features for INbreast.
Table 18. Classification results using Fusion deep features for INbreast.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time (s)
Cubic SVM99.9099.9099.901.000.00099.923.04
Fine Tree99.0099.099.000.990.01098.817.63
LSVM99.5099.5099.501.000.00099.826.72
QSVM99.5099.5099.501.000.00099.823.68
FG-SVM52.6576.9562.520.760.47555.0149.5
GN-Bayes99.0099.0099.000.990.01099.137.75
FKNN99.0099.0099.001.000.00099.644.00
MKNN99.0099.0099.001.000.01098.943.68
WKNN99.5099.5099.501.000.00599.543.64
Co-KNN98.5098.5098.501.000.01598.543.93
Table 19. Classification results using MEWOA on Fusion deep features for INbreast.
Table 19. Classification results using MEWOA on Fusion deep features for INbreast.
ModelsSensitivity (%)Precision (%)F1-Score (%)AUCFPRAccuracy (%)Time
Cubic SVM99.0099.1099.101.000.01099.11.61
Fine Tree98.5098.5098.500.990.01598.621.07
LSVM99.0099.0099.001.000.00599.65.30
QSVM99.0099.0099.001.000.00099.61.99
FG-SVM52.5077.0062.430.780.47554.98.40
GN-Bayes98.0098.5098.241.000.02098.25.33
FKNN99.0099.0099.000.990.01099.48.16
MKNN99.0099.0099.001.000.00599.46.43
WKNN99.0099.0099.001.000.00099.76.57
Co-KNN99.0099.0099.001.000.00599.46.95
Table 20. Comparisons with state of art CBIS-DDSM dataset.
Table 20. Comparisons with state of art CBIS-DDSM dataset.
ReferencesYearMethodImagesSensitivity (%)Precision (%)F1-Score (%)AUC (%)Accuracy (%)
[59]2021CNN159292.3190.0091.760.9291.2
[24]2020Nasnet, MobileNet, VGG, Resnet,
Xception
1696___________85.00.8484.4
[26]2020MobilenetV1, MobilenetV2 1696______70.0076.00______74.5
[60]2020DE-Ada*_______________________92.1987.05
[23]2019VGG, Residual Network_____86.1080.10______0.91______
[61]2019DCNN, Alexnet1696__________________0.8075.0
[62]2018Deep GeneRAtive Multitask_______________________88.489
Proposed Method2021MobilenetV2, Nasnet Mobile,
MEWOM
169693.7593.8093.770.9893.8
Table 21. Comparisons with the state of the art for the MIAS Dataset.
Table 21. Comparisons with the state of the art for the MIAS Dataset.
ReferencesYearMethodImagesSensitivity (%)Precision (%)F1-Score (%)AUC (%)Accuracy (%)
[63]2021ResNet-18, (ICS-ELM) 322_______________________98.13
[59]2021CNN32292.7294.1293.580.9493.39
[64]2020AlexNet, GoogleNet68100, 8097.37, 94.7498.3, 85.710.98, 0.9498.53,
88.24
[40]2019(MA-CNN)32296.00____________0.9996.47
[65]2019DCNN, MSVM322__________________0.9996.90
[66]2019Convolutional Neural Network
Improvement (CNNI-BCC)
______89.4790.71______0.9090.50
Proposed Method2021Mobilenet V2 & NasNet Mobile,
MEWOA
30099.0099.3399.161.0099.80
Table 22. Comparisons with state of art for INbreast Dataset.
Table 22. Comparisons with state of art for INbreast Dataset.
ReferencesYearMethodImagesSensitivity (%)Precision (%)F1-Score (%)AUC (%)Accuracy (%)
[63]2021ResNet-18, (ICS-ELM) 179_______________________98.26
[59]2021CNN38794.8391.2393.220.9493.04
[38]2020Inception ResNet V2107__________________0.9595.32
[60]2020De-ada*________________________92.6587.93
[34]2017Transfer learning, Random Forest 10898.070.0___________90.0
Proposed Method2021Fine-tuned MobilenetV2, Nasnet, MEWOM10899.099.099.01.0099.7
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zahoor, S.; Shoaib, U.; Lali, I.U. Breast Cancer Mammograms Classification Using Deep Neural Network and Entropy-Controlled Whale Optimization Algorithm. Diagnostics 2022, 12, 557. https://doi.org/10.3390/diagnostics12020557

AMA Style

Zahoor S, Shoaib U, Lali IU. Breast Cancer Mammograms Classification Using Deep Neural Network and Entropy-Controlled Whale Optimization Algorithm. Diagnostics. 2022; 12(2):557. https://doi.org/10.3390/diagnostics12020557

Chicago/Turabian Style

Zahoor, Saliha, Umar Shoaib, and Ikram Ullah Lali. 2022. "Breast Cancer Mammograms Classification Using Deep Neural Network and Entropy-Controlled Whale Optimization Algorithm" Diagnostics 12, no. 2: 557. https://doi.org/10.3390/diagnostics12020557

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop