Improved Bald Eagle Search Optimization with Synergic Deep Learning-Based Classification on Breast Cancer Imaging

Simple Summary The manual process of microscopic inspections is a laborious task, and the results might be misleading as a result of human error occurring. This article presents a model of an improved bald eagle search optimization with a synergic deep learning mechanism for breast cancer diagnoses using histopathological images (IBESSDL-BCHI). The performance validation of the IBESSDL-BCHI system was tested utilizing the benchmark dataset, and the results demonstrate that the IBESSDL-BCHI model has shown better general efficiency for BC classification. Abstract Medical imaging has attracted growing interest in the field of healthcare regarding breast cancer (BC). Globally, BC is a major cause of mortality amongst women. Now, the examination of histopathology images is the medical gold standard for cancer diagnoses. However, the manual process of microscopic inspections is a laborious task, and the results might be misleading as a result of human error occurring. Thus, the computer-aided diagnoses (CAD) system can be utilized for accurately detecting cancer within essential time constraints, as earlier diagnosis is the key to curing cancer. The classification and diagnosis of BC utilizing the deep learning algorithm has gained considerable attention. This article presents a model of an improved bald eagle search optimization with a synergic deep learning mechanism for breast cancer diagnoses using histopathological images (IBESSDL-BCHI). The proposed IBESSDL-BCHI model concentrates on the identification and classification of BC using HIs. To do so, the presented IBESSDL-BCHI model follows an image preprocessing method using a median filtering (MF) technique as a preprocessing step. In addition, feature extraction using a synergic deep learning (SDL) model is carried out, and the hyperparameters related to the SDL mechanism are tuned by the use of the IBES model. Lastly, long short-term memory (LSTM) was utilized to precisely categorize the HIs into two major classes, such as benign and malignant. The performance validation of the IBESSDL-BCHI system was tested utilizing the benchmark dataset, and the results demonstrate that the IBESSDL-BCHI model has shown better general efficiency for BC classification.


Introduction
Worldwide, the number of cancer cases is increasing at a faster rate than it ever has before. Multimodal medical imaging is utilized for diagnosing distinct kinds of cancers

•
An intelligent IBESSDL-BCHI technique comprising of MF-based pre-processing, SDL feature extraction, IBES-based parameter optimization, and an LSTM model for BC detection and classification using HIs is presented. To the best of our knowledge, the IBESSDL-BCHI model has never been presented in the literature. • A novel IBES algorithm is designed by the integration of oppositional-based learning with the traditional BES algorithm.

•
Hyperparameter optimization of the SDL model using the IBES algorithm using crossvalidation helps to boost the classification outcome of the IBESSDL-BCHI model for unseen data.

Related Works
In the study that was conducted earlier [11], a new patch-related DL technique named Pa-DBN-BC was suggested for the detection and classification of BC on histopathology images using a Deep Belief Network (DBN). In this technique, the features can be derived via conducting supervised fine-tuned and unsupervised pre-training stages. The network automatically extracts the features from the image stains. In the literature [12], the researchers compared two ML techniques for the automatic classification of BC histology images as either malevolent or benevolent and their respective sub-classes. The initial technique was designed based on the abstraction of a group of handcrafted features that are encrypted with Bag of Words (BoW) and locality-constrained linear coding, and it was well-trained through an SVM classifier. The next method was designed based on a CNN model.
In the literature [13], the researchers suggested a method to use DL techniques with convolutional layers for the extraction of valuable visual features and classify the BC. It was revealed that such DL techniques can derive superior features in comparison with the handcrafted FS methods. It further suggests a new advanced strategy to achieve the primary objective. Further, the model can be effectively improved through the progressive merging of the DL methods with weak classifiers as a stronger classifier. Xie et al. [14] presented a new model for the analysis of HIs of BC through unsupervised and supervised deep CNN networks. At first, it adapted Inception_ResNet_V2 and Inception_V3 infrastructures to binary and multi-class problems of BC-HI classification with the help of Transfer Learning (TL) approaches.
In the study that was conducted earlier [15], the authors recommended a system for BC classification with an Inception Recurrent Residual (IRRCNN) method. The proposed IRRCNN is a powerful DCNN method since it combines the robustness of Recurrent RCNN, v4ResNet, and the Inception technique. The proposed IRRCNN method achieved better outcomes towards the equivalent networks, Inception Networks, and the RCNNs in terms of an object recognition task. Yang et al. [16] suggested to employ further regional-level supervision for BC classification of the HIs using the CNN technique. In this method, the RoIs were localized and utilized for guiding the interest of the classifier network concurrently. The presented supervised attention algorithm precisely stimulated the neurons in the diagnostic-related areas, whereas it suppressed the stimulations in the inappropriate and noisy regions.
Ali et al. [17] presented an effective DL model to exploit the small dataset and learn generalizable and domain-invariant representation in various medical imaging applications for diseases such as malaria, Diabetic Retinopathy, and tuberculosis. This model was named the Incremental Modular Network Synthesis (IMNS), and the resultant CNNs were the Incremental Modular Networks (IMNets). The authors in the study conducted earlier [18] developed a cloud-enabled Android app to detect breast cancer using the ResNet101 model. The proposed framework was cost-effective, and it demanded less human intervention as it was cloud integrated. So, a lower performance load was placed on the edge devices. Narayanan et al. [19] presented a novel Deep Convolutional Neural Network architecture for the Invasive Ductal Carcinoma (IDC) classification process.

The Proposed Model
In the current study, a new IBESSDL-BCHI method has been developed for the recognition and classification of BC using the HIs. The presented IBESSDL-BCHI method follows a series of processes, namely, MF-based noise removal, SDL feature extraction, IBES-based hyperparameter optimization, and LSTM classification. The design of the IBES algorithm helps in precisely categorizing the HIs into two major classes, namely, benign and malignant. Figure 1 depicts the workflow of the proposed IBESSDL-BCHI approach. based hyperparameter optimization, and LSTM classification. The design of the IBES algorithm helps in precisely categorizing the HIs into two major classes, namely, benign and malignant. Figure 1 depicts the workflow of the proposed IBESSDL-BCHI approach.

Image Preprocessing
Initially, the Median Filtering (MF) technique was utilized to preprocess the input HIs. MF is a nonlinear digital filter method that is frequently utilized in the removal of noise from images/signals. Such noise reduction is a classical pre-processing phase that is performed to enhance the outcomes in the later processes. The MF approach smoothens the HIs [20], and its steps are as follows: Step1: The 3 × 3 kernel needs zero padding 3/2 = 1 column of 0′s at the left as well as the right edges, but it needs 3/2 = 1 row of 0′s at the upper as well as the bottom edges.
Step 2: To process the primary component, this approach covers 3 × 3 kernels with the center of them pointing at the initially handled component. The data, arranged in the kernel, were recorded with respect to the value, and the attained median value is obtained.
Step 3: We repeated the process for all of the elements until the final value was obtained.
The MF function calculates the median of every pixel in the kernel window, and the central pixel is interchanged with this median value. It can be extremely effectual in the extraction of salt-and-pepper noises. Notably, during the application of the Gaussian and box filters, the filter values to the central element remain a value that cannot occur in the original images. However, this is not the case in the MF approach since the central element is continuously exchanged with any of the pixel values of the images. This phenomenon decreases the noise in an efficient manner. The size of the kernel is a positive odd integer, and the median function is calculated as given in Equation (1).

Image Preprocessing
Initially, the Median Filtering (MF) technique was utilized to preprocess the input HIs. MF is a nonlinear digital filter method that is frequently utilized in the removal of noise from images/signals. Such noise reduction is a classical pre-processing phase that is performed to enhance the outcomes in the later processes. The MF approach smoothens the HIs [20], and its steps are as follows: Step1: The 3 × 3 kernel needs zero padding 3/2 = 1 column of 0 s at the left as well as the right edges, but it needs 3/2 = 1 row of 0 s at the upper as well as the bottom edges.
Step 2: To process the primary component, this approach covers 3 × 3 kernels with the center of them pointing at the initially handled component. The data, arranged in the kernel, were recorded with respect to the value, and the attained median value is obtained.
Step 3: We repeated the process for all of the elements until the final value was obtained. The MF function calculates the median of every pixel in the kernel window, and the central pixel is interchanged with this median value. It can be extremely effectual in the extraction of salt-and-pepper noises. Notably, during the application of the Gaussian and box filters, the filter values to the central element remain a value that cannot occur in the original images. However, this is not the case in the MF approach since the central element is continuously exchanged with any of the pixel values of the images. This phenomenon decreases the noise in an efficient manner. The size of the kernel is a positive odd integer, and the median function is calculated as given in Equation (1).
Here, X refers to the orderly list of values from the dataset and n signifies the amount of values from the dataset.

SDL-Based Feature Extraction
After the image preprocessing, the SDL model was utilized to derive the feature vectors. During the feature extraction procedure, the pre-processed images were fed into the SDL module to obtain a beneficial set of feature vectors [21].
The SDL model extracts the feature subsets from the pre-processed images. It represents the SDL k through three main elements such as k DCNN component, the input layer and the C 2 k synergic network (SN). Every DCNN component of the network provides an independent learning representation in the input dataset. The SN consists of the FC architecture to ensure that the input layers belong to the same class, and it offers remedial comments. Afterward, the SDL system is classified into three sub-models. Figure 2 illustrates the architecture of the SDL network.
Here, refers to the orderly list of values from the dataset and signifies the amount of values from the dataset.

SDL-based Feature Extraction
After the image preprocessing, the SDL model was utilized to derive the feature vectors. During the feature extraction procedure, the pre-processed images were fed into the SDL module to obtain a beneficial set of feature vectors [21].
The SDL model extracts the feature subsets from the pre-processed images. It represents the through three main elements such as DCNN component, the input layer and the synergic network (SN). Every DCNN component of the network provides an independent learning representation in the input dataset. The SN consists of the FC architecture to ensure that the input layers belong to the same class, and it offers remedial comments. Afterward, the SDL system is classified into three sub-models. Figure  2 illustrates the architecture of the SDL network.

Components of DCNN
Due to the implicit nature of ResNet, ResNet-50 was exploited for initializing every DCNN component ( = 1,2, … , ). Therefore, it can be indicated that the DCNN network comprises of VGGNet, AlexNet, and GoogLeNet which correspond to the SDL method. This module was trained using the data sequence = { ( ) , ( ) , … , ( ) } and a series of the last class label, = { ( ) , ( ) , … , ( ) }. The aim is to progress with a group of variables which make sure that the CE loss is offered as follows: In Equation (2), represents the class number and ( ) = ( ( ) , ) denotes the forward computation. The group of variables obtained for DCNN-a indicates that and the variable do not assign any massive DCNN units.

Components of DCNN
Due to the implicit nature of ResNet, ResNet-50 was exploited for initializing every DCNN component (a = 1, 2, . . . , n). Therefore, it can be indicated that the DCNN network comprises of VGGNet, AlexNet, and GoogLeNet which correspond to the SDL method.
This module was trained using the data sequence X = x (1) , x (2) , . . . , x (M) and a series of the last class label, Y = y (1) , y (2) , . . . , y (M) . The aim is to progress with a group of variables θ which make sure that the CE loss is offered as follows: In Equation (2), n represents the class number and Z (a) = F x (a) , θ denotes the forward computation. The group of variables obtained for DCNN-a indicates that θ a and the variable do not assign any massive DCNN units.

SDL Model
The DCNN component, using the synergic labels of the pair of embedded and the input layers, is exploited for FC learning. Assuming that (Z A , Z B ) are a data pair given as the input for two DCNN features (DCNNa, DCNNb) as follows, Cancers 2022, 14, 6159 6 of 18 Next, the deep feature from the whole dataset is embedded as f A • B , and the outcomes using the synergic label are given below.
To resolve the shortcoming, the percentage data pair from the class need to be higher. So, a simple-to-zero value is used to gauge the synergic signals using an alternate sigmoid layer, and the binary CE loss is as follows.
In Equation (6), θ S denotes the SN attribute andŷ s indicates the SN forward computation. This validates that the input dataset pair belong to the same class, and it offers the option to remedy the synergic error.

Training and Testing Processes
Once the training is completed, the features of both the DCNN component and the SN become improved.
In Equation (7), η(z) and S(a, b) indicate the learning rate and SN between DCNNa and DCNNb, respectively, as given below.
Here, λ denotes the trade-off between the sub-model of the classifiers and the synergic errors. The relationship between the trained process of the SDL 2 models increases. In the trained SDL k , the testing dataset x is classified using the DCNN unit, while it provides the prediction vector P (a) = p (a) which is activated from the resulting FC layer. The class labels of the testing dataset are evaluated as follows.

Hyperparameter Tuning Using IBES Algorithm
In this study, the hyperparameters related to the SDL mechanism are fine-tuned with the help of the IBES model. BES is a meta-heuristic optimization approach that imitates the behavior of bald eagle hunting [22]. This procedure has three phases, namely, selecting the space, searching in the space, and swooping. Initially, the bald eagles choose the best place in terms of the food amount. Next, the eagle searches for prey within the nominated place. In the optimally attained location in the previous stage, the eagle swoops to determine the optimal hunting site, which is the last phase.
a. Selection space: In this phase, a novel position is produced based on the subsequent formula.
P new (i) = P best + α.r.(P mean − P(i)) In Equation (11), P new (i) denotes the i-th recently produced location, P best refers to the optimally attained location, P mean indicates the mean location, α represents a control gain  [1.5, 2], and r indicates an arbitrary integer that lies in the range of [0, 1]. The fitness of every novel location is estimated; if the novel location (P new ) offers a better fitness than the offered one P best , then the novel location is allocated by P best .
b. Searching in space: After the allocation of the optimal search space (P best ) is completed, the process upgrades the location of the eagles within the searching space. The update module is given herein.
In Equation (12), P new (i) denotes the i − th recently produced position, P mean indicates the mean location, and x and y denote the directional coordinates for the i − th location as given below.
; xr(i) = r(i).sin(θ(i))y(i) = yr(i) max(|yr|) ; yr(i) = r(i).cos(θ(i)) (13) In Equation (13), a indicates a control variable that is utilized to determine the corner between the searching point and the central point, and it takes the values in the range of [5,10]. R denotes a variable within [0. 5,2], and it is utilized to determine the number of searching cycles. The fitness of the novel position is estimated, and the P best values are upgraded based on the attained outcomes.
c. Swooping: In this phase, the eagle moves towards the prey from the optimally attained location. The hunting model is given in the following expression.
In Equation (14), c 1 and c 2 denote two arbitrary integers that lie in the range of [1,2]; x1 and y1 indicate the directional coordinates that are determined as follows.
; xr(i) = r(i).sinh(θ(i))y1(i) = yr(i) max(|yr|) ; yr(i) = r(i).cosh(θ(i)) to estimate the individual fitness, and it relates to their equivalent opposite number after bringing the optimum one into the next iteration in the OBL approach, and it is determined as follows.
Opposite number: We assume that x is a real number and x ∈ [lb, ub], the next the opposite number x, is provided by the subsequent value as shown in Equation (16).
Here, lb and ub correspondingly denote the lower and upper boundaries, respectively. Opposite vector: When x = (x 1 , x 2 , . . . x D ), x 1 , x 2 , . . . x D denote the real numbers and x ∈ [lb, ub], and then x i is computed as given below.
At last, the current solution is located by The IBES method resolves the Fitness Function (FF) to obtain a superior classification performance. In this study, a reduced classifier error rate is treated as FF as given below.

LSTM-Based Classification
During the image classification process, the LSTM model is used to precisely categorize the HIs under two major classes, namely, benign and malignant. Being a variant of the RNN model, the LSTM model basically differs from the classical ANN [23]. Both the LSTM and RNN are sequence-based methods with internal self-looped repeating networks. These determines the temporal relationship amid the sequential datasets and preserve the previous information.
In the current study, the repeated module has a simple framework (Tanh layer). f t denotes the output of the forget gate a, for which the values lie in the range of [0, 1].
For the above explanation, the mathematical expression is given below.
The next layer of the LSTM blocks are named as an 'input gate' layer as shown below.
Afterwards, the older cell state C t−1 should be upgraded to the cell state, C t . The output of the forget gate f t is the decision to forget, and i r defines that a novel cell state has been added, i.e., C t . The update procedure of C t is described below.
At last, the interacting layer is named the 'output gate' layer. The procedure of producing an output of the LSTM block is demonstrated herein.
In Equation (23) In Equation (24), N indicates the overall number of labeled datasets. In the training course of LSTM, θ is tuned continuously by diminishing the loss function via an optimized technique, namely, SGD.

Results and Discussion
The proposed IBESSDL-BCHI method was experimentally validated using a benchmark Breast Cancer Histopathological Database (BreakHis) dataset [4] comprising 1820 HIs. The dataset holds a total of 588 images under the benign class and 1,232 images under the malignant class, and the details are given in Table 1. A few sample images are showcased in Figure 3. The proposed IBESSDL-BCHI method was experimentally validated using a benchmark Breast Cancer Histopathological Database (BreakHis) dataset [4] comprising 1820 HIs. The dataset holds a total of 588 images under the benign class and 1,232 images under the malignant class, and the details are given in Table 1. A few sample images are showcased in Figure 3.        Figure 5 show the analytical outcomes of the IBESSDL-BCHI model during distinct test runs in terms of its accuracy (accu y ), precision (prec n ), recall (reca l ), specificity (spec y ), F-score (F score ), and G-mean (G mean ). The experimental values infer that the proposed IBESSDL-BCHI method attained the maximum number of classification results under every run. For example, in run 1, the IBESSDL-BCHI technique attained the average accu y , prec n , reca l , spec y , F score , and G mean values which were 98.67%, 92.79%, 92.19%, 99.18%, 92.27%, and 95.55%, respectively. Additionally, in run 2, the proposed IBESSDL-BCHI approach reached the average accu y , prec n , reca l , spec y , F score and G mean values which were 99.48%, 97.22%, 97.29%, 99.68%, 97.20% and 98.46% correspondingly. In addition to these, in run 4, the IBESSDL-BCHI model accomplished the average accu y , prec n , reca l , spec y , F score , and G mean values which were 98.76%, 92.99%, 94.14%, 99.26%, 93.49%, and 96.66% correspondingly. Along with that, in run 5, the IBESSDL-BCHI methodology achieved the average accu y , prec n , reca l , spec y , F score , and G mean values which were 99.12%, 94.30%, 96.27%, 99.52%, 95.21% and 97.88% correspondingly. Both the Training Accuracy (TA) and Validation Accuracy (VA) values obtained using the proposed IBESSDL-BCHI method using the test dataset are depicted in Figure 6. The outcomes demonstrate that the proposed IBESSDL-BCHI methodology achieved the highest TA and VA values, while the VA values were superior to the TA values.  Both the Training Accuracy (TA) and Validation Accuracy (VA) values obtained using the proposed IBESSDL-BCHI method using the test dataset are depicted in Figure 6. The outcomes demonstrate that the proposed IBESSDL-BCHI methodology achieved the highest TA and VA values, while the VA values were superior to the TA values. Both the Training Loss (TL) and Validation Loss (VL) values attained by the proposed IBESSDL-BCHI methodology using the test data are depicted in Figure 7. The outcomes illustrate that the proposed IBESSDL-BCHI technique demonstrated minimal TL and VL values, while the VL values seemed to be smaller than the TL values.    A brief precision-recall inspection was conducted with the IBESSDL-BCHI method using the test data, and the results are depicted in Figure 8. It is to be noted that the proposed IBESSDL-BCHI approach obtained the maximal precision-recall performance under all of the classes. A brief precision-recall inspection was conducted with the IBESSDL-BCHI method using the test data, and the results are depicted in Figure 8. It is to be noted that the proposed IBESSDL-BCHI approach obtained the maximal precision-recall performance under all of the classes. A comprehensive ROC inspection was conducted on the proposed IBESSDL-BCHI system using the test dataset, and the results are portrayed in Figure 9. The outcomes show that the proposed IBESSDL-BCHI method depicted capability in categorizing the test dataset into dissimilar classes. A comprehensive ROC inspection was conducted on the proposed IBESSDL-BCHI system using the test dataset, and the results are portrayed in Figure 9. The outcomes show that the proposed IBESSDL-BCHI method depicted capability in categorizing the test dataset into dissimilar classes.  Table 3 provides the overall comparison analysis outcomes achieved by the proposed IBESSDL-BCHI method and other existing models [14,24]. Figure 10 portrays the comparative examination outcomes of the IBESSDL-BCHI technique and other techniques in terms of . The figure implies that the proposed IBESSDL-BCHI system achieved en- Figure 9. ROC curve analysis results of the IBESSDL-BCHI approach. Table 3 provides the overall comparison analysis outcomes achieved by the proposed IBESSDL-BCHI method and other existing models [14,24]. Figure 10 portrays the comparative examination outcomes of the IBESSDL-BCHI technique and other techniques in terms of accu y . The figure implies that the proposed IBESSDL-BCHI system achieved enhanced accu y values. With respect to accu y , the IBESSDL-BCHI approach obtained a maximum accu y of 0.9963, whereas the rest of the methods such as the GLCM-KNN, GLCM-NB, GLCM-Discrete transform, GLCM-SVM, GLCM-DL, DL-INV3, and DL-INV2 models attained low accu y values which were 0.7617, 0.7845, 0.8500, 0.8500, 0.9244, 0.9471, and 0.8812, respectively. Table 3. Comparative analysis outcomes of the IBESSDL-BCHI approach and other existing approaches using different measures [14,24].

Methods
Accu y , Prec n Reca l F score

Conclusions
In this study, a new IBESSDL-BCHI method has been developed for both the recognition and classification of BC using HIs. The presented IBESSDL-BCHI model follows a series of processes, namely, MF-based noise removal, SDL feature extraction, IBES-based hyperparameter optimization, and LSTM classification. The design of the IBES algorithm aids in the precise categorization of the HIs into two major classes namely, benign and malignant. The performance of the proposed IBESSDL-BCHI mechanism was validated using a benchmark dataset, and the IBESSDL-BCHI model achieved a better general efficiency score for BC classification. Therefore, the presented model can be utilized for BC diagnosis over other models. In the future, the performance of the presented IBESSDL-BCHI algorithm can be enhanced by using an ensemble of DL models. In addition, the proposed model can also be tested on large scale real-time datasets to assure its robustness and scalability. Moreover, the computational complexity of the proposed model can be investigated in future.  Eventually, with regard to F score , the proposed IBESSDL-BCHI methodology, it gained a superior F score value of 0.9818, whereas the GLCM-KNN, GLCM-NB, GLCM-Discrete transform, GLCM-SVM, GLCM-DL, DL-INV3, and DL-INV2 models attained low F score values which were 0.8222, 0.8697, 0.8469, 0.8162, 0.8792 0.8186, and 0.8642, respectively. From the detailed discussion about the results, it is evident that the proposed IBESSDL-BCHI technique yielded an effective breast cancer classification performance.

Conclusions
In this study, a new IBESSDL-BCHI method has been developed for both the recognition and classification of BC using HIs. The presented IBESSDL-BCHI model follows a series of processes, namely, MF-based noise removal, SDL feature extraction, IBES-based hyperparameter optimization, and LSTM classification. The design of the IBES algorithm aids in the precise categorization of the HIs into two major classes namely, benign and malignant. The performance of the proposed IBESSDL-BCHI mechanism was validated using a benchmark dataset, and the IBESSDL-BCHI model achieved a better general efficiency score for BC classification. Therefore, the presented model can be utilized for BC diagnosis over other models. In the future, the performance of the presented IBESSDL-BCHI algorithm can be enhanced by using an ensemble of DL models. In addition, the proposed model can also be tested on large scale real-time datasets to assure its robustness and scalability. Moreover, the computational complexity of the proposed model can be investigated in future.