Adaptive Aquila Optimizer with Explainable Artificial Intelligence-Enabled Cancer Diagnosis on Medical Imaging

Simple Summary For automated cancer diagnosis on medical imaging, explainable artificial intelligence technology uses advanced image analysis methods like deep learning to make a diagnosis and analyze medical images, as well as provide a clear explanation for how it arrived at its diagnosis. The objective of XAI is to provide patients and doctors with a better understanding of the system’s decision-making process and to increase transparency and trust in the diagnosis method. The manual classification of cancer using medical images is a tedious and tiresome process, which necessitates the design of automated tools for the decision-making process. In this study, we explored the significant application of explainable artificial intelligence and an ensemble of deep-learning models for automated cancer diagnosis. To demonstrate the enhanced performance of the proposed model, a widespread comparison study is made with recent models, and the results exhibit the significance of the proposed model on benchmark test images. Therefore, the proposed model has the potential as an automated, accurate, and rapid tool for supporting the detection and classification process of cancer. Abstract Explainable Artificial Intelligence (XAI) is a branch of AI that mainly focuses on developing systems that provide understandable and clear explanations for their decisions. In the context of cancer diagnoses on medical imaging, an XAI technology uses advanced image analysis methods like deep learning (DL) to make a diagnosis and analyze medical images, as well as provide a clear explanation for how it arrived at its diagnoses. This includes highlighting specific areas of the image that the system recognized as indicative of cancer while also providing data on the fundamental AI algorithm and decision-making process used. The objective of XAI is to provide patients and doctors with a better understanding of the system’s decision-making process and to increase transparency and trust in the diagnosis method. Therefore, this study develops an Adaptive Aquila Optimizer with Explainable Artificial Intelligence Enabled Cancer Diagnosis (AAOXAI-CD) technique on Medical Imaging. The proposed AAOXAI-CD technique intends to accomplish the effectual colorectal and osteosarcoma cancer classification process. To achieve this, the AAOXAI-CD technique initially employs the Faster SqueezeNet model for feature vector generation. As well, the hyperparameter tuning of the Faster SqueezeNet model takes place with the use of the AAO algorithm. For cancer classification, the majority weighted voting ensemble model with three DL classifiers, namely recurrent neural network (RNN), gated recurrent unit (GRU), and bidirectional long short-term memory (BiLSTM). Furthermore, the AAOXAI-CD technique combines the XAI approach LIME for better understanding and explainability of the black-box method for accurate cancer detection. The simulation evaluation of the AAOXAI-CD methodology can be tested on medical cancer imaging databases, and the outcomes ensured the auspicious outcome of the AAOXAI-CD methodology than other current approaches.


Introduction
Diagnosis of cancer is an indispensable problem in the medical sector. Initial identification of cancer is vital for better chances of treatment and the best course of action [1]. Therefore, cancer can be considered as one major topic where numerous authors carried out various research to attain higher performance in treatment prevention and diagnosis. Initial identification of tumors can increase treatment options and chances of survival of patients. Medical images like Magnetic Resonance Imaging, mammograms, microscopic images, and ultrasound were the typical technique for diagnosing cancer [2].
In recent times, computer-aided diagnosis (CAD) mechanism was utilized to help doctors in diagnosing tumors so that the accuracy level of diagnosis gets enhanced. CAD helps in reducing missed cancer lesions because of medical practitioner fatigue, minimizing data overloading and work pressure, and reducing the variability of intra-and-inter readers of imageries [3]. Problems like technical reasons are relevant to imaging quality, and errors caused by humans have augmented the misdiagnosis of breast cancer in the interpretation of radiologists. To solve these limitations, CAD mechanisms were advanced to automate breast cancer diagnosis and categorize malignant and benign lesions [4]. The CAD mechanism enhances the performance of radiologists in discriminating and finding abnormal and normal tissues. Such a process can be executed only as a double reader, but decisions are made by radiologists [5]. Figure 1 represents the structure of explainable artificial intelligence.

Introduction
Diagnosis of cancer is an indispensable problem in the medical sector. Initial identification of cancer is vital for better chances of treatment and the best course of action [1]. Therefore, cancer can be considered as one major topic where numerous authors carried out various research to attain higher performance in treatment prevention and diagnosis. Initial identification of tumors can increase treatment options and chances of survival of patients. Medical images like Magnetic Resonance Imaging, mammograms, microscopic images, and ultrasound were the typical technique for diagnosing cancer [2].
In recent times, computer-aided diagnosis (CAD) mechanism was utilized to help doctors in diagnosing tumors so that the accuracy level of diagnosis gets enhanced. CAD helps in reducing missed cancer lesions because of medical practitioner fatigue, minimizing data overloading and work pressure, and reducing the variability of intra-and-inter readers of imageries [3]. Problems like technical reasons are relevant to imaging quality, and errors caused by humans have augmented the misdiagnosis of breast cancer in the interpretation of radiologists. To solve these limitations, CAD mechanisms were advanced to automate breast cancer diagnosis and categorize malignant and benign lesions [4]. The CAD mechanism enhances the performance of radiologists in discriminating and finding abnormal and normal tissues. Such a process can be executed only as a double reader, but decisions are made by radiologists [5]. Figure 1 represents the structure of explainable artificial intelligence. Recent advancements in the resolution of medical imaging modalities have enhanced diagnostic accuracy [6]. Effective use of imaging data for enhancing the diagnosis becomes significant. Currently, computer-aided diagnosis systems (CAD) have advanced a novel context in radiology to make use of data that should be implemented in the diagnosis of different diseases and different imaging modalities [7][8][9][10]. The efficacy of radiologists' Recent advancements in the resolution of medical imaging modalities have enhanced diagnostic accuracy [6]. Effective use of imaging data for enhancing the diagnosis becomes significant. Currently, computer-aided diagnosis systems (CAD) have advanced a novel context in radiology to make use of data that should be implemented in the diagnosis of different diseases and different imaging modalities [7][8][9][10]. The efficacy of radiologists' analysis can be enhanced in the context of consistency and accuracy in diagnosis or detection, while production can be enhanced by minimizing the hours needed to read the imageries. The results can be extracted through several methods in computer vision (CV) for presenting certain important variables like the likelihood of malignancy and the location of suspicious lesions of the detected lesions [11]. Then, DL technology has now significantly advanced, increasing expectations for the likelihood of computer software relevant to tumor screening again. Deep learning (DL) is a type of neural network (NNs). This NN has an output layer, an input layer, and a hidden layer. DL can be a NN with a lot of hidden layers. In the past, DL had more achievements, i.e., incredible performance improvements, particularly in speech recognition and image classification [12]. Recently, DL has been utilized in various areas. As they can solve complicated issues, DNNs are now common in the healthcare field. However, decision-making by these methods was fundamentally a black-box procedure making it problematic for doctors to determine whether choices were dependable. The usage of explainable artificial intelligence (XAI) can be recommended as the key to this issue [13].

Related Works
Van der Velden et al. [14] presented an outline of explainable AI (XAI) utilized in DLrelated medical image analysis. A structure of XAI criteria can be presented for classifying DL-related medical image analysis techniques. As per the structure and anatomical location, studies on the XAI mechanism in medical image analysis were categorized and surveyed. Esmaeili et al. [15] intend to assess the performance of selective DL methods on localizing cancer lesions and differentiating lesions from healthier areas in MRI contrasts. Despite an important correlation between lesion localization accuracy and classification, the familiar AI techniques inspected in this study categorize certain cancer brains dependent upon other non-related attributes. The outcomes advocate that the abovementioned AI methods can formulate an intuition for method interpretability and play a significant role in the performance assessment of DL methods.
In [16], a new automatic classification system by merging several DL methods was devised for identifying prostate cancer from MRI and ultrasound (US) imageries. To enrich the performance of the model, particularly on the MRI data, the fusion model can be advanced by integrating the optimal pretrained method as feature extractors with shallow ML techniques (e.g., K-NN, SVM, RF, and Adaboost). At last, the fusion model can be inspected by explainable AI to identify the fact why it finds samples as Malignant or Benign Stage in prostate tumors. Kobylińska et al. [17] modeled selective techniques from the XAI domain in the instance of methods implemented for assessing lung cancer risk in the screening process of lung cancer using low-dose CT. The usage of such methods offers a good understanding of differences and similarities among the three typically used methods in screening lung cancer they are LCART, BACH, and PLCOm2012.
In [18], an explainable AI (XAI) structure was devised in this study for presenting the local and global analysis of auxiliary identification of hepatitis while maintaining good predictive outcomes. Firstly, a public hepatitis classifier benchmark from UCI was utilized for testing the structure feasibility. Afterward, the transparent and black-box ML methods were used to predict the deterioration of hepatitis. Transparent methods like KNN, LR, and DT were selected. While the black-box method like the RF, XGBoost, and SVM were selected. Watson and Al Moubayed [19] devised a method agnostic explainability-related technique for the precise identification of adversarial instances on two datasets with various properties and complexity: chest X-ray (CXR) data and Electronic Health Record (EHR). In [20], the XAI tool can be applied to the breast cancer (BC) dataset and offers a graphical analysis. The medical implication and molecular processes behind circulating adiponectin, HOMA, leptin, and BC resistance were sightseen, and XAI techniques were utilized for constructing methods for the diagnosis of new BC biomarkers.

Paper Contributions
This study develops an Adaptive Aquila Optimizer with Explainable Artificial Intelligence Enabled Cancer Diagnosis (AAOXAI-CD) technique on Medical Imaging. The proposed AAOXAI-CD technique uses the Faster SqueezeNet model for feature vector gen- eration. As well as the execution of hyperparameter tuning of the Faster SqueezeNet model done with the AAO algorithm. For cancer classification, the majority weighted voting ensemble model with three DL classifiers, namely recurrent neural network (RNN), gated recurrent unit (GRU), and bidirectional long short-term memory (BiLSTM). Furthermore, the AAOXAI-CD technique combines the XAI approach LIME for better understanding and explainability of the black-box method for accurate cancer detection. The simulation evaluation of the AAOXAI-CD technique is tested on medical cancer imaging databases.

Materials and Methods
In this article, we have developed an automated cancer diagnosis approach using the AAOXAI-CD approach on medical images. The proposed AAOXAI-CD system attained the effectual colorectal and osteosarcoma cancer classification process. It encompasses Faster SqueezeNet-based feature vector generation, AAO-based parameter tuning, ensemble classification, and XAI modeling. Figure 2 defines the overall flow of the AAOXAI-CD approach. The overall process involved in the proposed model is given in Algorithm 1.

Paper Contributions
This study develops an Adaptive Aquila Optimizer with Explainable Artificial Intelligence Enabled Cancer Diagnosis (AAOXAI-CD) technique on Medical Imaging. The proposed AAOXAI-CD technique uses the Faster SqueezeNet model for feature vector generation. As well as the execution of hyperparameter tuning of the Faster SqueezeNet model done with the AAO algorithm. For cancer classification, the majority weighted voting ensemble model with three DL classifiers, namely recurrent neural network (RNN), gated recurrent unit (GRU), and bidirectional long short-term memory (BiLSTM). Furthermore, the AAOXAI-CD technique combines the XAI approach LIME for better understanding and explainability of the black-box method for accurate cancer detection. The simulation evaluation of the AAOXAI-CD technique is tested on medical cancer imaging databases.

Materials and Methods
In this article, we have developed an automated cancer diagnosis approach using the AAOXAI-CD approach on medical images. The proposed AAOXAI-CD system attained the effectual colorectal and osteosarcoma cancer classification process. It encompasses Faster SqueezeNet-based feature vector generation, AAO-based parameter tuning, ensemble classification, and XAI modeling. Figure 2 defines the overall flow of the AAOXAI-CD approach. The overall process involved in the proposed model is given in Algorithm 1.

Feature Extraction Using Faster SqueezeNet
Primarily, the AAOXAI-CD technique employed the Faster SqueezeNet method for feature vector generation. Fast SqueezeNet was proposed to enrich the real-time performance and accuracy of cancer classification [21]. We added BatchNorm and residual structure to prevent overfitting. Simultaneously, like DenseNet, concat is employed to interconnect dissimilar layers to increase the expressiveness of the first few layers in the network. Figure 3 represents the architecture of the Faster SqueezeNet method.

Algorithm 1: Process Involved in AAOXAI-CD Technique
Step 1: Input Dataset (Training Images) Step 2: Image Pre-Processing Step 3: Feature Extraction Using Faster SqueezeNet Model Step 4: Parameter Tuning Process Step 4.1: Initialize the Population and Its Parameters Step 4.2: Calculate the Fitness Values Step 4.3: Exploration Process and Exploitation Process Step 4.4: Update the Fitness Values Step 4.5: Obtain Best Solution Step 5: Ensemble of Classifier (RNN, GRU, and Bi-LSTM) Step 6: Classification Output

Feature Extraction Using Faster SqueezeNet
Primarily, the AAOXAI-CD technique employed the Faster SqueezeNet method for feature vector generation. Fast SqueezeNet was proposed to enrich the real-time performance and accuracy of cancer classification [21]. We added BatchNorm and residual structure to prevent overfitting. Simultaneously, like DenseNet, concat is employed to interconnect dissimilar layers to increase the expressiveness of the first few layers in the network. Figure 3 represents the architecture of the Faster SqueezeNet method. Fast SqueezeNet comprises a global average pooling layer, 1 BatchNorm layer, 3 block layers, and 4 convolutional layers. In the following ways, Fast SqueezeNet can be improved: (1) To further enrich the information flow among layers DenseNet is imitated, and a distinct connection mode is devised. This covers a fire module and pooling layer, and lastly, 2 concat layers are interconnected to the following convolution layer. Fast SqueezeNet comprises a global average pooling layer, 1 BatchNorm layer, 3 block layers, and 4 convolutional layers. In the following ways, Fast SqueezeNet can be improved: (1) To further enrich the information flow among layers DenseNet is imitated, and a distinct connection mode is devised. This covers a fire module and pooling layer, and lastly, 2 concat layers are interconnected to the following convolution layer.
The present layer receives each feature map of the previous layer, and we apply x 0 , . . . , x l−1 as input; then, x l is expressed as where [x 0 , x 1 , . . . , x l−1 ] represent the connection of feature graphs produced in layers 0, 1, . . . , l − 1 and H l (·) concatenated more than one input data. Now, characterizes the max pooling layer, x 1 designates Fire layers, and x l indicates the concat layer. Initially, the performance of the network is improved without excessively raising the number of network variables, and simultaneously, any two-layer network could directly transmit data.
(2) We learned from the ResNet structure and suggested constituent elements, which comprise a fire module and pooling layer, to ensure improved network convergence. Lastly, afterward, two layers were summed, and it was interconnected to the next convolution layers.
In ResNet, shortcut connection employs identity mapping that implies input of a convolutional stack will be added directly to the resultant of the convolutional stack. Formally, the underlying mapping can be represented as H (x), considering the stacked nonlinear layer fits another mapping of F(x) := H(x) − x. The original mapping is rewritten into F(x) + x. F(x) + x is comprehended by the structure named shortcut connection in the encrypting process.
In this work, the hyperparameter tuning of the Faster SqueezeNet method occurs by employing the AAO algorithm. This abovementioned algorithm is based on the distinct hunting strategies of Aquila for different prey [22]. For faster-moving prey, the Aquila needs to obtain the prey in a precise and faster manner, where the global exploration capability of the model was reflected. The optimizer technique was characterized by mimicking 4 behaviors of Aquila hunting. Firstly, the population needs to arbitrarily generate inbetween the lower bound (LB) and upper bound (UB) dependent upon the problem, as given in Equation (2). The approximate optimum solution at the time of the iteration can be defined as the optimum solution. The present set of candidate solutions X was made at random by using the following expression: where n signifies the overall amount of candidate solutions, D indicates the dimensionality of problems, and x n, D represents the location of n-th solutions in d dimensional space.
Rand denotes a randomly generated value, and UB j and LB j signify the j-th dimensional upper and lower boundary of the problem. Initially, choose search spaces by hovering above in vertical bends. Aquila hovers above to identify the prey area and rapidly choose the better prey region as follows: where X 1 (t + 1) symbolizes the location of the individual at t + 1 time , X besi (t + 1) signifies the present global optimum site at the t-th iteration, T and t symbolize the maximal amount of iterations and the present amount of iterations, correspondingly, X(t) represents the average location of the individual at the existing iteration, and Rand represents the randomly generated value within [0, 1] in Gaussian distribution. The next strategy was a short gliding attack in isometric flight. Aquila flies over the targeted prey to prepare for assault while they find prey region from a higher altitude. This can be formulated as where X 2 (t + 1) denotes the new solution for the following iteration of t, D means spatial dimensions, levy (D) denotes Lévy flight distribution functions, X(t) indicates the arbitrary location of Aquila in [1, N], s take the values of 1.5, y and χ presents the spiral situations in search region as follows: where r 1 takes the fixed index between 1 and 20, D 1 denotes the integers from 1 to the length of the search region. The third strategy was a slow-descent attack and low-flying. The Aquila locks onto a hunting target in the hunting region and, with attack ready, makes the initial attacks in the vertical descent, thereby testing prey response. These behaviors are given as follows: where X 3 (t + 1) denotes the solution of the following iteration of t, δ, and α denotes the mining adjustment parameter within (0, 1), LB and UB represent the lower and upper boundaries of the issue. The fourth strategy was grabbing and walking prey. Once the Aquila approaches the prey, it starts to attack prey based on arbitrary movements of prey. These behaviors can be described as follows where X 4 (t + 1) denotes the new solution for the following iteration of t, QF represents the mass function leveraged for balancing the search process, and F ∈ (0, 1) G 1 represents various strategies utilized by the Aquila for prey escape; G 2 signifies slope value from the initial location to the final location at the chase time of Aquila's prey, which takes values from 2 to 0, · Rand denotes the random number within [0,1] in Gaussian distribution; and T and t denotes the maximal amount of iterations and existing amount of iterations, correspondingly. Niche thought is from biology in which microhabitats represent roles or functions of the organization in a specific environment, and organizations with general features are named species. In the AAO algorithm, Niche thought is used, which applies a sharing model for comparing the distance among individuals in a habitat. A specific threshold was set to increase the fitness of an individual with the highest fitness, ensuring that the individual state is optimal. For an individual with the lowest fitness, a penalty was presented to make them update and further find the optimum value in another region to guarantee the diversity of the population at the iteration and attain the optimum solution.
Here, the distance among individuals of the smallest habitat population was evaluated as follows: The data exchange function among Xi and Xj individuals is given below where ρ denotes the radius of data sharing in microhabitats and d ij < ρ guarantees that individuals live in the microhabitat environments. After sharing the data, the optimum adaptation can be adjusted in time, as follows.
where F i means optimum adaptation after sharing, and F j denotes original adaptation.
The AAO method not only derived a fitness function from attaining superior classification performance as well describes positive values to symbolize the enhanced outcome of the candidate solutions. The reduction of classification error rates was treated as the fitness function.

Ensemble Learning-Based Classification
In this work, the DL paradigm is integrated, and the best outcome is selected by the weighted voting method. Assumed the D base classification model and amount of classes as n for voting, predictive class c k of weighted voting for every instance as follows where ∆ ji signifies binary parameter. As soon as ith base classification classifies the k instances into jth classes, then ∆ ji = 1; or else, ∆ ji = 0. w i represents the weight of ith base classification in the ensemble. Acc = ∑ k {1|c k is the true class of instance k} Size of test instances × 100%.

RNN Model
Initially, Elman recommended the recurrent unit as its essential block (1990). If they are used to exceedingly long sequences, the elementary RNN cell has common problems of expanding gradient and disappearing gradient [23]. It is a fact that the elementary RNN cell could not hold long-term dependence eventually. Hence it demonstrates that this cell has shortcomings. The backpropagated gradient tends to reduce once the sequence is particularly long, which prevents the effective updating of the weight. However, once the gradient is substantial, they might explode across a longer sequence, which renders the weight matrix unstable. The above two difficulties stem from the intractable nature of the gradient, which has made it more difficult for RNN cells to identify and be accountable for a long-term relationship. Equations (24) and (25) demonstrate the mathematical expression for RNN architecture.
where h t denotes the hidden state, and it was the only type of memory in the RNN cell. P h and P x epitomize the weight matrix for the hidden state and P o bias vector for cell output correspondingly, x t and y t characterize the inputs and outputs of the cell at the t time step, correspondingly, B a and B o represent the bias vector for the hidden state and cell outputs, correspondingly. The latter hidden state is conditioned on the hidden state of the previous time step and the existing inputs. The cellular feedback loop connects the current state to the succeeding one. This bond is crucial to consider prior data while adjusting the present cell state. In such cases, the hyperbolic tangent function, represented by Tanh, turned on the overt state, and the sigmoid function was applied, represented by, to turn on the latent state.

GRU Model
The RNN is a kind of ANN model with a cyclic structure and is appropriate for data processing in sequence. The gradient is lost, and learning ability is greatly reduced once the time interval is large [24]. Hochreiter and Schmidhuber resolved these problems and developed the LSTM. The LSTM was extensively applied in time-series data, and its basic concept is that the cell state was interconnected as a conveyor belt. In that regard, the gradient propagates although distance among the states rises. In LSTM cells, the cell state can be controlled by using three gating functions forget, input, and output gates. In 2014, the GRU was developed as a network that enhanced the learning accuracy of LSTM by adjusting the LSTM model. Different from LSTM, the GRU has a fast-learning speed and is encompassed two gating functions. Furthermore, parameters are smaller than LSTM since the hidden and cell states are incorporated into a single hidden state. Accordingly, the GRU shows outstanding performance for long-term dependency in time-series data processing and takes lesser computational time when compared to the LSTM. The GRU equations to determine the hidden state are shown below: From the expression, r t denotes the reset gate and z t indicates the update gate at time t. x t represents input value at t time, W and U indicate weights, and b refers to bias. h t denotes the hidden state at time t. shows the component-wise (Hadamard) multiplication.

BiLSTM Model
RNN has the structural feature of the node connected in a loop, making them appropriate for data processing; however, it is frequently confronted with the problem of vanishing gradient [25]. The GRU and long and short-term memory (LSTM) improved on RNN by adding several threshold gates to mitigate gradient vanishing problems and enhance classification accuracy. Meanwhile, the LSTM method has a memory unit that prevents the network from facing gradient vanishing problems.
The LSTM could enhance the deficiencies of RNN; generally, the resultant of the present time was relevant to the state information of the past time, as well as state information of future time. The Bi-LSTM network was established concerning the problem that was integrating historical and future data by interconnecting two LSTMs. The architecture of the BiLSTM network comprises the back-to-forth and front-to-back LSTM layers. The forward and backward layers calculate the input dataset, and lastly, the architecture of two layers is integrated to obtain the output of the BiLSTM network as follows: In Equation (29), ω denotes weighted parameters in the BiLSTM network, i t shows input at t time , 0 t indicates the results of the forward hidden layer at t time, 0 t represents the output of the backward hidden layer at t time and y t represents the last resultant of the network.

Modeling of XAI Using LIMA Approach
The AAOXAI-CD technique combines the XAI approach LIME for a better understanding and explainability of the black-box method for accurate cancer detection [26]. Local interpretable model-agnostic explanation (LIME) describes various ML approaches for regression prediction, using the featured value change of the data sample to transform the featured values into the contribution of the predictor. The explainer gives a local interpretation of the data samples. For example, the interpretable model in LIME often uses linear regression (LR) or decision trees (DTs) and are trained by the smaller perturbation (removing specific words, hiding part of the image, and adding random noise) in the model. The quality of these models seems to be increasing and was used to resolve the best part of the business victimization dataset. Similarly, there were persistent tradeoffs between model accuracy and interpretability. Generally, the performance can be improved and enhanced by applying sophisticated techniques such as call trees, random forest, material, boosting, and SVM, which are "blackbox" techniques. The LIME provides a clear explanation of the problems with the blackbox classifiers. The LIME is a way of understanding an ML BlackBox method by perturbing the input dataset and seeing how prediction changes. The LIME is used for any ML black-box models. The fundamental steps are shown as follows: A TabularExplainer is initialized by the data used for the data training about the features and various class names.
In the class explain_instance, a technique called explain_instance accepts the reference to the instance where the explanation is essential, plus the number of features to be added in the explanation and the trained model's prediction technique.

Results and Discussion
The proposed model is simulated using Python 3.6.5 tool on PC i5-8600k, GeForce 1050 Ti 4 GB, 16 GB RAM, 250 GB SSD, and 1 TB HDD. The parameter settings are given as follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch count: 50, and activation: ReLU. In this section, the simulation values of the AAOXAI-CD technique can be tested utilizing dual datasets: the colorectal cancer dataset (dataset 1) and the osteosarcoma dataset (dataset 2). Figure 4 defines the sample images of Colorectal Cancer. For experimental validation, 70:30 and 80:20 of the training set (TRS) and testing set (TSS) is used. Dataset 1 (Warwick-QU dataset) [27] comprises 165 images with 91 malignant tumors and 74 benign tumor images. The data were collected using the Zeiss MIRAX MIDI Scanner by implementing an image data weight range of 1.187 kilobytes, 716 kilobytes, and an image data resolution range of 567 × 430 pixels to 775 × 522 pixels with all pixels having a distance of 0.6 µm from the actual distance. Next, dataset 2 [28]       In Figure 6, the cancer classifier outcomes of the AAOXAI-CD method in terms of classification performance under dataset-1. The outcomes demonstrate that the AAOXAI-CD system has identified benign and malignant samples. In Figure 6, the cancer classifier outcomes of the AAOXAI-CD method in terms of classification performance under dataset-1. The outcomes demonstrate that the AAOXAI-CD system has identified benign and malignant samples. In Table 1  In Table 1, the overall classifier results of the AAOXAI-CD method on dataset-1. The results demonstrate that the AAOXAI-CD method has identified benign and malignant samples. For instance, with 80% of TRS, the AAOXAI-CD technique reaches an average accu y of 98.65%, prec n of 98.33%, reca l of 98.65%, spec y of 98.65%, F score of 98.47%, and MCC of 96.98%. Meanwhile, with 20% of TSS, the AAOXAI-CD system reaches an average accu y of 97.06%, prec n of 97.06%, reca l of 97.06%, spec y of 97.06%, F score of 96.97%, and MCC of 94.12%. Furthermore, with 70% of TRS, the AAOXAI-CD algorithm reaches an average accu y of 99%, prec n of 99.24%, reca l of 99%, spec y of 99%, F score of 99.11%, and MCC of 98.24%. The TACY and VACY of the AAOXAI-CD model on dataset-1 are defined in Figure 7. The figure exhibited that the AAOXAI-CD method has improvised performance with augmented values of TACY and VACY. Visibly, the AAOXAI-CD model has maximum TACY outcomes.  The TACY and VACY of the AAOXAI-CD model on dataset-1 are defined in Figure  7. The figure exhibited that the AAOXAI-CD method has improvised performance with augmented values of TACY and VACY. Visibly, the AAOXAI-CD model has maximum TACY outcomes.   The TLOS and VLOS of the AAOXAI-CD model on dataset-1 are defined in Figure 8. The figure inferred that the AAOXAI-CD approach has superior performance with minimal values of TLOS and VLOS. Notably, the AAOXAI-CD model has minimal VLOS outcomes. The TLOS and VLOS of the AAOXAI-CD model on dataset-1 are defined in Figure 8. The figure inferred that the AAOXAI-CD approach has superior performance with minimal values of TLOS and VLOS. Notably, the AAOXAI-CD model has minimal VLOS outcomes. In Table 2 and Figure 9, the comparative interpretation of the AAOXAI-CD system with recent methods on dataset-1 [29][30][31]. The figures represented that the ResNet-18 (60-40), ResNet-50 (60-40), and CP-CNN models resulted in the least performance. Although the AAI-CCDC technique results in moderately improved outcomes, the AAOXAI-CD technique accomplishes maximum performance with of 99.24%, of 99%, and of 99%.  In Table 2 and Figure 9, the comparative interpretation of the AAOXAI-CD system with recent methods on dataset-1 [29][30][31]. The figures represented that the ResNet-18(60-40), ResNet-50 (60-40), and CP-CNN models resulted in the least performance. Although the AAI-CCDC technique results in moderately improved outcomes, the AAOXAI-CD technique accomplishes maximum performance with prec n of 99.24%, reca l of 99%, and accu y of 99%. Table 2. Analysis outcome of AAOXAI-CD method with other systems on dataset-1.  In Figure 10, the cancer classification outcomes of the AAOXAI-CD system in terms of classification performance under dataset-2. The results demonstrate that the AAOXAI-CD technique has identified benign and malignant samples.  In Figure 10, the cancer classification outcomes of the AAOXAI-CD system in terms of classification performance under dataset-2. The results demonstrate that the AAOXAI-CD technique has identified benign and malignant samples. In Figure 10, the cancer classification outcomes of the AAOXAI-CD system in terms of classification performance under dataset-2. The results demonstrate that the AAOXAI-CD technique has identified benign and malignant samples.  In Table 3, the overall classifier results of the AAOXAI-CD system on dataset-2. The results demonstrate that the AAOXAI-CD method has identified benign and malignant samples. For instance, with 80% of TRS, the AAOXAI-CD technique reaches an average accu y of 98.11%, prec n of 97.60%, reca l of 96.77%, spec y of 98.37%, F score of 97.16%, and MCC of 95.66%. Meanwhile, with 20% of TSS, the AAOXAI-CD algorithm reaches an average accu y of 99.42%, prec n of 99.16%, reca l of 98.61%, spec y of 99.49%, F score of 98.87%, and MCC of 98.44%. Furthermore, with 70% of TRS, the AAOXAI-CD technique reaches an average accu y of 98.67%, prec n of 97.70%, reca l of 97.26%, spec y of 99.07%, F score of 97.42%, and MCC of 96.56%. The TACY and VACY of the AAOXAI-CD model on dataset-2 are defined in Figure 11. The figure highlighted that the AAOXAI-CD method has performance with increased values of TACY and VACY. Remarkably, the AAOXAI-CD model has higher TACY outcomes. Cancers 2023, 15, x FOR PEER REVIEW 17 of 20 Figure 11. TACY and VACY analysis of AAOXAI-CD approach on dataset-2.

Methods Precision Recall Accuracy
The TLOS and VLOS of the AAOXAI-CD model on dataset-2 are defined in Figure  12. The figure inferred the AAOXAI-CD system has better outcomes having minimal values of TLOS and VLOS. Visibly the AAOXAI-CD model has minimal VLOS outcomes.  Table 4 and Figure 13 show a brief study of the AAOXAI-CD method with the recent method on dataset-2 [32,33]. The experimental values represented that the CNN-Xception, CNN-EfficientNet, CNN-ResNet-50, and CNN-MobileNet-V2 models resulted in the least The TLOS and VLOS of the AAOXAI-CD model on dataset-2 are defined in Figure 12. The figure inferred the AAOXAI-CD system has better outcomes having minimal values of TLOS and VLOS. Visibly the AAOXAI-CD model has minimal VLOS outcomes. The TLOS and VLOS of the AAOXAI-CD model on dataset-2 are defined in Figure  12. The figure inferred the AAOXAI-CD system has better outcomes having minimal values of TLOS and VLOS. Visibly the AAOXAI-CD model has minimal VLOS outcomes.  Table 4 and Figure 13 show a brief study of the AAOXAI-CD method with the recent method on dataset-2 [32,33]. The experimental values represented that the CNN-Xception, CNN-EfficientNet, CNN-ResNet-50, and CNN-MobileNet-V2 models resulted in the least  Table 4 and Figure 13 show a brief study of the AAOXAI-CD method with the recent method on dataset-2 [32,33]. The experimental values represented that the CNN-Xception, CNN-EfficientNet, CNN-ResNet-50, and CNN-MobileNet-V2 models resulted in the least performance. Although the WDODTL-ODC and HBODL-AOC techniques result in moderately improved outcomes, the AAOXAI-CD technique accomplishes maximum performance with of prec n 99.05%, of reca l 98.91%, and accu y of 99.42%.   From the above-mentioned results, it is assured that the proposed model achieves effectual classification performance over other DL models. The enhanced performance of the proposed model is due to the inclusion of AAO-based hyperparameter tuning and ensemble classification processes. In addition, the use of LIME helps to build an effective predictive modeling technique in cancer diagnosis. Without transparency, it is hard to gain the trust of healthcare professionals and employ predictive approaches in their daily operations. XAI has received considerable interest in recent times. It enables the clients to generate instances and comprehend how the classification model accomplishes the results. Healthcare institutions are keenly designing predictive models for supporting operations. The XAI can be combined to improve the transparency of healthcare predictive modeling. The interactions between healthcare professionals and the AI system are important for transferring knowledge and adopting models in healthcare operations. From the above-mentioned results, it is assured that the proposed model achieves effectual classification performance over other DL models. The enhanced performance of the proposed model is due to the inclusion of AAO-based hyperparameter tuning and ensemble classification processes. In addition, the use of LIME helps to build an effective predictive modeling technique in cancer diagnosis. Without transparency, it is hard to gain the trust of healthcare professionals and employ predictive approaches in their daily operations. XAI has received considerable interest in recent times. It enables the clients to generate instances and comprehend how the classification model accomplishes the results. Healthcare institutions are keenly designing predictive models for supporting operations. The XAI can be combined to improve the transparency of healthcare predictive modeling.
The interactions between healthcare professionals and the AI system are important for transferring knowledge and adopting models in healthcare operations.

Conclusions
In this study, we have developed an automated cancer diagnosis method using the AAOXAI-CD technique on medical images. The proposed AAOXAI-CD system attained the effectual colorectal and osteosarcoma cancer classification process. Primarily, the AAOXAI-CD technique utilized the Faster SqueezeNet model for feature vector generation. Moreover, the hyperparameter tuning of the Faster SqueezeNet model takes place with the AAO algorithm. For cancer classification, the majority-weighted voting ensemble model with three DL classifiers, namely RNN, GRU, and BiLSTM. Furthermore, the AAOXAI-CD technique combines the XAI approach LIME for better understanding and explainability of the black-box method for accurate cancer detection. The experimental evaluation of the AAOXAI-CD approach was tested on medical cancer imaging databases, and the outcomes ensured the promising outcome of the AAOXAI-CD method over other recent methods. In the future, a feature fusion-based classification model can be designed to boost the performance of the AAOXAI-CD technique. Data Availability Statement: Data sharing is not applicable to this article as no datasets were generated during the current study.