Brain Tumor Classiﬁcation and Detection Using Hybrid Deep Tumor Network

: Brain tumor (BTs) is considered one of the deadly, destructive, and belligerent disease, that shortens the average life span of patients. Patients with misdiagnosed and insufﬁcient medical treatment of BTs have less chance of survival. For tumor analysis, magnetic resonance imaging (MRI) is often utilized. However, due to the vast data produced by MRI, manual segmentation in a reasonable period of time is difﬁcult, which limits the application of standard criteria in clinical practice. So, efﬁcient and automated segmentation techniques are required. The accurate early detection and segmentation of BTs is a difﬁcult and challenging task in biomedical imaging. Automated segmentation is an issue because of the considerable temporal and anatomical variability of brain tumors. Early detection and treatment are therefore essential. To detect brain cancers or tumors, different classical machine learning (ML) algorithms have been utilized. However, the main difﬁculty with these models is the manually extracted features. This research provides a deep hybrid learning (DeepTumorNetwork) model of binary BTs classiﬁcation and overcomes the above-mentioned problems. The proposed method hybrid GoogLeNet architecture with a CNN model by eliminating the 5 layers of GoogLeNet and adding 14 layers of the CNN model that extracts features automatically. On the same Kaggle (Br35H) dataset, the proposed model key performance indicator was compared to transfer learning (TL) model (ResNet, VGG-16, SqeezNet, AlexNet, MobileNet V2) and different ML/DL. Furthermore, the proposed approach outperformed based on a key performance indicator (Acc, Recall, Precision, and F1-Score) of BTs classiﬁcation. Additionally, the proposed methods exhib-ited high classiﬁcation performance measures, Accuracy (99.51%), Precision (99%), Recall (98.90%), and F1-Score (98.50%). The proposed approaches show its superiority on recent sibling methods for BTs classiﬁcation. The proposed method outperformed current methods for BTs classiﬁcation using MRI images.


Introduction
A brain tumor (BTs) is an abnormal growth of brain cancerous cells. Usually, unregulated and aberrant cell division is the root cause brain tumor. Primary brain tumors are of its features [23]. Since each neuron in an ANN is coupled to some other neuron, these networks may extract information as well. The farther we delve into deep learning, the more deeply linked the layers become, enabling them to perform superior in medical imaging. CNN, for instance, is the most often applied DL model, with its primary application being the classification of images of brain tumors [25]. Additionality, it is always competent to use two hybrid model for better classification and detection. Various researchers used hybridization with CNN model in biomedical imagining got satisfactory results [24][25][26][27]. As a result, we were inspired to utilize the hybrid strategy to increase the accuracy and performance of current model in recognizing various types of BTs. To do this, we presented a hybrid variant of the DeepTumorNet model for identifying and classifying BTs into normal and BTs. In this method, the deep learning mechanism is employed to extract features, and a SoftMax classifying layer is utilized to account for heterogeneity. In comparison to the traditional technique ResNet [26,27], MobileNet V2 [28][29][30][31][32][33][34], Alex Net [27][28][29][30], Squeeze Net [31,33], VGG-16 [31,34,35] the Kaggle dataset of brain tumors can be accessed by the general public via figshare, the proposed model achieved the greatest BTs classification accuracy recorded. Furthermore, the goal of the research project that we have proposed is to attempt to answer the given question: How precisely and effectively can the DeepTumorNet model detect and categorize distinct forms of BT diseases?
Our key contribution to this study is as follows: • We propose a hybrid DL-TL model to identify two different kinds of brain malignancies (brain tumor) and (non-brain tumor (healthy).

•
The proposed TL-DL detection technique shows superiority over current methods and has the highest accuracy on the Kaggle dataset. A huge number of tests are done with four distinct pre-trained DL models using TL strategies. Furthermore, in order to reveal the effectiveness of prediction performance of the proposed methods, compared with recent ML/DL and transfer learning model.
The organization of the paper is as follows. The introduction and background information works are discussed in Sections 1 and 2, respectively. While the data processing and proposed model are presented in Section 3. Section 4 presents an experimental study and discussion and results. Section 5 contains the conclusion and future study.

Related Works
For BTs categorization and detection, various ML/DL models were employed [36]. DL model plays an important role in detection and classification in different areas [37][38][39][40][41][42][43][44][45]. In the literature, several alternative ways of identifying and classifying BTs have been established using magnetic resonance (MR) FLAIR images. Zeineldin et al. [46] developed a DNN technique for automated BTs segmentation. Their concept is comprised of two interconnected core components, one for encoding and the other for decoding. A CNN is specialized in extracting spatial information from the encoder part. A CNN is devoted to extracting meaningful features in the encoder section. The generated semantic map is then fed into the decoder part to produce the comprehensive probability map. The ResNet and Dense Net were investigated in the final stage. Resnet-50 of the TL, was used to identify BTs. Their experimental results were 95% accurate. In related work, Nawab et al. [47] used block-wise transfer learning to obtain a 5-fold cross-validation and they achieved 94.82%. To validate their approach, they employed a benchmark dataset based on T1-weighted contrast-enhanced magnetic resonance imaging (CEMRI). Furthermore, Sarmad Maqsood et al. [48] implemented fuzzy logic and U-NET CNN model for binary segmentation and classification of BTs. They proposed that the model performed better as compared to other sibling methods.
The detection accuracy attained with this conceptual framework was 97.5%. In [49], Mircea et al. extracted wavelet coefficients from images using a feature-based technique. The authors contend that wavelet transforms have a temporal resolution edge over Fourier transforms, allowing to locate the location coordinates frequency of the images. As a classifier, a technique is used for 91% accuracy. V Rajinikanth et al. [50] show CADD system with CNN model for segmentation and classification. They explained and investigated different CADD systems. After the evaluation and investigation, the SVM model performed 97% accuracy using 10-fold cross validation.
Before this model could be trained, it was put through a validation process using the deep learning algorithms Inception-v3 and DensNet201.They got 89% accuracy.
This collection has 155 illustrations of malignant BTs, normal and healthy tissue. Furthermore, It was not possible to the CNNs fine-tune by utilizing the dataset was small in size and the testing set was also insufficient to verify the correctness of the proposed model.
A model for the automatic classification of BTs was proposed using VGG-16 and the BRaTs dataset [51]. B Badjie et al. [52] implemented DCNN model learning for binary MRI image segmentation and classification, the proposed Alex Net CNN model indicated best accuracy up to 90%. P. Dvorak and colleagues selected the convolutional neural network as the learning approach in [53] due to its ability to cope with feature correlation. They put the technique through its tests on the publicly accessible data set (BRATS2014), which includes three separate segmentations of multimodal tasks. As a consequence, they were able to acquire cutting-edge findings for the data set of BT segmentation, which contained 254 volumes of multimodal and implemented only thirteen s to process every volume.
S. Irsheidat and colleagues created a model based on ANN in their paper referred to as [54]. This model is capable of taking magnetic resonance images and analyzing them using matrix operations and mathematical formulas. To generate reliable predictions concerning the presence of brain cancers, this neural network was trained using magnetic resonance images from 155 healthy brains and 98 tumors. The network was trained using these images. Magnetic resonance imaging was used to produce 253 pictures in the collection.
Sravya et al. [55] investigated the detection of BTs and presented various important topics and approaches. Dolphin-SCA is a unique optimized DL approach for the identification and classification of BT described by Kumar et al. [56]. A deep CNN is used to power the process by various researchers used a fuzzy based model in conjunction with a dolphin echolocation-based sine cosine method for segmentation (DolphinSCA). The obtained characteristics were utilized in a deep neural network that was built on power statistical features and LDP and used Dolphin-SCA as its basis. The proposed technique obtained the highest accuracy rate of 81.6% ever. S Maqsood et al. [57] introduced support vector machine (SVM) with DCNNs for muti-model BTs detection with 96% accuracy. Waghmare et al. [58] identified and classified BTs using a range of CNN architectures. All the mentioned problem has issue of classification performance of BTs images, which is resolved using the proposed model deep tumor network.

Methodology
This section shows the proposed work of the deep tumor network including two major steps, data processing (data collection, data augmentation, and class labeling), and the second step is the training approaches of the suggested methods and the process to classify the Kaggle (Br35H) image dataset into the tumor and non-tumors class as shown in Figures 1 and 2. In addition, the proposed model's performance has also been assessed using the major performance indicators (Acc, Recall, Prec, and F1-Score).

Brain Tumor Kaggle Dataset
The experiments described in this study were performed by utilizing a publicly accessible dataset acquired from a Kaggle (Br35H) [59]. This dataset consisted of 1500 brain MRI images with tumors and 1500 brain MRI images without tumors. All images were two-dimensional and had a height and width of 256 × 256 pixels. All images were skullstripped and labeled yes if they contained a tumor and no if they did not. Figure 1 shows the dataset of images with and without tumors labeled yes and no, respectively. The descriptions of the training and testing datasets are listed in Table 1

Brain Tumor Kaggle Dataset
The experiments described in this study were performed by utilizing a publicly accessible dataset acquired from a Kaggle (Br35H) [59]. This dataset consisted of 1500 brain MRI images with tumors and 1500 brain MRI images without tumors. All images were two-dimensional and had a height and width of 256 × 256 pixels. All images were skullstripped and labeled yes if they contained a tumor and no if they did not. Figure 1 shows the dataset of images with and without tumors labeled yes and no, respectively. The descriptions of the training and testing datasets are listed in Table 1

Brain Tumor Kaggle Dataset
The experiments described in this study were performed by utilizing a publicly accessible dataset acquired from a Kaggle (Br35H) [59]. This dataset consisted of 1500 brain MRI images with tumors and 1500 brain MRI images without tumors. All images were two-dimensional and had a height and width of 256 × 256 pixels. All images were skullstripped and labeled yes if they contained a tumor and no if they did not. Figure 1 shows the dataset of images with and without tumors labeled yes and no, respectively. The descriptions of the training and testing datasets are listed in Table 1, Figures 3 and 4.

Data Augmentation
The Kaggle dataset (Br35H) included 3000 images that were insufficient and need data augmentation for size increasing, scaling, and rotating of the images and add noise. The images were vertically and horizontally zoomed in at certain angles and increasing the brightness which improve the training and classification performance of the proposed model. Additionally, each image of the Kaggle dataset was augmented 17 times of the orig- inal dataset to avoid the overfishing issue [11]. Some of the data augmentation techniques used in this research are as follows; A.
Position augmentation In this process the position of the brain MRI images pixel is changed. B.
Scaling In scaling process, the brain images are resized. C. Cropping In cropping process, a small portion of the brain MRI images is selected; here in this study we selected the center of the brain images. D.
Brightness In this step the brightness of the brain images is changed from original to a lighter one.

Row Major Order
In row-major order, the images of multi-dimensional arrays are stored in a single row to ease the computing processing, because the RGB and Greyscale images are complex multi-dimensional and need more computing resources.

Proposed Model
The hybrid DeepTumorNet model consists of CNN as a fundamental model hybrid with GoogleNet. Initially, to train the CNN model may occasionally take a few days and is a difficult task [30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48]. It is preferable to first train the proposed model implements the TL model before hybridizing the CNN model. Additionally, in this study, we implemented Google Net [26] model as the foundational model because the model won ILSVRC (2014) ImageNet competition. The basic Google Net model consists of a total of 22 layer, including convolution layers, average pooling layer (APS), normalization layers Global max-pooling layers (GMXs), inception layers module, and fully connected layer (FCLs). The input data of the Goo consist of 224 × 224 dimensions. A new input layer with the dimensions 224 × 224 by 1 was implemented in GoogleNet. Within the context of the pre-trained google Net method, the ReLU AF was used. While this was going on, the ReLU activation function ignored any negative values and substituted zero for them. The input image of the Google Net consists of the 224 × 224 size with the ReLU activation function (RAF). The RAF used to ignore the negative values and replace them with 0. Furthermore, ReAF has upgraded to Leaky ReLU where all the negative values were replaced with positive ones. Furthermore, in the Google Net, the last 5 layers were replaced by 14 additional layers of the CNN model. In the CNN layers, ReAF was also replaced by Leaky ReAF. These changes were accomplished without changes to the primary structure of the proposed model. After adding these layers, the total number of layers was 27 layers.
For the first layers of CLs, the image size was shuffled and the filter size was 8 × 8. The second had two deep CLs, consists of a 1 × 1 convolution block. It achieved 1 × 1 dimensionality reduction and a decrease in the number of parameters. Google Net consists of different inception model that has various convolution kernels (CKs) size such as 1 × 1 to 5 × 5 with different features. At the beginning of the process, the important feature was extracted. Similarly, the 1 × 1 CKs reduce processing time as well as prove enough information as described in Table 2. In addition, CLs show more robust and precise information about the feature, because the first layers of CLs extracted minute features while the four CLs extracted high-level features. Furthermore, the addition of GaPLs improved the validation accuracy of the proposed model. In addition, the Leaky ReAFs were used to improve the proposed model expressiveness and solved the issue ReLU, which resulted in improved classification performance as shown in Figure 4. Due to these layers, the proposed model was able to extract the most important, deep, and discriminative features which improve the classification performance as compared to other recent state-ofthe-art methods and ML/DL models.

Input Image Data
The input image of the proposed DeepTumorNet model possesses the size of the image which starts from the image layer. In this study, the image size 224 ×224 × 1 was provided with the grayscale gradient, which shows the width, height, and corresponding channel size including one image of a grayscale consisting of three colors (red, green, and blue). In the initial training process, these images were passed through the input layers. A.
Convolutional layer In this layer, the two major inputs were image filter and matrix. The mathematical operation involved multiplying filter of the image generating input of the feature map. B.
Activation layer In this layer, the rectifier linear units (ReLUs) were used, which speeds up the training process and gives nonlinearity to the network model. The mathematical expression of the activation function is shown in Equation (1).
In the case of positive inputs (y), ReLU action function returns the value (y) as the output. However, when dealing with negative inputs, it returns a much smaller number that is equal to 0.01 times y. As a result, in this scenario, no neuron is rendered inactive, and we will no longer come across neurons that have died. C.
Batch normalization layers The outputs that were created by the suggested convolution layers were used, and the batch normalization layer was applied to normalize them. The training duration of the recommended proposed model is shortened as a consequence of normalizing, which makes the process of learning both more efficient and more rapidly achieved. Normalization also makes the training period shorter. D.
Pooling layer The convolutional layer's primary limitation is that it only captures the locationdependent features. Therefore, the categorization ends up being inaccurate if there is even a little shift in the position of the feature inside the image. By rendering the image more compact through the process of pooling, the network is able to bypass this constraint. As a result, the representation is now invariant to relatively few changes and particulars. Absolute pooling and average pooling were applied so that the characteristics might be linked to one another. E.
Fully connected layer In this layer, the features that were generated from the CLs are fed into the FC layers. In the FC layer, every node is connected with another node and makes the relation between an input image and its associate's class. This layer implements SoftMax activation. F.
Loss function During training, this function (Y) must be reduced. After the image has been processed through all of the preceding layers, the output is calculated. The error rate is computed after comparing it to the expected outcome using the loss function. This technique is performed several times till its loss function is reduced. We used the binary cross-entropy as our loss function (BCE). The mathematical expression for BCE is shown in Equation (2).
In binary classification, the actual value of y may only take on one of two potential forms: either 0 or 1. Therefore, in order to accurately determine the loss between the expected and actual results, it is necessary to compare the actual value, which can either be 0 or 1, with the probability that the input lines up with that category (where p(i) is the probability that the category is 1, and 1 − p(i) is the probability that the category is 0). G.
SoftMax layer The FC layer's outcomes are more normally distributed because of the activation function. SoftMax performs the probabilistic computation for the network and generates work in positive values for each class.

H. Classification Layer
The classification layer is indeed the model's final layer to be demonstrated. This layer is utilized to generate the output by merging each input. As a consequence of the SoftMax AF, a posterior distribution was obtained [34]. I.
Grid search Hyperparameter optimization Grid search hyperparameter is optimization approach that will methodically build and evaluate a model for each combination of algorithms parameters specified in a grid. In this problem, we tune the hypermeters by using grid search to find out the optimal hypermeters-based best classification performance. Furthermore, the grid search has optimal hyperparameter including epoch size = 100, Epsilon from 0.002, filter size = 1 × 1, batch size = 100 and the learning rate = 0.009. Furthermore, grid search optimization also used 10-fold cross validation. In 10-fold cross validation all the process, both the training and the test would be carried out only once within each set (fold). In order to avoid overfitting, 10-fold cross validation is the best technique to be used. k-fold validation reduces this variance by averaging over k different partitions, so the performance estimate is less sensitive to the partitioning of the data. In addition, in 10-fold cross validation process the one dataset is then split into 10 equal parts using a random number generator. Nine of those parts are put to use in training, while the remaining tenth is set aside for examination. We carry out this process a total of ten times, setting aside a different tenth of each iteration for evaluation each time.

Transfer Learning Model
Transfer learning employs a model that has already been trained to learn new, diverse data and to utilize the characteristics that have already been learned to address one problem as a springboard for solving other problems. In this work, we employed five pre-trained CNN architectures to predict 1000 classes: Alex Net, ResNet, VGG-16, MobileNet-v2, and SqueezNet. These architectures were trained using 1.2 M images. The labels of each object in the image are generated probabilistically by these networks using the full image as an input.

A.
ResNet This model is related to Microsoft Research Center's 50-layer Residual Network built in the research [60] ResNet employs shortcut connections to speed up training for improved service, which can decrease errors as complexity rises. Residual is linked to feature deduction. ResNet also addresses the issue of decreasing accuracy. Figure 5 depicts the ResNet model design.
as a springboard for solving other problems. In this work, we employed five pre-trained CNN architectures to predict 1000 classes: Alex Net, ResNet, VGG-16, MobileNet-v2, and Squeez-Net. These architectures were trained using 1.2M images. The labels of each object in the image are generated probabilistically by these networks using the full image as an input.
A. ResNet This model is related to Microsoft Research Center's 50-layer Residual Network built in the research [60] ResNet employs shortcut connections to speed up training for improved service, which can decrease errors as complexity rises. Residual is linked to feature deduction. ResNet also addresses the issue of decreasing accuracy. Figure  5 depicts the ResNet model design.   as a springboard for solving other problems. In this work, we employed five pre-trained CNN architectures to predict 1000 classes: Alex Net, ResNet, VGG-16, MobileNet-v2, and Squeez-Net. These architectures were trained using 1.2M images. The labels of each object in the image are generated probabilistically by these networks using the full image as an input.
A. ResNet This model is related to Microsoft Research Center's 50-layer Residual Network built in the research [60] ResNet employs shortcut connections to speed up training for improved service, which can decrease errors as complexity rises. Residual is linked to feature deduction. ResNet also addresses the issue of decreasing accuracy. Figure  5 depicts the ResNet model design.   Alex Net In Alex Net, the network is divided into 11 different layers. The network has a significant number of layers, which makes feature extraction easier. In addition, the extensive number of factors has a negative influence on overall performance. The first layer that Alex Net has is called the convolution layer. The convolution layer is the third and last layer, coming after the maximum pooling and normalizing layers.
The classification procedure comes to a close with the application of the SoftMax layer [64] as shown in Figure 8.

D. SqueezNet
Squeeze Net is an 18-layer deep convolutional neural network. A pretrained variant of the network trained on over a thousand images of the ImageNet database may be loaded. As a consequence, the network has learnt detailed visual features for a diverse set of images. This method returns a Squeeze Net v1.1 network with similar accuracy as Squeeze Net v1.0 but fewer floating-point computations per prediction [63] as shown in Figure 7.

E. Alex Net
In Alex Net, the network is divided into 11 different layers. The network has a significant number of layers, which makes feature extraction easier. In addition, the extensive number of factors has a negative influence on overall performance. The first layer that Alex Net has is called the convolution layer. The convolution layer is the third and last layer, coming after the maximum pooling and normalizing layers. The classification procedure comes to a close with the application of the SoftMax layer [64] as shown in Figure 8.

E. Alex Net
In Alex Net, the network is divided into 11 different layers. The network has a significant number of layers, which makes feature extraction easier. In addition, the extensive number of factors has a negative influence on overall performance. The first layer that Alex Net has is called the convolution layer. The convolution layer is the third and last layer, coming after the maximum pooling and normalizing layers. The classification procedure comes to a close with the application of the SoftMax layer [64] as shown in Figure 8.   Table 3 presents the experimental setup of are as follows.

Evaluation Matrix
In order to evaluate the proposed model performance, the key metrices (Acc, Pres, Recall, Pres, F1-Score) is used. Which shows the true positive (TP), false negative (FP), True negative (TN), False positive (FP), as shown in Equations (3)-(7). The following are the key performance indicators: This section compares the performance metrics of the proposed system with transfer learning models (TL) such as VVG-16, SqeezNet, Mobile Net V2, ResNet Alex Net and in brain tumor detection and classification utilizing key performance metrics.
Accuracy (%), Pres (%), Recall (%) and F1-Score (%) One of the core aspects that exhibits the unique class efficiency in classification performance is accuracy.
Additionally, precision indicate the ratio of accuracy vs real time prediction and specificity presents the percentage of negative class. For the evolution of the proposed model the key performance indicators compared with other TL methods which shows the proposed model shows best classification performance in terms of Accuracy (99.20), Precsion (99.10%), specificity (98.2%), Recall (98.60%), and F1-Score (98%). Figure 9 shows that the SqueezNet model has the lowest performance metric.

Confusion Matrix
A confusion matrix is a performance assessment indicator that measures each class's detection. In this investigation, the proposed deep Tumor network's confusion matrix achieved good classification performance of binary tumor detection and properly classi-

Confusion Matrix
A confusion matrix is a performance assessment indicator that measures each class's detection. In this investigation, the proposed deep Tumor network's confusion matrix achieved good classification performance of binary tumor detection and properly classified each type of brain tumor. Figure 10 shows that the TLs has the lowest performance metric.

ROC Analysis
The receiver operating characteristic (ROC) curve is critical for evaluating brain tumor detection. The ROC curve depicts the ratio of TPR to FPR for each class detection performance. Figure 11 demonstrates that the proposed method outperforms than other TL techniques on the ROC curve basis.

ROC Analysis
The receiver operating characteristic (ROC) curve is critical for evaluating brain tumor detection. The ROC curve depicts the ratio of TPR to FPR for each class detection performance. Figure 11 demonstrates that the proposed method outperforms than other TL techniques on the ROC curve basis.

TNR, TPR, and MCC Analysis
In this subsection represents the analysis of TNR, TPR, and MCC of the proposed model with best performer (Alex Net and Mobile Net) on Kaggle dataset. Figure 12 shows that the proposed model performed excellent values of TPR, TNR, and MCC as compared to another TL model.

TNR, TPR, and MCC Analysis
In this subsection represents the analysis of TNR, TPR, and MCC of the proposed model with best performer (Alex Net and Mobile Net) on Kaggle dataset. Figure 12 shows that the proposed model performed excellent values of TPR, TNR, and MCC as compared to another TL model.

Time Complexity (%)
The detection time is important factors or metrics indicated the effectiveness of the model, which shows the internal sustainability to find out the features and performed classification. The proposed model performed less detection time up to 3 ms as shown in Figure 13. Furthermore, the proposed model time complexity is expressed using the big O notation. The Big O notation (O(nˆ2) is the metric that is used most frequently to calculate time complexity where "n" represents the initial samples of the population. The term "worst-case scenario" is what "Big O" refers to precisely, and it may be used to represent either the amount of time needed for an algorithm's execution or the amount of space it takes up.

Time Complexity (%)
The detection time is important factors or metrics indicated the effectiveness of the model, which shows the internal sustainability to find out the features and performed classification. The proposed model performed less detection time up to 3 ms as shown in figure  13. Furthermore, the proposed model time complexity is expressed using the big O notation. The Big O notation (O(n^2) is the metric that is used most frequently to calculate time complexity where "n" represents the initial samples of the population. The term "worst-case scenario" is what "Big O" refers to precisely, and it may be used to represent either the amount of time needed for an algorithm's execution or the amount of space it takes up.
The proposed model efficiency was tested by comparing it with other DL models using the same dataset. When compared to other models, the proposed methods have high classification performance.

Time Complexity (%)
The detection time is important factors or metrics indicated the effectiveness of the model, which shows the internal sustainability to find out the features and performed classification. The proposed model performed less detection time up to 3 ms as shown in figure  13. Furthermore, the proposed model time complexity is expressed using the big O notation. The Big O notation (O(n^2) is the metric that is used most frequently to calculate time complexity where "n" represents the initial samples of the population. The term "worst-case scenario" is what "Big O" refers to precisely, and it may be used to represent either the amount of time needed for an algorithm's execution or the amount of space it takes up.
The proposed model efficiency was tested by comparing it with other DL models using the same dataset. When compared to other models, the proposed methods have high classification performance.  The proposed model efficiency was tested by comparing it with other DL models using the same dataset. When compared to other models, the proposed methods have high classification performance.

Comparative Results with Existing ML/DL Model
We compared the proposed deep tumor network to other excellent benchmark algorithms such as LSTM, GRU, CNN, etc. Figure 14 shows the performance metrics accuracy to check the performance of the model. Although the Deep Tumor Network is expecting spectacular outcomes, all of these methods are being evaluated in terms of these parameters. Furthermore, the proposed have some limitation, needed high computing resources (good GPU) for the training process, which is high time complexity (ms).
When compared to other baseline methods from the existing literature, the performance of our proposed model in binary tumor classification is remarkable, as shown in Table 4.
to check the performance of the model. Although the Deep Tumor Network is expecting spectacular outcomes, all of these methods are being evaluated in terms of these parameters. Furthermore, the proposed have some limitation, needed high computing resources (good GPU) for the training process, which is high time complexity (ms).
When compared to other baseline methods from the existing literature, the performance of our proposed model in binary tumor classification is remarkable, as shown in Table 4.

Conclusions
This study proposed Google Net and CNN model hybrid models as defined as defined as deep tumor networks for BTs detection. The GoogleNet model was adopted as the foundation for the proposed model. The final five levels of GoogleNet were deleted

Conclusions
This study proposed Google Net and CNN model hybrid models as defined as defined as deep tumor networks for BTs detection. The GoogleNet model was adopted as the foundation for the proposed model. The final five levels of GoogleNet were deleted replaced by 14 layers of CNN model, each one deeper than the prior one. Furthermore, the ReLU AFs were changed to a leaky Re-AF, although the basic CNN architecture remained unchanged. The total number of layers increased from 22 to 33 once the changes were implemented. The recommended hybrid model attained the highest classification accuracy of 99.10% achieved. In addition, to define the BT types, we used the Kaggle brain tumor dataset to train five deep CNN models implemented the TL technique. The results of these models were then compared to those of the proposed model. The outcomes of the investigations indicated that the proposed model was capable of distinguishing between brain tumors with greater accuracy. Furthermore, the proposed approach was capable of calculating more descriptive and discriminatory information, as well as precise features for brain tumor detection, resulting in a high degree of accuracy when compared to existing state-of-the-art techniques. Furthermore, the results of the studies show clearly that the CNN model that used transfer teaching methods offered the best potential performance level. In contrast to the other pre-trained models, the hybrid framework achieved the best level of accuracy.
Furthermore, in the following work, we will conduct experiments on the dataset using a limited number of MRI scans of the brain, including any malignant lesions and a significant number of normal scans, with the proposed model trying to extract information that was more comprehensive, discriminatory, and with precise features. As a result, before categorizing the Kaggle dataset of the brain into two groups (brain tumor and non-brain tumors), an effective segmentation technique for brain MRI data should be applied. Furthermore, we wish to assess the efficacy of the hybrid approach presented for application with different types of data in the areas of biomedical imaging, as COVID-19, lung disease, and asthma diagnosis.

Data Availability Statement:
The data used to support the findings of this study are available at https://www.kaggle.com/ahmedhamada0/brain-tumour-detection (accessed on 1 October 2022).