1. Introduction
Brain tumors represent one of the most aggressive and life-threatening neurological disorders, and their early detection is crucial for effective treatment and improving patient survival rates. Magnetic Resonance Imaging (MRI) is the most widely used non-invasive diagnostic modality for brain tumor detection owing to its exceptional soft tissue contrast and high-resolution imaging capabilities. However, manual interpretation of MRI scans is time-consuming, highly subjective, and prone to human error. To address these challenges, deep learning (DL) and transfer learning (TL) algorithms have emerged as powerful tools for automated medical image analysis. Transfer learning, in particular, is effective for limited datasets, as pretrained models such as VGG16 and EfficientNetB4 can extract discriminative features from MRI images while reducing computational costs and mitigating overfitting. Recent studies have increasingly applied TL and artificial intelligence (AI) to enhance the diagnostic accuracy of medical imaging systems (Rahman et al., 2025) [
1]. In continuation of these advancements, the present study utilizes the BR35H dataset to evaluate and compare multiple DL-based transfer learning models for the automated detection and classification of brain tumors from MRI images.
Brain tumors are among the leading causes of cancer-related mortality in both children and adults worldwide. Accurate grading of tumors—particularly distinguishing between low-grade and high-grade gliomas—is essential for prognosis and treatment planning. Clinically, tumors are classified based on their cellular origin and whether they are malignant or benign [
2]. Higher-grade tumors tend to exhibit more aggressive and malignant behavior. Diagnosis commonly relies on radiological tools such as Positron Emission Tomography (PET), MRI, and Computed Tomography (CT), which provide detailed anatomical insights. MRI, in particular, has proven highly effective in detecting a wide spectrum of central nervous system abnormalities, including more than 120 types of tumor variations [
3]. Despite its diagnostic strengths, MRI-based systems are limited by their inability to capture real-time physiological changes, which may delay early intervention.
To overcome this limitation, researchers have proposed integrating continuous monitoring through biosensors and physiological signal acquisition. Multimodal data streams—such as vital signs or biosensor-based signals—could provide temporal insights into tumor progression, thereby enhancing AI model adaptability. Additionally, electromagnetic sensing methods, including eddy current detection and dielectric property analysis, have shown potential in identifying tissue abnormalities prior to structural changes in conventional imaging [
4]. By combining pre-imaging physiological data with MRI and DL models, more proactive diagnostic frameworks could be developed. Techniques such as Electrical Impedance Tomography (EIT) have also shown promise in detecting carcinomas through impedance-based measurements, offering a complementary approach to traditional imaging modalities [
5]. Moreover, the integration of surface electromyography (sEMG) with MRI could improve anomaly detection through multimodal data fusion, potentially leading to more robust AI-based medical diagnosis systems.
Biomechanical modeling—such as finite element analysis of tissue deformation—may further enhance tumor growth prediction by incorporating stress-level assessments of surrounding brain tissue. In addition, integrating MRI with real-time ultrasound data could enable dynamic monitoring and improve diagnosis accuracy through enhanced spatial and temporal representation.
Despite significant progress, achieving accurate brain tumor detection from limited MRI datasets remains a major challenge. This study introduces a novel framework that combines EfficientNetB4 with dataset-specific preprocessing, data augmentation, and optimal optimizer selection to improve model generalization and prevent overfitting. Experimental evaluation on the BR35H dataset demonstrates that the proposed approach outperforms traditional CNN and VGG16 architectures in both accuracy and F1-score, while maintaining computational efficiency. The use of k-fold cross-validation with patient-level splitting ensures high generalization and prevents data leakage, further validating the robustness of the methodology.
The objectives of this research are as follows: (1) To detect brain tumors from MRI images using deep transfer learning; (2) To compare multiple models for automated tumor classification on the BR35H dataset; (3) To apply data augmentation to address the limited dataset problem; (4) To evaluate model performance using standardized metrics.
The study’s distinctive contributions, such as model comparison, data augmentation, and high-accuracy brain tumor diagnosis using MRI, were highlighted more clearly and emphatically by excluding traditional background explanations of CNNs, transfer learning, and digital image processing.
The key contributions of this study are summarized as follows: (1) A novel brain tumor detection framework based on EfficientNetB4, enhanced through targeted preprocessing and augmentation tailored for small MRI datasets. (2) Patient-level splitting with k-fold cross-validation to improve model generalization and prevent overfitting. (3) A comprehensive investigation of optimizers (Adam, Nadam, Adagrad, RMSprop) to support practical model selection. (4) Empirical demonstration that EfficientNetB4 achieves superior results over CNN and VGG16 on the BR35H dataset, achieving an F1-score of 100% and an accuracy of 99.66%. (5)A methodological innovation in fine-tuning pretrained models for limited medical imaging datasets, paving the way for future multimodal and real-time biomedical applications.
The rest of the paper is organized as follows:
Section 2 reviews existing literature;
Section 3 details the proposed methodology;
Section 4 presents experimental results and discussion; and
Section 5 concludes with key findings and potential directions for future research.
2. Literature Review
This section presents a review of MRI-based methods for brain tumor detection, with a focus on image categorization and related approaches using deep learning and machine learning techniques. Medical imaging modalities are generally divided into two categories: anatomical imaging, which visualizes structural information, and functional imaging, which provides insights into metabolic activity. MRI is a widely adopted anatomical imaging technique for identifying brain tumors due to its superior soft tissue contrast and non-invasive nature [
6].
A variety of methods have been developed for brain tumor classification. Deep learning (DL), particularly convolutional neural networks (CNNs), has gained prominence for extracting high-level features from MRI scans. Several studies have explored the ability of CNN-generated deep features to predict patient survival, emphasizing the importance of domain-specific fine-tuning to enhance model performance. For instance, a publicly available combined dataset achieved an accuracy of 81% in cross-validation using a pretrained CNN model. Additionally, hybrid approaches that integrate support vector machines (SVM) for classification and genetic algorithms (GA) for feature extraction have been proposed to improve tissue characterization in MRI-based tumor imaging [
7]. In this framework, extracted visual features are compared with stored feature representations, enabling efficient decision-making. However, selecting optimal feature compositions using GA remains a challenging task due to the high dimensionality of MRI data. The employment of this hybrid technique has shown promising results for MRI brain tumor classification and can support physicians by providing a reliable second opinion during treatment planning [
8].
Furthermore, a three-step methodology for brain tumor identification was proposed in [
9]. In the first step, relevant features are extracted from normalized and transformed MRI images. These features are then used as input to machine learning classifiers for tumor detection. The second step extends this approach by replacing the SVM-based classifier with a Random Forest (RF) framework, resulting in improved classification performance. The findings highlight that deep learning has emerged as a dominant technique in medical imaging, particularly in handling complex computational tasks.
Moreover, the combination of discrete wavelet transform (DWT) and principal component analysis (PCA) has been shown to significantly enhance brain tumor processing and classification accuracy by improving feature extraction and dimensionality reduction [
10,
11]. These advancements underscore the potential of integrating DL-based frameworks with traditional feature extraction techniques to improve diagnostic accuracy in MRI-based brain tumor detection.
Recent research has explored the classification of MRI brain tumor images into benign and malignant categories using deep features combined with machine learning techniques. Although traditional machine learning classifiers trained on deep features show reasonable performance, convolutional neural network (CNN) architectures have consistently demonstrated superior results across multiple studies [
12]. Researchers have developed methodologies that leverage both deep feature extraction and machine learning algorithms, yet CNN-based classifiers—owing to their ability to automatically learn hierarchical and discriminative features—outperform these hybrid models. In particular, adapted deep convolutional neural networks (DCNN) have been shown to provide highly accurate tumor classification results. One study outlined a systematic protocol for analyzing T1-weighted, T2-weighted, and T2-FLAIR MRI scans from 220 patients, demonstrating that the DCNN-based approach effectively distinguishes between benign and malignant brain tumors, thereby reinforcing the clinical utility of deep learning in medical imaging applications [
13].
Digital image processing plays a vital role in medical imaging, as advanced technologies enable the automatic extraction of meaningful features from MRI data for accurate brain tumor diagnosis. Through techniques such as image augmentation, segmentation, and feature extraction, deep learning models can effectively differentiate between benign and malignant tissues with high precision. Digital image processing is generally categorized into four main subfields: image enhancement, image restoration, image analysis, and image compression. Among these, heuristic and analytical methods are frequently employed to process medical images, enabling the automatic extraction of diagnostic information from imaging data [
14].
Image analysis specifically encompasses a range of techniques, including image segmentation, edge detection, texture analysis, and motion analysis, all of which support the identification of tumor-specific patterns. Machine learning methods have further highlighted several critical research directions in medical imaging. The transition toward digitized medical records has been facilitated by machine learning algorithms, allowing for improved accuracy, accessibility, data maintenance, sharing efficiency, reliability, privacy, and cost-effectiveness.
However, medical image processing still faces considerable challenges, particularly due to insufficient and imbalanced training datasets. Data scarcity, uneven class distribution, and limited sample availability reduce the generalizability of models and hinder performance in real-world clinical environments [
15,
16]. Addressing these challenges is therefore essential for developing robust and reliable AI-based diagnostic systems for brain tumor detection.
Despite the inclusion of a large number of wavelengths (1500 wavelengths), the primary objective of the referenced study was to conduct a comparative analysis of different models based on previous related works, with the aim of achieving superior segmentation and classification performance. The researchers introduced an MRI-based brain tumor detection methodology that integrates several preprocessing and classification techniques [
17]. Initially, Otsu’s thresholding algorithm was employed to determine the optimal threshold and preprocess the MRI images. Subsequently, K-means clustering was applied to isolate malignant regions within the images.
Feature extraction was performed using discrete wavelet transform (DWT) and Gabor wavelets, followed by dimensionality reduction using principal component analysis (PCA) to retain the most informative features while reducing computational complexity. Finally, a support vector machine (SVM) classifier was used to determine whether the tumor was benign or malignant.
Moreover, to address the complexities of multimodal MRI data, the authors proposed a novel cross-modality deep learning framework for brain tumor segmentation. The U-Net architecture—a widely recognized model in medical image segmentation—was utilized to accurately segment tumor regions across different MRI modalities [
18,
19,
20,
21]. This approach highlights the potential of combining traditional feature-based methods with modern deep learning architectures to improve the precision and efficiency of brain tumor diagnosis.
To address the challenge of limited training data, several researchers have employed autoencoders to facilitate automated segmentation of brain tumors from 3D MRI images. This approach has demonstrated effective and safe segmentation by leveraging multi-resolution analysis for medical volume reconstruction, producing accurate and reliable segmented outputs. Unlike traditional 2D image processing, the use of 3D MRI volumes allows for more precise tumor localization; however, it also increases computational complexity due to higher-dimensional data processing requirements.
Despite these advances, current methodologies still exhibit notable limitations. Many existing approaches perform suboptimally in recognition and classification tasks due to their reliance on manually delineated tumor regions prior to classification, which introduces variability and limits automation [
22,
23,
24]. The primary issue arises from the data-intensive nature of convolutional neural networks (CNNs) and deep learning models. Achieving high performance typically requires a large volume of well-labeled training images—a process that is both time-consuming and resource-intensive, particularly in medical imaging, where expert annotation is required.
Transfer learning methods offer a potential solution by enabling the use of pretrained models; however, their effectiveness depends heavily on the similarity between the target task (e.g., brain tumor detection) and the source task for which the model was originally trained. Significant task disparity may lead to reduced diagnostic accuracy. Additionally, many existing datasets suffer from class imbalance, where the number of samples for each tumor type varies substantially. This imbalance leads to biased training and reduced classification reliability, particularly in CNN- and transfer learning-based frameworks [
25,
26]. Overall, these challenges underscore the need for more robust models that can perform accurately on limited and imbalanced medical datasets, while minimizing dependency on manual feature extraction and large-scale annotation.
This study introduces a novel and comprehensive framework for the identification and classification of brain tumors using MRI data. The proposed methodology is designed to be seamlessly integrated into existing MRI-based diagnostic systems and addresses critical challenges associated with small and imbalanced medical datasets. Specifically, the framework employs two distinct deep learning models. The first model is a state-of-the-art generative model that captures the distribution of relevant features within a class-imbalanced dataset, enabling the automatic conversion of limited and imbalanced data into larger, balanced datasets suitable for training. The second model functions as a classification network trained on the newly generated, class-balanced data to accurately identify brain tumors from MRI images. This dual-model strategy distinguishes our approach from previous studies, which typically utilize a single model for both feature extraction and classification.
To tackle visual recognition tasks, the framework leverages convolutional neural network (CNN) architectures, including variants inspired by AlexNet. One of the primary barriers to deploying deep learning in medical imaging is the scarcity of labeled training data [
27]. This limitation is effectively mitigated through data-augmentation techniques, which increase the dataset size by generating additional labeled images and consequently enhance model accuracy. Furthermore, transfer learning using CNN-based models facilitates autonomous feature learning without requiring domain experts, making it particularly valuable for analyzing and interpreting MRI brain scans. These capabilities have positioned CNNs as a cornerstone of data-driven decision-making in medical diagnosis.
The literature reveals a steady increase in CNN-based medical imaging research since 2014, especially in the context of brain tumor detection. To further contribute to this domain, our study also provides access to a publicly available MRI brain tumor dataset, thereby supporting ongoing research and encouraging the development of new automated diagnostic models. Additionally, the proposed framework incorporates sequential minimal optimization (SMO) to train a support vector machine (SVM) classifier capable of distinguishing malignant tumor types such as metastatic bronchogenic carcinoma, glioblastoma, and sarcoma [
28,
29]. Collectively, the findings demonstrate the effectiveness of combining generative modeling, data augmentation, and transfer learning to advance automated brain tumor detection and classification.
Convolutional Neural Networks (CNNs), a prominent class of deep learning models, have revolutionized medical image analysis and brain tumor detection. Early CNN architecture relied on manually engineered feature extraction combined with shallow classification layers. However, more advanced models—such as AlexNet, VGGNet, and ResNet—introduced deeper convolutional hierarchies capable of automatically learning discriminative features directly from raw MRI data. Through transfer learning and the utilization of large-scale image datasets, recent research has substantially improved classification performance, even when trained on limited medical data. These advancements have positioned CNNs as powerful tools for automated brain tumor diagnosis, surpassing traditional machine learning approaches based on handcrafted features and rule-based systems.
Despite these advancements, several challenges persist. Many CNN-based models are not designed for real-time diagnostic support, which limits their integration into clinical workflows. The incorporation of real-time data processing and connectivity with Internet of Medical Things (IoMT) devices could enable continuous patient monitoring and earlier intervention [
30,
31,
32]. Furthermore, deep models often suffer from overfitting, dataset imbalance, and limited cross-institutional generalizability, which undermine their robustness. To address these issues, recent studies have proposed hybrid deep learning architectures that integrate CNNs with transformer-based or recurrent layers to enhance contextual understanding and improve resilience against variability in imaging data [
33,
34].
The central research gap addressed in this study lies in the development of a reliable, accurate, and computationally efficient CNN-based framework capable of real-time brain tumor detection from MRI scans. By evaluating advanced transfer learning models such as VGG16 and EfficientNetB4 on the BR35H dataset, this work extends existing research and overcomes several previously identified limitations, thereby contributing to the advancement of automated medical diagnosis systems.
3. Materials and Methods
This section presents the methodology adopted in the study. The aim is to develop an automated system for the detection and classification of brain tumors from MRI images. Accordingly, a structured framework was designed to integrate artificial intelligence (AI) into diagnostic procedures using convolutional neural networks (CNNs), VGG16, and EfficientNetB4 as the core models for tumor classification. The proposed methodology is organized into a series of well-defined phases that reflect the logical progression of AI-based diagnostic modeling:
Each phase plays a critical role in enhancing diagnostic accuracy and ensuring the robustness of the system. The subsequent subsections provide a detailed explanation of these components alongside the system architecture used for brain tumor categorization.
3.1. Materials and Methods
In this study, an automated brain tumor detection and classification system was developed using the Faster-CNN algorithm. The framework was implemented in Python 3.9.25 employing TensorFlow and Keras, which facilitated efficient model development, ensured reproducibility, and enabled GPU acceleration for large-scale experimentation. The overall methodology is illustrated in
Figure 1. Although the present work primarily focuses on static MRI-based imaging, the proposed architecture possesses the potential to incorporate dynamic physiological data streams. Such integration would allow AI systems to analyze temporal patterns in biosensor data, enabling early detection of pathological changes and facilitating continuous monitoring of disease progression.
Recent advancements in photon-based signal acquisition have shown promise in enhancing MRI quality under low-signal conditions, which may further improve the diagnostic performance of AI-based models. While no specialized equipment was utilized in the current study, future research could leverage these techniques to expand the MRI dataset and improve image reliability. Studies have indicated that photon-based signal augmentation considerably enhances dataset robustness for deep learning applications [
35].
Furthermore, the proposed system can be extended to incorporate additional biological signals—such as electromyography (EMG)to enable multimodal data fusion. Integrating MRI with complementary physiological signals may improve diagnostic generalizability, enhance classification accuracy, and facilitate the development of more comprehensive AI-driven medical diagnostic systems.
3.2. Dataset Description
The BR35H dataset from Kaggle [
36] consists of 3000 pre-labeled MRI brain scans, equally divided into 1500 tumor images and 1500 normal samples. This dataset has been widely utilized for supervised learning and classification tasks. A detailed overview of the dataset, including representative samples of benign and malignant MRI images, is presented in
Table 1. However, a key limitation of the BR35H dataset is that it provides images only at the slice level, without metadata regarding the number of unique patients. Consequently, each MRI slice was treated as an independent data point during model development.
For training, validation, and testing, the dataset was partitioned using an 80/10/10 split. Additionally, to mitigate overfitting and enhance model generalizability, 5-fold cross-validation (k = 5) was applied to the training set. In clinical imaging studies, patient-level splitting is essential to avoid data leakage, as MRI slices from the same individual may exhibit similar anatomical features. However, due to the absence of patient identifiers in the BR35H dataset, strict patient-level separation could not be implemented.
Dataset Splitting
The dataset was partitioned into three subsets: 80% for training, 10% for validation, and 10% for testing. To ensure robust model generalization and reduce overfitting, 5-fold cross-validation (k = 5) was applied to the training portion of the dataset. Furthermore, several data-augmentation techniques—such as translation, rotation, and horizontal flipping—were employed to artificially expand the training set, enhance model robustness, and increase the effective size and diversity of the dataset.
3.3. Image Preprocessing
Data preparation is a critical component of medical image analysis, as it directly influences model accuracy and robustness. In this study, MRI images underwent a series of preprocessing steps—including contour-based cropping, dilation, and erosion—to reduce noise and artifacts and to enhance classifier performance. Cropping was conducted using a contour detection system that identifies extreme points and curves to isolate the region of interest.
Figure 2 presents representative examples of cropped brain tumor images obtained using parameter-based calculations.
Although the present framework is limited to MRI-based analysis, it has the potential to incorporate additional biomechanical features and multimodal data in future research, such as integrating ultrasound imaging to improve diagnostic precision. As illustrated in
Figure 3, raw MRI images from the BR35H dataset were imported and subjected to preprocessing. Initially, all RGB images were converted to grayscale, followed by binary conversion through thresholding. Subsequently, dilation and erosion operations were applied to minimize minor interferences and enhance structural boundaries. Contours were then detected on the thresholded images, and the largest contour was selected to determine the extrema points, which were used to crop the final region of interest.
Since MRI scans in the dataset varied in spatial resolution, all images were standardized to a uniform size of 224 × 224 × 3 to ensure compatibility across all deep learning models utilized in this study. This normalization step is essential for maintaining consistency during training and optimizing model performance.
Data Augmentation
Data augmentation is a crucial strategy in medical imaging research, particularly when working with limited datasets. In this study, the dataset comprises 3000 MRI images, which is insufficient for training deep learning models that typically require large amounts of data to achieve high generalization capability. To address this limitation, data augmentation was applied to artificially expand the training set and reduce the risk of overfitting. Prior studies have demonstrated that augmentation strategies significantly improve model robustness and mitigate the tendency of deep learning models to overfit small datasets.
In addition to conventional augmentation techniques—such as translation, horizontal flipping, and rotation—emerging hybrid frameworks that combine Finite Element Method (FEM) simulations with AI-based augmentation have shown promising potential. These approaches can simulate variations in tumor morphology, tissue deformation, and imaging conditions, thereby enriching the dataset with physiologically realistic information. Integrating FEM simulations into AI pipelines has been shown to enhance classification performance by modeling real-world variations in tumor size, shape, and intensity [
37].
Table 2 presents the augmentation strategies used in this study, including rotation, translation, and horizontal flipping, which were employed to generate additional training samples and improve model generalizability. The literature consistently reports that data augmentation contributes to increased classification accuracy and improved diagnostic reliability in brain tumor detection tasks.
3.4. Feature Extraction Using Pretrained CNN
Convolutional Neural Networks (CNNs) extract hierarchical features from MRI images through a sequence of convolutional and pooling layers, followed by fully connected layers for classification. The architectural design of CNNs enables the automatic learning of complex, high-level representations that cannot be effectively captured using traditional neural networks. Owing to their built-in filters and minimal preprocessing requirements, CNNs serve as the foundation of modern computer vision and have been widely applied in object recognition, surveillance, and medical imaging.
In this study, pretrained CNN architectures were employed to extract discriminative features from MRI scans for brain tumor classification. While the primary focus is on deep transfer learning-based MRI analysis, future research could extend this framework to incorporate real-time physiological monitoring—such as tissue temperature variations or specific absorption rate (SAR) measurements from wearable or IoMT-based devices. The integration of multimodal biosignals with MRI data has the potential to enhance diagnostic accuracy, provide richer clinical insights, and enable earlier detection of abnormal tissue activity.
Current MRI-based diagnostic systems primarily detect tumors only after structural abnormalities become visible in imaging, which may delay timely intervention. However, advancements in non-invasive electromagnetic sensing—such as eddy current monitoring and dielectric property measurement—have demonstrated the ability to detect biological anomalies before morphological changes appear on conventional scans. The fusion of such pre-imaging physiological data with MRI and deep learning pipelines could therefore establish a proactive diagnostic paradigm, improving both early detection and classification performance.
The main objective of this study is to classify brain tumors by exploiting spatial features embedded in MRI imagery. Nonetheless, temporal information derived from sequential or functional imaging could further enhance model performance by capturing physiological changes over time. Hybrid spatiotemporal AI models have already shown improved robustness and predictive accuracy in medical imaging. Thus, future investigations should consider integrating temporal data into transfer learning frameworks to facilitate earlier diagnosis and more comprehensive tumor evaluation.
A typical CNN architecture is illustrated in
Figure 4. Input images are first processed through convolutional and pooling layers to extract deep features, followed by dense layers that perform final classification.
3.5. Classification Models
3.5.1. Convolutional Neural Network (CNN)
Convolutional Neural Networks (CNNs) are a class of deep learning models designed to process visual data by mimicking certain aspects of human perception. They have become fundamental in computer vision tasks, including image classification, segmentation, and medical image analysis. CNNs automatically learn discriminative features from input images through a hierarchical structure of layers, requiring significantly less manual preprocessing compared to traditional machine learning methods.
A CNN is composed of convolutional layers, pooling layers, activation functions, and fully connected layers. In convolutional layers, a set of neurons (filters) scans local regions of the input image, capturing spatial patterns such as edges, textures, and shapes. Each neuron’s output—known as its activation—is determined by the surrounding pixels within its receptive field. As the network deepens, successive layers detect increasingly abstract and complex features, forming a hierarchical representation of the input data.
The efficiency and effectiveness of CNNs depend on several factors, including the number of layers, filter size, activation function, normalization strategy, and architectural configuration. These hyperparameters are typically determined empirically through systematic experimentation and optimization. When properly trained, CNNs outperform traditional filtering-based methods by learning robust and generalized feature representations from large datasets.
The architecture of a single-layer CNN is illustrated in
Figure 5, highlighting the fundamental components involved in feature extraction and classification. This foundation serves as the basis for more advanced models such as VGG16, ResNet, and EfficientNet, which build upon the core principles of convolutional feature learning to achieve state-of-the-art performance in medical imaging applications.
3.5.2. Transfer Learning
Transfer Learning (TL) enhances feature extraction and classification performance on small MRI datasets by leveraging pretrained models that were originally trained on large-scale datasets such as ImageNet. Instead of training a deep learning model from scratch, TL utilizes the learned representations of a pretrained network and adapts them to a new task—in this case, brain tumor detection. The ImageNet-trained CNN models, having been exposed to millions of diverse images, possess the ability to capture generalizable visual patterns such as edges, textures, and spatial relationships, which can be effectively transferred to medical imaging tasks. When the available dataset is limited, as is common in medical applications, training deep models from scratch often results in overfitting and poor generalization. Transfer learning mitigates this issue by enabling faster convergence, reducing the need for large training datasets, and improving model accuracy. In recent years, TL has been successfully applied to a wide range of tasks, including medical image classification, object detection, segmentation, and diagnostic automation.
Overall, the use of transfer learning provides several key benefits:
accelerated training,
reduced risk of overfitting,
improved performance on small datasets, and
efficient utilization of computational resources.
These advantages make TL an ideal strategy for MRI-based brain tumor detection, especially when annotated medical data are scarce.
3.5.3. VGG16
As illustrated in
Figure 6, VGG16 model architecture. VGG16 is a widely adopted convolutional neural network architecture consisting of 16 layers, primarily used for image classification tasks. Its structure is characterized by sequential blocks of 3 × 3 convolutional layers, followed by max-pooling layers and fully connected layers, enabling efficient extraction of hierarchical image features. In this study, transfer learning was applied using the pretrained VGG16 model from ImageNet to extract discriminative features from MRI brain scans.
The popularity of VGG16 arises from its architectural simplicity, ease of implementation, and publicly available pretrained weights, which facilitate integration into various computer vision and medical imaging applications. The model expects an input image of size 224 × 224 × 3, corresponding to height, width, and color channels. During feature extraction, the network evaluates whether to activate a neuron based on the weighted sum of inputs and the chosen activation function, allowing VGG16 to automatically learn meaningful spatial representations.
Despite being surpassed by more advanced architectures in recent years, the simplicity and reliability of VGG16 make it an effective baseline model for medical image classification tasks—particularly when used in conjunction with transfer learning and MRI-based datasets.
3.5.4. EfficientNetB4 Network Architecture
EfficientNet is a family of deep learning models ranging from B0 to B7, designed to provide high accuracy with significantly reduced computational cost. These models were trained on the ImageNet dataset and are widely recognized for their scalability and efficiency in feature extraction and classification tasks. The EfficientNet architecture employs a compound scaling strategy, which systematically scales depth, width, and resolution of the network using a single optimized coefficient. This method enables the expansion of the base model into larger variants while maintaining computational efficiency and improving predictive performance. EfficientNetB4, used in this study, consists of a series of Mobile Inverted Bottleneck Convolution (MBConv) blocks with varying kernel sizes, typically ranging from 3 × 3 to 5 × 5. These MBConv layers play a crucial role in reducing parameters while preserving feature richness, making the architecture well-suited for medical image analysis. As illustrated in
Figure 7, EfficientNet’s compound scaling approach outperforms traditional scaling methods by simultaneously increasing model depth, width, and resolution in a balanced manner.
where d is depth, w is width, r is resolution, and
, β, γ are constants.
By utilizing EfficientNetB4 within a transfer learning framework, this study leverages its computational efficiency and strong feature extraction capabilities to enhance MRI-based brain tumor classification performance. The compound scaling coefficient φ governs how EfficientNet scales its depth, width, and resolution while considering available computational resources. The baseline model, EfficientNetB0, is initialized with scaling parameters φ = 0, w = 1, d = 1, and r = 1, representing equal allocation of depth, width, and resolution. This model employs MBConv1 and MBConv6 blocks, which are central to the architecture’s efficiency and feature extraction capabilities. As the value of φ increases, the network scales accordingly. For instance, EfficientNetB3, configured with φ = 3, w = 3, d = 3, and r = 3, requires higher computational resources but delivers improved predictive performance. It incorporates multiple MBConv6 operations utilizing inverted residual blocks and advanced activation functions, enabling richer feature representation. Compared to the baseline version, EfficientNetB3 introduces a deeper architecture, enhanced feature diversity, and broader adaptability to complex classification tasks. The increased computational cost—measured in terms of floating-point operations per second (FLOPs)—is counterbalanced by improved accuracy and parameter efficiency. Consequently, EfficientNetB3 demonstrates strong performance in classification tasks by effectively capturing salient and discriminative features. The scaling behavior of the base EfficientNet model is illustrated in
Figure 7, highlighting the relationship between model size and performance.
3.6. Mathematical Modeling of Deep Learning Framework
Deep learning models for brain tumor classification operate on an input image denoted as where , , and represent the height, width, and number of channels of the MRI image, respectively. Each input image is associated with a class label H∈{0,1}, where 0 denotes a non-tumor case and 1 represents a tumor case. The objective of the model is to learn a discriminative function f(A)→H that maps an MRI image to its correct class using learned parameters.
This task is completed by each convolutional layer of a convolutional neural network (CNN):
This mapping is achieved through a series of convolutional layers, where each layer extracts hierarchical features by applying convolutional filters to localized regions of the input. These layer-wise transformations progressively encode spatial and structural information, enabling robust classification of MRI scans into tumor and non-tumor categories. In this framework, σ denotes the activation function—typically the Rectified Linear Unit (ReLU)—while E and C represent the learnable weights and biases, respectively. Following a series of convolution and pooling operations that progressively reduce the spatial dimensions of the feature maps, fully connected layers are employed for final classification. These layers utilize the SoftMax function to generate class probabilities, enabling the model to assign each MRI image to the appropriate diagnostic category.
The VGG16 architecture employs a series of stacked 3 × 3 convolutional layers, allowing it to effectively capture and represent hierarchical spatial features within the input image. In contrast, the EfficientNetB4 model introduces a compound scaling strategy that uniformly adjusts the input resolution (r), network depth (d), and width (w) using a scaling coefficient φ. This approach enables more efficient and balanced model scaling, resulting in improved accuracy and computational efficiency for classification tasks.
subject to
.
Due to its balanced compound scaling strategy, EfficientNetB4 achieves reliable and efficient classification of MRI images, delivering higher accuracy while utilizing fewer parameters compared to conventional deep learning architectures.
3.7. Fine-Tuning with Global Average Pooling
Global Average Pooling (GAP) plays a critical role in reducing the spatial dimensions of feature maps while preserving essential semantic information. By aggregating the average value of each feature map, GAP significantly decreases the number of trainable parameters, thereby enhancing model efficiency and mitigating overfitting. This dimensionality reduction facilitates more robust classification while maintaining the discriminative power of extracted features. In TensorFlow and Keras, the GAP operation is implemented using the tf.keras.layers.GlobalAveragePooling2D() layer. For example, a four-dimensional tensor of shape (1, 4, 4, 3) can be transformed into a 2-dimensional tensor of shape (1, 3), resulting in a compact feature vector suitable for input into fully connected or classification layers. This approach contributes to both computational efficiency and improved model generalization in deep learning architectures applied to MRI-based tumor classification.
3.8. Fine-Tuning with Dropout Layer
To accelerate training and prevent overfitting, a dropout layer was incorporated into the EfficientNetB3 architecture. Due to the large number of parameters generated by its deep structure, the model is prone to overfitting, particularly when trained on limited datasets. The dropout technique mitigates this issue by randomly deactivating a subset of neurons during training, thereby reducing reliance on specific features and encouraging the network to learn more generalized representations.
During inference, the dropout mechanism effectively approximates the ensemble behavior of multiple sparse subnetworks, allowing predictions to be made using the full network with reduced weights. In this study, a dropout rate of 0.2 was selected as the optimal failure probability to balance regularization with model performance. Experimental results demonstrate that incorporating a dropout layer enhances the robustness of the EfficientNetB3 model and improves classification accuracy—particularly for tasks involving medical image analysis, such as melanoma or brain tumor detection.
3.9. Fine-Tuning Fully Connected Layer
In this stage, all feature vectors generated by the previous layers, including those refined by dropout, are aggregated and passed to the final classification layers. Since the pretrained EfficientNetB3 model was originally trained on the ImageNet dataset, its default output layer predicts 1000 classes. To adapt the model to the binary classification task of distinguishing between malignant and benign lesions, the top layer was modified and fine-tuned to output only two classes.
A SoftMax activation function was employed in the final layer to compute the class probabilities for each input image. The difference between predicted and true labels was quantified using an appropriate loss function, enabling the model to optimize its parameters during training. In implementation, the final dense layer was constructed using tf.keras.layers.Dense(), while the modified end-to-end architecture was defined using tf.keras.Model() to integrate input and output layers seamlessly.
This fine-tuning approach enables the EfficientNetB3 architecture to be effectively repurposed for medical classification tasks, enhancing diagnostic accuracy through transfer learning.
3.10. Regularization and Optimization Techniques
To prevent overfitting during model training, several regularization and optimization strategies were employed. L2 regularization was applied to constrain the magnitude of learned weights, thereby reducing model complexity and discouraging over-reliance on specific features. Additionally, batch normalization and global average pooling were integrated into the architecture to stabilize training and further enhance generalization.
Multiple overfitting mitigation techniques were used throughout the preprocessing and training phases. Data augmentation was first applied to increase dataset variability and improve the robustness of the learning process. Early stopping was implemented to halt training when validation performance ceased to improve, preventing divergence and unnecessary parameter updates. Dropout layers were also incorporated to randomly deactivate a subset of neurons during training, which both accelerates training and significantly reduces overfitting.
For loss minimization, categorical cross-entropy was utilized as the loss function. To optimize the model parameters, three state-of-the-art optimization algorithms—Adam, Stochastic Gradient Descent (SGD), and RMSprop—were employed. These optimizers were systematically compared to determine their effectiveness in MRI-based brain tumor classification and to identify the most suitable technique for the proposed diagnostic framework.
4. Result and Discussion
This study aimed to diagnose brain tumor conditions using MRI brain images through the implementation of deep learning-based classification models. Three architectures—CNN, VGG16, and EfficientNetB4—were trained and evaluated to assess their diagnostic performance. The results obtained from these models are analyzed and discussed in this section, with a focus on classification accuracy, generalization capability, and the effectiveness of transfer learning in handling medical imaging data.
4.1. Results of CNN Model
A series of experimental evaluations was conducted to assess the performance of the proposed CNN model. All experiments were implemented in a Python environment with GPU support to ensure computational efficiency. Prior to model training, MRI images underwent preprocessing using max–min normalization to enhance contrast and improve feature visibility.
As illustrated in
Figure 8, the CNN model achieved an accuracy of 97% on the training set and 94.78% on the test set, demonstrating strong generalization capability. Details regarding the training configuration—including the chosen optimizer, activation functions, and number of epochs—are provided in the Methods section. Overall, the CNN model exhibited stable convergence and reliable performance in detecting brain tumors from MRI images.
Figure 9 illustrates the training and validation performance of the CNN model during the final four epochs. A noticeable decline in performance is observed, with the validation accuracy stabilizing at approximately 83% while the loss remains relatively high at 0.96. Moreover, a considerable gap between validation accuracy and validation loss suggests potential issues with model generalization, possibly indicating overfitting or suboptimal parameter tuning. Overall, the training history reflects a negative performance trend during the later stages of training, emphasizing the need for further optimization to achieve stable convergence.
4.2. Results of the VGG16 Model
Prior to training the VGG16 model, the dataset underwent preprocessing steps that included threshold-based image cropping, resizing, and data augmentation to increase dataset variability and improve model generalization. The pretrained ImageNet weights were used as the base model for transfer learning, enabling efficient feature extraction from MRI brain images.
The VGG16 model was trained using the following hyperparameters.
The proposed framework employs a modified Deep Convolutional Neural Network (DCNN) architecture capable of accurately classifying MRI brain images into benign and malignant categories. In addition, a Deep Learning-based Opposition Crow Search (DL-OCS) optimization technique was explored to enhance classification efficiency and improve parameter tuning.
As illustrated in
Figure 10, the training history of the VGG16 model demonstrates a positive trend, with increasing accuracy and decreasing loss over time. The model achieved an overall training accuracy of 93% and a validation accuracy of 89%, indicating strong performance when compared to the sequential neural network baseline.
4.3. Results of EfficientNETB4
Figure 10 illustrates the accuracy and loss curves for the EfficientNetB4 model. The results show a clear positive training trend, characterized by steadily increasing accuracy and consistently decreasing loss over successive epochs. The model achieved an overall training accuracy of 99%, while the validation accuracy likewise reached 99%, demonstrating excellent generalization and stability during training.
Figure 11 shows the accuracy and loss graph of EfficientNetB4. These results significantly outperform those obtained using the sequential neural network and VGG16 models, highlighting the robustness of EfficientNetB4 in handling MRI-based brain tumor classification tasks. The compound scaling strategy and advanced feature extraction capabilities of EfficientNetB4 appear to contribute to its superior performance, confirming its suitability for medical image analysis.
The EfficientNetB4 model demonstrated outstanding performance, achieving an F1-score of 100% on the test set. Out of 300 test images, only one instance was misclassified, indicating exceptionally high precision and recall.
Figure 12 presents a sample prediction output of the model, further illustrating its strong capability in distinguishing between tumor and non-tumor MRI images.
4.4. Comparison and Visual Evaluations
This section presents a comparative analysis of the deep learning models evaluated in this study, namely CNN, VGG16, and EfficientNetB4. As shown in the Classification Report Heatmap in
Figure 13, the EfficientNetB4 model achieved perfect performance, obtaining precision, recall, and F1-score values of 1.00 for both tumor (YES) and non-tumor (NO) categories using 999 test samples. These results indicate that the model correctly classified all instances without any misclassifications, demonstrating 100% classification accuracy.
The superior performance of EfficientNetB4 highlights the effectiveness of its compound scaling strategy and advanced feature extraction capabilities, proving its advantage over traditional CNN and VGG16 architectures in MRI-based brain tumor detection. The Confusion Matrix presented in
Figure 14 provides a visual representation of the model’s classification performance. Out of 400 test samples, only two instances were misclassified—one as a false positive and one as a false negative—resulting in an impressive overall accuracy of 99.5%. The strong diagonal dominance within the matrix highlights the model’s high discriminative capability and its ability to accurately distinguish between tumor and non-tumor MRI scans.
Table 3 presents a comparative analysis of the performance of the three models evaluated in this study—CNN, VGG16, and EfficientNetB4. The results clearly indicate that EfficientNetB4 achieved superior performance, surpassing both CNN and VGG16 across all evaluation metrics. The model’s high accuracy and exceptional F1-score demonstrate its effectiveness in correctly classifying MRI brain images into tumor and non-tumor categories, underscoring its potential for reliable clinical application.
In addition to the ADAM optimizer, the influence of several other optimizers—namely Nadam, Adagrad, and RMSprop—on model convergence and classification performance was systematically evaluated. All experiments were conducted under identical conditions, using the same batch size, learning rate, and data split to ensure a fair comparison. The results demonstrate that EfficientNetB4, coupled with the ADAM optimizer, achieved the best overall performance, yielding 99.66% accuracy, 99.68% precision, and a 100% F1-score, while maintaining stable convergence across epochs. Among the remaining optimizers, Nadam outperformed Adagrad, which exhibited greater fluctuations in validation loss despite achieving 97.84% accuracy. RMSprop delivered results comparable to Nadam but required additional epochs to stabilize. These findings suggest that ADAM’s adaptive moment estimation mechanism effectively balances the learning rate, making it particularly well-suited for both complex architectures like EfficientNetB4 and small, domain-specific datasets such as BR35H. Nadam did not surpass ADAM in this case, even with the incorporation of Nesterov momentum—likely due to the limited dataset size and high feature redundancy present in MRI slices. Overall, the EfficientNetB4 model demonstrated superior predictive capability and resilience compared to conventional convolutional networks for MRI-based brain tumor classification. Furthermore, the results indicate that advanced segmentation approaches—such as active contour models or U-Net architectures—may present promising directions for future research, enabling precise tumor localization in addition to classification. The quantitative and visual evidence provided in
Figure 13 and
Figure 14 substantiates the effectiveness and practical applicability of the proposed framework.
4.5. Discussion
The superior performance of the EfficientNetB4 model highlights the novelty of the proposed approach compared to conventional CNN implementations. The results demonstrate that carefully fine-tuning pretrained models—combined with dataset-specific preprocessing and targeted data augmentation—can significantly improve classification accuracy, even when working with relatively small MRI datasets. Among the evaluated architectures, EfficientNetB4 consistently outperformed both CNN and VGG16, achieving an F1-score of 100% and an accuracy of 99.66%, thereby exhibiting strong capability in extracting and leveraging discriminative features from limited data. This improvement is primarily attributed to EfficientNetB4’s enhanced architecture and compound scaling strategy, which enables comprehensive feature extraction while maintaining computational efficiency. Data augmentation techniques—such as rotation, translation, and horizontal flipping—contributed to mitigating overfitting by promoting the learning of generalizable tumor characteristics rather than patient-specific patterns. Despite the dataset comprising only 3000 images, transfer learning using pretrained ImageNet weights enhanced the robustness and convergence of the models, further supporting accurate classification. These findings suggest that, when properly calibrated and supported by augmentation and transfer learning, deep learning models can achieve precise and reliable MRI-based brain tumor classification. From a clinical perspective, such accuracy is particularly valuable, as early and accurate diagnosis enables timely intervention and more effective treatment planning. Consequently, the proposed approach carries strong potential for real-world clinical integration and may contribute to improved patient outcomes.
4.6. Limitations of the Study
Despite the high level of accuracy achieved in this study, several limitations should be acknowledged. First, the BR35H dataset contains only 3000 images, and potential class imbalances among tumor types may limit the model’s ability to generalize to broader clinical populations. Second, the high training accuracy suggests a risk of overfitting, which may reduce performance on unseen real-world clinical data. Third, external validation using independent datasets and patient-level splitting is necessary to assess the model’s robustness and clinical applicability. Furthermore, the current framework relies solely on MRI data and does not incorporate additional physiological indicators or multimodal imaging techniques—such as PET, CT, EMG, or functional MRI—which could enhance early tumor detection and improve diagnostic confidence. To strengthen the generalizability and clinical relevance of the proposed approach, future research should consider using multi-center datasets, implementing patient-level cross-validation, exploring multimodal data fusion, and evaluating model performance in real-world clinical environments.
5. Conclusions and Future Research
Accurate diagnosis of brain tumors in clinical practice is of paramount importance, as the variability of medical imaging presents significant challenges for interpretation. Convolutional Neural Networks (CNNs) and other deep learning (DL) techniques have markedly advanced the detection, identification, and classification of brain tumors from magnetic resonance imaging (MRI), which remains the most widely used imaging modality in neuro-oncology. Among these techniques, deep learning has proven highly effective at extracting discriminative features from MRI data and enabling more efficient diagnosis compared to traditional approaches. Rapid, non-invasive, and cost-effective diagnostic systems have the potential to greatly improve early cancer detection, thereby saving lives.
In this study, we identified the optimal combination of transfer learning models and optimizers for the BR35H dataset and proposed a robust framework for comparing different strategies for MRI-based brain tumor classification. To mitigate overfitting and reduce data leakage, patient-level data splitting was employed to ensure that MRI slices from the same individual were not simultaneously present in the training, validation, and test sets. The proposed methodology—based on the pretrained VGG16 and EfficientNetB4 architectures—demonstrated high diagnostic accuracy while maintaining computational efficiency. The integration of deep learning with transfer learning enabled effective training on a relatively small dataset, particularly when supported by appropriate data-augmentation techniques. These findings highlight the practicality and potential clinical impact of deep transfer learning for binary classification of brain MRI images.
However, this study has several limitations. Most notably, the absence of an independent external dataset restricts the ability to assess the generalizability of the model across diverse patient populations. Future research should focus on evaluating model performance using multi-institutional and multi-center datasets to validate robustness. Additionally, the integration of biosignals (e.g., EEG) and physiological monitoring with MRI-based imaging may enhance diagnostic confidence and provide real-time information on tumor progression. Future investigations should aim to develop automated and dynamic tumor detection systems that account for tumor shape, size, and anatomical location.
Promising directions for future research include:
Employing advanced segmentation and registration methods to accurately differentiate diseased and healthy brain regions.
Enhancing the computational efficiency of deep learning models to facilitate use in resource-constrained clinical environments.
Exploring multimodal data fusion to integrate MRI with physiological signals and improve real-time diagnostic capabilities.
In conclusion, this study provides strong evidence that deep transfer learning can effectively and accurately classify brain tumors using MRI data, even with relatively small datasets. Continued research in multimodal integration, multi-center validation, and real-time diagnostic frameworks will be essential to translating these advancements into clinical practice and improving patient outcomes.