A Unified Deep Learning Framework for Robust Multi-Class Tumor Classification in Skin and Brain MRI
Abstract
1. Introduction
1.1. Motivation
1.2. Contribution
- Dermatoscopy branch: Suppresses non-biological artifacts (hairs, air bubbles) through inpainting-based masking.
- Neuroimaging branch: Applies skull stripping and bias field correction to isolate parenchyma [16].
1.3. Paper Structure
2. Related Work
3. Methodology and Processing
3.1. Data Collection and Preprocessing
- Skin Cancer: We utilize dermoscopic images from publicly available datasets, such as the Human Against Machine (HAM10000) archive [26]. These images are close-up, magnified views of skin lesions captured with a specialized device called a dermatoscope. The HAM archive provides dermoscopic pictures from a diverse population with various skin conditions, including the following classifications:
- -
- MEL: A severe form of skin cancer arising from pigment-producing cells.
- -
- NV: known as a mole, it is a common, usually benign growth of pigment cells.
- -
- BCC: The most frequent type of skin cancer, typically slow-growing and treatable.
- -
- AKIEC: A precancerous skin lesion that may develop into squamous cell carcinoma if left untreated.
- -
- BKL: A noncancerous, scaly growth on the skin.
- -
- DF: A benign skin tumor composed of fibrous tissue.
- -
- VASC: A noncancerous abnormality of blood vessels in the skin.
- Brain Tumors: Brain tumor data is obtained from MRI scans sourced from the BraTS 2020 challenge dataset [27]. MRI scans provide detailed anatomical information about the brain and surrounding tissues. The BraTS 2020 dataset specifically focuses on gliomas, a common type of brain tumor. It includes multi-modal MRI scans (e.g., T1-weighted, T2-weighted, contrast-enhanced) that capture different aspects of the tumor and surrounding brain tissue. Our system is designed to distinguish between brain scans with tumors and those with normal, healthy brain tissue.
Image Preprocessing and Normalization
- Improved Training Stability: Deep learning models rely on gradient descent for optimization. Normalization prevents features with large values from dominating the gradients, leading to smoother and more stable training.
- Faster Convergence: When pixel intensities are on a similar scale, the model can learn the optimal weights and biases more efficiently, resulting in faster convergence during training.
- Reduced Sensitivity to Preprocessing: Normalization minimizes the impact of minor variations in image acquisition conditions (e.g., lighting, camera settings) on pixel intensities. This makes the model less sensitive to preprocessing variations within your datasets.
- Enhanced Activation Function Performance: Certain activation functions employed in deep learning models, such as sigmoid or tanh, exhibit optimal performance ranges. Normalization ensures pixel values fall within these ranges, maximizing the effectiveness of these activation functions.
- Bridging the Gap Between Modalities: Dermoscopic images and MRI scans employ different technologies, resulting in a range of pixel intensities. Normalization helps bridge this gap and ensures both types of images are processed on a similar scale within the model.
- Managing Data Variability: Skin lesions and brain tumors can exhibit diverse appearances. Normalization helps manage this variability by focusing on the relative relationships between pixel intensities within each image, allowing the model to learn more effectively from both datasets combined.
- Mean and Standard Deviation Normalization: We initially subtracted the mean RGB values (averaged across each image) and divided them by the standard deviation, following the approach suggested by [28].
- Training Set Mean Normalization: Another approach involved subtracting the mean RGB values, calculated from the training set images only, followed by division by their standard deviation [29].
- ImageNet Mean Subtraction: We also employed pre-computed ImageNet mean subtraction as a normalization step. This method utilizes a constant value derived from the extensive ImageNet image database [30].
3.2. Data Augmentation
3.2.1. Deep Convolutional Generative Adversarial Network (DCGAN)
3.2.2. Augmentor Library
- Horizontal and vertical flips: Mirroring the image horizontally or vertically (as presented in Figure 5d).
- Contrast adjustments: Increasing or decreasing the contrast between light and dark areas in the image (as presented in Figure 5b).
- Brightness corrections: Adjust the image to be brighter or darker (as shown in Figure 5b).
- Class-Conditioned Focus: Our DCGAN used explicit class labels during training, guiding feature learning toward specific cancer types despite limited data.
- Augmentation Synergy: We combined DCGAN outputs with traditional augmentations (rotation/flipping from Section 3.2) to make the most of the limited samples.
- Progressive Training: We started with a higher sampling of the minority class before fine-tuning with balanced batches.
3.3. Data Segmentation
- Skin Lesions: In skin lesion images, hair and surrounding healthy skin can introduce noise, making it difficult for the model to distinguish the lesion itself. The model may focus on irrelevant features instead of the lesion characteristics that are crucial for accurate classification.
- Brain Tumors: Similarly, in brain MRI scans, the model may be overwhelmed by information from healthy brain tissue rather than focusing solely on the tumor region. This can negatively influence the model’s ability to segment the tumor accurately.
3.4. Feature Extraction and Deep Learning Architectures
3.5. The Power of Convolution
- Input Layer: This is the starting point, where the preprocessed medical image data is fed into the network. It specifies the image size (width and height) and the number of channels (e.g., RGB for color images).
- Convolutional Layer: The convolutional layer is the primary component of a CNN, extracting features from the input image. A small filter, called a kernel, moves across the image and calculates the dot product between its weights and the corresponding pixel values. You can think of the kernel as a magnifying glass that examines small patches of the image closely to identify patterns. The size of the kernel is significant.
- Smaller kernels are adept at capturing localized features like edges and textures.
- Larger kernels can learn more complex patterns that span larger image regions.
- Activation Function: Not all extracted features are equally important. Here, the activation function acts as a gatekeeper, allowing only significant activations to pass through. Popular choices, such as ReLU (Rectified Linear Unit), suppress insignificant features and address the vanishing gradient problem that can hinder training in deep networks [36].
- Pooling Layer: This layer helps to simplify the data, making it easier for the network to process. Techniques like max-pooling focus on the strongest activation in a small area, which summarizes the local information. This not only speeds up processing time but also helps the network recognize objects even if they shift slightly in the image [37].
- Fully Connected Layer: After the convolutional and pooling layers have extracted and summarized local features, the fully connected layer becomes essential. In this layer, every neuron connects to all neurons in the previous layer, unlike the localized connections seen before. This connection allows the network to combine the extracted features and understand the entire image. The last fully connected layer typically employs a SoftMax activation function. This function changes the network’s output into class probabilities for image classification tasks [38].
CNN Architectures: InceptionV3 and ResNet
- InceptionV3 [39]:
- ResNet-50 [40]:
3.6. Transfer Learning for Skin and Brain Tumor Classification
- Source Task: A pre-trained CNN, like InceptionV3 or ResNet-50, serves as the starting point.
- Target Task: Your dataset (described in the Section 3.1) containing labeled images represents the target task.
- Transfer: The initial layers of the pre-trained CNN, which have learned general image recognition features, are retained.
- Fine-tuning: The final layers of the pre-trained CNN are fine-tuned with your specific skin lesion dataset. This fine-tuning process enables the model to specialize in identifying relevant patterns within skin lesions, allowing it to distinguish between benign and malignant types.
- Reduced Training Data Requirements: Acquiring and labeling large datasets of medical images can be a challenging task. Transfer learning allows you to build effective models for skin and brain tumor classification even with limited labeled data.
- Faster Training Time: By leveraging pre-trained knowledge from the initial layers, transfer learning significantly reduces training time compared to training a model from scratch.
- Improved Performance: Transfer learning can lead to enhanced performance, particularly when working with limited medical image datasets. The pre-trained CNN provides a solid foundation for learning features relevant to medical images, even if the source task (ImageNet) involves general images.
- Level of Transfer: The level of transfer learning can be adjusted. In medical image classification, fine-tuning only the final layer of the pre-trained CNN is often more effective than complete transfer (using the entire pre-trained model).
- Task Similarity: While the source task (general image recognition) might seem quite different from the target task (medical image classification), both tasks involve recognizing patterns within images. This underlying similarity allows transfer learning to be effective in this scenario.
4. Results
4.1. Data Augmentation and Class Balancing
- Combined Dataset (described in Section 3.1): We leverage a rich dataset that combines skin lesion images from the publicly accessible Human Against Machine (HAM) archive, and brain MRI scans from the BraTS 2020 challenge [26,27]. All training and test images were uniformly formatted in RGB with a fixed size of 600 × 450 pixels. Figure 3a,b illustrate dermoscopic skin lesions from the HAM archive. Figure 3c depicts an MRI brain scan from the BraTS 2020 dataset.
- Data Augmentation: To address potential limitations in dataset size and variations in image quality (e.g., illumination errors, inconsistent staining, image noise), we employ data augmentation techniques described in Section 3.2. These techniques, such as rotation, flipping, and scaling, artificially expand the Dataset and help the model learn from a wider range of image presentations. The effectiveness of data augmentation is visually demonstrated in Figure 4 and Figure 5, which show the original images alongside their augmented counterparts.
- Class Balancing: The data is carefully balanced to ensure each cancer type (e.g., Melanoma, Basal Cell Carcinoma) has enough representative images (details in Table 2). This is crucial to prevent the model from biasing its predictions towards more frequent classes.
4.2. Feature Extraction and Cancer Type Classification with U-Net
- U-Net Segmentation (described in Section 3.3): a U-Net architecture is employed to extract informative features from the preprocessed images. This segmentation step focuses on identifying the region of interest (ROI) within the image, which contains the potential cancerous tissue (skin lesion or brain tumor). This targeted approach allows the model to concentrate on the most relevant image area for classification. The segmentation results for sample images are illustrated in Figure 7, demonstrating the U-Net’s ability to identify potential problem regions.
- ROI Cropping: Following U-Net segmentation, the generated mask is used to crop the image to a smaller size (400 × 400 pixels), as demonstrated in Figure 11. This reduces the computational burden on the classification model without compromising essential information.
4.3. Deep Learning Classification with Transfer Learning
- Transfer Learning with InceptionV3 (described in Section 3.4): We leverage the power of transfer learning by utilizing the pre-trained InceptionV3 architecture. InceptionV3, trained on a massive image dataset like ImageNet, has already learned valuable features for recognizing shapes, textures, and patterns within images.
- Fine-tuning: The pre-trained weights of InceptionV3 are fine-tuned for the specific task of cancer type classification. This involves adjusting the final layers of the model to adapt to the new dataset and cancer classification problem.
- Multi-class Classification: The fine-tuned InceptionV3 model is equipped to differentiate between various skin cancers (Melanoma (MEL), Nevus (NV), Basal Cell Carcinoma (BCC), etc.) and brain tumors. The output layer comprises multiple neurons corresponding to each cancer type, enabling the model to predict the cancer class for a given image.
5. Discussion
5.1. Experimental Setup and Hyperparameter Tuning
- Batch size thirty-two balanced training speed and memory usage
- A learning rate of 0.001 provided stable improvement from a common starting point.
- Freezing 70% of the layers proved critical, as it kept most of the initial patterns learned from general images unchanged while updating only the later layers for cancer-specific details. This cut reduced trainable parameters from twenty-one million to 12.9 million (Table 4), speeding up training while reducing the risk of overfitting.
- Activation Function: The ReLU (Rectified Linear Unit) activation function [45] was employed in all layers due to its computational efficiency and effectiveness in deep learning architectures.
- Dropout Layers: Dropout layers with a rate of 0.5 [46] were incorporated after the convolutional layers to drop out a certain percentage of neurons during training randomly. This helps mitigate overfitting by preventing co-adaptation of features.
- Optimizer: The Adam optimizer [47], with a momentum of 0.99 and a learning rate of 0.001, was used for training. Adam is an adaptive learning rate optimization algorithm that has proven effective in training various deep learning models.
- Epochs: The model was trained for a maximum of 20 epochs.
- Classification Layer: The final layer of the pre-trained InceptionV3 was replaced with a new fully connected layer with ten output neurons corresponding to the standard and malignant cancer classes. A SoftMax activation function was applied to this layer to normalize the output probabilities.
- Freezing Layers (70%): These layers are not trainable, and their weights remain unchanged during the fine-tuning process. Freezing layers help leverage the pre-trained features for general image recognition tasks while reducing the number of trainable parameters.
- Removing the Final Layer: The final classification layer of the pre-trained InceptionV3, which is designed for the ImageNet categories, is removed.
- Adding a New Fully Connected Layer: A new fully connected layer is added at the end of the architecture. This layer has several neurons equal to the number of cancer classes (10 classes).
- Activation Function: The new fully connected layer uses a SoftMax activation function to normalize the output probabilities.
5.2. Model Training Progress
5.3. Examining Model Performance
- Precision: This metric reflects the proportion of positive predictions that were truly positive for a specific class. In simpler terms, it indicates how often the model correctly identified a cancer type out of all the images it predicted as that cancer type (e.g., a precision of 0.97 for AKIEC signifies that 97% of AKIEC predictions were truly AKIEC).
- Recall: This metric signifies the proportion of actual positive cases (images with a specific cancer type) that the model correctly identified. In other words, it indicates how often the model identified a particular cancer type out of all the images that had that cancer (e.g., a recall of 0.97 for AKIEC means the model identified 97% of the actual AKIEC images).
- F1-score: This metric strikes a balance between precision and recall, providing a more comprehensive view of the model’s performance for each class (ideally close to 1.00).
- Actual Positive Rate (TPR): This metric represents the proportion of actual positive cases (images with a specific cancer type) that the model correctly identified. It is also known as recall.
- False Positive Rate (FPR): This metric represents the proportion of negative cases (images without the specific cancer type) that the model incorrectly classified as positive.
5.4. Comparison with Recent Models
5.5. Evaluation of the Inception Model for Cancer Classification and Localization
- Image Acquisition: Single test images were presented to the model.
- Image Processing: The fine-tuned Inception model performed the following actions on each image:
- Classification: The model classified the image into one of ten categories:
- Seven classes of Skin Cancer: If a cancerous lesion was identified (e.g., potentially AKIEC skin cancer, as suggested in Figure 16a), the model attempted to classify the specific cancer type.
- Undefined Image: Images falling outside the expected range for skin or brain tissue were classified as “undefined.”
- Localization (if applicable): For classified skin or brain cancer images (as exemplified in Figure 16a,b), the model aimed to localize the potential lesion within the image.
6. Conclusions and Future Prospects
6.1. Strengths of the Proposed Approach
- Leverages Complementary Architectures: Inception excels at capturing intricate spatial relationships, while U-Net is well-suited for segmentation tasks. This combination could offer robust feature extraction and precise localization of cancer regions, aiding in both classification and treatment planning.
- Data Augmentation and Transfer Learning: The framework incorporates data augmentation techniques to address the limitations of potentially smaller datasets. Additionally, transfer learning from pre-trained models helps expedite the learning process and enhance network performance.
6.2. Future Work
- Interpretability Enhancement: We aim to integrate techniques like Grad-CAM (Gradient-weighted Class Activation Mapping) to enhance model interpretability, allowing for better understanding of the decision-making process behind classifications.
- Improved Discrimination Accuracy: We will continue to refine our model architecture and training strategies to achieve even higher accuracy in differentiating between various skin cancer types (including potentially rarer types) and healthy tissues. We will also explore techniques for handling imbalanced datasets, which are common in medical image classification tasks.
- Generalization to Diverse Cancers: We plan to expand the applicability of our framework by evaluating it on datasets encompassing a broader range of cancer types, demonstrating its potential for broader clinical utility. Additionally, we will investigate the feasibility of implementing the model in a real-time or near-real-time setting for potential use in clinical settings.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
AI | Artificial Intelligence |
AKIEC | Actinic Keratosis |
BCC | Basal Cell Carcinoma |
BKL | Benign Keratosis |
BraTS | Brain Tumor Segmentation |
cGANs | Conditional Generative Adversarial Networks |
CNN | Convolutional Neural Network |
DCGAN | Deep Convolutional Generative Adversarial Network |
DF | Dermatofibroma |
ESRGANs | Enhanced Super-Resolution GANs |
GANs | Generative Adversarial Networks |
HAM10000 | Human Against Machine (10k-image dataset) |
ISIC | International Skin Imaging Collaboration |
LD | Linear Dichroism |
MEL | Melanoma |
MDPI | Multidisciplinary Digital Publishing Institute |
MRI | Magnetic Resonance Imaging |
NV | Nevus |
RGB | Red, Green, Blue |
SCC | Squamous Cell Carcinoma |
SVM | Support Vector Machine |
TLA | Three Letter Acronym |
VASC | Vascular Lesion |
References
- Patel, R.H.; Foltz, E.A.; Witkowski, A.; Ludzik, J. Analysis of artificial intelligence-based approaches applied to non-invasive imaging for early detection of melanoma: A systematic review. Cancers 2023, 15, 4694. [Google Scholar] [CrossRef] [PubMed]
- Güler, M.; Namlı, E. Brain Tumor Detection with Deep Learning Methods’ Classifier Optimization Using Medical Images. Appl. Sci. 2024, 14, 642. [Google Scholar] [CrossRef]
- Melarkode, N.; Srinivasan, K.; Qaisar, S.M.; Plawiak, P. AI-Powered Diagnosis of Skin Cancer: A Contemporary Review, Open Challenges and Future Research Directions. Cancers 2023, 15, 1183. [Google Scholar] [CrossRef]
- Schadendorf, D.; van Akkooi, A.C.J.; Berking, C.; Griewank, K.G.; Gutzmer, R.; Hauschild, A.; Stang, A.; Roesch, A.; Ugurel, S. Melanoma. Lancet 2018, 392, 971–984. [Google Scholar] [CrossRef] [PubMed]
- Bernal, J.; Kushibar, K.; Asfaw, D.S.; Valverde, S.; Oliver, A.; Martí, R.; Lladó, X. Deep convolutional neural networks for brain image analysis on magnetic resonance imaging: A review. Artif. Intell. Med. 2019, 95, 64–81. [Google Scholar] [CrossRef]
- Asiri, A.A.; Shaf, A.; Ali, T.; Aamir, M.; Irfan, M.; Alqahtani, S.; Mehdar, K.M.; Halawani, H.T.; Alghamdi, A.H.; Alshamrani, A.F.A.; et al. Brain Tumor Detection and Classification Using Fine-Tuned CNN with ResNet50 and U-Net Model: A Study on TCGA-LGG and TCIA Dataset for MRI Applications. Life 2023, 13, 1449. [Google Scholar] [CrossRef]
- Kulkarni, A.J.; Satapathy, S.C. Optimization techniques for machine learning. In Optimization in Machine Learning and Applications; Springer: Singapore, 2020; pp. 31–50. [Google Scholar] [CrossRef]
- Gao, Y.; Li, J.; Zhou, Y.; Xiao, F.; Liu, H. Optimization Methods for Large-Scale Machine Learning. In Proceedings of the 2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), Chengdu, China, 17–19 December 2021; pp. 304–308. [Google Scholar] [CrossRef]
- Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef]
- Oliveira, R.B.; Filho, E.M.; Ma, Z.; Papa, J.P.; Pereira, A.S.; Tavares, J.M.R.S. Computational methods for the image segmentation of pigmented skin lesions: A review. Comput. Methods Programs Biomed. 2016, 131, 127–141. [Google Scholar] [CrossRef]
- Sayedelahl, M.A. A novel edge detection filter based on fractional order Legendre-Laguerre functions. Int. J. Intell. Syst. Technol. Appl. 2023, 21, 321–343. [Google Scholar] [CrossRef]
- Xu, M.; Yoon, S.; Fuentes, A.; Park, D.S. A Comprehensive Survey of Image Augmentation Techniques for Deep Learning. Pattern Recognit. 2023, 137, 109347. [Google Scholar] [CrossRef]
- Lee, T.; Ng, V.; Gallagher, R.; Coldman, A.; McLean, D. Dullrazor®: A Software Approach to Hair Removal from Images. Comput. Biol. Med. 1997, 27, 533–543. [Google Scholar] [CrossRef]
- Toossi, M.T.B.; Pourreza, H.R.; Zare, H.; Sigari, M.H.; Layegh, P.; Azimi, A. An effective hair removal algorithm for dermoscopy images. Skin Res Technol. 2013, 19, 230–235. [Google Scholar] [CrossRef]
- Mir, A.N.; Nissar, I.; Rizvi, D.R.; Kumar, A. LesNet: An Automated Skin Lesion Deep Convolutional Neural Network Classifier through Augmentation and Transfer Learning. Procedia Comput. Sci. 2024, 235, 112–121. [Google Scholar] [CrossRef]
- Smith, S.M. Fast Robust Automated Brain Extraction. Hum. Brain Mapp. 2002, 17, 143–155. [Google Scholar] [CrossRef] [PubMed]
- Daimary, D.; Bora, M.B.; Amitab, K.; Kandar, D. Brain Tumor Segmentation from MRI Images using Hybrid Convolutional Neural Networks. Procedia Comput. Sci. 2020, 167, 2419–2428. [Google Scholar] [CrossRef]
- Innani, S.; Dutande, P.; Baid, U.; Pokuri, V.; Bakas, S.; Talbar, S.; Baheti, B.; Guntuku, S.C. Generative Adversarial Networks for Skin Lesion Classification. Sci. Rep. 2023, 13, 13467. [Google Scholar] [CrossRef]
- Vega-Huerta, H.; Rivera-Obregón, M.; Maquen-Niño, G.L.E.; De-la-Cruz-VdV, P.; Lázaro-Guillermo, J.C.; Pantoja-Collantes, J.; Cámara-Figueroa, A. Classification model of skin cancer using convolutional neural network. Ingénierie Des Systèmes D’information 2025, 30, 387–394. [Google Scholar] [CrossRef]
- Yu, L.; Chen, H.; Dou, Q.; Qin, J.; Heng, P.-A. Automated Melanoma Recognition in Dermoscopy Images via Very Deep Residual Networks. IEEE Trans. Med. Imaging 2016, 36, 994–1004. [Google Scholar] [CrossRef]
- Kavitha, C.; Priyanka, S.; Kumar, M.P.; Kusuma, V. Skin Cancer Detection and Classification using Deep Learning Techniques. Sensors 2020, 20, 3206. [Google Scholar] [CrossRef]
- El-Shafai, W.; El-Fattah, I.A.; Taha, T.E. Deep learning-based hair removal for improved diagnostics of skin diseases. Multimed. Tools Appl. 2024, 83, 27331–27355. [Google Scholar] [CrossRef]
- Qin, Z.; Liu, Z.; Zhu, P.; Xue, Y. A GAN-based image synthesis method for skin lesion classification. Comput. Methods Programs Biomed. 2020, 195, 105568. [Google Scholar] [CrossRef]
- Quishpe-Usca, A.; Cuenca-Dominguez, S.; Arias-Viñansaca, A.; Bosmediano-Angos, K.; Villalba-Meneses, F.; Ramírez-Cando, L.; Tirado-Espín, A.; Cadena-Morejón, C.; Almeida-Galárraga, D.; Guevara, C. The effect of hair removal and filtering on melanoma detection: A comparative deep learning study with AlexNet CNN. PeerJ Comput. Sci. 2024, 10, e1953. [Google Scholar] [CrossRef]
- Ali, A.; Sharif, M.; Faisal, C.M.S.; Rizwan, A.; Atteia, G.; Alabdulhafith, M. Brain Tumor Segmentation Using Generative Adversarial Networks. IEEE Access 2024, 12, 183525–183541. [Google Scholar] [CrossRef]
- Tschandl, F.; Rosendahl, C.; Kittler, H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. In Data Repository (Harvard Dataverse); Harvard University: Cambridge, MA, USA, 2018; pp. 1–3. Available online: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DBW86T (accessed on 2 January 2024).
- Musthafa, N.; Memon, Q.A.; Masud, M.M. Advancing Brain Tumor Analysis: Current Trends, Key Challenges, and Perspectives in Deep Learning-Based Brain MRI Tumor Diagnosis. Eng 2025, 6, 82. [Google Scholar] [CrossRef]
- Roy, S.; Jain, A.K.; Lal, S.; Kini, J. A study about color normalization methods for histopathology images. Micron 2018, 114, 42–61. [Google Scholar] [CrossRef] [PubMed]
- Picon, A.; Bereciartua-Perez, A.; Eguskiza, I.; Romero-Rodriguez, J.; Jimenez-Ruiz, C.J.; Eggers, T.; Klukas, C.; Navarra-Mestre, R. Deep convolutional neural network for damaged vegetation segmentation from RGB images based on virtual NIR-channel estimation. Artif. Intell. Agric. 2022, 6, 199–210. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.; Li, K.; Li, F.-F. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar] [CrossRef]
- Ian, G.; Bengio, Y.; Courville, A. Deep Learning; Chapter 9: Generative Adversarial Networks; MIT Press: Cambridge, MA, USA, 2016; Available online: https://www.deeplearningbook.org/ (accessed on 2 January 2024).
- Augmentor Team. Augmentor: Image Augmentation Library for Python. Available online: https://augmentor.readthedocs.io/ (accessed on 2 January 2024).
- Al-Kababji, A.; Bensaali, F.; Dakua, S.P.; Himeur, Y. Automated liver tissues delineation techniques: A systematic survey on machine learning current trends and future orientations. Eng. Appl. Artif. Intell. 2023, 117, 105532. [Google Scholar] [CrossRef]
- Olaf, R.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015, Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18; Springer International Publishing: Berlin/Heidelberg, Germany, 2015. [Google Scholar] [CrossRef]
- Suganyadevi, S.; Seethalakshmi, V.; Balasamy, K. A review of deep learning on medical image analysis. Int. J. Multimed. Inf. Retr. 2021, 11, 19–38. [Google Scholar] [CrossRef]
- Ravanmehr, R.; Mohamadrezaei, R. Deep Learning Overview. In Session-Based Recommender Systems Using Deep Learning; Springer: Cham, Switzerland, 2024; pp. 27–72. [Google Scholar] [CrossRef]
- Vinod, N.; Hinton, G.E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010. [Google Scholar]
- Gayatri, K.; Vora, D. Activation functions and training algorithms for deep neural network. UGC Approv. J. Int. J. Comput. Eng. Res. Trends 2018, 5, 98–104. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar] [CrossRef]
- Karl, W.; Khoshgoftaar, T.M.; Wang, D. A survey of transfer learning. J. Big Data 2016, 3, 9. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Antonio, G.; Pal, S. Deep Learning with Keras; Packt Publishing Ltd.: Birmingham, UK, 2017. [Google Scholar]
- Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A system for Large-Scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA, 2–4 November 2016. [Google Scholar]
- Fred, A.A. Deep learning using rectified linear units (relu). arXiv 2018, arXiv:1803.08375. [Google Scholar]
- Sungheon, P.; Kwak, N. Analysis on the dropout effect in convolutional neural networks. In Computer Vision–ACCV 2016, Proceedings of the 13th Asian Conference on Computer Vision, Taipei, Taiwan, 20–24 November 2016; Revised Selected Papers, Part II 13; Springer International Publishing: Berlin/Heidelberg, Germany, 2017. [Google Scholar] [CrossRef]
- Zijun, Z. Improved adam optimizer for deep neural networks. In Proceedings of the 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), Banff, AB, Canada, 4–6 June 2018. [Google Scholar] [CrossRef]
Paper | Year | Dataset | Method | Accuracy | Strengths | Limitations |
---|---|---|---|---|---|---|
Soyal et al. [18] | 2023 | ISIC | CNN (VGG16, VGG19) + SVM | 96% (VGG19 + SVM) | High accuracy with ensemble learning, data augmentation (GANs, ESRGANs) | Limited evaluation metrics |
Lembhe et al. [19] | 2020 | Not specified | CNN + Image Super-Resolution | 95.10% | Explores image super-resolution for skin cancer classification | Lacks details on specific techniques and evaluation metrics |
Yu et al. [20] | 2020 | HAM10000 | CNN (VGG19, Inceptionv3) with data augmentation (GANs, geometric) | 96.90% | State-of-the-art accuracy highlights data augmentation importance | Evaluation on a single dataset (HAM10000) |
Nasr-Esfahani et al. [21] | 2020 | HAM10000 | Fine-tuned Inception-ResNetV2 & NasNet Mobile | 97.10% | High accuracy, explores transfer learning with hyperparameter tuning | Limited evaluation on newer datasets (e.g., ISIC 2023) |
Guo et al. [22] | 2021 | Not specified | Deep learning-based hair removal (CNN + GAN) + classification | Not reported | Addresses the hair interference issue for improved classification | Lack of specific accuracy data |
Behara et al. [23] | 2021 | Not specified | Improved DCGAN for lesion synthesis and classification | Not reported | Contributes to enhancing deep learning for skin lesion analysis | Classification accuracy data not available |
Li et al. [24] | 2021 | Not specified | Hair removal + Efficient Attention Net CNN | 94.10% | Effective hair removal for preserving lesion information | Slightly lower accuracy compared to some other studies |
Ishan et al. [25] | 2022 | Not specified | cGANs for synthetic brain tumor MRI + 3D CNN | Not reported (segmentation) | Addresses class imbalance, enriches training data | Classification accuracy not reported |
A.A. Khan [17] | 2021 | BraTS 2020 | Cascaded Ensemble of CNNs (CE-CNN) | 98.34% | High accuracy with transfer learning | Requires significant computational resources |
A.A. Asiri [6] | 2023 | TCGA-LGG & TCIA | Fine-tuned ResNet50 and U-Net models | 94.21% | Good performance with limited data | May not generalize well to unseen data |
S. Şengör et al. [2] | 2022 | Rembrandt (public) | VGG16 with transfer learning | 99.04% | Excellent accuracy potential with pre-trained models | Relies heavily on pre-trained model performance |
Class Name | Before Augmentation | After Augmentation |
---|---|---|
Normal brain | 1500 | 7000 |
Brain tumor | 1200 | 7000 |
MEL skin | 1113 | 7000 |
NV skin | 6705 | 7000 |
BCC skin | 514 | 7000 |
AKIEC skin | 327 | 7000 |
BKL skin | 1099 | 7000 |
DF skin | 115 | 7000 |
VASC skin | 142 | 7000 |
Model | Simple Preprocessing | U-Net Segmentation |
---|---|---|
Inception v3 | 0.5767 | 0.9686 |
Parameter | Number | Size |
---|---|---|
Total | 21,823,274 | 83.25 MB |
Trainable | 12,980,618 | 49.52 MB |
Non-trainable | 8,842,656 | 33.73 MB |
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
AKIEC | 0.97 | 0.97 | 0.97 | 1050 |
BCC | 0.96 | 0.98 | 0.97 | 1050 |
BKL | 0.92 | 0.93 | 0.93 | 1050 |
DF | 0.99 | 0.99 | 0.99 | 1050 |
MEL | 0.92 | 0.94 | 0.93 | 1050 |
NV | 0.93 | 0.88 | 0.91 | 1050 |
VASC | 0.99 | 1.00 | 0.99 | 1050 |
Brain tumor | 1.00 | 1.00 | 1.00 | 1049 |
Normal brain | 1.00 | 1.00 | 1.00 | 1047 |
Undefined | 1.00 | 1.00 | 1.00 | 1095 |
Accuracy | – | – | 0.97 | 10,541 |
Macro avg | 0.97 | 0.97 | 0.97 | 10,541 |
Weighted avg | 0.97 | 0.97 | 0.97 | 10,541 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sayedelahl, M.A.; Gad, A.G.; Essa, R.M.; Hussein, Z.G.; Abohany, A.A. A Unified Deep Learning Framework for Robust Multi-Class Tumor Classification in Skin and Brain MRI. Technologies 2025, 13, 401. https://doi.org/10.3390/technologies13090401
Sayedelahl MA, Gad AG, Essa RM, Hussein ZG, Abohany AA. A Unified Deep Learning Framework for Robust Multi-Class Tumor Classification in Skin and Brain MRI. Technologies. 2025; 13(9):401. https://doi.org/10.3390/technologies13090401
Chicago/Turabian StyleSayedelahl, Mohamed A., Ahmed G. Gad, Reham M. Essa, Zakaria G. Hussein, and Amr A. Abohany. 2025. "A Unified Deep Learning Framework for Robust Multi-Class Tumor Classification in Skin and Brain MRI" Technologies 13, no. 9: 401. https://doi.org/10.3390/technologies13090401
APA StyleSayedelahl, M. A., Gad, A. G., Essa, R. M., Hussein, Z. G., & Abohany, A. A. (2025). A Unified Deep Learning Framework for Robust Multi-Class Tumor Classification in Skin and Brain MRI. Technologies, 13(9), 401. https://doi.org/10.3390/technologies13090401