1. Introduction
Breast cancer is the second leading cause of cancer-related mortality worldwide, following lung cancer [
1]. Without early detection, it represents a major source of female mortality [
2]. Conversely, timely diagnosis significantly improves survival rates, as breast cancer is among the most treatable cancers when detected early [
3], reducing both mortality and patient distress [
4]. Various imaging modalities, including digital mammography, ultrasound, MRI, and histopathology, are widely used for the early detection and diagnosis of breast cancer [
5].
Digital mammography remains the primary screening tool, producing high-resolution images to identify calcifications and masses, yet its performance declines in dense breast tissue. Ultrasound is widely used to distinguish solid from cystic lesions and to guide biopsies, while MRI is particularly valuable for high-risk patients and ambiguous cases. Despite their utility, these modalities suffer from variability in interpretation and intrinsic limits of sensitivity and specificity, often leading to false positives or negatives [
6]. These constraints highlight the need for advanced diagnostic approaches to improve accuracy and reliability.
Artificial intelligence (AI) has emerged as a transformative solution in medical imaging, capable of analyzing large datasets such as mammograms [
7] and MRIs with performance comparable to or exceeding human experts [
8,
9]. By detecting subtle patterns beyond human perception [
9], AI has proven effective in diverse applications, including skin leishmaniasis [
10] and breast cancer detection [
11].
However, most computer-aided diagnosis (CAD) systems remain modality-specific, limiting adaptability and requiring multiple frameworks to handle mammography, ultrasound, MRI, or histopathology. To overcome these challenges, we propose a unified convolutional neural network (CNN) framework capable of analyzing multiple modalities within a single model. This modality-agnostic approach leverages shared imaging patterns, streamlining deployment, reducing costs, and improving flexibility, while maintaining robust diagnostic performance across imaging types.
Deep learning, particularly deep convolutional neural networks (DCNNs), has demonstrated remarkable success in medical image analysis by automatically extracting complex features [
12], surpassing traditional machine learning methods that rely on manual feature engineering [
13].
The objective of this study is to develop a unified, multimodal CNN framework that minimizes overfitting through optimized architecture design, enhances diagnostic accuracy, and eliminates manual feature extraction. By integrating multiple imaging modalities with deep learning, the proposed system aims to advance breast cancer diagnosis, enable earlier detection, and ultimately improve patient outcomes.
2. Related Works
Machine learning (ML) and deep learning (DL) have been extensively applied to breast cancer diagnosis, primarily for binary or multi-class classification, evaluated using metrics such as accuracy, precision, recall, and F1 score. CNN-based CAD systems offer faster, more reliable detection across modalities including ultrasound, MRI, X-ray, and mammography [
14], with AI in digital mammography and tomosynthesis matching or exceeding conventional CADe/CADx performance [
15]. DL also enables analysis of genetic and histopathological data for early detection, supporting timely diagnosis and improved outcomes [
16]. AI-assisted mammography improves detection and reduces radiologist workload despite challenges such as false positives and variable subgroup performance [
17], and AI-powered CAD has notably enhanced mammography accuracy, showing strong potential for future breast cancer screening [
18].
Mammogram Images: DCNNs with preprocessing and augmentation effectively classify mammograms into benign, malignant, and normal while handling class imbalance [
19]. Castro-Tapia et al. [
20] used segments from the INbreast and MIAS datasets to analyze and compare breast lesion classification architectures such as AlexNet, GoogleNet, VGG19, and ResNet50. In line with previous studies, they evaluated 14 malignant and benign microcalcification and mass classifiers from prior research. CNNs yielded exceptional results. GoogleNet emerged as the most accurate model in their CAD system for breast cancer diagnosis, achieving an F1 score of 91.92%, an AUC of 99.29%, a precision rate of 92.15%, an accuracy rate of 91.92%, a specificity rate of 97.66%, and a sensitivity rate of 91.70% on a balanced dataset. Chougrad et al. [
21] developed a CNN-based breast cancer screening method to enhance mammographic image classification in the DDSM dataset.
These studies include: Rahman et al. [
22], who employed pre-trained convolutional neural network (CNN) architectures, specifically ResNet50 and InceptionV3, to categorize mammographic lesions into benign and malignant classifications. Due to the limited availability of data, methods such as data augmentation, preprocessing, and transfer learning were implemented, with some approaches also incorporating encoder mechanisms. The ResNet50 model achieved an accuracy of 85.7%, while InceptionV3 recorded an accuracy of 79.6%.
Sun et al. [
23] integrated features derived from multiple perspectives (MLO and CC) within the convolutional neural network framework. This model introduced a penalty term and utilized features across various scales, resulting in an accuracy of 82.02%. In [
24], Jafari et al. extracted features from several pre-trained CNN models to identify breast cancer. The most relevant features were selected using mutual information and classified using neural networks (NN), k-nearest neighbors (kNN), random forests (RF), and support vector machines. This novel approach achieved an accuracy of 92% on the RSNA dataset, 94.5% on MIAS, and 96% on DDSM.
Muduli et al. [
25] developed a CNN with five learnable layers (four convolutional and one fully connected) for breast cancer classification. The model automates feature extraction with fewer parameters and was rigorously tested on multiple mammography and ultrasound datasets (MIAS, DDSM, INbreast, BUS-1, BUS-2). It outperformed several methods, achieving 96.55%, 90.68%, and 91.28% accuracy on the MIAS, DDSM, and INbreast datasets, and 100% and 89.73% accuracy on the BUS-1 and BUS-2 datasets.
A hybrid mammography computer-aided diagnosis (CADx) system was developed by Rouhi and Jafari in [
26], integrating region-based, contour-based, and clustering segmentation methodologies. The system employs spatial frequency components (SFC), enhanced region growing (RG), or convolutional neural networks (CNN) for preliminary segmentation and utilizes genetic algorithm-artificial neural networks (GA-ANN) or genetic algorithm-multiple artificial neural networks (GA-MA-ANN) for the optimization of level set parameters. Neoplasms were categorized using classifiers, including artificial neural networks (ANN), random forests, and support vector machines (SVM), achieving high levels of sensitivity, specificity, accuracy, and area under the curve (AUC) across various datasets (MIAS, DDSM, INbreast). The multilayer perceptron (MLP) classifier achieved accuracies of 90.94%, 88.61%, and 89.23% for the aforementioned datasets, respectively.
A novel feature extraction technique based on the Dual Contourlet Transform (Dual-CT) was introduced by Dong et al. in [
27] for breast cancer diagnosis. In conjunction with an enhanced k-nearest neighbor (kNN) classifier, the methodology involved extracting regions of interest (ROI) from the MIAS database, followed by decomposition using Dual-CT, contourlet, and wavelet transforms, and extracting texture features. This approach achieved classification accuracies of 94.14% and 95.76%, surpassing conventional techniques. The enhanced kNN classifier demonstrated accuracies of 95.76%, 86.54%, and 89.30% for the MIAS, DDSM, and INbreast datasets, respectively.
Aguerchi et al. [
11] conducted a study that employed transfer learning methodologies through the application of pre-trained convolutional neural network (CNN) architectures, specifically VGG16, ResNet50, and InceptionV3, to classify mammography images as either benign or malignant. The models underwent fine-tuning on the Digital Database for Screening Mammography (DDSM) dataset, utilizing pre-trained weights derived from ImageNet. Among the architectures evaluated, ResNet50 demonstrated superior performance, achieving an accuracy of 88%, precision of 85%, recall of 90%, and a ROC AUC of 0.92, outpacing both VGG16 and InceptionV3 in all comprehensive metrics. These findings highlight the effectiveness of transfer learning in enhancing breast cancer detection by improving diagnostic accuracy and reliability.
A sophisticated deep ensemble transfer learning model, combined with a neural network classifier, was proposed by Arora et al. in [
28] for the automated extraction of features and classification of mammographic images. This approach involved the pre-processing of images, the extraction of robust features using the ensemble model, and the optimization of these features into a cohesive vector for classification. The neural network classifier successfully distinguished between benign and malignant tumors, achieving an accuracy of 88% and an AUC of 0.88, thereby demonstrating the promising capabilities of this robust computer-aided diagnosis (CADx) system for breast cancer classification.
Aguerchi et al. [
7] developed a highly accurate convolutional neural network (CNN) model for breast cancer detection through mammographic imaging. The proposed methodology is based on the Particle Swarm Optimization (PSO) algorithm, which helps identify optimal hyperparameters and structural configurations for the CNN model. The CNN model utilizing PSO achieved impressive success rates of 98.23% and 97.98% on the DDSM and MIAS datasets, respectively.
Histopathological images: Mansour [
29] introduced a computer-assisted system for breast cancer detection, employing an adaptive learning-based Gaussian Mixture Model (GMM) alongside feature extraction using AlexNet-DNN, complemented by principal component analysis (PCA) and linear discriminant analysis (LDA). The proposed approach achieved a performance score of 96.70% for the AlexNet-FC7 model using the BreakHis dataset.
An evaluative analysis was conducted on the efficacy of CNNs in conjunction with four widely recognized CNN-based architectures: VGG16, VGG19, MobileNet, and ResNet50, for the classification of breast cancer using histopathological images from the BreakHis dataset. Among the evaluated classifiers, VGG16 exhibited superior performance, achieving an accuracy of 94.67%, precision of 92.60%, F1-score of 85.21%, and recall of 80.52%. These results substantiate its effectiveness in the classification of malignant versus non-malignant tumors, as noted by Agarwal et al. [
30].
Spanhol et al. [
31] published a dataset containing 7,909 breast cancer histology images, classified into normal and cancerous categories. The primary objective of this dataset is to facilitate the automatic classification of these images into two groups, providing medical professionals with a useful computer-aided diagnosis tool. Their analysis achieved an accuracy of 80% to 85%, indicating room for improvement. The researchers employed machine learning algorithms such as KNN, SVM, quadratic linear analysis, and random forest for feature analysis.
Zhou et al. [
4] utilized a three-dimensional DCNNs detect and localize breast cancer in dynamic contrast-enhanced MRI datasets. Despite being a relatively under-trained model, the 3D-CNN achieved an accuracy of 83.7%, showcasing the potential of CNNs for MRI-based breast cancer detection.
Yurttakal et al. [
32] developed a multilayer convolutional neural network (CNN) that used pixel information and on-line data enhancement to detect malignant or benign lesions in MRI images. Their model demonstrated impressive performance, achieving an accuracy of 98.33%, which underscores the potential of pixel-based feature extraction and data augmentation to improve model efficacy.
In the field of ultrasound imaging, Ragab et al. [
33] aimed to identify and classify breast cancer using an innovative ensemble deep learning-based clinical decision support system. To accurately identify tumor-affected regions, the researchers developed an optimal multilevel thresholding technique for image segmentation. In addition, they established a feature extraction ensemble comprising three distinct deep learning models, combined with an effective machine learning classifier for breast cancer diagnosis.
Eroğlu Y [
34] designed a hybrid CNN system that utilizes ultrasonography images for breast cancer diagnosis. This system extracts features from AlexNet, MobileNetV2, and ResNet50, concatenating these features and applying the mRMR (minimum Redundancy Maximum Relevance) feature selection method to identify the most significant features. The classification was performed using SVM and k-NN classifiers, leading to an outstanding accuracy rate of 95.6
One study [
35] utilized two breast ultrasound datasets from different platforms, with breast ultrasound images as the primary dataset. The BUSI dataset contains 780 images, including 133 normal, 210 malignant, and 437 benign specimens. Another dataset (referred to as dataset B) includes 163 images comprising 110 normal and 53 malignant specimens. To improve the dataset, generative adversarial networks (GANs) were used to augment the data. The study explored the classification of breast ultrasound images using deep learning (DL), comparing CNN-AlexNet and transfer learning approaches, both with and without data augmentation. Over 60 training epochs with a learning rate of 0.0001, their model achieved an accuracy of 94% for the BUSI dataset, 92% for dataset B, and 99% when data augmentation was applied.
6. Conclusions
This research presents a CNN-based approach for the automated prediction and diagnosis of breast cancer using various imaging modalities, including mammography, ultrasound, MRI, and histopathological Datasets. The model achieved exceptional accuracy rates, demonstrating its ability to identify significant features and provide reliable predictions. Its simplicity, efficiency, and adaptability make it suitable for clinical applications, especially in contrast to more complex ensemble or hybrid models. The model’s capacity to generalize across diverse imaging datasets positions it as a transformative tool in computer-aided diagnosis, offering valuable support to radiologists in the early detection of breast cancer.
While the model performs well, the study identified key limitations, particularly related to dataset size, diversity, and image quality. These factors may hinder the model’s clinical utility despite its strong performance. These findings suggest the need for further research to evaluate the model on larger, more diverse datasets, and to explore enhancements such as integrating clinical and genomic data for more personalized predictions. In conclusion, this study lays a strong foundation for deep learning-based breast cancer diagnosis. By addressing its limitations and leveraging the model’s scalability, future research can improve early detection and patient outcomes, facilitating its integration into clinical workflows.